[zeromq-dev] Improving zeromq in OOM conditions
sustrik at 250bpm.com
Mon May 16 09:38:24 CEST 2011
> As Martin encouraged me to fix zeromq in out of memory conditions. Here
> are first patches and first questions.
> There are a lot of explicit and implicit (e.g. inserting in STL
> container) memory allocations in consturctors in zeromq code. As long as
> we are encouraged not to use exceptions in zeromq code, we can't
> gracefully propagate exceptions from there. So I see three options:
> 1. Refactor code to have all the memory allocations in `init()` method
> (other name?)
> 2. Allow throwing and catching exceptions in code which is not on
> critical path
> 3. Move memory allocation code to overriden `new` (which will probably
> turn it into a mess)
> BTW, if catching exceptions is discouraged at all, we need to rewrite
> all code which uses STL containers.
I would start with something much simpler. The proposed roadmap requires
heavy refactoring upfront without actually being able to test the thing
until much later on.
99.9% of memory allocated by 0mq is allocated at two places:
With large messages the most allocation happens in 1., with small
messages, most memory is allocated by 2.
So, if you create a test program which would publish say 10MB messages
in tight loop, while the peer is not receiving, you'll hit the
allocation error in 1.
If you do the same with messages 1 byte long, you'll hit the error in 2.
Having the test program I would try to write, test and submit small
gradual patches that improve reliability in these cases.
More information about the zeromq-dev