[zeromq-dev] Improving zeromq in OOM conditions
paul at colomiets.name
Tue May 17 00:06:26 CEST 2011
On Mon, May 16, 2011 at 10:38 AM, Martin Sustrik <sustrik at 250bpm.com> wrote:
> Hi Paul,
> As Martin encouraged me to fix zeromq in out of memory conditions. Here
>> are first patches and first questions.
>> There are a lot of explicit and implicit (e.g. inserting in STL
>> container) memory allocations in consturctors in zeromq code. As long as
>> we are encouraged not to use exceptions in zeromq code, we can't
>> gracefully propagate exceptions from there. So I see three options:
>> 1. Refactor code to have all the memory allocations in `init()` method
>> (other name?)
>> 2. Allow throwing and catching exceptions in code which is not on
>> critical path
>> 3. Move memory allocation code to overriden `new` (which will probably
>> turn it into a mess)
>> BTW, if catching exceptions is discouraged at all, we need to rewrite
>> all code which uses STL containers.
> I would start with something much simpler. The proposed roadmap requires
> heavy refactoring upfront without actually being able to test the thing
> until much later on.
> 99.9% of memory allocated by 0mq is allocated at two places:
> 1. src/msg.cpp:56
Fixed in one of patches attached to previous email
> 2. src/yqueue.hpp:108
Will look into that.
> With large messages the most allocation happens in 1., with small messages,
> most memory is allocated by 2.
> So, if you create a test program which would publish say 10MB messages in
> tight loop, while the peer is not receiving, you'll hit the allocation error
> in 1.
> If you do the same with messages 1 byte long, you'll hit the error in 2.
> Having the test program I would try to write, test and submit small gradual
> patches that improve reliability in these cases.
I've tried, but got errors seems to be related to reconnection. Are there
any disconnects on OOM conditions? (will look myself, but may be you
Anyway I've hit:
there was also error in timers and few other places I don't remember now.
You can look at my test at:
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the zeromq-dev