[zeromq-dev] HWM default
sustrik at 250bpm.com
Sun May 15 09:35:16 CEST 2011
On 05/14/2011 04:16 PM, Paul Colomiets wrote:
> What else would you do if you run out of memory? Any recovery
> mechanism in user space is likely to fail because there's no memory
> available :) Killing (and possibly restarting) the app seems like a
> reasonable option to me.
> On the application side you can flush buffers, flush cache, close
> connections. On zeromq side you can stop accepting connections, stop
> processing incoming data, drop incoming messages, drop messages already
> in queue (delivery is not reliable anyway), wait (for data to be
> processed), notify user about memory failure. Enought?
Have you ever tried to implement that? The problem is that in no-memory
situations even these emergency measures tend to fail. Flush the
buffers? Sorry, no memory. Send a message to the I/O thread to stop
accepting connections? Sorry, no memory. Print a dialog box to notify
user? Sorry, no memory. Etc.
What you can do is to switch off overcommit, allocate enough memory in
advance to be able to run the emergency measures, free that memory when
no-memory error is hit and run the emergency measures afterwards. Even
then, the memory allocated can turn out to be unsufficient or other
applications may steal it before you use it.
All in all, it's so complex and unreliable that properly setting HWMs
and max messages sizes suddenly sounds like a great idea.
> For some applications it's crucial, e.g. for databases which has a write
> cache, for persistent queues which postpone their writes, for game
> servers having some state in memory, whatever. You right that it's
> reasonable for some *applications*. But it's quite mad for networking
> library. It will be fixed in kernel implementation anyway (may be using
> other methods). Still I'm pretty sure it should be fixed in a library.
If you are not discouraged by the above, feel free to try. Do it in
small incremental steps though as the whole patch is likely to be
extremely invasive and hard to merge.
More information about the zeromq-dev