[zeromq-dev] [PATCH] Fixed OOM handling while writing to a pipe
ph at imatix.com
Fri May 20 12:30:34 CEST 2011
On Fri, May 20, 2011 at 12:21 PM, Martin Sustrik <sustrik at 250bpm.com> wrote:
> There's one important point to be made: 0MQ currently behaves 100%
> predictably in OOM condition -- it terminates the process. User is then
> free to restart the process or take whatever emergency measures are
> Any patches to OOM handling should preseve this 100% predictability.
> zmq_send() can return ENOMEM instead of terminating the process,
> however, it must do so consistently. Introducing undefined behaviour
> under OOM conditions is not an option.
Sorry to say this rather late, but before we change the behavior of
0MQ under OOM conditions, I'd want the consensus of users here.
It is a radical change in semantics to go from asserting, to
continuing with an error response. We cannot make such changes without
being certain there is a consensus of approval for them.
My own experience goes strongly against handling OOM in any way except
assertion. We explored this quite exhaustively in OpenAMQ and found
that returning errors in case of OOM was very fragile. It is not even
clear that an application can deal with such errors sanely, since many
system calls will themselves fail if memory is exhausted. We tried
hard to make this work, and in the end had to choose for "assert" as
the only robust answer.
It's particularly important for services because most of the time
there is a problem that must be raised and resolved, whether it's the
too-low default VM size, or the lack of HWMs on queues, or too-slow
The only exception to assertion, afaics, is for allocation requests
that are clearly unreasonable. And even then, assertion seems the
right response if these requests are internal. If they're driven by
user data (i.e. someone sending a 4GB message to a service), the
correct response is detecting over-sized messages and discarding them
(and we have this code in 2.2 and 3.0).
tl,dr - +1 for asserting on OOM, -1 for returning ENOMEM.
More information about the zeromq-dev