[zeromq-dev] [PATCH] Fixed OOM handling while writing to a pipe

Paul Colomiets paul at colomiets.name
Fri May 20 23:00:58 CEST 2011

> Hi Martin,

On Fri, May 20, 2011 at 12:57 PM, Martin Sustrik <sustrik at 250bpm.com> wrote:

> Hi Paul,
>  In case of OOM zmq_send will return EAGAIN, because there no way to
>> return other error from pipe (writer_t). Do you think it's worth fixing?
>> This
>> basically means writer_t::write should return 0 and set errno instead
>> of returning bool. (Or should it return ENOMEM directly?)
> Yes. Definitely. If error is to be returned it should be correct error.

> As for returning the error the most common POSIX practice is to return
> negative number and set errno. Return value of 0 is used for success.

Yup, but citing http://www.unix.org/whitepapers/reentrant.html:

  In addition, all POSIX.1c functions avoid using errno and, instead, return
the error number directly as the function return value, with a return value
of zero indicating that no error was detected. This strategy is, in fact,
being followed on a POSIX-wide basis for all new functions.

but, nevermind, it should be consistent with other functions

>  Sometimes write fail even if `more` is set on previous message, this
>> will trigger some assertions in zeromq code (they will be fixed in the
>> future patches),
> What assertions? If send fails we should revert the state to what it was
> before send() was called. That shouldn't trigger any assert.

e.g. lb.cpp:118
your code expects pipe->write never fail after first message. I would
improve that, you just said you want small incremental steps, and this patch
really fixed crash in one of my test scripts.

>  and probably can trigger assertions in user code.
> Same as above.

Well, I usually do something like:

rc = zmq_send(&msg, ZMQ_SNDMORE);
if(rc) return false;
rc = zmq_send(&msg2, 0);

Because previously sending all subsequent message parts could never fail.

> One additional issue: If you change function prototype to return error, you
> should check the error at each place the function is called from.

Sure. Am I missed something?

>  But as currently nobody expects zeromq to work under OOM,
>> it's probably fine to live with this for some time.
> There's one important point to be made: 0MQ currently behaves 100%
> predictably in OOM condition -- it terminates the process. User is then free
> to restart the process or take whatever emergency measures are necessary.
> Any patches to OOM handling should preseve this 100% predictability.
> zmq_send() can return ENOMEM instead of terminating the process, however, it
> must do so consistently. Introducing undefined behaviour under OOM
> conditions is not an option.

Sure. Termination in lb.cpp:118 will be quite unfortunate, but I've said
it's single step, and I think it's ok for some development version.

All in all, I'll fix the patch, if we decide it's needed at all (discussion
is in another thread).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20110521/4102f427/attachment.htm>

More information about the zeromq-dev mailing list