[zeromq-dev] zeromq, abort(), and high reliability environments

Dylan Cali calid1984 at gmail.com
Sat Aug 9 05:44:08 CEST 2014


I just noticed the formatting with my previous post went all screwy.
I'm terribly sorry about that.  Re-posting with (hopefully) fixed
formatting.

Hey guys,

What is the right way to use zeromq in high reliability environments?
In certain insane/impossible situations (e.g. out of memory, out of
file descriptors, etc) libzmq assertions will fail and it will abort.

I came across a thread by Martin where he addresses a similar
situation [1].  If I'm reading his argument correctly, the gist in
general is: If it's impossible to connect due to some error, than
you're dead in the water anyways.  Crash loudly and immediately with
the error (the Fail-Fast paradigm), fix the error, and then restart
the process.

I actually agree with this philosophy, but a user would say "You
terminated my entire application stack and didn't give me a chance to
cleanup!  I had very important data in memory and it's gone!"  This is
especially the case with Java programmers who Always Expect an
Exception.

For example, in the case of being out of file descriptors, the jzmq
bindings will abort, but a Java programmer would expect to get an
Exception with the "Too Many Open Files" error.

I guess one possible retort is: if the data in memory was so
important, why didn't you have redundancy/failover/some kind of
playback log? Why did you put all your eggs in one basket assuming
your process would never crash?

Is that the right answer here (basically blame the user for not having
disaster recovery), or is there a different/better way to address the
high reliability scenario?

I came across another thread where Martin gets this very complaint
(zeromq aborted my application!), and basically says well, if you
really, really want to, you can install a signal handler for SIGABRT,
but caveat emptor [2].

To me, this is playing with fire, dangerous, and just a Bad Idea. But
maybe it's worth the risk in high reliability environments?


Thanks in advance for any advice or thoughts.

[1] http://lists.zeromq.org/pipermail/zeromq-dev/2009-May/000784.html
[2] http://lists.zeromq.org/pipermail/zeromq-dev/2011-October/013608.html



More information about the zeromq-dev mailing list