[zeromq-dev] zeromq, abort(), and high reliability environments

Dylan Cali calid1984 at gmail.com
Sat Aug 9 04:12:13 CEST 2014


Hey guys,

What is the right way to use zeromq in high reliability environments?  In
certain insane/impossible situations (e.g. out of memory, out of file
descriptors, etc) libzmq assertions will fail and it will abort.

I came across a thread by Martin where he addresses a similar situation
[1].  If
I'm reading his argument correctly, the gist in general is: If it's
impossible
to connect due to some error, than you're dead in the water anyways.  Crash
loudly and immediately with the error (the Fail-Fast paradigm), fix the
error,
and then restart the process.

I actually agree with this philosophy, but a user would say "You terminated
my
entire application stack and didn't give me a chance to cleanup!  I had
very important data
in memory and it's gone!"  This is especially the case with Java
programmers who
Always Expect an Exception.

For example, in the case of being out of file descriptors, the jzmq
bindings will abort,
but a Java programmer would expect to get an Exception with the "Too Many
Open
Files" error.

I guess one possible retort is: if the data in memory was so important, why
didn't you have redundancy/failover/some kind of playback log? Why did you
put
all your eggs in one basket assuming your process would never crash?

Is that the right answer here (basically blame the user for not having
disaster
recovery), or is there a different/better way to address the high
reliability
scenario?

I came across another thread where Martin gets this very
complaint (zeromq aborted my application!), and basically says well, if you
really, really want to,
you can install a signal handler for SIGABRT, but caveat emptor [2].

To me, this is playing with fire, dangerous, and just a Bad Idea. But maybe
it's
worth the risk in high reliability environments?


Thanks in advance for any advice or thoughts.

[1] http://lists.zeromq.org/pipermail/zeromq-dev/2009-May/000784.html
[2] http://lists.zeromq.org/pipermail/zeromq-dev/2011-October/013608.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20140808/bf4ee15b/attachment.html>


More information about the zeromq-dev mailing list