[zeromq-dev] [BUG] zmq_assert causes BOOM if you breath on OSX Lion kqueue wrong.

Martin Sustrik sustrik at 250bpm.com
Sat Oct 1 10:58:54 CEST 2011


Hi Zed,

> I was asked to report all asserts I encounter.  I went to the JIRA to
> submit this as a bug, but it looks like I have to create an account,
> or something, can't really figure it out even though I've used JIRA
> before.  I'm guessing this is the next place to report a bug, so here
> you go.

As Mikko says the bug is already reported.

> On another note:  Causing a full assert abort in *my* program from
> *your* library because of a little hicup in an external resource is
> stupid.

This is not a hiccup in resource. It looks more like a synchronisation 
issue.

>  I've been saying for close to a year now that *all* of the
> zmq_asserts need to go away.

Asserts check for bugs. To get rid of them we have to fix bugs. The 
other option is to ignore the bugs and allow 0MQ to continue operating 
in a broken state. That's OK as far as you are happy with undefined 
behaviour.

If you really want that I can add a compile time option to ignore all 
the asserts. It's just few lines of code, so let me know.

> libzmq needs to return valid error codes
> and stop aborting *my* servers.  Until they're gone completely I can't
> trust that some random socket error I have no control over won't abort
> my whole world. And, having to troll through C++ code to debug why I'm
> getting the error is annoying.

The errors can happen asynchronously.

What can be done is setting a global handler function that will be 
called if a bug is hit. It's not clear what the application should do 
then though. Maybe it can save its state and restart itself?

> At a minimum, add a 3rd parameter that
> gives an error message that's other than something like "No such file
> or directory".

The asserts can be enhanced by longer messages, like, in this case, 
"kqueue have returned an unexpected error: no such file or directory". I 
am not sure how helpful will that be though.

> WTF does that even mean for kqueue?  I sure as hell
> didn't do anything to cause that. How could I possibly fix that?

[ENOENT] The event could not be found to be modified or deleted.

What's happening, I guess, is that an event is referenced that was 
already removed from the kqueue.

In any case, I have no OSX system to reproduce the problem. If anyone 
bother to give me remote access, I can try to fix it.

Martin



More information about the zeromq-dev mailing list