[zeromq-dev] Bad File Descriptor / nbytes != -1 error (mailbox.cpp:241)

Andrew Cholakian andrew at andrewvc.com
Thu Feb 10 20:43:41 CET 2011


A bit more data, first thanks to Chuck Remes for helping me out quite a bit
in IRC today. So, here's the relevant part of a GDB backtrace:

(gdb) backtrace
#0  0x00007ffff6b8eba5 in raise () from /lib/libc.so.6
#1  0x00007ffff6b926b0 in abort () from /lib/libc.so.6
#2  0x00007ffff4d2984f in zmq::mailbox_t::recv (this=<value optimized out>,
cmd_=<value optimized out>, block_=<value optimized out>) at mailbox.cpp:244
#3  0x00007ffff4d33a2b in zmq::socket_base_t::process_commands
(this=0x16b47e0, block_=<value optimized out>, throttle_=false) at
socket_base.cpp:677
#4  0x00007ffff4d3405e in zmq::socket_base_t::getsockopt (this=0x3dd5,
option_=<value optimized out>, optval_=0x7fffec000a30, optvallen_=<value
optimized out>)
    at socket_base.cpp:270
#5  0x00007ffff4f66114 in ffi_call_unix64 () from
/home/andrew/.rvm/gems/ruby-1.9.2-p136/gems/ffi-1.0.5/lib/ffi_c.so

So, it looks like calling ZMQ_EVENTS repeatedly on a writable socket (a PUB
socket in this case) is the cause. I surrounded calls to getsockopt, and saw
that the crash was always on a ZMQ_EVENTS getsockopt to a writable socket.
It doesn't seem to matter whether I tried pub/sub/xreq.

I disabled this, and now just write blindly, and it works. Thoughts?

On Thu, Feb 10, 2011 at 9:08 AM, Andrew Cholakian <andrew at andrewvc.com>wrote:

> I've got an (unfortunately) reproducible error that's been triggering on an
> app of mine. I briefly spoke about it in IRC yesterday, but I think I may
> have an strace log that indicates the error. I just tried it again this
> morning with ZMQ HEAD, no luck.
>
> The most interesting part is excerpted here:
> https://gist.github.com/820779
>
> Basically, it looks like recv is being called on fd 22 which is opened
> earlier with:
>
> socketpair(PF_FILE, SOCK_STREAM, 0, [21, 22]) = 0
>
>  after it's already been closed! My app only uses ZMQ TCP sockets, so the
> socketpair seems like it's probably internal to ZMQ. Switching transports
> makes no difference.
>
> The full strace is here:
>
> http://dl.dropbox.com/u/7376989/weirdzmqstrace.txt
>
> I wish I had a minimal test case, but it's part of a complex ruby app, and
> extracting one might not even be feasible. It's 100% reproducible given a
> certain sequence of events in the app however, it always occurs after the
> same number of messages.
>
> If anyone DOES want to setup the app, it's OSS, and I'd be glad to help in
> IRC, I'm andrewvc in the chat room.
>
> --
> Andrew Cholakian
> http://www.andrewvc.com
>



-- 
Andrew Cholakian
http://www.andrewvc.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20110210/4ebdfe97/attachment.htm>


More information about the zeromq-dev mailing list