[zeromq-dev] Bad file descriptor in rm_fd()

Martin Hurton hurtonm at gmail.com
Wed Nov 6 00:23:54 CET 2013


I vaguely remember this was reported a long time ago. I spent some
time but never found anything suspicious. Will look at it. - Martin

On Tue, Nov 5, 2013 at 4:10 PM, Pieter Hintjens <ph at imatix.com> wrote:
> Hi MinRK,
>
> I'll try to reproduce it tomorrow. Any suggestion to the kind of test
> case I could make?
>
> -Pieter
>
> On Tue, Nov 5, 2013 at 11:44 PM, MinRK <benjaminrk at gmail.com> wrote:
>> Once in a while, when running either the IPython or PyZMQ test suite, I
>> still get this error:
>>
>>     Bad file descriptor (kqueue.cpp:77)
>>
>> or
>>
>>     Bad file descriptor (epoll.cpp:81)
>>
>> Stack trace suggests that this happens when destroying a context:
>>
>> Thread 0:
>> 1   libzmq.3.dylib                 0x000000010f26b170
>> zmq::signaler_t::send() + 52
>> 2   libzmq.3.dylib                 0x000000010f261b2f
>> zmq::object_t::send_stop() + 35
>> 3   libzmq.3.dylib                 0x000000010f2534a7 zmq::ctx_t::~ctx_t() +
>> 59
>> 4   libzmq.3.dylib                 0x000000010f253a29
>> zmq::ctx_t::terminate() + 439
>> 5   libzmq.3.dylib                 0x000000010f27c071 zmq_ctx_term + 35
>>
>>
>> Thread 6 Crashed:
>> 0   libsystem_kernel.dylib         0x00007fff94a4d866 __pthread_kill + 10
>> 1   libsystem_pthread.dylib       0x00007fff8cac835c pthread_kill + 92
>> 2   libsystem_c.dylib             0x00007fff97570bba abort + 125
>> 3   libzmq.3.dylib                 0x000000010f25a9e1 zmq::zmq_abort(char
>> const*) + 9
>> 4   libzmq.3.dylib                 0x000000010f25d0fe
>> zmq::kqueue_t::kevent_delete(int, short) + 142
>> 5   libzmq.3.dylib                 0x000000010f25d1b0
>> zmq::kqueue_t::rm_fd(void*) + 42
>> 6   libzmq.3.dylib                 0x000000010f2687a3
>> zmq::reaper_t::process_stop() + 59
>> 7   libzmq.3.dylib                 0x000000010f26862b
>> zmq::reaper_t::in_event() + 161
>> 8   libzmq.3.dylib                 0x000000010f25d40c zmq::kqueue_t::loop()
>> + 362
>>
>>
>> I am still seeing this error once in a while with libzmq-master as of today.
>> I don't think it's a recent regression.  A minimal test case is difficult,
>> since it only seems to raise after at least a hundred tests, and only a
>> small fraction of the time even then.  Given that it is always late in the
>> process that the assert is hit, I have always assumed that it is FD
>> exhaustion that is causing the problem, but I am not actually sure, and I am
>> fairly careful about cleaning up sockets.
>>
>> Properties of the test suite that sees the issue:
>>
>> - create and destroy many contexts and sockets
>> - the previous test's context should always be destroyed before the next
>> test starts
>> - it is not reliably the same test where the assert is hit
>>
>> I'm afraid I don't know enough about the internals to really tell what's
>> going on here, or figure out why the deleted FD is invalid (maybe it was
>> already closed, and the error should be ignored?).
>>
>> Anyone have insight on what might be causing the problem, or how I might dig
>> deeper into more useful information?
>>
>> -MinRK
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>
>
>
> --
> -
> Pieter Hintjens
> CEO of iMatix.com
> Founder of ZeroMQ community
> blog: http://hintjens.com
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev



More information about the zeromq-dev mailing list