[zeromq-dev] Bad file descriptor in rm_fd()

Pieter Hintjens ph at imatix.com
Wed Nov 6 13:46:12 CET 2013


OK. When you have a test case, you can add it to the issues repository at
https://github.com/zeromq/issues <https://github.com/zeromq/issues/pull/6>.

Speaking of sleeps in the test cases, I've cut most of them down to a few
msec, since it was needlessly slow. Am sending a pull request now.

-Pieter


On Wed, Nov 6, 2013 at 1:26 PM, <Richard_Newton at waters.com> wrote:

> Thanks.
>
> It the test_multiple_threads test that fails, which I guess make sense.
>  I'll test to see if any of the other ones (that do everything on a single
> thread) can reproduce it just in case.
>
> Ric.
>
>
> [image: Inactive hide details for "Pieter Hintjens" ---06/11/2013 12:14:07
> PM---Great, nice to have a test case. I think this is a very]"Pieter
> Hintjens" ---06/11/2013 12:14:07 PM---Great, nice to have a test case. I
> think this is a very old issue: https://zeromq.jira.com/browse/LI
>
> From: "Pieter Hintjens" <ph at imatix.com>
> To: "ZeroMQ development list" <zeromq-dev at lists.zeromq.org>,
> Date: 06/11/2013 12:14 PM
> Subject: Re: [zeromq-dev] Bad file descriptor in rm_fd()
> Sent by: zeromq-dev-bounces at lists.zeromq.org
> ------------------------------
>
>
>
> Great, nice to have a test case.
>
> I think this is a very old issue:
> *https://zeromq.jira.com/browse/LIBZMQ-76*<https://zeromq.jira.com/browse/LIBZMQ-76>
>
> I'll put Richard's test case into the issues repository, and update issue
> 76.
>
>
> On Wed, Nov 6, 2013 at 12:47 PM, <*Richard_Newton at waters.com*<Richard_Newton at waters.com>>
> wrote:
>
>    Hi,
>
>    I managed to reproduce something that looks like this, seen it on both
>    Linux and windows.
>
>    I modified test_inproc_connect to run the tests in a tight loop
>    (except for test_connect_before_bind_pub_sub as that has a sleep in it), so
>    the main looks like:
>
>    while (true)
>    {
>    test_bind_before_connect();
>    test_connect_before_bind();
>    //test_connect_before_bind_pub_sub();
>    test_multiple_connects();
>    test_multiple_threads();
>    test_identity();
>    }
>
>    This gave me the output:
>
>    Bad file descriptor (/home/richard/code/libzmq/src/epoll.cpp:79)
>    Aborted
>
>    On todays master.
>
>    It does take a few hours to occur on my machine.
>
>    Ric.
>
>
>    MinRK ---05/11/2013 10:44:47 PM---Once in a while, when running either
>    the IPython or PyZMQ test suite, I still get this error:
>
>    From: MinRK <*benjaminrk at gmail.com* <benjaminrk at gmail.com>>
>    To: "0MQ development list" <*zeromq-dev at lists.zeromq.org*<zeromq-dev at lists.zeromq.org>>,
>
>    Date: 05/11/2013 10:44 PM
>
>
>    Subject: [zeromq-dev] Bad file descriptor in rm_fd()
>    Sent by: *zeromq-dev-bounces at lists.zeromq.org*<zeromq-dev-bounces at lists.zeromq.org>
>    ------------------------------
>
>
>
>
>    Once in a while, when running either the IPython or PyZMQ test suite,
>    I still get this error:
>
>        Bad file descriptor (kqueue.cpp:77)
>
>    or
>
>        Bad file descriptor (epoll.cpp:81)
>
>    Stack trace suggests that this happens when destroying a context:
>
>    Thread 0:
>    1   libzmq.3.dylib                 0x000000010f26b170
>    zmq::signaler_t::send() + 52
>    2   libzmq.3.dylib                 0x000000010f261b2f
>    zmq::object_t::send_stop() + 35
>    3   libzmq.3.dylib                 0x000000010f2534a7
>    zmq::ctx_t::~ctx_t() + 59
>    4   libzmq.3.dylib                 0x000000010f253a29
>    zmq::ctx_t::terminate() + 439
>    5   libzmq.3.dylib                 0x000000010f27c071 zmq_ctx_term + 35
>
>
>    Thread 6 Crashed:
>    0   libsystem_kernel.dylib         0x00007fff94a4d866 __pthread_kill +
>    10
>    1   libsystem_pthread.dylib       0x00007fff8cac835c pthread_kill + 92
>    2   libsystem_c.dylib             0x00007fff97570bba abort + 125
>    3   libzmq.3.dylib                 0x000000010f25a9e1
>    zmq::zmq_abort(char const*) + 9
>    4   libzmq.3.dylib                 0x000000010f25d0fe
>    zmq::kqueue_t::kevent_delete(int, short) + 142
>    5   libzmq.3.dylib                 0x000000010f25d1b0
>    zmq::kqueue_t::rm_fd(void*) + 42
>    6   libzmq.3.dylib                 0x000000010f2687a3
>    zmq::reaper_t::process_stop() + 59
>    7   libzmq.3.dylib                 0x000000010f26862b
>    zmq::reaper_t::in_event() + 161
>    8   libzmq.3.dylib                 0x000000010f25d40c
>    zmq::kqueue_t::loop() + 362
>
>
>    I am still seeing this error once in a while with libzmq-master as of
>    today. I don't think it's a recent regression.  A minimal test case is
>    difficult, since it only seems to raise after at least a hundred tests, and
>    only a small fraction of the time even then.  Given that it is always late
>    in the process that the assert is hit, I have always assumed that it is FD
>    exhaustion that is causing the problem, but I am not actually sure, and I
>    am fairly careful about cleaning up sockets.
>
>    Properties of the test suite that sees the issue:
>
>    - create and destroy many contexts and sockets
>    - the previous test's context should always be destroyed before the
>    next test starts
>    - it is not reliably the same test where the assert is hit
>
>    I'm afraid I don't know enough about the internals to really tell
>    what's going on here, or figure out why the deleted FD is invalid (maybe it
>    was already closed, and the error should be ignored?).
>
>    Anyone have insight on what might be causing the problem, or how I
>    might dig deeper into more useful information?
>
>    -MinRK_______________________________________________
>    zeromq-dev mailing list
> *zeromq-dev at lists.zeromq.org* <zeromq-dev at lists.zeromq.org>
> *http://lists.zeromq.org/mailman/listinfo/zeromq-dev*<http://lists.zeromq.org/mailman/listinfo/zeromq-dev>
>
>    ===========================================================
>    The information in this email is confidential, and is intended solely
>    for the addressee(s).
>    Access to this email by anyone else is unauthorized and therefore
>    prohibited.  If you are
>    not the intended recipient you are notified that disclosing, copying,
>    distributing or taking
>    any action in reliance on the contents of this information is strictly
>    prohibited and may be unlawful.
>    ===========================================================
>
>    _______________________________________________
>    zeromq-dev mailing list
> *zeromq-dev at lists.zeromq.org* <zeromq-dev at lists.zeromq.org>
> *http://lists.zeromq.org/mailman/listinfo/zeromq-dev*<http://lists.zeromq.org/mailman/listinfo/zeromq-dev>
>
>
>
>
>
> --
> -
> Pieter Hintjens
> CEO of iMatix.com
> Founder of ZeroMQ community
> blog: *http://hintjens.com* <http://hintjens.com/>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
> ===========================================================
> The information in this email is confidential, and is intended solely for the addressee(s).
> Access to this email by anyone else is unauthorized and therefore prohibited.  If you are
> not the intended recipient you are notified that disclosing, copying, distributing or taking
> any action in reliance on the contents of this information is strictly prohibited and may be unlawful.
> ===========================================================
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>


-- 
-
Pieter Hintjens
CEO of iMatix.com
Founder of ZeroMQ community
blog: http://hintjens.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20131106/7363b3ab/attachment.html>


More information about the zeromq-dev mailing list