[zeromq-dev] Bad file descriptor in rm_fd()

Pieter Hintjens ph at imatix.com
Wed Nov 6 13:13:20 CET 2013


Great, nice to have a test case.

I think this is a very old issue: https://zeromq.jira.com/browse/LIBZMQ-76

I'll put Richard's test case into the issues repository, and update issue
76.


On Wed, Nov 6, 2013 at 12:47 PM, <Richard_Newton at waters.com> wrote:

> Hi,
>
> I managed to reproduce something that looks like this, seen it on both
> Linux and windows.
>
> I modified test_inproc_connect to run the tests in a tight loop (except
> for test_connect_before_bind_pub_sub as that has a sleep in it), so the
> main looks like:
>
>  while (true)
>  {
>  test_bind_before_connect();
>  test_connect_before_bind();
>  //test_connect_before_bind_pub_sub();
>  test_multiple_connects();
>  test_multiple_threads();
>  test_identity();
>  }
>
> This gave me the output:
>
> Bad file descriptor (/home/richard/code/libzmq/src/epoll.cpp:79)
> Aborted
>
> On todays master.
>
> It does take a few hours to occur on my machine.
>
> Ric.
>
>
> [image: Inactive hide details for MinRK ---05/11/2013 10:44:47 PM---Once
> in a while, when running either the IPython or PyZMQ test suit]MinRK
> ---05/11/2013 10:44:47 PM---Once in a while, when running either the
> IPython or PyZMQ test suite, I still get this error:
>
> From: MinRK <benjaminrk at gmail.com>
> To: "0MQ development list" <zeromq-dev at lists.zeromq.org>,
> Date: 05/11/2013 10:44 PM
>
> Subject: [zeromq-dev] Bad file descriptor in rm_fd()
> Sent by: zeromq-dev-bounces at lists.zeromq.org
> ------------------------------
>
>
>
> Once in a while, when running either the IPython or PyZMQ test suite, I
> still get this error:
>
>     Bad file descriptor (kqueue.cpp:77)
>
> or
>
>     Bad file descriptor (epoll.cpp:81)
>
> Stack trace suggests that this happens when destroying a context:
>
> Thread 0:
> 1   libzmq.3.dylib                 0x000000010f26b170
> zmq::signaler_t::send() + 52
> 2   libzmq.3.dylib                 0x000000010f261b2f
> zmq::object_t::send_stop() + 35
> 3   libzmq.3.dylib                 0x000000010f2534a7 zmq::ctx_t::~ctx_t()
> + 59
> 4   libzmq.3.dylib                 0x000000010f253a29
> zmq::ctx_t::terminate() + 439
> 5   libzmq.3.dylib                 0x000000010f27c071 zmq_ctx_term + 35
>
>
> Thread 6 Crashed:
> 0   libsystem_kernel.dylib         0x00007fff94a4d866 __pthread_kill + 10
> 1   libsystem_pthread.dylib       0x00007fff8cac835c pthread_kill + 92
> 2   libsystem_c.dylib             0x00007fff97570bba abort + 125
> 3   libzmq.3.dylib                 0x000000010f25a9e1 zmq::zmq_abort(char
> const*) + 9
> 4   libzmq.3.dylib                 0x000000010f25d0fe
> zmq::kqueue_t::kevent_delete(int, short) + 142
> 5   libzmq.3.dylib                 0x000000010f25d1b0
> zmq::kqueue_t::rm_fd(void*) + 42
> 6   libzmq.3.dylib                 0x000000010f2687a3
> zmq::reaper_t::process_stop() + 59
> 7   libzmq.3.dylib                 0x000000010f26862b
> zmq::reaper_t::in_event() + 161
> 8   libzmq.3.dylib                 0x000000010f25d40c
> zmq::kqueue_t::loop() + 362
>
>
> I am still seeing this error once in a while with libzmq-master as of
> today. I don't think it's a recent regression.  A minimal test case is
> difficult, since it only seems to raise after at least a hundred tests, and
> only a small fraction of the time even then.  Given that it is always late
> in the process that the assert is hit, I have always assumed that it is FD
> exhaustion that is causing the problem, but I am not actually sure, and I
> am fairly careful about cleaning up sockets.
>
> Properties of the test suite that sees the issue:
>
> - create and destroy many contexts and sockets
> - the previous test's context should always be destroyed before the next
> test starts
> - it is not reliably the same test where the assert is hit
>
> I'm afraid I don't know enough about the internals to really tell what's
> going on here, or figure out why the deleted FD is invalid (maybe it was
> already closed, and the error should be ignored?).
>
> Anyone have insight on what might be causing the problem, or how I might
> dig deeper into more useful information?
>
> -MinRK_______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
> ===========================================================
> The information in this email is confidential, and is intended solely for the addressee(s).
> Access to this email by anyone else is unauthorized and therefore prohibited.  If you are
> not the intended recipient you are notified that disclosing, copying, distributing or taking
> any action in reliance on the contents of this information is strictly prohibited and may be unlawful.
> ===========================================================
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>


-- 
-
Pieter Hintjens
CEO of iMatix.com
Founder of ZeroMQ community
blog: http://hintjens.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20131106/50606281/attachment.html>


More information about the zeromq-dev mailing list