[zeromq-dev] Bad file descriptor in rm_fd()

Richard_Newton at waters.com Richard_Newton at waters.com
Wed Nov 6 13:26:56 CET 2013


Thanks.

It the test_multiple_threads test that fails, which I guess make sense.  I'll test to see if any of the other ones (that do everything on a single
thread) can reproduce it just in case.

Ric.




From:	"Pieter Hintjens" <ph at imatix.com>
To:	"ZeroMQ development list" <zeromq-dev at lists.zeromq.org>,
Date:	06/11/2013 12:14 PM
Subject:	Re: [zeromq-dev] Bad file descriptor in rm_fd()
Sent by:	zeromq-dev-bounces at lists.zeromq.org



Great, nice to have a test case.

I think this is a very old issue: https://zeromq.jira.com/browse/LIBZMQ-76

I'll put Richard's test case into the issues repository, and update issue 76.


On Wed, Nov 6, 2013 at 12:47 PM, <Richard_Newton at waters.com> wrote:
  Hi,

  I managed to reproduce something that looks like this, seen it on both Linux and windows.

  I modified test_inproc_connect to run the tests in a tight loop (except for test_connect_before_bind_pub_sub as that has a sleep in it), so the main
  looks like:

  while (true)
  {
  test_bind_before_connect();
  test_connect_before_bind();
  //test_connect_before_bind_pub_sub();
  test_multiple_connects();
  test_multiple_threads();
  test_identity();
  }

  This gave me the output:

  Bad file descriptor (/home/richard/code/libzmq/src/epoll.cpp:79)
  Aborted

  On todays master.

  It does take a few hours to occur on my machine.

  Ric.


  MinRK ---05/11/2013 10:44:47 PM---Once in a while, when running either the IPython or PyZMQ test suite, I still get this error:

  From: MinRK <benjaminrk at gmail.com>
  To: "0MQ development list" <zeromq-dev at lists.zeromq.org>,
  Date: 05/11/2013 10:44 PM



  Subject: [zeromq-dev] Bad file descriptor in rm_fd()
  Sent by: zeromq-dev-bounces at lists.zeromq.org




  Once in a while, when running either the IPython or PyZMQ test suite, I still get this error:

      Bad file descriptor (kqueue.cpp:77)

  or


      Bad file descriptor (epoll.cpp:81)


  Stack trace suggests that this happens when destroying a context:

  Thread 0:
  1   libzmq.3.dylib                 0x000000010f26b170 zmq::signaler_t::send() + 52
  2   libzmq.3.dylib                 0x000000010f261b2f zmq::object_t::send_stop() + 35
  3   libzmq.3.dylib                 0x000000010f2534a7 zmq::ctx_t::~ctx_t() + 59
  4   libzmq.3.dylib                 0x000000010f253a29 zmq::ctx_t::terminate() + 439
  5   libzmq.3.dylib                 0x000000010f27c071 zmq_ctx_term + 35


  Thread 6 Crashed:
  0   libsystem_kernel.dylib         0x00007fff94a4d866 __pthread_kill + 10
  1   libsystem_pthread.dylib       0x00007fff8cac835c pthread_kill + 92
  2   libsystem_c.dylib             0x00007fff97570bba abort + 125
  3   libzmq.3.dylib                 0x000000010f25a9e1 zmq::zmq_abort(char const*) + 9
  4   libzmq.3.dylib                 0x000000010f25d0fe zmq::kqueue_t::kevent_delete(int, short) + 142
  5   libzmq.3.dylib                 0x000000010f25d1b0 zmq::kqueue_t::rm_fd(void*) + 42
  6   libzmq.3.dylib                 0x000000010f2687a3 zmq::reaper_t::process_stop() + 59
  7   libzmq.3.dylib                 0x000000010f26862b zmq::reaper_t::in_event() + 161
  8   libzmq.3.dylib                 0x000000010f25d40c zmq::kqueue_t::loop() + 362


  I am still seeing this error once in a while with libzmq-master as of today. I don't think it's a recent regression.  A minimal test case is
  difficult, since it only seems to raise after at least a hundred tests, and only a small fraction of the time even then.  Given that it is always
  late in the process that the assert is hit, I have always assumed that it is FD exhaustion that is causing the problem, but I am not actually sure,
  and I am fairly careful about cleaning up sockets.

  Properties of the test suite that sees the issue:

  - create and destroy many contexts and sockets
  - the previous test's context should always be destroyed before the next test starts
  - it is not reliably the same test where the assert is hit

  I'm afraid I don't know enough about the internals to really tell what's going on here, or figure out why the deleted FD is invalid (maybe it was
  already closed, and the error should be ignored?).

  Anyone have insight on what might be causing the problem, or how I might dig deeper into more useful information?

  -MinRK_______________________________________________
  zeromq-dev mailing list
  zeromq-dev at lists.zeromq.org
  http://lists.zeromq.org/mailman/listinfo/zeromq-dev



  ===========================================================
  The information in this email is confidential, and is intended solely for the addressee(s).
  Access to this email by anyone else is unauthorized and therefore prohibited.  If you are
  not the intended recipient you are notified that disclosing, copying, distributing or taking
  any action in reliance on the contents of this information is strictly prohibited and may be unlawful.
  ===========================================================

  _______________________________________________
  zeromq-dev mailing list
  zeromq-dev at lists.zeromq.org
  http://lists.zeromq.org/mailman/listinfo/zeromq-dev




--
-
Pieter Hintjens
CEO of iMatix.com
Founder of ZeroMQ community
blog: http://hintjens.com _______________________________________________
zeromq-dev mailing list
zeromq-dev at lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

===========================================================
The information in this email is confidential, and is intended solely for the addressee(s). 
Access to this email by anyone else is unauthorized and therefore prohibited.  If you are 
not the intended recipient you are notified that disclosing, copying, distributing or taking 
any action in reliance on the contents of this information is strictly prohibited and may be unlawful.
===========================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20131106/fe43fdc7/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20131106/fe43fdc7/attachment.gif>


More information about the zeromq-dev mailing list