[zeromq-dev] Zmq I/O thread abort() due to bad file descriptor (EBADF)
Ottawa Guy
ottawaguy81 at yahoo.com
Thu Apr 15 04:19:56 CEST 2021
Hi,
I am using zeromq 4.3.1. In our design micro-services sends periodic heart-beat to its peers(ROUTER->DEALER model). ZMQ socket options are set with ZMQ_IMMEDIATE and ZMQ_SENDTIMEO. This makes the send operation non-blocking when the peer is not up. We are seeing cases where zmq I/O thread crashes(abort) with "BAD file descriptor". It only happens for a peer which is not reachable. It aborts due to "EBADF" epoll_ctl() for EPOLL_CTL_DEL/ EPOLL_CTL_ADD. Our application only uses zmq socket, it doesn't use ZMQ_FD. I am not sure how there could be any race condition. It seems the socket file descriptor gets closed after epoll_wait () event. The problem is rare but does happen. I don't have any recipe to reproduce the problem. There is no issue with peers that are reachable.
Any pointer will be helpful. Thanks Hadi-
gdb) bt#0 0x00003fff7cdee530 in __libc_signal_restore_set (set=0x3fff712e8040) at ../sysdeps/unix/sysv/linux/internal-signals.h:84#1 __GI_raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:48#2 0x00003fff7cdd4648 in __GI_abort () at abort.c:79#3 0x00003fff7c971818 in zmq::zmq_abort (errmsg_=<optimized out>) at /usr/src/debug/zeromq/4.3.1-r0/zeromq-4.3.1/src/err.cpp:88#4 0x00003fff7c970d88 in zmq::epoll_t::add_fd (this=0x104797d0, fd_=<optimized out>, events_=<optimized out>) at /usr/src/debug/zeromq/4.3.1-r0/zeromq-4.3.1/src/epoll.cpp:100#5 0x00003fff7c972438 in zmq::io_object_t::add_fd (this=<optimized out>, fd_=<optimized out>) at /usr/src/debug/zeromq/4.3.1-r0/zeromq-4.3.1/src/io_object.cpp:65#6 0x00003fff7c9b1e98 in zmq::tcp_connecter_t::start_connecting (this=0x3fff385bfe70) at /usr/src/debug/zeromq/4.3.1-r0/zeromq-4.3.1/src/tcp_connecter.cpp:203#7 zmq::tcp_connecter_t::start_connecting (this=0x3fff385bfe70) at /usr/src/debug/zeromq/4.3.1-r0/zeromq-4.3.1/src/tcp_connecter.cpp:190#8 0x00003fff7c9b1fe4 in zmq::tcp_connecter_t::timer_event (this=<optimized out>, id_=<optimized out>) at /usr/src/debug/zeromq/4.3.1-r0/zeromq-4.3.1/src/tcp_connecter.cpp:186#9 0x00003fff7c98dad0 in zmq::poller_base_t::execute_timers (this=0x104797d0) at /usr/src/debug/zeromq/4.3.1-r0/zeromq-4.3.1/src/poller_base.cpp:103#10 0x00003fff7c9709c4 in zmq::epoll_t::loop (this=0x104797d0) at /usr/src/debug/zeromq/4.3.1-r0/zeromq-4.3.1/src/epoll.cpp:173#11 0x00003fff7c98d3ac in zmq::worker_poller_base_t::worker_routine (arg_=<optimized out>) at /usr/src/debug/zeromq/4.3.1-r0/zeromq-4.3.1/src/poller_base.cpp:139#12 0x00003fff7c9b3658 in thread_routine (arg_=0x10479828) at /usr/src/debug/zeromq/4.3.1-r0/zeromq-4.3.1/src/thread.cpp:182#13 0x00003fff7cfabb14 in start_thread (arg=0x0) at pthread_create.c:486#14 0x00003fff7cec72e8 in .__clone () at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:82
#0 0x00003fffb323e530 in __libc_signal_restore_set (set=0x3fff6afe7060) at ../sysdeps/unix/sysv/linux/internal-signals.h:84#1 __GI_raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:48#2 0x00003fffb3224648 in __GI_abort () at abort.c:79#3 0x00003fffb2dc1818 in .zmq::zmq_abort(char const*) () from /usr/lib64/libzmq.so.5#4 0x00003fffb2dc155c in .zmq::epoll_t::rm_fd(void*) () from /usr/lib64/libzmq.so.5#5 0x00003fffb2dc2474 in .zmq::io_object_t::rm_fd(void*) () from /usr/lib64/libzmq.so.5#6 0x00003fffb2e011e0 in .zmq::tcp_connecter_t::rm_handle() () from /usr/lib64/libzmq.so.5#7 0x00003fffb2e01c3c in .zmq::tcp_connecter_t::out_event() () from /usr/lib64/libzmq.so.5#8 0x00003fffb2e00cbc in .zmq::tcp_connecter_t::in_event() () from /usr/lib64/libzmq.so.5#9 0x00003fffb2dc0a9c in ?? () from /usr/lib64/libzmq.so.5#10 0x00003fffb2ddd3ac in .zmq::worker_poller_base_t::worker_routine(void*) () from /usr/lib64/libzmq.so.5#11 0x00003fffb2e03658 in ?? () from /usr/lib64/libzmq.so.5#12 0x00003fffb33fbb14 in start_thread (arg=0x0) at pthread_create.c:486#13 0x00003fffb33172e8 in .__clone () at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:82*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20210415/b903871e/attachment.htm>
More information about the zeromq-dev
mailing list