[zeromq-dev] need advice! hitting assertion in epoll.cpp:131 (zmq 4.2.1) when going from RHEL6 to RHEL7?!
zmqdev
zmqdev at amitego.com
Thu Feb 16 14:59:46 CET 2017
Hello,
I could use some advice to diagnose the following issue.
I have a program that has been running without problems for a couple of
years on Red Hat Enterprise Linux 6 at various sites.
On RHEL7, the program triggers the assertion
Bad file descriptor (src/epoll.cpp:131)
in about 1/3 of executions, during startup (sometimes during shutdown).
Less often, I see
Bad file descriptor (src/epoll.cpp:100)
The problem persists after upgrading to ZeroMQ 4.2.1 from 4.1.6.
I don't get it!
Programming errors aside, I do check all return codes and log errors as
they occur in the main thread, and there is nothing until libzmq commits
suicide from one of its threads.
Any idea/advice on how I could track down this problem?
What makes RHEL7 different enough from RHEL6 to emerge this kind of errors?
Cheers :-(
________________________________________________________________________
GDB BACKTRACE FROM CORE FILE:
Thread 3 (Thread 0xf736b900 (LWP 5039)):
#0 0xf7751430 in __kernel_vsyscall ()
#1 0xf745694b in poll () from /lib/libc.so.6
#2 0xf6ff5457 in
zmq::socket_poller_t::wait(zmq::socket_poller_t::event_t*, int, long) ()
from $TOP/lib/platform/libzmq.so.5
#3 0xf6ff325f in zmq_poller_wait_all(void*, zmq_poller_event_t*, int,
long) () from $TOP/lib/platform/libzmq.so.5
#4 0xf6ff3aa5 in zmq_poller_poll(zmq_pollitem_t*, int, long) () from
$TOP/lib/platform/libzmq.so.5
#5 0xf6ff2bb1 in zmq_poll () from $TOP/lib/platform/libzmq.so.5
#6 0xf702cec1 in zt_reactor_loop (r=<optimized out>) at
$TOP/src/reactor.c:268
(...)
#17 0x080487da in main ()
Thread 2 (Thread 0xf6e6db40 (LWP 5066)):
#0 0xf7751430 in __kernel_vsyscall ()
#1 0xf7463a16 in epoll_wait () from /lib/libc.so.6
#2 0xf6fa17d0 in zmq::epoll_t::loop() () from $TOP/lib/platform/libzmq.so.5
#3 0xf6fa1a35 in zmq::epoll_t::worker_routine(void*) () from
$TOP/lib/platform/libzmq.so.5
#4 0xf6fe36f2 in thread_routine () from $TOP/lib/platform/libzmq.so.5
#5 0xf7574b2c in start_thread () from /lib/libpthread.so.0
#6 0xf746308e in clone () from /lib/libc.so.6
Thread 1 (Thread 0xf666cb40 (LWP 5067)):
#0 0xf7751430 in __kernel_vsyscall ()
#1 0xf739a1f7 in raise () from /lib/libc.so.6
#2 0xf739ba33 in abort () from /lib/libc.so.6
#3 0xf6fa2726 in zmq::zmq_abort(char const*) () from
$TOP/lib/platform/libzmq.so.5
#4 0xf6fa164b in zmq::epoll_t::set_pollout(void*) () from
$TOP/lib/platform/libzmq.so.5
#5 0xf6fa3951 in zmq::io_object_t::set_pollout(void*) () from
$TOP/lib/platform/libzmq.so.5
#6 0xf6fdafe1 in zmq::stream_engine_t::restart_output() () from
$TOP/lib/platform/libzmq.so.5
#7 0xf6fcae20 in zmq::session_base_t::read_activated(zmq::pipe_t*) ()
from $TOP/lib/platform/libzmq.so.5
#8 0xf6fb9dd3 in zmq::pipe_t::process_activate_read() () from
$TOP/lib/platform/libzmq.so.5
#9 0xf6fb2a9e in zmq::object_t::process_command(zmq::command_t&) ()
from $TOP/lib/platform/libzmq.so.5
#10 0xf6fa3f77 in zmq::io_thread_t::in_event() () from
$TOP/lib/platform/libzmq.so.5
#11 0xf6fa1948 in zmq::epoll_t::loop() () from $TOP/lib/platform/libzmq.so.5
#12 0xf6fa1a35 in zmq::epoll_t::worker_routine(void*) () from
$TOP/lib/platform/libzmq.so.5
#13 0xf6fe36f2 in thread_routine () from $TOP/lib/platform/libzmq.so.5
#14 0xf7574b2c in start_thread () from /lib/libpthread.so.0
#15 0xf746308e in clone () from /lib/libc.so.6
More information about the zeromq-dev
mailing list