[zeromq-dev] libzmq crash closing socket with pending messages
tucker at brkt.com
Wed Nov 6 18:47:55 CET 2013
Hi, I have a program that sends messages on a ZMQ_DEALER socket with with
ZMQ_DONTWAIT. If it gets back EAGAIN (perhaps because the other end is
responding slowly or has gone away) it calls zmq_close to close the socket
and then re-establish the connection (possibly to a new endpoint) with a
new socket. ZMQ_LINGER is set to 0 (this doesn't appear to happen if
ZMQ_LINGER isn't set, but that can cause other issues).
I'm occasionally seeing crashes in the libzmq epoll_t thread with either
"pure virtual method called" or a segmentation fault. The stack looks like
(this is with libzmq 3.2.4 but others are similar):
#4 0x00007f8928939ca3 in std::terminate() () from
#5 0x00007f892893a77f in __cxa_pure_virtual () from
#6 0x00007f8929649db1 in zmq::v1_encoder_t::message_ready
(this=0x7f8918000b90) at v1_encoder.cpp:66
#7 0x00007f892964a2a4 in zmq::encoder_base_t<zmq::v1_encoder_t>::get_data
(this=0x7f8918000b90, data_=0x7f8918000928, size_=0x7f8918000930,
offset_=0x0) at encoder.hpp:93
#8 0x00007f892963fb42 in zmq::stream_engine_t::out_event
(this=0x7f89180008e0) at stream_engine.cpp:261
#9 0x00007f8929627d1a in zmq::epoll_t::loop (this=0x8eace0) at
#10 0x00007f8929644996 in thread_routine (arg_=0x8ead50) at thread.cpp:83
#11 0x00007f8928be6e9a in start_thread (arg=0x7f89271b9700) at
#12 0x00007f89293453fd in clone () at
Looking at the core, it appears that the memory pointed to by the
msg_source field in the encoder has been freed (the "pure virtual method
called" is because the vtbl pointer has been munged by something that
re-allocated the buffer). The msg_source field points to the
session_base_t, but that was freed by the zmq_close. The session_base_t
destructor calls engine->terminate(), which would normally free the engine
state but doesn't do anything if the encoder still has data left to be sent.
I've reproduced this with 3.2.4, 4.0.1, and master (as of a few days ago).
I filed LIBZMQ-576 and attached a small test program to the issue.
This looks like a libzmq bug to me, though if I'm misusing the API in some
way (or if there's a reasonable workaround) please let me know.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the zeromq-dev