[zeromq-dev] libzmq crash closing socket with pending messages

AJ Lewis aj.lewis at quantum.com
Wed Nov 6 19:28:55 CET 2013


I've recently seen the same thing in 3.2.3, but hadn't been able to pinpoint
whether the problem was in zmq proper, or in the application using it.  I
look forward to the results of this question.

On Wed, Nov 06, 2013 at 09:47:55AM -0800, Andy Tucker wrote:
> Hi, I have a program that sends messages on a ZMQ_DEALER socket with with
> ZMQ_DONTWAIT. If it gets back EAGAIN (perhaps because the other end is
> responding slowly or has gone away) it calls zmq_close to close the socket
> and then re-establish the connection (possibly to a new endpoint) with a
> new socket. ZMQ_LINGER is set to 0 (this doesn't appear to happen if
> ZMQ_LINGER isn't set, but that can cause other issues).
> 
> I'm occasionally seeing crashes in the libzmq epoll_t thread with either
> "pure virtual method called" or a segmentation fault. The stack looks like
> (this is with libzmq 3.2.4 but others are similar):
> 
> #4  0x00007f8928939ca3 in std::terminate() () from
> /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> #5  0x00007f892893a77f in __cxa_pure_virtual () from
> /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> #6  0x00007f8929649db1 in zmq::v1_encoder_t::message_ready
> (this=0x7f8918000b90) at v1_encoder.cpp:66
> #7  0x00007f892964a2a4 in zmq::encoder_base_t<zmq::v1_encoder_t>::get_data
> (this=0x7f8918000b90, data_=0x7f8918000928, size_=0x7f8918000930,
> offset_=0x0) at encoder.hpp:93
> #8  0x00007f892963fb42 in zmq::stream_engine_t::out_event
> (this=0x7f89180008e0) at stream_engine.cpp:261
> #9  0x00007f8929627d1a in zmq::epoll_t::loop (this=0x8eace0) at
> epoll.cpp:158
> #10 0x00007f8929644996 in thread_routine (arg_=0x8ead50) at thread.cpp:83
> #11 0x00007f8928be6e9a in start_thread (arg=0x7f89271b9700) at
> pthread_create.c:308
> #12 0x00007f89293453fd in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
> 
> Looking at the core, it appears that the memory pointed to by the
> msg_source field in the encoder has been freed (the "pure virtual method
> called" is because the vtbl pointer has been munged by something that
> re-allocated the buffer). The msg_source field points to the
> session_base_t, but that was freed by the zmq_close. The session_base_t
> destructor calls engine->terminate(), which would normally free the engine
> state but doesn't do anything if the encoder still has data left to be sent.
> 
> I've reproduced this with 3.2.4, 4.0.1, and master (as of a few days ago).
> I filed LIBZMQ-576 and attached a small test program to the issue.
> 
> This looks like a libzmq bug to me, though if I'm misusing the API in some
> way (or if there's a reasonable workaround) please let me know.
> 
> Andy

> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev


-- 
AJ Lewis
Software Engineer
Quantum Corporation

Work:    651 688-4346
email:   aj.lewis at quantum.com

----------------------------------------------------------------------
The information contained in this transmission may be confidential. Any disclosure, copying, or further distribution of confidential information is not permitted unless such privilege is explicitly granted in writing by Quantum. Quantum reserves the right to have electronic communications, including email and attachments, sent across its networks filtered through anti virus and spam software programs and retain such messages in order to comply with applicable data security and retention requirements. Quantum is not responsible for the proper and complete transmission of the substance of this communication or for any delay in its receipt.



More information about the zeromq-dev mailing list