[zeromq-dev] OpenPGM & segfault - assertion failed

Olivier olivier.chamoux at fr.thalesgroup.com
Thu May 6 10:11:14 CEST 2010


Hi all,

I'm using zeromq and the pub/sub model.

Here is my (very simple) use case : 
A master that receive messages, and a node with one or two sender(s)
started by SSH (see attachments for the programs).
Master and node connected by a switch and 100MB LAN.
zmq version : 2.06 trunk version (zeromq2-1ad6ade)
OS : debian

The messages are well transmitted, but  few seconds after the begining ,
i get a lots of segfault or Assertion failed (more or less randomly :/)
on the subscriber's side (cf attachment for the list).

Moreover these segfault/assertion, i observe a strange behavior of the
subscriber program :
sometimes it blocks and doesn't receive message anymore, whereas the
sender is still active. If i start a second sender,  it deblock the
subscriber for a little moment.

So i am a bit confused with (all) these errors.
Is it a bug from OpenPGM, or am i doing something wrong in my code ?

Regards,
Olivier.
-------------- next part --------------
*** glibc detected *** /home/olivier/ZMQ/0mq_prgrm/Debug/segfault/sub: free(): invalid next size (fast): 0x091761e0 ***
======= Backtrace: =========
/lib/i686/cmov/libc.so.6[0xb7cfe6b4]
/lib/i686/cmov/libc.so.6(cfree+0x96)[0xb7d008b6]
/usr/local/lib/libzmq.so.0(zmq_msg_close+0x47)[0xb7f429a7]
/usr/local/lib/libzmq.so.0(_ZN3zmq13zmq_decoder_tD1Ev+0x28)[0xb7f44668]
/usr/local/lib/libzmq.so.0(_ZN3zmq14pgm_receiver_t8in_eventEv+0x35a)[0xb7f2ff0a]
/usr/local/lib/libzmq.so.0(_ZN3zmq7epoll_t4loopEv+0x15e)[0xb7f2a95e]
/usr/local/lib/libzmq.so.0(_ZN3zmq7epoll_t14worker_routineEPv+0x1d)[0xb7f2aa4d]
/usr/local/lib/libzmq.so.0(_ZN3zmq8thread_t14thread_routineEPv+0x57)[0xb7f402a7]
/lib/i686/cmov/libpthread.so.0[0xb7bb54c0]
/lib/i686/cmov/libc.so.6(clone+0x5e)[0xb7d7061e]
======= Memory map: ========
08048000-0804a000 r-xp 00000000 03:03 11167572   /home/olivier/ZMQ/0mq_prgrm/Debug/segfault/sub
0804a000-0804b000 rw-p 00001000 03:03 11167572   /home/olivier/ZMQ/0mq_prgrm/Debug/segfault/sub
0916b000-091ad000 rw-p 0916b000 00:00 0          [heap]
b7200000-b7221000 rw-p b7200000 00:00 0 
b7221000-b7300000 ---p b7221000 00:00 0 
b7377000-b7378000 ---p b7377000 00:00 0 
b7378000-b7b78000 rw-p b7378000 00:00 0 
b7b78000-b7b82000 r-xp 00000000 03:02 5406915    /lib/i686/cmov/libnss_files-2.7.so
b7b82000-b7b84000 rw-p 00009000 03:02 5406915    /lib/i686/cmov/libnss_files-2.7.so
b7b84000-b7b86000 rw-p b7b84000 00:00 0 
b7b86000-b7bae000 r-xp 00000000 03:02 757918     /usr/lib/libpcre.so.3.12.1
b7bae000-b7baf000 rw-p 00027000 03:02 757918     /usr/lib/libpcre.so.3.12.1
b7baf000-b7bc4000 r-xp 00000000 03:02 5406920    /lib/i686/cmov/libpthread-2.7.so
b7bc4000-b7bc6000 rw-p 00014000 03:02 5406920    /lib/i686/cmov/libpthread-2.7.so
b7bc6000-b7bc8000 rw-p b7bc6000 00:00 0 
b7bc8000-b7bcb000 r-xp 00000000 03:02 5382204    /lib/libuuid.so.1.2
b7bcb000-b7bcc000 rw-p 00002000 03:02 5382204    /lib/libuuid.so.1.2
b7bcc000-b7bcd000 rw-p b7bcc000 00:00 0 
b7bcd000-b7c81000 r-xp 00000000 03:02 757924     /usr/lib/libglib-2.0.so.0.1600.6
b7c81000-b7c82000 rw-p 000b3000 03:02 757924     /usr/lib/libglib-2.0.so.0.1600.6
b7c82000-b7c89000 r-xp 00000000 03:02 5406922    /lib/i686/cmov/librt-2.7.so
b7c89000-b7c8b000 rw-p 00006000 03:02 5406922    /lib/i686/cmov/librt-2.7.so
b7c8b000-b7c8f000 r-xp 00000000 03:02 757927     /usr/lib/libgthread-2.0.so.0.1600.6
b7c8f000-b7c90000 rw-p 00003000 03:02 757927     /usr/lib/libgthread-2.0.so.0.1600.6
b7c90000-b7de5000 r-xp 00000000 03:02 5406906    /lib/i686/cmov/libc-2.7.so
b7de5000-b7de6000 r--p 00155000 03:02 5406906    /lib/i686/cmov/libc-2.7.so
b7de6000-b7de8000 rw-p 00156000 03:02 5406906    /lib/i686/cmov/libc-2.7.so
b7de8000-b7deb000 rw-p b7de8000 00:00 0 
b7deb000-b7df7000 r-xp 00000000 03:02 5383802    /lib/libgcc_s.so.1
b7df7000-b7df8000 rw-p 0000b000 03:02 5383802    /lib/libgcc_s.so.1
b7df8000-b7e1c000 r-xp 00000000 03:02 5406910    /lib/i686/cmov/libm-2.7.so
b7e1c000-b7e1e000 rw-p 00023000 03:02 5406910    /lib/i686/cmov/libm-2.7.so
b7e1e000-b7e1f000 rw-p b7e1e000 00:00 0 
b7e1f000-b7f02000 r-xp 00000000 03:02 755000     /usr/lib/libstdc++.so.6.0.10
b7f02000-b7f05000 r--p 000e2000 03:02 755000     /usr/lib/libstdc++.so.6.0.10
b7f05000-b7f07000 rw-p 000e5000 03:02 755000     /usr/lib/libstdc++.so.6.0.10
b7f07000-b7f0d000 rw-p b7f07000 00:00 0 
b7f0d000-b7f89000 r-xp 00000000 03:02 765791     /usr/local/lib/libzmq.so.0.0.0
b7f89000-b7f8c000 rw-p 0007b000 03:02 765791     /usr/local/lib/libzmq.so.0.0.0
b7fa0000-b7fa3000 rw-p b7fa0000 00:00 0 
b7fa3000-b7fa4000 r-xp b7fa3000 00:00 0          [vdso]
b7fa4000-b7fbe000 r-xp 00000000 03:02 5383803    /lib/ld-2.7.so
b7fbe000-b7fc0000 rw-p 0001a000 03:02 5383803    /lib/ld-2.7.so
bfaaa000-bfabf000 rw-p bffeb000 00:00 0          [stack]

Program received signal SIGABRT, Aborted.
[Switching to Thread 0xb7b77b90 (LWP 11744)]
0xb7fa3424 in __kernel_vsyscall ()
(gdb) bt
#0  0xb7fa3424 in __kernel_vsyscall ()
#1  0xb7cbb640 in raise () from /lib/i686/cmov/libc.so.6
#2  0xb7cbd018 in abort () from /lib/i686/cmov/libc.so.6
#3  0xb7cf83dd in ?? () from /lib/i686/cmov/libc.so.6
#4  0x0000000f in ?? ()
#5  0xb7b76034 in ?? ()
#6  0x00000400 in ?? ()
#7  0xb7dce5c8 in ?? () from /lib/i686/cmov/libc.so.6
#8  0x00000017 in ?? ()
#9  0xbfabe9db in ?? ()
#10 0x0000002e in ?? ()
#11 0xb7dce5e1 in ?? () from /lib/i686/cmov/libc.so.6
#12 0x00000002 in ?? ()
#13 0xb7dce614 in ?? () from /lib/i686/cmov/libc.so.6
#14 0x00000020 in ?? ()
#15 0xb7dce5e5 in ?? () from /lib/i686/cmov/libc.so.6
#16 0x00000004 in ?? ()
#17 0xb7b76563 in ?? ()
#18 0x00000008 in ?? ()
#19 0xb7dce5eb in ?? () from /lib/i686/cmov/libc.so.6
#20 0x00000005 in ?? ()
#21 0xb7b75f48 in ?? ()
#22 0xb7f8a6f4 in ?? () from /usr/local/lib/libzmq.so.0
#23 0x09172330 in ?? ()
#24 0xb7d7e930 in ?? () from /lib/i686/cmov/libc.so.6



############################
############################

*** glibc detected *** /home/olivier/ZMQ/0mq_prgrm/Debug/segfault/sub: malloc(): memory corruption: 0xb7217350 ***
37823
======= Backtrace: =========
/lib/i686/cmov/libc.so.6[0xb7c9d2e6]
/lib/i686/cmov/libc.so.6(__libc_malloc+0x95)[0xb7c9e6e5]
/lib/i686/cmov/libc.so.6(vasprintf+0x23)[0xb7c92ff3]
/usr/lib/libglib-2.0.so.0(g_vasprintf+0x37)[0xb7bd8257]
/usr/lib/libglib-2.0.so.0(g_strdup_vprintf+0x26)[0xb7bc4586]
/usr/lib/libglib-2.0.so.0(g_set_error+0x52)[0xb7b91dc2]
/usr/local/lib/libzmq.so.0(pgm_recvmsgv+0x246)[0xb7eff966]
/usr/local/lib/libzmq.so.0(_ZN3zmq12pgm_socket_t7receiveEPPvPPK9pgm_tsi_t+0xbc)[0xb7ecd59c]
/usr/local/lib/libzmq.so.0(_ZN3zmq14pgm_receiver_t8in_eventEv+0x79)[0xb7ecbc29]
/usr/local/lib/libzmq.so.0(_ZN3zmq7epoll_t4loopEv+0x15e)[0xb7ec695e]
/usr/local/lib/libzmq.so.0(_ZN3zmq7epoll_t14worker_routineEPv+0x1d)[0xb7ec6a4d]
/usr/local/lib/libzmq.so.0(_ZN3zmq8thread_t14thread_routineEPv+0x57)[0xb7edc2a7]
/lib/i686/cmov/libpthread.so.0[0xb7b514c0]
/lib/i686/cmov/libc.so.6(clone+0x5e)[0xb7d0c61e]
======= Memory map: ========
08048000-0804a000 r-xp 00000000 03:03 11167738   /home/olivier/ZMQ/0mq_prgrm/Debug/segfault/sub
0804a000-0804b000 rw-p 00001000 03:03 11167738   /home/olivier/ZMQ/0mq_prgrm/Debug/segfault/sub
09d2d000-09d5b000 rw-p 09d2d000 00:00 0          [heap]
b7200000-b7272000 rw-p b7200000 00:00 0 
b7272000-b7300000 ---p b7272000 00:00 0 
b7313000-b7314000 ---p b7313000 00:00 0 
b7314000-b7b14000 rw-p b7314000 00:00 0 
b7b14000-b7b1e000 r-xp 00000000 03:02 5406915    /lib/i686/cmov/libnss_files-2.7.so
b7b1e000-b7b20000 rw-p 00009000 03:02 5406915    /lib/i686/cmov/libnss_files-2.7.so
b7b20000-b7b22000 rw-p b7b20000 00:00 0 
b7b22000-b7b4a000 r-xp 00000000 03:02 757918     /usr/lib/libpcre.so.3.12.1
b7b4a000-b7b4b000 rw-p 00027000 03:02 757918     /usr/lib/libpcre.so.3.12.1
b7b4b000-b7b60000 r-xp 00000000 03:02 5406920    /lib/i686/cmov/libpthread-2.7.so
b7b60000-b7b62000 rw-p 00014000 03:02 5406920    /lib/i686/cmov/libpthread-2.7.so
b7b62000-b7b64000 rw-p b7b62000 00:00 0 
b7b64000-b7b67000 r-xp 00000000 03:02 5382204    /lib/libuuid.so.1.2
b7b67000-b7b68000 rw-p 00002000 03:02 5382204    /lib/libuuid.so.1.2
b7b68000-b7b69000 rw-p b7b68000 00:00 0 
b7b69000-b7c1d000 r-xp 00000000 03:02 757924     /usr/lib/libglib-2.0.so.0.1600.6
b7c1d000-b7c1e000 rw-p 000b3000 03:02 757924     /usr/lib/libglib-2.0.so.0.1600.6
b7c1e000-b7c25000 r-xp 00000000 03:02 5406922    /lib/i686/cmov/librt-2.7.so
b7c25000-b7c27000 rw-p 00006000 03:02 5406922    /lib/i686/cmov/librt-2.7.so
b7c27000-b7c2b000 r-xp 00000000 03:02 757927     /usr/lib/libgthread-2.0.so.0.1600.6
b7c2b000-b7c2c000 rw-p 00003000 03:02 757927     /usr/lib/libgthread-2.0.so.0.1600.6
b7c2c000-b7d81000 r-xp 00000000 03:02 5406906    /lib/i686/cmov/libc-2.7.so
b7d81000-b7d82000 r--p 00155000 03:02 5406906    /lib/i686/cmov/libc-2.7.so
b7d82000-b7d84000 rw-p 00156000 03:02 5406906    /lib/i686/cmov/libc-2.7.so
b7d84000-b7d87000 rw-p b7d84000 00:00 0 
b7d87000-b7d93000 r-xp 00000000 03:02 5383802    /lib/libgcc_s.so.1
b7d93000-b7d94000 rw-p 0000b000 03:02 5383802    /lib/libgcc_s.so.1
b7d94000-b7db8000 r-xp 00000000 03:02 5406910    /lib/i686/cmov/libm-2.7.so
b7db8000-b7dba000 rw-p 00023000 03:02 5406910    /lib/i686/cmov/libm-2.7.so
b7dba000-b7dbb000 rw-p b7dba000 00:00 0 
b7dbb000-b7e9e000 r-xp 00000000 03:02 755000     /usr/lib/libstdc++.so.6.0.10
b7e9e000-b7ea1000 r--p 000e2000 03:02 755000     /usr/lib/libstdc++.so.6.0.10
b7ea1000-b7ea3000 rw-p 000e5000 03:02 755000     /usr/lib/libstdc++.so.6.0.10
b7ea3000-b7ea9000 rw-p b7ea3000 00:00 0 
b7ea9000-b7f25000 r-xp 00000000 03:02 765791     /usr/local/lib/libzmq.so.0.0.0
b7f25000-b7f28000 rw-p 0007b000 03:02 765791     /usr/local/lib/libzmq.so.0.0.0
b7f3c000-b7f3f000 rw-p b7f3c000 00:00 0 
b7f3f000-b7f40000 r-xp b7f3f000 00:00 0          [vdso]
b7f40000-b7f5a000 r-xp 00000000 03:02 5383803    /lib/ld-2.7.so
b7f5a000-b7f5c000 rw-p 0001a000 03:02 5383803    /lib/ld-2.7.so
bff46000-bff5b000 rw-p bffeb000 00:00 0          [stack]

Program received signal SIGABRT, Aborted.
[Switching to Thread 0xb7b13b90 (LWP 11795)]
0xb7f3f424 in __kernel_vsyscall ()
(gdb) bt
#0  0xb7f3f424 in __kernel_vsyscall ()
#1  0xb7c57640 in raise () from /lib/i686/cmov/libc.so.6
#2  0xb7c59018 in abort () from /lib/i686/cmov/libc.so.6
#3  0xb7c943dd in ?? () from /lib/i686/cmov/libc.so.6
#4  0x00000011 in ?? ()
#5  0xb7b11804 in ?? ()
#6  0x00000400 in ?? ()
#7  0xb7d6a5c8 in ?? () from /lib/i686/cmov/libc.so.6
#8  0x00000017 in ?? ()
#9  0xbff5a9db in ?? ()
#10 0x0000002e in ?? ()
#11 0xb7d6a5e1 in ?? () from /lib/i686/cmov/libc.so.6
#12 0x00000002 in ?? ()
#13 0xb7d67551 in ?? () from /lib/i686/cmov/libc.so.6
#14 0x0000001b in ?? ()
#15 0xb7d6a5e5 in ?? () from /lib/i686/cmov/libc.so.6
#16 0x00000004 in ?? ()
#17 0xb7b11db3 in ?? ()
#18 0x00000008 in ?? ()
#19 0xb7d6a5eb in ?? () from /lib/i686/cmov/libc.so.6
#20 0x00000005 in ?? ()
#21 0x00000000 in ?? ()


############################
############################


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb7b05b90 (LWP 11812)]
0xb7bb2adb in g_slice_alloc () from /usr/lib/libglib-2.0.so.0
(gdb) bt
#0  0xb7bb2adb in g_slice_alloc () from /usr/lib/libglib-2.0.so.0
#1  0xb7b83da3 in g_set_error () from /usr/lib/libglib-2.0.so.0
#2  0xb7ef1966 in pgm_recvmsgv (transport=0x85649e0, msg_start=0x8560da8, msg_len=6, flags=64, 
    _bytes_read=0x8560410, error=0xb7b045d8)
    at ../foreign/openpgm/libpgm-2.0.24/openpgm/pgm/recv.c:908
#3  0xb7ebf59c in zmq::pgm_socket_t::receive (this=0x85603bc, raw_data_=0xb7b04674, 
    tsi_=0xb7b04670) at pgm_socket.cpp:459
#4  0xb7ebdc29 in zmq::pgm_receiver_t::in_event (this=0x8560398) at pgm_receiver.cpp:137
#5  0xb7eb895e in zmq::epoll_t::loop (this=0x855d5e8) at epoll.cpp:197
#6  0xb7eb8a4d in zmq::epoll_t::worker_routine (arg_=0x855d5e8) at epoll.cpp:210
#7  0xb7ece2a7 in zmq::thread_t::thread_routine (arg_=0x855d608) at thread.cpp:99
#8  0xb7b434c0 in start_thread () from /lib/i686/cmov/libpthread.so.0
#9  0xb7cfe61e in clone () from /lib/i686/cmov/libc.so.6




#########################
#########################


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb7b87b90 (LWP 11819)]
0xb7f3feae in zmq::pgm_receiver_t::in_event (this=0x8d45398) at pgm_receiver.cpp:213
213	    inout->flush ();
Current language:  auto; currently c++
(gdb) bt
#0  0xb7f3feae in zmq::pgm_receiver_t::in_event (this=0x8d45398) at pgm_receiver.cpp:213
#1  0xb7f3a95e in zmq::epoll_t::loop (this=0x8d425e8) at epoll.cpp:197
#2  0xb7f3aa4d in zmq::epoll_t::worker_routine (arg_=0x8d425e8) at epoll.cpp:210
#3  0xb7f502a7 in zmq::thread_t::thread_routine (arg_=0x8d42608) at thread.cpp:99
#4  0xb7bc54c0 in start_thread () from /lib/i686/cmov/libpthread.so.0
#5  0xb7d8061e in clone () from /lib/i686/cmov/libc.so.6


############################
############################


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb7b86b90 (LWP 11861)]
0xb7d16cac in memcpy () from /lib/i686/cmov/libc.so.6
(gdb) bt
#0  0xb7d16cac in memcpy () from /lib/i686/cmov/libc.so.6
#1  0xb7f3ee31 in zmq::pgm_receiver_t::in_event (this=0x8d0d398) at decoder.hpp:118
#2  0xb7f3995e in zmq::epoll_t::loop (this=0x8d0a5e8) at epoll.cpp:197
#3  0xb7f39a4d in zmq::epoll_t::worker_routine (arg_=0x8d0a5e8) at epoll.cpp:210
#4  0xb7f4f2a7 in zmq::thread_t::thread_routine (arg_=0x8d0a608) at thread.cpp:99
#5  0xb7bc44c0 in start_thread () from /lib/i686/cmov/libpthread.so.0
#6  0xb7d7f61e in clone () from /lib/i686/cmov/libc.so.6


############################
############################


Assertion failed: pgm_msgv [pgm_msgv_processed].msgv_len == 1 (pgm_socket.cpp:496)
138419

Program received signal SIGABRT, Aborted.
[Switching to Thread 0xb7b00b90 (LWP 12219)]
0xb7f2c424 in __kernel_vsyscall ()
(gdb) bt
#0  0xb7f2c424 in __kernel_vsyscall ()
#1  0xb7c44640 in raise () from /lib/i686/cmov/libc.so.6
#2  0xb7c46018 in abort () from /lib/i686/cmov/libc.so.6
#3  0xb7eba64c in zmq::pgm_socket_t::receive (this=0x98663bc, raw_data_=0xb7aff674, 
    tsi_=0xb7aff670) at pgm_socket.cpp:461
#4  0xb7eb8c29 in zmq::pgm_receiver_t::in_event (this=0x9866398) at pgm_receiver.cpp:137
#5  0xb7eb395e in zmq::epoll_t::loop (this=0x98635e8) at epoll.cpp:197
#6  0xb7eb3a4d in zmq::epoll_t::worker_routine (arg_=0x98635e8) at epoll.cpp:210
#7  0xb7ec92a7 in zmq::thread_t::thread_routine (arg_=0x9863608) at thread.cpp:99
#8  0xb7b3e4c0 in start_thread () from /lib/i686/cmov/libpthread.so.0
#9  0xb7cf961e in clone () from /lib/i686/cmov/libc.so.6



############################
############################

sender 10000  : ./sub block, start another sender : deblock
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sender.cpp
Type: text/x-c++src
Size: 587 bytes
Desc: not available
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20100506/75e51e45/attachment.cpp>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sub.cpp
Type: text/x-c++src
Size: 342 bytes
Desc: not available
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20100506/75e51e45/attachment-0001.cpp>


More information about the zeromq-dev mailing list