[zeromq-dev] Another test_shutdown_stress test
ntupel at googlemail.com
ntupel at googlemail.com
Thu Nov 4 14:41:12 CET 2010
Hi all,
Because I sometimes get segmentation faults on shutdown using 0MQ +
jzmq [1] I was keen to try out test_shutdown_stress. And to cause even
more stress I changed THREAD_COUNT from 100 to 400. Two problems
became manifest:
1. The program sometimes seems to deadlock. In GDB it looks like this:
...
[New Thread 0x7fff456fa710 (LWP 20687)]
[Thread 0x7fff45efb710 (LWP 20686) exited]
[New Thread 0x7fff44ef9710 (LWP 20688)]
[Thread 0x7fff456fa710 (LWP 20687) exited]
[Thread 0x7fff44ef9710 (LWP 20688) exited]
[New Thread 0x7fff446f8710 (LWP 20690)]
[New Thread 0x7fff43ef7710 (LWP 20691)]
^C
Program received signal SIGINT, Interrupt.
0x00007ffff74c7e54 in __lll_lock_wait () from /lib/libpthread.so.0
(gdb) bt
#0 0x00007ffff74c7e54 in __lll_lock_wait () from /lib/libpthread.so.0
#1 0x00007ffff74c3344 in _L_lock_511 () from /lib/libpthread.so.0
#2 0x00007ffff74c315a in pthread_mutex_lock () from /lib/libpthread.so.0
#3 0x00007ffff7b3777a in zmq::mutex_t::lock (this=0x7fffec000928) at
mutex.hpp:95
#4 0x00007ffff7b36c1e in zmq::ctx_t::create_socket
(this=0x7fffec0008b0, type_=2) at ctx.cpp:165
#5 0x00007ffff7b632c2 in zmq_socket (ctx_=0x7fffec0008b0, type_=2) at
zmq.cpp:265
#6 0x0000000000400a90 in main (argc=1, argv=0x7fffffffdf28) at
test_shutdown_stress.cpp:64
2. The program blocks on send calls. Again in GDB:
...
[Thread 0x7fff4a6ec710 (LWP 10327) exited]
[New Thread 0x7fff496ea710 (LWP 10329)]
[Thread 0x7fff49eeb710 (LWP 10328) exited]
[New Thread 0x7fff48ee9710 (LWP 10330)]
[New Thread 0x7fff486e8710 (LWP 10331)]
[New Thread 0x7fff47ee7710 (LWP 10332)]
[New Thread 0x7fff476e6710 (LWP 10333)]
[New Thread 0x7fff46ee5710 (LWP 10334)]
[New Thread 0x7fff466e4710 (LWP 10335)]
[New Thread 0x7fff45ee3710 (LWP 10336)]
[New Thread 0x7fff456e2710 (LWP 10337)]
[New Thread 0x7fff44ee1710 (LWP 10338)]
[New Thread 0x7fff446e0710 (LWP 10339)]
[New Thread 0x7fff43edf710 (LWP 10340)]
[New Thread 0x7fff436de710 (LWP 10341)]
[New Thread 0x7fff42edd710 (LWP 10342)]
[New Thread 0x7fff426dc710 (LWP 10343)]
[New Thread 0x7fff41edb710 (LWP 10344)]
[New Thread 0x7fff416da710 (LWP 10345)]
...
[New Thread 0x7fff2369e710 (LWP 10405)]
[New Thread 0x7fff22e9d710 (LWP 10406)]
[New Thread 0x7fff2269c710 (LWP 10407)]
^C
Program received signal SIGINT, Interrupt.
0x00007ffff74c1e55 in pthread_join () from /lib/libpthread.so.0
(gdb) bt
#0 0x00007ffff74c1e55 in pthread_join () from /lib/libpthread.so.0
#1 0x0000000000400b3d in main (argc=1, argv=0x7fffffffdf28) at
test_shutdown_stress.cpp:71
(gdb) info threads
408 Thread 0x7fff2269c710 (LWP 10407) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
407 Thread 0x7fff22e9d710 (LWP 10406) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
406 Thread 0x7fff2369e710 (LWP 10405) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
405 Thread 0x7fff23e9f710 (LWP 10404) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
404 Thread 0x7fff246a0710 (LWP 10403) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
403 Thread 0x7fff24ea1710 (LWP 10402) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
402 Thread 0x7fff256a2710 (LWP 10401) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
401 Thread 0x7fff25ea3710 (LWP 10400) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
400 Thread 0x7fff266a4710 (LWP 10399) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
399 Thread 0x7fff26ea5710 (LWP 10398) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
398 Thread 0x7fff276a6710 (LWP 10397) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
397 Thread 0x7fff27ea7710 (LWP 10396) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
396 Thread 0x7fff286a8710 (LWP 10395) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0...
...
332 Thread 0x7fff486e8710 (LWP 10331) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
331 Thread 0x7fff48ee9710 (LWP 10330) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
330 Thread 0x7fff496ea710 (LWP 10329) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
8 Thread 0x7ffff37a7710 (LWP 10000) 0x00007ffff6a89ff3 in
epoll_wait () from /lib/libc.so.6
7 Thread 0x7ffff3fa8710 (LWP 9999) 0x00007ffff6a89ff3 in epoll_wait
() from /lib/libc.so.6
6 Thread 0x7ffff47a9710 (LWP 9998) 0x00007ffff6a89ff3 in epoll_wait
() from /lib/libc.so.6
5 Thread 0x7ffff4faa710 (LWP 9997) 0x00007ffff6a89ff3 in epoll_wait
() from /lib/libc.so.6
4 Thread 0x7ffff57ab710 (LWP 9996) 0x00007ffff6a89ff3 in epoll_wait
() from /lib/libc.so.6
3 Thread 0x7ffff5fac710 (LWP 9995) 0x00007ffff74c87ec in send ()
from /lib/libpthread.so.0
2 Thread 0x7ffff67ad710 (LWP 9994) 0x00007ffff6a89ff3 in epoll_wait
() from /lib/libc.so.6
* 1 Thread 0x7ffff7fcb760 (LWP 9991) 0x00007ffff74c1e55 in
pthread_join () from /lib/libpthread.so.0
(gdb) thread 338
[Switching to thread 338 (Thread 0x7fff456e2710 (LWP 10337))]#0
0x00007ffff74c87ec in send () from /lib/libpthread.so.0
(gdb) bt
#0 0x00007ffff74c87ec in send () from /lib/libpthread.so.0
#1 0x00007ffff7b578af in zmq::signaler_t::send (this=0x603f50,
cmd_=...) at signaler.cpp:290
#2 0x00007ffff7b36e08 in zmq::ctx_t::send_command (this=0x602c50,
slot_=1, command_=...) at ctx.cpp:221
#3 0x00007ffff7b449b1 in zmq::object_t::send_command
(this=0x7fffec05cfc0, cmd_=...) at object.cpp:407
#4 0x00007ffff7b44103 in zmq::object_t::send_plug
(this=0x7fffec05cfc0, destination_=0x7a6b10, inc_seqnum_=true) at
object.cpp:179
#5 0x00007ffff7b45d37 in zmq::own_t::launch_child
(this=0x7fffec05cfc0, object_=0x7a6b10) at own.cpp:76
#6 0x00007ffff7b5935b in zmq::socket_base_t::connect
(this=0x7fffec05cfc0, addr_=0x400cf0 "tcp://127.0.0.1:5555") at
socket_base.cpp:421
#7 0x00007ffff7b63418 in zmq_connect (s_=0x7fffec05cfc0,
addr_=0x400cf0 "tcp://127.0.0.1:5555") at zmq.cpp:314
#8 0x0000000000400961 in worker (s=0x7fffec05cfc0) at
test_shutdown_stress.cpp:31
#9 0x00007ffff74c0cb0 in start_thread () from /lib/libpthread.so.0
#10 0x00007ffff6a899fd in clone () from /lib/libc.so.6
#11 0x0000000000000000 in ?? ()
(gdb) thread 6
[Switching to thread 6 (Thread 0x7ffff47a9710 (LWP 9998))]#0
0x00007ffff6a89ff3 in epoll_wait () from /lib/libc.so.6
(gdb) bt
#0 0x00007ffff6a89ff3 in epoll_wait () from /lib/libc.so.6
#1 0x00007ffff7b3d06d in zmq::epoll_t::loop (this=0x6045f0) at epoll.cpp:141
#2 0x00007ffff7b3d2ea in zmq::epoll_t::worker_routine (arg_=0x6045f0)
at epoll.cpp:173
#3 0x00007ffff7b5e0e2 in zmq::thread_t::thread_routine
(arg_=0x604660) at thread.cpp:79
#4 0x00007ffff74c0cb0 in start_thread () from /lib/libpthread.so.0
#5 0x00007ffff6a899fd in clone () from /lib/libc.so.6
#6 0x0000000000000000 in ?? ()
(gdb) thread 400
[Switching to thread 400 (Thread 0x7fff266a4710 (LWP 10399))]#0
0x00007ffff74c87ec in send () from /lib/libpthread.so.0
(gdb) bt
#0 0x00007ffff74c87ec in send () from /lib/libpthread.so.0
#1 0x00007ffff7b578af in zmq::signaler_t::send (this=0x603f50,
cmd_=...) at signaler.cpp:290
#2 0x00007ffff7b36e08 in zmq::ctx_t::send_command (this=0x602c50,
slot_=1, command_=...) at ctx.cpp:221
#3 0x00007ffff7b449b1 in zmq::object_t::send_command
(this=0x7fffec069770, cmd_=...) at object.cpp:407
#4 0x00007ffff7b44103 in zmq::object_t::send_plug
(this=0x7fffec069770, destination_=0x84f410, inc_seqnum_=true) at
object.cpp:179
#5 0x00007ffff7b45d37 in zmq::own_t::launch_child
(this=0x7fffec069770, object_=0x84f410) at own.cpp:76
#6 0x00007ffff7b5935b in zmq::socket_base_t::connect
(this=0x7fffec069770, addr_=0x400cf0 "tcp://127.0.0.1:5555") at
socket_base.cpp:421
#7 0x00007ffff7b63418 in zmq_connect (s_=0x7fffec069770,
addr_=0x400cf0 "tcp://127.0.0.1:5555") at zmq.cpp:314
#8 0x0000000000400961 in worker (s=0x7fffec069770) at
test_shutdown_stress.cpp:31
#9 0x00007ffff74c0cb0 in start_thread () from /lib/libpthread.so.0
#10 0x00007ffff6a899fd in clone () from /lib/libc.so.6
#11 0x0000000000000000 in ?? ()
Any idea how to fix this?
-nt
------
[1] Here is the relevant part of a JVM dump:
Stack: [0x00007fac72b13000,0x00007fac72c14000],
sp=0x00007fac72c11628, free space=3f90000000000000018k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [libzmq.so.0+0x2a620] _ZN3zmq5own_t9terminateEv+0x0
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j org.zeromq.ZMQ$Context.finalize()V+0
j org.zeromq.ZMQ$Context.term()V+1
...
More information about the zeromq-dev
mailing list