[zeromq-dev] wedging bug

Andrew Hume andrew at research.att.com
Wed Mar 14 23:19:59 CET 2012

i have a program called portal that takes a socket as input and several output sockets.
i have a thread R that receives messages from the input and a thread S that
sends messages out on one of teh output threads. pseudocode is

tmp_in and tmp_out are the input and output ends of a PUSH/PULL inproc socket
with no queue bounds.

	while(zmq_recv(isock, &msg)){
		// do statistics
		zmq_send(tmp_out, &msg)

	while(zmq_recv(tmp_in, &msg)){
		// do statistics
		// determine which output socket osock
		zmq_send(osock, &msg)

the input socket is a PUSH/PULL with a bound of about 20000 messages, and maybe
	a hundred or so inputs (PUSHers).
the output sockets are PUSH/PULL with a bound of 5000 messages, each going to a
	single process.

ordinarily, this works great; the internal inproc socket remains empty (we drain
it as fast as input comes in. under heavy load, about once or twice a day, this setup wedges;
that is, S is blocked on the zmq_send and and the destination process is blocked on a

this wedging occurs with both TCP transport and ipc transport.
when it occurs, killing just the receiving process does not fix teh problem;
all the receiving processes have to be killed.
this occurs under 2.1.7, and under 2.1.11.
i have several portals, each handling messages of different sizes and contents, on each
server (there are 8 servers). when the portal on one server wedges, the portal of the same
type on all the other servers soon (within 5-10 minutes) will wedge.

	any clues or advice?


Andrew Hume  (best -> Telework) +1 623-551-2845
andrew at research.att.com  (Work) +1 973-236-2014
AT&T Labs - Research; member of USENIX and LOPSA

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20120314/61e4a7bd/attachment.htm>

More information about the zeromq-dev mailing list