[zeromq-dev] wedging bug

William Brown william.brown at ericsson.com
Fri Mar 16 13:44:16 CET 2012


	Are you sure your output threads (s) are operating on independent output
	sockets? If you're sharing output sockets across multiple threads of
	control, you will experience exactly the behavior you're seeing now.


-----Original Message-----
From: zeromq-dev-bounces at lists.zeromq.org [mailto:zeromq-dev-bounces at lists.zeromq.org] On Behalf Of Jon Dyte
Sent: Thursday, March 15, 2012 4:11 PM
To: ZeroMQ development list
Subject: Re: [zeromq-dev] wedging bug

Hi Andrew

Just reading this trying to make sense of what you are describing

each S thread has it own set of output sockets yes?

and each one of these sockets is connected to an external process over either tcp or ipc?

could you create a simple example which just replicated the a few 'S' 
threads spinning very fast just pushing messages out over the various output sockets to these external processes?


On 14/03/12 22:19, Andrew Hume wrote:
> i have a program called portal that takes a socket as input and 
> several output sockets.
> i have a thread R that receives messages from the input and a thread S 
> that sends messages out on one of teh output threads. pseudocode is
> tmp_in and tmp_out are the input and output ends of a PUSH/PULL inproc 
> socket with no queue bounds.
> R:
> while(zmq_recv(isock, &msg)){
> // do statistics
> zmq_send(tmp_out, &msg)
> }
> S:
> while(zmq_recv(tmp_in, &msg)){
> // do statistics
> // determine which output socket osock zmq_send(osock, &msg) }
> the input socket is a PUSH/PULL with a bound of about 20000 messages, 
> and maybe a hundred or so inputs (PUSHers).
> the output sockets are PUSH/PULL with a bound of 5000 messages, each 
> going to a single process.
> ordinarily, this works great; the internal inproc socket remains empty 
> (we drain it as fast as input comes in. under heavy load, about once 
> or twice a day, this setup wedges; that is, S is blocked on the 
> zmq_send and and the destination process is blocked on a zmq_recv.
> this wedging occurs with both TCP transport and ipc transport.
> when it occurs, killing just the receiving process does not fix teh 
> problem; all the receiving processes have to be killed.
> this occurs under 2.1.7, and under 2.1.11.
> i have several portals, each handling messages of different sizes and 
> contents, on each server (there are 8 servers). when the portal on one 
> server wedges, the portal of the same type on all the other servers 
> soon (within 5-10 minutes) will wedge.
> any clues or advice?
> andrew
> ------------------
> Andrew Hume (best -> Telework) +1 623-551-2845 andrew at research.att.com 
> <mailto:andrew at research.att.com> (Work) +1
> 973-236-2014
> AT&T Labs - Research; member of USENIX and LOPSA
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev

zeromq-dev mailing list
zeromq-dev at lists.zeromq.org

More information about the zeromq-dev mailing list