[zeromq-dev] hangs when opening/closing sockets "frequently" [2.0.7, OS/X] [re-post]

Matt Weinstein mattweinstein at gmail.com
Thu Jul 8 23:01:44 CEST 2010


[Moderator - Please kill my prior message, I joined the list ;-) ]

Folks,

> I have a client and server using REQ and REP, running the a  
> ZMQ_QUEUE device.
>
> REQ -- [TCP localhost] - XREP - ZMQ_QUEUE - XREQ - [INPROC] - REP
>
> To handle timeouts, the client is closing its socket, and opening a  
> new socket, whenever it sees a null packet coming from the server:
>
>
> 		// check to see if we're a special case
> 		if (reply.size() == 0) {
> 			delete psocket;
> 			psocket = new zmq::socket_t(*pctx, ZMQ_REQ);
> 			assert(psocket != NULL);
> 			psocket->connect(client_connect);
> 		}
>
> I have a server sending close replies ever 10th message.
>
> After a few hundred cycles, things hang, see below.
>
> I've done a git of the latest 2.0.7, as I needed the fix for bug 38  
> (Assertion failed: fetched (xrep.cpp:196)), which had been biting me.
>
> Any thoughts?
>

I played around a bit, and the problem goes away if I insert a  
usleep() strategically in one of two places (where it --helps).  My  
feeling is that there may be a race condition related to tearing down  
the actual TCP socket, or a timing problem allocating and deallocating  
a ypipe.  I tried using an OSMemoryBarrier (OS/X) but that didn't  
help. I haven't tried different usleep() values:

		if (reply.size() == 0) {
//			usleep(10000); -- does not help
			delete psocket;
//			usleep(10000); //-- helps here
			psocket = new zmq::socket_t(*pctx, ZMQ_REQ);
			assert(psocket != NULL);
			usleep(10000); //-- helps here
			psocket->connect(client_connect);
		}


The problem is reproducible (easily) on OS/X.

> Code is available.  Environment: OS/X Leopard.
>
>
> Thanks,
>
> Best,
>
> Matt
>
> client recv: Xthread# 0x10040a000 request# 297
> client send: thread# 0x10040a000 request# 298
> server recv: thread# 0x10040a000 request# 298
> server send thread# 0x10040a000 request# 298
> server send complete
> client recv: Xthread# 0x10040a000 request# 298
> client send: thread# 0x10040a000 request# 299
> server recv: thread# 0x10040a000 request# 299
> server send thread# 0x10040a000 request# 299
> server send complete
> client recv: Xthread# 0x10040a000 request# 299
> client send: thread# 0x10040a000 request# 300
> server recv: thread# 0x10040a000 request# 300
> server send null for thread# 0x10040a000 request# 300
> client recv:
> client send: thread# 0x10040a000 request# 301
> server recv: thread# 0x10040a000 request# 301
> server send thread# 0x10040a000 request# 301
> server send complete
>
> --- I expected to see this, it never showed up:
> client recv: Xthread# 0x10040a000 request# 301
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20100708/6b9c9422/attachment.htm>


More information about the zeromq-dev mailing list