[zeromq-dev] SUB connect socket causes zmq_term to block

Justin Karneges justin at affinix.com
Tue Feb 3 08:27:41 CET 2015


Hi,

I've noticed that if I create a SUB socket, call
setsockopt(...ZMQ_SUBSCRIBE...), and then connect to a peer that doesn't
exist (I'm using "ipc://foo"), and then call zmq_close() on the socket
followed by zmq_term(), then the termination will hang.

There is slightly more to it that I cannot figure out yet though. I
can't reproduce the bug in a small test program that only performs
exactly these steps. However, I can reproduce it 100% in my larger
event-driven application that uses ZMQ_FD and ZMQ_EVENTS with an event
loop (Qt). I report it in case anyone might have an idea about how this
can happen. I've tried strace to see what is different between my app
and a small test case, but nothing stands out.

Fortunately it is possible to work around by setting ZMQ_LINGER on the
socket. However I consider it a bug since SUB technically has no write
queue, at least from the application's perspective, so it should not
block shutdown.

My theory is that since SUB under the hood sends the subscription to the
publisher, it is the combination of a pending subscription and a connect
socket (which causes the queue to be created in the absence of a peer)
that causes the zmq engine to consider the socket to have a pending
write and so it employs the blocking behavior on shutdown, even though
the application didn't actually write anything. If I don't subscribe, or
if I bind instead of connect, or if I use a non-SUB socket, then this
problem doesn't occur. Further, if I create a second application that
binds to "ipc://foo", and start this application while the original is
blocking on zmq_term, then the original will finish and exit. So it's
clearly waiting on a connect.

Happy to investigate this further if anyone can direct me.

Justin



More information about the zeromq-dev mailing list