[zeromq-dev] Closed dealer socket may cause context termination to hang

Pieter Hintjens ph at imatix.com
Thu Nov 6 08:56:12 CET 2014


You have to set linger=0 on the socket before closing it.

There are two old design mistakes in libzmq which cause this. One is
the default linger of infinity. I'd have changed that ages ago except
would break compatibility. The other is libzmq's shutdown logic, which
can't handle even the simple case you showed. It's worse... try just
creating the socket and connecting, then terminating, and it blocks.

The good news is you can trivially fix this. Set linger=0 on all
sockets before destroying them, and destroy all sockets before
terminating. Or use a binding like CZMQ which does this for you.

There is IME no value in "waiting for messages to be sent" in normal
operation. That happens only in apps which fire up, send a message,
then exit. Rare. Normal ZeroMQ apps run forever and message loss at
the end of a connection is always dealt with at a higher level.

-Pieter

On Thu, Nov 6, 2014 at 1:13 AM, André Caron <andre.l.caron at gmail.com> wrote:
> Hi there,
>
> I've been bit by this nasty little bugger and I thought I'd share.
>
> In this little snippet, terminating the context hangs unless the connection
> was established, even though we're closing the socket properly before
> terminating the context.
>
>     import zmq
>     c = zmq.Context()
>     s = c.socket(zmq.DEALER)
>     s.connect('tcp://127.0.0.1:66')
>     s.send_multipart(['foo', 'bar', 'meh'])
>     s.close()
>     c.term()  # hangs (until someone starts listening on port 666).
>
> It happens that if you put a linger on the socket when closing it, the
> program shuts down nicely.
>
> Took me a while to figure out why, but it seems this is a symptom of that
> little subtlety between bind and connect in ZeroMQ.  I remember reading
> about this somewhere in the guide, but I probably thought I'd come back to
> this concept and never did.  So apparently, the dealer socket immediately
> creates a queue for messages when you call "zmq_connect()", even if the TCP
> connection is never established and somehow this queue that never gets
> flushed if you don't establish a connection.
>
> I'm really surprised by this behavior.  I would think that closing the
> dealer socket would stop the reconnection attempts, and then the socket
> would drop the queue and allow the context to terminate (as when I set the
> linger on the socket).
>
> I know the queue creation on "zmq_connect()" is by design, but is the
> attempt to connect to the peer after the dealer socket is closed desired as
> well?
>
> If it is, I'd like to know if there is a place to document these kinds of
> gotchas.  If so, I'd probably send in a documentation patch for this
> particular one.
>
> Cheers,
>
> André
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>



More information about the zeromq-dev mailing list