[zeromq-dev] [BUG] forever reconnect even after deleting the socket

Thijs Terlouw thijsterlouw at gmail.com
Wed Jan 19 06:09:57 CET 2011


I see a huge flood of reconnect attempts from the connecting zeromq
side when I'm shutting down the zeromq socket that did the bind.

It basically looks like this:
xxx.xxx.xxx.xxx.17050 > yyy.yyy.yyy.yyy.5555: S
xxx.xxx.xxx.xxx.17052 > yyy.yyy.yyy.yyy.5555: S
xxx.xxx.xxx.xxx.17057 > yyy.yyy.yyy.yyy.5555: S

where yyy.yyy.yyy.yyy.5555 is the "server" that is now shutdown. At
first I didn't understand why so many reconnects, but it appears this
is the ZMQ_RECONNECT_IVL default value of 100ms in action (indeed
~100ms between each reconnect attempt).

These reconnects go on non-stop, untill my application gets a
notification that the target server is no longer available, destroys
the socket, creates a new zeromq socket and connects to the remaining
'up' servers. Since the other servers are all 'up' the reconnect flood
stops.

So far so good (though I think the reconnect interval should be
augmented by exponential back-off, perhaps by splitting it in two
options ZMQ_RECONNECT_IVL (the max) + ZMQ_RECONNECT_IVL_MIN ). A patch
for this should be easy.

The BUG:

When I upgrade from 2.0.10 to the latest github version (
zeromq-zeromq2-v2.1.0-35-ga249d15.tar.gz ) , the flood doesn't stop
when I destroy the socket in my application. It keeps flooding forever
which also leads to problems using the socket (timeouts etc). It seems
the socket is not cleaned up correctly internally? I didnt have time
yet to look into the problem in more detail.


Extra descriptions:
The socket appears to be still connected to the two servers. So about
half of the requests succeed, where the other half timeout (to the
down server). With tcpdump I can see non-stop (>100ms) reconnect
attempts to the down server. I'm using XREQ from the client.

I use the C++ interface and create a socket on the heap as such :
m_tcp_sock = new socket_t([zmqcontext], ZMQ_XREQ);
and delete it with a simple:
delete m_tcp_sock;

The only difference between the two versions of my application is
which ZeroMQ version I use.

Thijs



More information about the zeromq-dev mailing list