[zeromq-dev] ZMQ reconnect/ephemeral ports

Bill Torpey wallstprog at gmail.com
Fri Sep 1 22:59:35 CEST 2017

I'm curious about how ZMQ handles re-connection.  I understand that re-connection is supposed to happen "automagically" under the covers, but that poses an interesting question.

To make a long story short, the application I'm working on uses pub/sub sockets over TCP. and works like follows:

At startup:
1.  connects to a proxy/broker at a well-known address, using a pub/sub socket pair ("discovery");
2.  subscribes to a well-known topic using the "discovery" sub socket;
3.  binds a different pub/sub socket pair ("data") and retrieves the actual endpoints assigned;
4.  publishes the "data" endpoints from step 3 on the "discovery" pub socket; 

When the application receives a message on the "discovery" sub socket, it connects the "data" socket pair to the endpoints specified in the "discovery" message.

So far, this seems to be working relatively well, and allows the high-volume, low-latency "data" messages to be sent/received directly between peers, avoiding the extra hop caused by a proxy/broker connection.  The discovery messages use the proxy/broker, but since these are (very) low-volume the extra hop doesn't matter.  The use of the proxy also eliminates the "slow joiner" problem that can happen with other configurations.

My question is what happens when one of the "data" peer sockets disconnects.  Since ZMQ (apparently) keeps trying to reconnect, what would prevent another process from binding to the same ephemeral port?  

- Can I assume that if the new application at that port is not a ZMQ application, that the reconnect will (silently) fail, and continue to be retried?

- What if the new application at that port *IS* a ZMQ application?  Would the reconnect succeed?  And if so, what would happen if it's a *DIFFERENT* ZMQ application, and the messages that it's sending/receiving don't match what the original application expects?

It's reasonable for the application to publish a disconnect message when it terminates normally, and the connected peers can disconnect that endpoint.  But, applications don't always terminate normally ;-)

Any guidance, hints or tips would be much appreciated -- thanks in advance!

More information about the zeromq-dev mailing list