[zeromq-dev] Help on the PUB/SUB socket reconnection.
Doron Somech
somdoron at gmail.com
Thu May 16 17:20:18 CEST 2013
There are few solutions to the problem:
1. As mentioned above you can enable the tcp keep alive
2. Make the publisher connect to the subscriber and make the subscriber
bind, if the connection is dead the publisher will recognize that and
reconnect.
3. Use xsub and xpub and sent keep keep alive messages from the subscriber
to the publisher, on the publisher just ignore the messages. If the
connection is dead the subscriber will recognize that when trying to send
the message and the subscriber will reconnect.
4. From the publisher send keep alive messages every X seconds, on the
subscriber if you don't get a message after Y (usually 2*X) close the
existing socket and reconnect.
Out of the 4 I prefer the second if it possible(the amount of subscriber is
fixed or small) and the forth if not.
On Thu, May 16, 2013 at 2:27 PM, Joshua Foster <jhawk28 at gmail.com> wrote:
> You may also be able to use the "ZMQ_TCP_KEEPALIVE" option to help with
> observation 2.
>
> Joshua
>
> 许海玲 <hailingxu at gmail.com>
> May 15, 2013 11:05 PM
> Hello zmq guys,
>
> I am writing this letter to confirm whether zmq PUB/SUB socket tcp
> connection have no timeout mechanism.
>
> Recently, I am coding with ZMQ3.2.2 API. In my program, there are two
> nodes, one is publisher, the other is subscriber, communicated with zmq tcp
> PUB/SUB sockets. These two nodes are running on virtual machine. When the
> publisher's VM is RESET(code restart), the subscriber won't received any
> message from the restarted publisher. However, restart the publisher
> program, reboot in guest os, disconnect interface for a period of time
> won't lead to such result, subscriber always reconnects to the new started
> publisher.
>
> With tcpdump, we noticed that
> 1) restart the publisher program or reboot the guest os, makes publisher
> zmq send a FIN to terminate the connection. And when publisher restarts,
> subscriber will reconnect to it automatically.
>
> 2) Reset the VM of publisher, no FIN is sent out and when publisher
> restarts, the subscribe do nothing, no communication is observed, so the
> subscribe can't detect the previous connection is lost.
>
> 3) Reset the VM of subscriber, no FIN is sent out, and publisher is
> always sending messages to the previous connection. Of course, a new
> connection is established, and messages are also sent out via this
> connection.
>
> With the above observation, we guess that zmq PUB/SUB has no connection
> timeout mechanism, the connection only be terminated when one of the ends
> sends a FIN. I am curious about the whether it is the design target, or a
> bug. Maybe we must implements timeout with heatbeat on uplayer, to avoid
> lost connection when publisher is down for power losing.
>
> Thanks for your information.
>
> Hailing.
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20130516/fa34fc2d/attachment.htm>
More information about the zeromq-dev
mailing list