[zeromq-dev] 0MQ keepalive

Meng Zhang jammy.linux at gmail.com
Sun Nov 2 00:51:59 CET 2014


Thanks @Andrew. 


-----Original Message-----
From: "Andrew Hume" <andrew at research.att.com>
Sent: ‎2014/‎11/‎2 0:32
To: "ZeroMQ development list" <zeromq-dev at lists.zeromq.org>
Subject: Re: [zeromq-dev] 0MQ keepalive

no.
we used PUSH (clients) and PULL (server) for heartbeats.
it worked well, but every now and then
(few to several months) the connection would stop working
(although no errors were seen) for one of teh clients.
networking sucks.


On Nov 1, 2014, at 6:26 AM, Meng Zhang <jammy.linux at gmail.com> wrote:



Thanks Doron for sharing this pattern;)

Does anyone in the community use the simple PUB/SUB and enable TCP keep alive to achieve what we're talking about here?

Regards,
Meng



From: Doron Somech
Sent: ‎2014/‎11/‎1 18:40
To: ZeroMQ development list
Subject: Re: [zeromq-dev] 0MQ keepalive


This is what we are doing to overcome this issue: publishers are the clients and the subscribers are the servers (publishers connect and subscriber bind). Publisher publish a heartbeat message every one second, that way zeromq will recognize a disconnection and will reconnect automatically.


We use beacon to do the discovery of the subscribers.






On Sat, Nov 1, 2014 at 11:17 AM, Meng Zhang <jammy.linux at gmail.com> wrote:

Hi, @Benjamin

Thanks for your quick response. I'm aware of the way to implement the heartbeat function.

I was just wondering how the TCP keep alive helps. What happened to zeromq lib when TCP Keepalive dectects a failure.

Regards,
Meng


From: Benjamin
Sent: ‎2014/‎11/‎1 16:02
To: ZeroMQ development list
Subject: Re: [zeromq-dev] 0MQ keepalive


Hi,


the standard way is the Paranoide Pirate Protocol: http://rfc.zeromq.org/spec:6


The Guide discusses this in chapter 4: http://zguide.zeromq.org/php:chapter4


For a heart-beating for publishers I think you have to define your use-case. As an example, say the client discovers that the service is down, can he switch to another service? In a P2P context this happens all the time - one peer discovers another peer is dead and switches. Or in a client/server context, the client might just wait a bit because of overload on the server. So "dead" can mean different things.


Regards,

Benjamin



On Sat, Nov 1, 2014 at 3:50 AM, Meng Zhang <jammy.linux at gmail.com> wrote:



Hi, @there,


Following is the issue we encountered in our production env:


We are using ZeroMQ PUB/SUB pattern, 
but the weird thing is that at the SUB  end, netstat showed the zeromq socket is in ESTABLISHED state, 
while at the PUB end, the LISTEN socket is still there, but the corresponding ESTABLISHED socket disappeared.
Given there is not built-in hearbeat mechanism in ZeroMQ, 
for such situation, what's the best practice to leverage TCP keepalive to dectect this issue?


So...at the SUB end, if I set ZMQ_TCP_KEEPALIVE/ZMQ_TCP_KEEPALIVE_IDLE properly, 
* if I choose to use czmq, how can I assert the socket is dead thru the zstr_recv()?
* if use libzmq directly, how can I do the same thing by zmq_msg_recv()?


Regards,
Meng




_______________________________________________
zeromq-dev mailing list
zeromq-dev at lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev





_______________________________________________
zeromq-dev mailing list
zeromq-dev at lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev




_______________________________________________
zeromq-dev mailing list
zeromq-dev at lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev




-----------------------
Andrew Hume
949-707-1964 (VO)
andrew at research.att.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20141102/171a3ab9/attachment.htm>


More information about the zeromq-dev mailing list