[zeromq-dev] Heartbeating using TCP keepalives
Alex Grönholm
alex.gronholm at nextday.fi
Fri Jan 3 19:00:54 CET 2014
03.01.2014 11:24, Laurent Alebarde kirjoitti:
> The existing apps do not use yet 0MQ heartbeating anyhow, so there is
> no "backward compatibility" issue.
What 0MQ heartbeating? There is no 0MQ heartbeating implemented
anywhere. I'm talking about TCP keepalives which is a whole different beast.
TCP keepalive parameters were exposed by libzmq 3 already, so there
could very well be existing apps using it.
>
> I support Matt point of view to keep libzmq homogeneous. I add that I
> don't like specifications or performances degradations when it can be
> avoided. There are standard usages that sustend your argumentation,
> and sometimes you think: "why on the heal have they limited this?".
>
> But there is always a middle path: it would be not costly to have an
> API with seconds and one with milliseconds for that case, in order to
> not break standard usage claimed by Alex.
>
> Le 03/01/2014 02:03, Alex Grönholm a écrit :
>> 02.01.2014 23:48, Pieter Hintjens kirjoitti:
>>> Seconds is fine for this case but surprising overall since all other
>>> durations in the API are in msec.
>>>
>>> I'm not sure what you mean about backwards compatibility.
>> As it stands, the TCP keepalive intervals are given in seconds on the
>> vast majority of operating systems.
>> If we change it so the values are given in milliseconds instead (meaning
>> that we divide the given value by 1000 before calling setsockopt()),
>> this will break existing apps that set the keepalive intervals as seconds.
>>> On Thu, Jan 2, 2014 at 7:55 PM, Alex Grönholm<alex.gronholm at nextday.fi> wrote:
>>>> 02.01.2014 15:59, Pieter Hintjens kirjoitti:
>>>>> It makes sense, and I'd try this; the timeout should be in msec, to be
>>>>> consistent with other duration arguments. You can take any of the
>>>>> existing socket options like ZMQ_SNDBUF as a template, and make a pull
>>>>> request.
>>>> Wouldn't it be enough to document that the values are expressed in
>>>> seconds and not ms?
>>>> Who needs sub-second accuracy with keepalives?
>>>> Besides, converting the values on non-Windows systems would break
>>>> backwards compatibility.
>>>> Are you fine with that? This definitely should not be done in a micro
>>>> release.
>>>>> On Mon, Dec 30, 2013 at 11:29 PM, Alex Grönholm
>>>>> <alex.gronholm at nextday.fi> wrote:
>>>>>> This isn't directly related to ZeroMQ, but it is somewhat relevant now given
>>>>>> A) the addition of the (yet unimplemented) heartbeating feature in ZMTP/3.0
>>>>>> and B) the Windows TCP keepalive parameters fix I committed recently.
>>>>>> The question is: has someone here used TCP keepalives as a substitute for
>>>>>> application level heartbeating? Given the operating model of ZeroMQ, using
>>>>>> TCP keepalives for this purpose would transparently shield the user from
>>>>>> stale connections. Are there any downsides to this?
>>>>>> TCP keepalives, when turned on, use a 2 hour interval by default (this is a
>>>>>> de facto standard). This makes them impractical unless the values are
>>>>>> adjusted.
>>>>>> I've done some research on that. From what I've gathered, it seems that
>>>>>> setting TCP keepalive parameters on a per-socket level is supported at least
>>>>>> on the following operating systems:
>>>>>>
>>>>>> Linux
>>>>>> FreeBSD
>>>>>> Windows (since Windows 2000; set only, read not supported; number of
>>>>>> keepalive probes is fixed on 10; must be set before connecting; values in
>>>>>> milliseconds, not seconds)
>>>>>> Mac OS X (since Mountain Lion)
>>>>>> AIX
>>>>>> Solaris (values in milliseconds, not seconds)
>>>>>>
>>>>>> It seems that both iOS and Android support sending TCP keepalives, but
>>>>>> setting keepalive parameters is not supported.
>>>>>> Note that the Windows TCP keepalive parameters patch takes the time
>>>>>> intervals in seconds and multiplies by 1000 on Windows for cross platform
>>>>>> compatibility. There is no similar fix for Solaris yet so Solaris users need
>>>>>> to do it on the application level for now.
>>>>>>
>>>>>> Setting the keepalive idle and retransmission delay to values like 10 and 5
>>>>>> seconds would make a lot of sense to me. If the peer fails to respond to the
>>>>>> probes, zmq will just see a disconnection.
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> zeromq-dev mailing list
>>>>>> zeromq-dev at lists.zeromq.org
>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>
>>>>> _______________________________________________
>>>>> zeromq-dev mailing list
>>>>> zeromq-dev at lists.zeromq.org
>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20140103/51f7190e/attachment.htm>
More information about the zeromq-dev
mailing list