[zeromq-dev] Forwarder stops forwarding
robin at scout-trading.com
Tue Jun 15 16:38:20 CEST 2010
Obviously the answer here is it has to be a socket option :-)
In my case it would depend on whether I'm connecting over a WAN, the internet or local network. I don't have matts requirements though so 1 second would be fine for all scenarios for me. 10 ms would obviously cause problems over a WAN that crossed the united states though...
From: zeromq-dev-bounces at lists.zeromq.org [mailto:zeromq-dev-bounces at lists.zeromq.org] On Behalf Of Matt Weinstein
Sent: Tuesday, June 15, 2010 10:18 AM
To: 0MQ development list
Subject: Re: [zeromq-dev] Forwarder stops forwarding
On Jun 15, 2010, at 9:39 AM, Martin Sustrik wrote:
> Hi Robin,
>> I think we've figured out how this happens, and I don't think its
>> specific to the forwarder.
>> The exact setup is zmq_forwarder A connecting to zmq_forwarder B,
>> which is binding. The messages are forwarded in the reverse direction
>> i.e. messages are forwarded from B to A. This is done because of
>> firewall restrictions. When there is a network interruption the
>> server (B) detects it and drops the connection however forwarder A
>> never realizes this since the connection isn't shut down cleanly
>> since the host B is on is unreachable.
>> Netstat confirms that forwarder A still believes its connected to B,
>> but B does not see the connection. In fact you can restart forwarder
>> B many hours after the network disconnect and forwarder A still
>> believes its connected. I'm guessing the issue is that since there is
>> no messages flowing from A to B writes never fail so A never realizes
>> it has been disconnected. I'm somewhat surprised by this, I thought
>> TCP would be able to figure out a connection was disconnected after a
>> minute or 2 on both client and server. Any TCP experts that can
>> confirm that this behavior is expected?
> Yes. This is exactly how TCP behaves.
> What has to be done is implementing heartbeats on 0MQ level.
> Btw, a survey: What's the timeout you would be willing to accept to
> assert the connection is dead?
Brocades and such consider trunking down (default) at 3 seconds, which
I think is a very outside number, but good for human i/o. My
application requires <100ms response times. I'm going to be inserting
keep-alives as required at 10 ms, connections will be marked down and
failed over after 20 ms of silence (two cycles).
Should there be a more general QoS framework?
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
zeromq-dev mailing list
zeromq-dev at lists.zeromq.org
More information about the zeromq-dev