[zeromq-dev] Cleaning up file descriptors for dead router peers

Marcin Romaszewicz marcin at brkt.com
Tue Jun 30 02:11:29 CEST 2015


I've debugged the zmq code a bit, and I know exactly what's going on to
cause this.

The underlying TCP socket (at least on linux) takes quite a while to go
from ESTABLISHED state to a failed state once packets stop flowing, up to
approximately 90 minutes. If, during this time, my own code continues to
heartbeat, then eventually, the file descriptors are closed, but this
doesn't help me because my rate of "black hole" peers can exceed the file
descriptor limit with a 90 minute window.

I was leaking file descriptors forever, because my heartbeats told me the
peer is dead, and I stopped issuing sends to that peer after a couple of
minutes, so as far as ZMQ is concerned, it has a file descriptor in
ESTABLISHED state in its poll loop, which, for those 90 minutes returns
success on all writes, so there's simply no way to tell that the connection
is dead.

jbreams' heartbeat fix does indeed fix this issue.

What would be really nice is some sort of API call to tell a router socket
to close a peer.




On Fri, Jun 26, 2015 at 4:58 PM, Marcin Romaszewicz <marcin at brkt.com> wrote:

> Hi All,
>
> I've gota trivial bit of code to reproduce this issue on a single host
> using iptables to simulate network partition.
>
> https://s3-us-west-2.amazonaws.com/marcin-zmq-example/zmq_test.cpp
>
> The file has comments on how to run the executable, but the short version
> is that you start a ZMQ_ROUTER listener which accepts connections from
> other peers, and remembers their identities and pings them every 5 seconds.
> Then, you start a number of peers which connect to this router and start
> pinging it every few seconds.
>
> Once you use the iptables command (also in the comments in the file), the
> router can't ping the peers, and the peers can't ping the router. The file
> descriptors and connections remain open forever on both sides.
>
> Furthermore, when you undo the iptables block, the connections never come
> back.
>
> On Thu, Jun 25, 2015 at 3:35 PM, Marcin Romaszewicz <marcin at brkt.com>
> wrote:
>
>> I've done some testing of github head version with the heartbeat code,
>> and it didn't work for me, I have monotonically increasing file descriptor
>> counts, but I'm not sure if I set up my test scenario properly. I had the
>> following setup.
>>
>> 1 ZMQ_ROUTER socket with
>>   ZMQ_HEARTBEAT_IVL = 3000 (3 seconds)
>>   ZMQ_HEARTBEAT_TTL = 30000 (30 seconds)
>>   ZMQ_HEARTBEAT_TIMEOUT = 30000 (30 seconds)
>>
>> However, due to logistical reasons, my clients which were connecting to
>> this ZMQ socket were on ZMQ 4.1.2. Was this a valid test scenario? It would
>> take me a couple of days to set up the AMI's to test Router(4.2.0) <->
>> client(4.2.0)
>>
>> Another question:
>> If I switch a router socket into ZMQ_ROUTER_RAW mode, send it a
>> disconnect fame (peer identity followed by empty frame), then switch off
>> RAW mode, would I be doing something completely unsupported, or is it worth
>> a try? My tests take a very long time and a lot of work to set up right
>> now, so I'm reluctant to try something if it's probably a waste of time.
>>
>> Thanks,
>> -- Marcin
>>
>>
>> On Wed, Jun 24, 2015 at 1:01 PM, Pieter Hintjens <ph at imatix.com> wrote:
>>
>>> The underlying sockets should indeed error out. Presumably the code
>>> isn't handling this properly.
>>>
>>>
>>> On Wed, Jun 24, 2015 at 8:16 PM, Marcin Romaszewicz <marcin at brkt.com>
>>> wrote:
>>> > Thanks, this probably would solve our problem, however, I'm reluctant
>>> to
>>> > deploy the bleeding edge from your git repo into our production
>>> systems,
>>> > even if it does work on my test cluster.
>>> >
>>> > When I detect that a peer is dead with my own heartbeats, why is it
>>> that
>>> > attempting to send data to the dead peer doesn't force some kind of
>>> > connection cleanup or reset? The underlying os sockets should error out
>>> > eventually.
>>> >
>>> > On Wed, Jun 24, 2015 at 10:52 AM, Pieter Hintjens <ph at imatix.com>
>>> wrote:
>>> >>
>>> >> For what it's worth, we just merged a pull request that adds
>>> >> connection heartbeating. It could be fun to see if this solves your
>>> >> problem. (In theory it should...)
>>> >>
>>> >> https://github.com/zeromq/libzmq/pull/1448
>>> >>
>>> >>
>>> >> On Wed, Jun 24, 2015 at 6:48 PM, Marcin Romaszewicz <marcin at brkt.com>
>>> >> wrote:
>>> >> > Yes, you can easily reproduce this by pulling a network cable or
>>> >> > shutting
>>> >> > the host down before it can do any sort of TCP connection cleanup.
>>> I'm
>>> >> > seeing it in AWS when instances get terminated, because they're
>>> given so
>>> >> > little time to respond to TERM that connections aren't cleaned up.
>>> >> >
>>> >> > The iptables approach which Francis mentioned should work as well.
>>> >> >
>>> >> > I'll see if I can come up with a simple example of reproducing
>>> this. It
>>> >> > might be even possible to repro this on a single machine simply by
>>> >> > suspending a peer.
>>> >> >
>>> >> > -- Marcin
>>> >> >
>>> >> > On Wed, Jun 24, 2015 at 2:47 AM, Pieter Hintjens <ph at imatix.com>
>>> wrote:
>>> >> >>
>>> >> >> Do you think there's any way to reproduce this in the lab, e.g.
>>> >> >> killing a peer before it can shut down TCP properly?
>>> >> >>
>>> >> >> On Tue, Jun 23, 2015 at 10:08 PM, Marcin Romaszewicz <
>>> marcin at brkt.com>
>>> >> >> wrote:
>>> >> >> > Hi All,
>>> >> >> >
>>> >> >> > I've got an issue with ZMQ_ROUTER sockets which I'm having a hard
>>> >> >> > time
>>> >> >> > working around, and I'd love some advice, but I suspect the
>>> answer is
>>> >> >> > that
>>> >> >> > what I want to do isn't possible.
>>> >> >> >
>>> >> >> > Say I have a router socket listening on a port, and I have peers
>>> >> >> > connecting
>>> >> >> > and disconnecting randomly over TCP. These peers have random
>>> >> >> > identities
>>> >> >> > for
>>> >> >> > all intents and purposes.
>>> >> >> >
>>> >> >> > Most of the time, a peer will disconnect "cleanly", meaning the
>>> TCP
>>> >> >> > connection is terminated via FIN or RST packets, ZMQ cleans up
>>> the
>>> >> >> > file
>>> >> >> > descriptor.
>>> >> >> >
>>> >> >> > However, some of the time, my peer will die silently,
>>> effectively due
>>> >> >> > to
>>> >> >> > network outage or power outage or something.
>>> >> >> >
>>> >> >> > In these cases, the router socket keeps the file descriptor
>>> around
>>> >> >> > forever.
>>> >> >> > I know that the peer is dead because all my peers heartbeat to
>>> each
>>> >> >> > other,
>>> >> >> > and the heartbeats have gone away. I thought that trying to send
>>> some
>>> >> >> > data
>>> >> >> > to a dead peer would tear down that connection, since the
>>> underlying
>>> >> >> > TCP
>>> >> >> > socket would eventually start erroring, but it doesn't, zmq must
>>> be
>>> >> >> > dropping
>>> >> >> > my packet before sending it to the underlying socket.
>>> >> >> >
>>> >> >> > The socket monitor tells me that someone has connected to the
>>> router
>>> >> >> > socket
>>> >> >> > on on its bound port with a specific file descriptor, but I've
>>> got so
>>> >> >> > many
>>> >> >> > of these coming in that I can't associate a specific file
>>> descriptor
>>> >> >> > with a
>>> >> >> > specific peer.
>>> >> >> >
>>> >> >> > TCP keep-alives don't work all that well in raising errors in a
>>> dead
>>> >> >> > connection.
>>> >> >> >
>>> >> >> > What I know on the app side due to my heartbeats is that peer
>>> XYZ is
>>> >> >> > dead.
>>> >> >> > I'd like to tell the router socket to close the underlying file
>>> >> >> > descriptor.
>>> >> >> > What I know via the monitor is that I have a bunch of file
>>> >> >> > descriptors
>>> >> >> > open,
>>> >> >> > but I can't map them to peers. If I could, I'd just call
>>> os.close()
>>> >> >> > on
>>> >> >> > that
>>> >> >> > file descriptor and hopefully ZMQ would handle this gracefully.
>>> >> >> >
>>> >> >> > Eventually, in a few hours of uptime, my process hits the os file
>>> >> >> > descriptor
>>> >> >> > limit, and stops receiving new connections on the zeromq level.
>>> I can
>>> >> >> > have
>>> >> >> > the process quit when it detects this, but that forces all the
>>> >> >> > functioning
>>> >> >> > peers to reconnect and re-do some work, so I'd like to avoid it.
>>> >> >> >
>>> >> >> > I scanned the previous discussions about it, and there has been
>>> >> >> > mention
>>> >> >> > of
>>> >> >> > exposing this somehow, but I don't see anything along these
>>> lines in
>>> >> >> > the
>>> >> >> > latest API. (looking at 4.1.2 release).
>>> >> >> >
>>> >> >> > Any suggestions on how I could work around this?
>>> >> >> >
>>> >> >> > I'm thinking of extending the socket monitor to have a new event
>>> >> >> > type,
>>> >> >> > like
>>> >> >> > ZMQ_PEER_CONNECT/DISCONNECT which passes back the peer ID and
>>> file
>>> >> >> > descriptor, but I've not gone through the zmq code enough yet to
>>> know
>>> >> >> > how
>>> >> >> > much work this would be.
>>> >> >> >
>>> >> >> > Thanks in advance,
>>> >> >> > -- Marcin
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> > _______________________________________________
>>> >> >> > zeromq-dev mailing list
>>> >> >> > zeromq-dev at lists.zeromq.org
>>> >> >> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> >> >> >
>>> >> >> _______________________________________________
>>> >> >> zeromq-dev mailing list
>>> >> >> zeromq-dev at lists.zeromq.org
>>> >> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> >> >
>>> >> >
>>> >> >
>>> >> > _______________________________________________
>>> >> > zeromq-dev mailing list
>>> >> > zeromq-dev at lists.zeromq.org
>>> >> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> >> >
>>> >> _______________________________________________
>>> >> zeromq-dev mailing list
>>> >> zeromq-dev at lists.zeromq.org
>>> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > zeromq-dev mailing list
>>> > zeromq-dev at lists.zeromq.org
>>> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> >
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20150629/a3b9bec1/attachment.htm>


More information about the zeromq-dev mailing list