[zeromq-dev] reusing a mount point

Rob Starling 00zmq-dev at robstarling.org
Thu Nov 28 23:08:27 CET 2013


If it's ESTABLISHED, can't you get lsof to tell you what other
process has a port 46502 TCP socket open?

e.g.

# lsof -i :80  # but use 46502 :)
COMMAND    PID     USER   FD   TYPE     DEVICE SIZE/OFF NODE NAME
curl      2619  robstar    3u  IPv4 3188168535      0t0  TCP foo.bar:51400->baz.biff:http (ESTABLISHED)
lighttpd  3217 www-data    4u  IPv4       9500      0t0  TCP *:http (LISTEN)
lighttpd  3217 www-data    5u  IPv6       9501      0t0  TCP *:http (LISTEN)

--Rob*

On Thu, Nov 28, 2013 at 07:00:25PM +0100, Pieter Hintjens wrote:
> Lacking more information, it looked like normal TCP behavior. Since
> you're on Linux and the socket is ESTABLISHED, it's something else.
> 
> Can you get a small test case that reproduces this reliably?
> 
> On Thu, Nov 28, 2013 at 6:05 PM, Andrew Hume <andrew at research.att.com> wrote:
> > fedora 18
> > but netstat didn’t say WAIT, it said ESTABLISHED
> >
> > tcp        0      0 135..249:46502   135..249:46502   ESTABLISHED
> >
> > (eliding identical interior octets for privacy).
> >
> > i’m trying to understand your argument; are you saying this is a TCP hiccup?
> > it doesn’t smell like that; especially as it never used to happen and now
> > it happens quite often.
> >
> > On Nov 28, 2013, at 7:11 AM, Pieter Hintjens <ph at imatix.com> wrote:
> >
> > What OS are you using?
> >
> > I've seen this symptom before, where a server cannot re-bind to a TCP
> > socket when there is an old client connection still connected to the
> > defunct socket. If you run netstat -a you'll see the socket in a wait
> > state, forever. When the client disconnects and restarts, it all works
> > again.
> >
> > The problem is not solvable afaik at the lower levels. The new server
> > cannot force the socket out of a wait state (SO_REUSADDR does
> > nothing), and the client does not (afaik) get an error on the socket.
> >
> > One solution is to detect the error using heartbeats, and then
> > explicitly close the socket at the client side, which frees the
> > server-side port for new connections.
> >
> > I do not recall seeing the problem on Linux, only on AIX and Windows,
> > which is why I wonder what OS you're using.
> >
> > It would be nice to add the heartbeating into ZMTP and libzmq if we
> > had budget to do that (and if this is in fact the problem).
> >
> > -Pieter
> >
> > On Thu, Nov 28, 2013 at 3:42 PM, Andrew Hume <andrew at research.att.com>
> > wrote:
> >
> > a few months ago, i moved to czmq 3.2.3 and i’ve been quite happy except for
> > one issue.
> > i notice this rarely, so i’ve let it sit but now its become a nuisance.
> >
> > i have a stats_server which binds a PULL on port 46502.
> > i have a hist_server which connects a PUSH to port 46502.
> > ordinarily, the stats_server stays up for ever, while every now and then,
> > we restart the hist_server process. so far, so good. and like always,
> > we can start these servers in either order and it all works.
> >
> > what happens when we forget to restart the stats_server?
> > the hist_server runs happily, sending stats over the channel on 46502.
> > after (hours, days), we finally observe the stats_server is not up and we
> > start it. it now fails because port 46502 “is in use”.
> >
> > this seems to be a bug to me.



More information about the zeromq-dev mailing list