[zeromq-dev] reusing a mount point

Pieter Hintjens ph at imatix.com
Thu Nov 28 19:00:25 CET 2013


Lacking more information, it looked like normal TCP behavior. Since
you're on Linux and the socket is ESTABLISHED, it's something else.

Can you get a small test case that reproduces this reliably?

On Thu, Nov 28, 2013 at 6:05 PM, Andrew Hume <andrew at research.att.com> wrote:
> fedora 18
> but netstat didn’t say WAIT, it said ESTABLISHED
>
> tcp        0      0 135..249:46502   135..249:46502   ESTABLISHED
>
> (eliding identical interior octets for privacy).
>
> i’m trying to understand your argument; are you saying this is a TCP hiccup?
> it doesn’t smell like that; especially as it never used to happen and now
> it happens quite often.
>
> On Nov 28, 2013, at 7:11 AM, Pieter Hintjens <ph at imatix.com> wrote:
>
> What OS are you using?
>
> I've seen this symptom before, where a server cannot re-bind to a TCP
> socket when there is an old client connection still connected to the
> defunct socket. If you run netstat -a you'll see the socket in a wait
> state, forever. When the client disconnects and restarts, it all works
> again.
>
> The problem is not solvable afaik at the lower levels. The new server
> cannot force the socket out of a wait state (SO_REUSADDR does
> nothing), and the client does not (afaik) get an error on the socket.
>
> One solution is to detect the error using heartbeats, and then
> explicitly close the socket at the client side, which frees the
> server-side port for new connections.
>
> I do not recall seeing the problem on Linux, only on AIX and Windows,
> which is why I wonder what OS you're using.
>
> It would be nice to add the heartbeating into ZMTP and libzmq if we
> had budget to do that (and if this is in fact the problem).
>
> -Pieter
>
> On Thu, Nov 28, 2013 at 3:42 PM, Andrew Hume <andrew at research.att.com>
> wrote:
>
> a few months ago, i moved to czmq 3.2.3 and i’ve been quite happy except for
> one issue.
> i notice this rarely, so i’ve let it sit but now its become a nuisance.
>
> i have a stats_server which binds a PULL on port 46502.
> i have a hist_server which connects a PUSH to port 46502.
> ordinarily, the stats_server stays up for ever, while every now and then,
> we restart the hist_server process. so far, so good. and like always,
> we can start these servers in either order and it all works.
>
> what happens when we forget to restart the stats_server?
> the hist_server runs happily, sending stats over the channel on 46502.
> after (hours, days), we finally observe the stats_server is not up and we
> start it. it now fails because port 46502 “is in use”.
>
> this seems to be a bug to me.
>
> -----------------------
> Andrew Hume
> 949-707-1964 (VO and best)
> 732-420-0907 (NJ)
> andrew at research.att.com
>
>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
>
> -----------------------
> Andrew Hume
> 949-707-1964 (VO and best)
> 732-420-0907 (NJ)
> andrew at research.att.com
>
>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>



More information about the zeromq-dev mailing list