[zeromq-dev] Signaler.cpp hangs on windows intermittently

KIU Shueng Chuan nixchuan at gmail.com
Fri May 17 02:40:13 CEST 2013


For the 1st one, a related entry is LIBZMQ-492. The submitter changes the
Event to a Mutex which is better since the OS will release the Mutex
automatically on program abort.

For the 2nd one, LIBZMQ-519 seems related. The submitter says that creating
and closing a socket fails after 16000 iterations on Windows 7. This sounds
like exhaustion of unique 4-tuple due to socketpair emulation sockets
hanging around in TIME_WAIT state. On WinXP, the failure will occur sooner
due to lower number of ephemeral ports.

On May 17, 2013 1:02 AM, "Pieter Hintjens" <ph at imatix.com> wrote:

> Hi Shueng Chaun,
>
> Thanks for this. I've backported these two commits to zeromq3-x
> stable. Do you know if there are Jira issues for them?
>
> -Pieter
>
> On Thu, May 16, 2013 at 7:06 AM, KIU Shueng Chuan <nixchuan at gmail.com>
> wrote:
> > Hi Pieter,
> >
> > This is the commit in master for the above-mentioned signaler.cpp
> assertion:
> > Release Event on error
> > 8c71ac47e83dc4ae116ab4abb5e4a76e8249d888
> >
> > Another change that might help Windows (especially XP) users is the
> > following. It avoids the socketpair emulation in win32 from staying in
> the
> > TIME_WAIT state after close.
> > avoid TIME_WAIT socket structures
> > 151a80619bf3f9c4696788f79cd2c934ed26246d
> >
> >
> >
> >
> > On Tue, May 7, 2013 at 3:54 AM, Pieter Hintjens <ph at imatix.com> wrote:
> >>
> >> Has this been fixed on master but not backported to 3.x?
> >>
> >> Usually it's up to individual contributors to either do the backport
> >> (a git cherrypick) or ask someone to help do it.
> >>
> >> I'm happy to backport the fix for the next 3.2.4 release if someone
> >> tells me what commit it was.
> >>
> >> -Pieter
> >>
> >> On Mon, May 6, 2013 at 8:32 PM, Felipe Farinon
> >> <felipe.farinon at powersyslab.com> wrote:
> >> > Why wasn't this patch released yet?
> >> >
> >> > In my application i'm trying to fix it by using Mutex considering the
> >> > case that waiting (WaitForSingleObject) on the mutex results in
> >> > WAIT_ABANDONED. It seems that it has been fixed. Should I submit a
> >> > patch?
> >> >
> >> > Em 19/03/2013 16:45, Pau escreveu:
> >> >> Hi,
> >> >>
> >> >> I had this problem some weeks ago. This was my problem:
> >> >>
> >> >> I do not know all possible reasons to generate a fatal wsa_assert(..)
> >> >> but there is at least one:
> >> >>
> >> >> I have seen that in XP it is possible that line 301  rc = connect
> (*w_,
> >> >> (sockaddr *) &addr, sizeof (addr)); returns an error when a socket
> >> >> tries
> >> >> to connect to 5905 and this has happened many times.
> >> >> Windows uses port numbers starting near 1400 and XP has a limit at
> >> >> 5000.
> >> >> In W7 this does not look as a problem because maximum is 65000
> >> >> It sounds as if the number was big enough but apart from the fact
> that
> >> >> ZMQ uses a big number of connections (at least in my tests) I have
> >> >> experienced that Windows jumps port numbers by 7, so 5000 happens
> >> >> sometimes with catastrophic consequences.
> >> >>
> >> >> Perhaps there are other reasons (actually this problem does not
> happen
> >> >> like that in W7) anyway whatever crashes between
> >> >>
> >> >> HANDLE sync = CreateEvent (NULL, FALSE, TRUE, TEXT
> >> >> ("zmq-signaler-port-sync"));
> >> >> and
> >> >> SetEvent (sync);
> >> >>
> >> >> will leave the event signaled and any other application in the system
> >> >> will hang. Closing all apps in a system fixes it.
> >> >>
> >> >> KIU Shueng Chuan submitted a patch that sets the event when crash
> >> >> avoiding other applications to ge hanged:
> >> >>
> >> >> https://github.com/zeromq/libzmq/pull/514
> >> >>
> >> >> That worked for me...
> >> >>
> >> >> If you search for "zmq-signaler-port-sync" in previous mails in this
> >> >> list you will see the complete thread.
> >> >>
> >> >> best,
> >> >>
> >> >> Pau
> >> >>
> >> >> El 19/03/2013 17:18, Felipe Farinon escribió:
> >> >>> Hi,
> >> >>>
> >> >>> The code
> >> >>> HANDLE sync = CreateEvent (&sa, FALSE, TRUE, TEXT
> >> >>> ("Global\\zmq-signaler-port-sync"));
> >> >>>
> >> >>> in signaler.cpp:262 hangs intermittently when starting zeromq
> context
> >> >>> through zmq_init in my application. I'm not able to reproduce this
> bug
> >> >>> with a minimal test case. I'm running my application in a Windows 7
> >> >>> with
> >> >>> zeromq compiled with VS2010.
> >> >>> _______________________________________________
> >> >>> zeromq-dev mailing list
> >> >>> zeromq-dev at lists.zeromq.org
> >> >>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> >> >>>
> >> >> _______________________________________________
> >> >> zeromq-dev mailing list
> >> >> zeromq-dev at lists.zeromq.org
> >> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> >> >>
> >> >
> >> > _______________________________________________
> >> > zeromq-dev mailing list
> >> > zeromq-dev at lists.zeromq.org
> >> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> >> _______________________________________________
> >> zeromq-dev mailing list
> >> zeromq-dev at lists.zeromq.org
> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> >
> >
> >
> > _______________________________________________
> > zeromq-dev mailing list
> > zeromq-dev at lists.zeromq.org
> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> >
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20130517/ed8cb9d3/attachment.htm>


More information about the zeromq-dev mailing list