[zeromq-dev] Signaler.cpp hangs on windows intermittently

Pieter Hintjens ph at imatix.com
Fri May 17 11:03:43 CEST 2013


I think we need specific issues for the two commits; any backport to
stable needs issues so users know exactly what changed and why.

Shueng Chuan, could you create two new issues? I don't have enough
information to make them.

Also, if we want the patch for 492 to go into libzmq, we need to make
a pull request (whomever does this needs Windows to test on).

Thanks!
-Pieter


On Fri, May 17, 2013 at 2:40 AM, KIU Shueng Chuan <nixchuan at gmail.com> wrote:
> For the 1st one, a related entry is LIBZMQ-492. The submitter changes the
> Event to a Mutex which is better since the OS will release the Mutex
> automatically on program abort.
>
> For the 2nd one, LIBZMQ-519 seems related. The submitter says that creating
> and closing a socket fails after 16000 iterations on Windows 7. This sounds
> like exhaustion of unique 4-tuple due to socketpair emulation sockets
> hanging around in TIME_WAIT state. On WinXP, the failure will occur sooner
> due to lower number of ephemeral ports.
>
> On May 17, 2013 1:02 AM, "Pieter Hintjens" <ph at imatix.com> wrote:
>>
>> Hi Shueng Chaun,
>>
>> Thanks for this. I've backported these two commits to zeromq3-x
>> stable. Do you know if there are Jira issues for them?
>>
>> -Pieter
>>
>> On Thu, May 16, 2013 at 7:06 AM, KIU Shueng Chuan <nixchuan at gmail.com>
>> wrote:
>> > Hi Pieter,
>> >
>> > This is the commit in master for the above-mentioned signaler.cpp
>> > assertion:
>> > Release Event on error
>> > 8c71ac47e83dc4ae116ab4abb5e4a76e8249d888
>> >
>> > Another change that might help Windows (especially XP) users is the
>> > following. It avoids the socketpair emulation in win32 from staying in
>> > the
>> > TIME_WAIT state after close.
>> > avoid TIME_WAIT socket structures
>> > 151a80619bf3f9c4696788f79cd2c934ed26246d
>> >
>> >
>> >
>> >
>> > On Tue, May 7, 2013 at 3:54 AM, Pieter Hintjens <ph at imatix.com> wrote:
>> >>
>> >> Has this been fixed on master but not backported to 3.x?
>> >>
>> >> Usually it's up to individual contributors to either do the backport
>> >> (a git cherrypick) or ask someone to help do it.
>> >>
>> >> I'm happy to backport the fix for the next 3.2.4 release if someone
>> >> tells me what commit it was.
>> >>
>> >> -Pieter
>> >>
>> >> On Mon, May 6, 2013 at 8:32 PM, Felipe Farinon
>> >> <felipe.farinon at powersyslab.com> wrote:
>> >> > Why wasn't this patch released yet?
>> >> >
>> >> > In my application i'm trying to fix it by using Mutex considering the
>> >> > case that waiting (WaitForSingleObject) on the mutex results in
>> >> > WAIT_ABANDONED. It seems that it has been fixed. Should I submit a
>> >> > patch?
>> >> >
>> >> > Em 19/03/2013 16:45, Pau escreveu:
>> >> >> Hi,
>> >> >>
>> >> >> I had this problem some weeks ago. This was my problem:
>> >> >>
>> >> >> I do not know all possible reasons to generate a fatal
>> >> >> wsa_assert(..)
>> >> >> but there is at least one:
>> >> >>
>> >> >> I have seen that in XP it is possible that line 301  rc = connect
>> >> >> (*w_,
>> >> >> (sockaddr *) &addr, sizeof (addr)); returns an error when a socket
>> >> >> tries
>> >> >> to connect to 5905 and this has happened many times.
>> >> >> Windows uses port numbers starting near 1400 and XP has a limit at
>> >> >> 5000.
>> >> >> In W7 this does not look as a problem because maximum is 65000
>> >> >> It sounds as if the number was big enough but apart from the fact
>> >> >> that
>> >> >> ZMQ uses a big number of connections (at least in my tests) I have
>> >> >> experienced that Windows jumps port numbers by 7, so 5000 happens
>> >> >> sometimes with catastrophic consequences.
>> >> >>
>> >> >> Perhaps there are other reasons (actually this problem does not
>> >> >> happen
>> >> >> like that in W7) anyway whatever crashes between
>> >> >>
>> >> >> HANDLE sync = CreateEvent (NULL, FALSE, TRUE, TEXT
>> >> >> ("zmq-signaler-port-sync"));
>> >> >> and
>> >> >> SetEvent (sync);
>> >> >>
>> >> >> will leave the event signaled and any other application in the
>> >> >> system
>> >> >> will hang. Closing all apps in a system fixes it.
>> >> >>
>> >> >> KIU Shueng Chuan submitted a patch that sets the event when crash
>> >> >> avoiding other applications to ge hanged:
>> >> >>
>> >> >> https://github.com/zeromq/libzmq/pull/514
>> >> >>
>> >> >> That worked for me...
>> >> >>
>> >> >> If you search for "zmq-signaler-port-sync" in previous mails in this
>> >> >> list you will see the complete thread.
>> >> >>
>> >> >> best,
>> >> >>
>> >> >> Pau
>> >> >>
>> >> >> El 19/03/2013 17:18, Felipe Farinon escribió:
>> >> >>> Hi,
>> >> >>>
>> >> >>> The code
>> >> >>> HANDLE sync = CreateEvent (&sa, FALSE, TRUE, TEXT
>> >> >>> ("Global\\zmq-signaler-port-sync"));
>> >> >>>
>> >> >>> in signaler.cpp:262 hangs intermittently when starting zeromq
>> >> >>> context
>> >> >>> through zmq_init in my application. I'm not able to reproduce this
>> >> >>> bug
>> >> >>> with a minimal test case. I'm running my application in a Windows 7
>> >> >>> with
>> >> >>> zeromq compiled with VS2010.
>> >> >>> _______________________________________________
>> >> >>> zeromq-dev mailing list
>> >> >>> zeromq-dev at lists.zeromq.org
>> >> >>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>> >> >>>
>> >> >> _______________________________________________
>> >> >> zeromq-dev mailing list
>> >> >> zeromq-dev at lists.zeromq.org
>> >> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>> >> >>
>> >> >
>> >> > _______________________________________________
>> >> > zeromq-dev mailing list
>> >> > zeromq-dev at lists.zeromq.org
>> >> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>> >> _______________________________________________
>> >> zeromq-dev mailing list
>> >> zeromq-dev at lists.zeromq.org
>> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>> >
>> >
>> >
>> > _______________________________________________
>> > zeromq-dev mailing list
>> > zeromq-dev at lists.zeromq.org
>> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>> >
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>



More information about the zeromq-dev mailing list