[zeromq-dev] Zyre Wi-Fi Rejoin Issue

Steven Rasmussen Steve.Rasmussen at RasSimTech.com
Wed Jun 11 01:52:11 CEST 2014


Yea, I figured that out, thanks.

-----Original Message-----
From: zeromq-dev-bounces at lists.zeromq.org
[mailto:zeromq-dev-bounces at lists.zeromq.org] On Behalf Of Pieter Hintjens
Sent: Tuesday, June 10, 2014 5:45 PM
To: ZeroMQ development list
Subject: Re: [zeromq-dev] Zyre Wi-Fi Rejoin Issue

You can't define this; you will have to use the libzmq master version to get
this functionality.

On Tue, Jun 10, 2014 at 9:24 PM, Steve Rasmussen
<Steve.Rasmussen at rassimtech.com> wrote:
> Hey Pieter,
>
> I haven't quite got this working. After I define the symbol 
> ZMQ_ROUTER_HANDOVER, I start getting the following assert:
> lt-zpinger: zsock_option.c:82: zsock_set_router_handover: Assertion 
> `rc == 0
> || zmq_errno () == (156384712 + 53)' failed.
> Aborted (core dumped)
>
> Any ideas on what I'm doing wrong?
>
> Thanks,
>
> -Steve
>
> -----Original Message-----
> From: zeromq-dev-bounces at lists.zeromq.org
> [mailto:zeromq-dev-bounces at lists.zeromq.org] On Behalf Of Steve 
> Rasmussen
> Sent: Tuesday, June 10, 2014 10:03 AM
> To: 'ZeroMQ development list'
> Subject: Re: [zeromq-dev] Zyre Wi-Fi Rejoin Issue
>
> Hey Pieter,
>
> That is great news! I was just getting back into this problem. I'll 
> try out your fixes and let you know that they work :)
>
> Thanks again!
>
> Regards,
>
> Steve
>
> -----Original Message-----
> From: zeromq-dev-bounces at lists.zeromq.org
> [mailto:zeromq-dev-bounces at lists.zeromq.org] On Behalf Of Pieter 
> Hintjens
> Sent: Tuesday, June 10, 2014 9:52 AM
> To: ZeroMQ development list
> Subject: Re: [zeromq-dev] Zyre Wi-Fi Rejoin Issue
>
> Hi Steve,
>
> I have found the cause of the WiFI rejoin issue (#200) and fixed it, I 
> think. The problem was old/new clients connecting with the same 
> identity, where the router socket incorrectly delivered messages from 
> the old client rather than the new one. It may be an issue in libzmq 
> but I think rather it's a combination of the TCP stack retrying, and 
> delivering, old messages, plus the router socket doing something weird
with the new client connection.
> I'm not quite sure where the HELLO messages disappear to...
>
> Anyhow, the fix is to use ZMQ_ROUTER_HANDOVER in zyre_node, and there 
> is no need to remove peers or do other hacks. It works as we'd expect.
>
> Pull request is on zyre master.
>
> -Pieter
>
> On Sat, Jun 7, 2014 at 9:26 PM, Pieter Hintjens <ph at imatix.com> wrote:
>> OK, I did a simple test to try to reproduce this at the dealer-router 
>> level and it doesn't happen. So it's not a libzmq issue. I'll dig 
>> deeper, it has to be something in the way Zyre is managing its 
>> sockets...
>>
>> On Fri, Jun 6, 2014 at 11:25 PM, Steven Rasmussen 
>> <Steve.Rasmussen at rassimtech.com> wrote:
>>> At little more information:
>>>
>>> One of the first things I tried, when the Wi-Fi connection was 
>>> re-established, was delaying sending the START message,  until after 
>>> the old messages had been received. I couldn't figure out a good 
>>> time to delay, but If I delayed it long enough, the HELLO would get 
>>> through and kick off the handshake. This made it seem to me that 
>>> messages were being buffered somewhere.
>>>
>>> If I just started periodically sending HELLO messages, after 
>>> receiving beacons, without removing the peer, the HELLO messages 
>>> would not ever get through.
>>>
>>> -Steve
>>>
>>> -----Original Message-----
>>> From: zeromq-dev-bounces at lists.zeromq.org
>>> [mailto:zeromq-dev-bounces at lists.zeromq.org] On Behalf Of Pieter 
>>> Hintjens
>>> Sent: Friday, June 6, 2014 1:18 PM
>>> To: ZeroMQ development list
>>> Subject: Re: [zeromq-dev] Zyre Wi-Fi Rejoin Issue
>>>
>>> OK, I've pushed a patch that fixes it, using your workaround more or
> less.
>>>
>>> I want to test this at the libzmq level, it's weird that old 
>>> messages are getting through and the new ones aren't.
>>>
>>> -Pieter
>>>
>>> On Fri, Jun 6, 2014 at 6:36 PM, Pieter Hintjens <ph at imatix.com> wrote:
>>>> OK, I've reproduced the problem quite easily. Something strange 
>>>> with messages being delivered even though the socket they're sent 
>>>> on is torn down entirely. I'm investigating...
>>>>
>>>> On Fri, Jun 6, 2014 at 5:57 PM, Pieter Hintjens <ph at imatix.com> wrote:
>>>>> OK, I'll simulate this in the code. The peers should automatically 
>>>>> resend HELLO if they lost contact.
>>>>>
>>>>> No thanks needed, we enjoy making this software and use it in 
>>>>> everything we make. :-)
>>>>>
>>>>> On Fri, Jun 6, 2014 at 4:12 PM, Steve Rasmussen 
>>>>> <Steve.Rasmussen at rassimtech.com> wrote:
>>>>>>> In principle if the connection is re-established there should be 
>>>>>>> no new
>>>>>> HELLO message sent.
>>>>>>
>>>>>> This problem occurs after the Wi-Fi connection has been down long 
>>>>>> enough for the peers to remove each other. When the connection 
>>>>>> come back up, as I understand it, the HELLO message is necessary 
>>>>>> to kick-off
>>> handshaking.
>>>>>>
>>>>>>> Can you find a way to reproduce the problem easily?
>>>>>> The easiest method that I've found is using a modified version of 
>>>>>> the zpinger tool on two laptops. The modified zpinger tool is set 
>>>>>> up to send a whisper, after a time delay, anytime it receives a 
>>>>>> whisper from a peer. I either turn the Wi-Fi adapter off/on or 
>>>>>> move the laptop out of range to perform the test.
>>>>>>
>>>>>> It seems like this may have something to do with the sockets 
>>>>>> maintaining the TCP/IP connection during the break and then being 
>>>>>> in a bad state when the Wi-Fi connection comes back up. Is this 
>>>>>> possible? If so is there some way to reset the TCP/IP connection?
>>>>>>
>>>>>>> Thanks for taking the time to analyse the problem.
>>>>>>
>>>>>> I need this capability for the system I'm developing. Thank you 
>>>>>> and your colleagues for ZeroMQ, CZMQ, Zyre, ...
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Steve
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: zeromq-dev-bounces at lists.zeromq.org
>>>>>> [mailto:zeromq-dev-bounces at lists.zeromq.org] On Behalf Of Pieter 
>>>>>> Hintjens
>>>>>> Sent: Thursday, June 5, 2014 5:22 PM
>>>>>> To: ZeroMQ development list
>>>>>> Subject: Re: [zeromq-dev] Zyre Wi-Fi Rejoin Issue
>>>>>>
>>>>>> On Thu, Jun 5, 2014 at 5:32 PM, Steve Rasmussen 
>>>>>> <Steve.Rasmussen at rassimtech.com> wrote:
>>>>>>
>>>>>>> The problem seems to be with the TCP/IP connection not the beacon.
>>>>>>> After a
>>>>>> network break, the beacon reestablishes the connection, but no 
>>>>>> data is getting through the tcp/ip connection.
>>>>>>> It looks as if there are messages that are being buffered before 
>>>>>>> the break
>>>>>> and then delivered after. This prevents the "HELLO" message from 
>>>>>> getting through. I've tried various things, but the closest the 
>>>>>> I've come, so far, is to keep removing the peer until it is 
>>>>>> reported as being ready. I'm doing this in the
> "zyre_node_require_peer"
>>>>>> function. If a peer exists I check to see if it is ready, 
>>>>>> "zyre_peer_ready" and if not, I remove the peer, 
>>>>>> "zyre_node_remove_peer". This seems to fix the problem that I'm 
>>>>>> having,
>>> but it seems a little kludgie.
>>>>>>
>>>>>> Thanks for taking the time to analyse the problem.
>>>>>>
>>>>>> In principle if the connection is re-established there should be 
>>>>>> no new HELLO message sent. Can you find a way to reproduce the 
>>>>>> problem
>>> easily?
>>>>>>
>>>>>> Feel free to make a pull request with your change anyhow. I'm 
>>>>>> reworking a lot of this code atm so will try to include your 
>>>>>> change if I can reproduce the error.
>>>>>>
>>>>>> -Pieter
>>>>>> _______________________________________________
>>>>>> zeromq-dev mailing list
>>>>>> zeromq-dev at lists.zeromq.org
>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>
>>>>>> _______________________________________________
>>>>>> zeromq-dev mailing list
>>>>>> zeromq-dev at lists.zeromq.org
>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________
zeromq-dev mailing list
zeromq-dev at lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev




More information about the zeromq-dev mailing list