[zeromq-dev] Zyre Wi-Fi Rejoin Issue

Pieter Hintjens ph at imatix.com
Tue Jun 10 23:44:55 CEST 2014


You can't define this; you will have to use the libzmq master version
to get this functionality.

On Tue, Jun 10, 2014 at 9:24 PM, Steve Rasmussen
<Steve.Rasmussen at rassimtech.com> wrote:
> Hey Pieter,
>
> I haven't quite got this working. After I define the symbol
> ZMQ_ROUTER_HANDOVER,
> I start getting the following assert:
> lt-zpinger: zsock_option.c:82: zsock_set_router_handover: Assertion `rc == 0
> || zmq_errno () == (156384712 + 53)' failed.
> Aborted (core dumped)
>
> Any ideas on what I'm doing wrong?
>
> Thanks,
>
> -Steve
>
> -----Original Message-----
> From: zeromq-dev-bounces at lists.zeromq.org
> [mailto:zeromq-dev-bounces at lists.zeromq.org] On Behalf Of Steve Rasmussen
> Sent: Tuesday, June 10, 2014 10:03 AM
> To: 'ZeroMQ development list'
> Subject: Re: [zeromq-dev] Zyre Wi-Fi Rejoin Issue
>
> Hey Pieter,
>
> That is great news! I was just getting back into this problem. I'll try out
> your fixes and let you know that they work :)
>
> Thanks again!
>
> Regards,
>
> Steve
>
> -----Original Message-----
> From: zeromq-dev-bounces at lists.zeromq.org
> [mailto:zeromq-dev-bounces at lists.zeromq.org] On Behalf Of Pieter Hintjens
> Sent: Tuesday, June 10, 2014 9:52 AM
> To: ZeroMQ development list
> Subject: Re: [zeromq-dev] Zyre Wi-Fi Rejoin Issue
>
> Hi Steve,
>
> I have found the cause of the WiFI rejoin issue (#200) and fixed it, I
> think. The problem was old/new clients connecting with the same identity,
> where the router socket incorrectly delivered messages from the old client
> rather than the new one. It may be an issue in libzmq but I think rather
> it's a combination of the TCP stack retrying, and delivering, old messages,
> plus the router socket doing something weird with the new client connection.
> I'm not quite sure where the HELLO messages disappear to...
>
> Anyhow, the fix is to use ZMQ_ROUTER_HANDOVER in zyre_node, and there is no
> need to remove peers or do other hacks. It works as we'd expect.
>
> Pull request is on zyre master.
>
> -Pieter
>
> On Sat, Jun 7, 2014 at 9:26 PM, Pieter Hintjens <ph at imatix.com> wrote:
>> OK, I did a simple test to try to reproduce this at the dealer-router
>> level and it doesn't happen. So it's not a libzmq issue. I'll dig
>> deeper, it has to be something in the way Zyre is managing its
>> sockets...
>>
>> On Fri, Jun 6, 2014 at 11:25 PM, Steven Rasmussen
>> <Steve.Rasmussen at rassimtech.com> wrote:
>>> At little more information:
>>>
>>> One of the first things I tried, when the Wi-Fi connection was
>>> re-established, was delaying sending the START message,  until after
>>> the old messages had been received. I couldn't figure out a good time
>>> to delay, but If I delayed it long enough, the HELLO would get
>>> through and kick off the handshake. This made it seem to me that
>>> messages were being buffered somewhere.
>>>
>>> If I just started periodically sending HELLO messages, after
>>> receiving beacons, without removing the peer, the HELLO messages
>>> would not ever get through.
>>>
>>> -Steve
>>>
>>> -----Original Message-----
>>> From: zeromq-dev-bounces at lists.zeromq.org
>>> [mailto:zeromq-dev-bounces at lists.zeromq.org] On Behalf Of Pieter
>>> Hintjens
>>> Sent: Friday, June 6, 2014 1:18 PM
>>> To: ZeroMQ development list
>>> Subject: Re: [zeromq-dev] Zyre Wi-Fi Rejoin Issue
>>>
>>> OK, I've pushed a patch that fixes it, using your workaround more or
> less.
>>>
>>> I want to test this at the libzmq level, it's weird that old messages
>>> are getting through and the new ones aren't.
>>>
>>> -Pieter
>>>
>>> On Fri, Jun 6, 2014 at 6:36 PM, Pieter Hintjens <ph at imatix.com> wrote:
>>>> OK, I've reproduced the problem quite easily. Something strange with
>>>> messages being delivered even though the socket they're sent on is
>>>> torn down entirely. I'm investigating...
>>>>
>>>> On Fri, Jun 6, 2014 at 5:57 PM, Pieter Hintjens <ph at imatix.com> wrote:
>>>>> OK, I'll simulate this in the code. The peers should automatically
>>>>> resend HELLO if they lost contact.
>>>>>
>>>>> No thanks needed, we enjoy making this software and use it in
>>>>> everything we make. :-)
>>>>>
>>>>> On Fri, Jun 6, 2014 at 4:12 PM, Steve Rasmussen
>>>>> <Steve.Rasmussen at rassimtech.com> wrote:
>>>>>>> In principle if the connection is re-established there should be
>>>>>>> no new
>>>>>> HELLO message sent.
>>>>>>
>>>>>> This problem occurs after the Wi-Fi connection has been down long
>>>>>> enough for the peers to remove each other. When the connection
>>>>>> come back up, as I understand it, the HELLO message is necessary
>>>>>> to kick-off
>>> handshaking.
>>>>>>
>>>>>>> Can you find a way to reproduce the problem easily?
>>>>>> The easiest method that I've found is using a modified version of
>>>>>> the zpinger tool on two laptops. The modified zpinger tool is set
>>>>>> up to send a whisper, after a time delay, anytime it receives a
>>>>>> whisper from a peer. I either turn the Wi-Fi adapter off/on or
>>>>>> move the laptop out of range to perform the test.
>>>>>>
>>>>>> It seems like this may have something to do with the sockets
>>>>>> maintaining the TCP/IP connection during the break and then being
>>>>>> in a bad state when the Wi-Fi connection comes back up. Is this
>>>>>> possible? If so is there some way to reset the TCP/IP connection?
>>>>>>
>>>>>>> Thanks for taking the time to analyse the problem.
>>>>>>
>>>>>> I need this capability for the system I'm developing. Thank you
>>>>>> and your colleagues for ZeroMQ, CZMQ, Zyre, ...
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Steve
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: zeromq-dev-bounces at lists.zeromq.org
>>>>>> [mailto:zeromq-dev-bounces at lists.zeromq.org] On Behalf Of Pieter
>>>>>> Hintjens
>>>>>> Sent: Thursday, June 5, 2014 5:22 PM
>>>>>> To: ZeroMQ development list
>>>>>> Subject: Re: [zeromq-dev] Zyre Wi-Fi Rejoin Issue
>>>>>>
>>>>>> On Thu, Jun 5, 2014 at 5:32 PM, Steve Rasmussen
>>>>>> <Steve.Rasmussen at rassimtech.com> wrote:
>>>>>>
>>>>>>> The problem seems to be with the TCP/IP connection not the beacon.
>>>>>>> After a
>>>>>> network break, the beacon reestablishes the connection, but no
>>>>>> data is getting through the tcp/ip connection.
>>>>>>> It looks as if there are messages that are being buffered before
>>>>>>> the break
>>>>>> and then delivered after. This prevents the "HELLO" message from
>>>>>> getting through. I've tried various things, but the closest the
>>>>>> I've come, so far, is to keep removing the peer until it is
>>>>>> reported as being ready. I'm doing this in the
> "zyre_node_require_peer"
>>>>>> function. If a peer exists I check to see if it is ready,
>>>>>> "zyre_peer_ready" and if not, I remove the peer,
>>>>>> "zyre_node_remove_peer". This seems to fix the problem that I'm
>>>>>> having,
>>> but it seems a little kludgie.
>>>>>>
>>>>>> Thanks for taking the time to analyse the problem.
>>>>>>
>>>>>> In principle if the connection is re-established there should be
>>>>>> no new HELLO message sent. Can you find a way to reproduce the
>>>>>> problem
>>> easily?
>>>>>>
>>>>>> Feel free to make a pull request with your change anyhow. I'm
>>>>>> reworking a lot of this code atm so will try to include your
>>>>>> change if I can reproduce the error.
>>>>>>
>>>>>> -Pieter
>>>>>> _______________________________________________
>>>>>> zeromq-dev mailing list
>>>>>> zeromq-dev at lists.zeromq.org
>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>
>>>>>> _______________________________________________
>>>>>> zeromq-dev mailing list
>>>>>> zeromq-dev at lists.zeromq.org
>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev



More information about the zeromq-dev mailing list