[zeromq-dev] Weird whisper messages arriving with Pyre

Arnaud Loonstra arnaud at sphaero.org
Tue Dec 22 11:47:14 CET 2015


Default HWMs for recv and send are at 1000 so if there is a lot of 
traffic sockets can block. However background threads will be 
continuing. From this log you've send it looks like frames are getting 
messed up. If this happens very strange things could happen indeed. If 
this is really the case we should find out when it happens. You are 
saying it happens only in high traffic cases? This would indeed suspect 
the HWMs. We really need a reproducible case.
There's the high_volume example in examples. Can you reproduce it if 
you tweak it?

Rg,

Arnaud


On 2015-12-22 09:37, Pieter Hintjens wrote:
> It looks like partial messages being delivered and/or incorrectly
> passed between layers.
>
> On Mon, Dec 21, 2015 at 11:36 PM, Axel Voitier
> <axel.voitier at gmail.com> wrote:
>> Well, no, it can't be the HWM settings. Pyre set them on DEALER and 
>> PAIR
>> sockets only, which should be blocking if they get full.
>>
>>
>> Axel
>>
>>
>> 2015-12-21 23:15 GMT+01:00 Axel Voitier <axel.voitier at gmail.com>:
>>>
>>> Yes, the unittests pass.
>>>
>>> I kind of solved it by increasing the regular sleep I do to 100ms. 
>>> Am I
>>> right to think it really is due to the HWM?
>>>
>>> How are the HWM settings handled in the internal (zactor pipe) and
>>> external sockets of pyre? Is it configurable?
>>> I see them set with rather fixed values (constants or magic 
>>> numbers) :S.
>>> Only zcreate_pipe takes it as a default parameter, but it is not 
>>> used when
>>> called.
>>>
>>>
>>> Cheers,
>>> Axel
>>>
>>>
>>> 2015-12-21 22:44 GMT+01:00 Arnaud Loonstra <arnaud at sphaero.org>:
>>>>
>>>> I can have a look tomorrow although we'd be best helped with a
>>>> reproducible error. In this case it seems messages are either 
>>>> malformed or
>>>> getting mixed up. If some error happens somewhere in a node it is 
>>>> very hard
>>>> to predict what will happen due to the asynchronous behaviour of 
>>>> pyre.
>>>>
>>>> Are the unittest running fine on your setup?
>>>>
>>>> Rg,
>>>>
>>>> Arnaud
>>>>
>>>> On December 21, 2015 9:24:29 PM GMT+01:00, Axel Voitier
>>>> <axel.voitier at gmail.com> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> I observe an odd behaviour in my application which seems to 
>>>>> relate to
>>>>> Pyre whisper messages being malformatted (incomplete and/or 
>>>>> mixed).
>>>>>
>>>>> It is still difficult to reproduce as it happens randomly and 
>>>>> when
>>>>> having quite some traffic between the two nodes.
>>>>>
>>>>> Here is an example:
>>>>>>
>>>>>> DEBUG:isac.transport.pyre_node:(alidron-archiver-influxdb) 
>>>>>> received
>>>>>> stuff: ['WHISPER', 
>>>>>> '\xe7\x19`\xf2a\xddM\xe1\x85\x1e\xe6s~\xd6@\x0b',
>>>>>> 'alidron-openzwave-controller', 'SHOUT']
>>>>>
>>>>>
>>>>> It is quite strange to get a 'SHOUT' in a whisper message, 
>>>>> knowing this
>>>>> is not the kind of payload my application send.
>>>>>
>>>>>
>>>>> In another case (actually happened during another run), on the 
>>>>> other
>>>>> node:
>>>>>>
>>>>>> Exception in thread Thread-1:
>>>>>> Traceback (most recent call last):
>>>>>>   File "/usr/local/lib/python2.7/threading.py", line 801, in
>>>>>> __bootstrap_inner
>>>>>>     self.run()
>>>>>>   File "/usr/local/lib/python2.7/threading.py", line 754, in run
>>>>>>     self.__target(*self.__args, **self.__kwargs)
>>>>>>   File "/usr/local/lib/python2.7/site-packages/pyre/zactor.py", 
>>>>>> line
>>>>>> 57, in run
>>>>>>     self.shim_handler(*self.shim_args, **self.shim_kwargs)
>>>>>>   File 
>>>>>> "/usr/local/lib/python2.7/site-packages/pyre/pyre_node.py", line
>>>>>> 52, in __init__
>>>>>>     self.run()
>>>>>>   File 
>>>>>> "/usr/local/lib/python2.7/site-packages/pyre/pyre_node.py", line
>>>>>> 504, in run
>>>>>>     self.recv_api()
>>>>>>   File 
>>>>>> "/usr/local/lib/python2.7/site-packages/pyre/pyre_node.py", line
>>>>>> 182, in recv_api
>>>>>>     peer_id = uuid.UUID(bytes=request.pop(0))
>>>>>>   File "/usr/local/lib/python2.7/uuid.py", line 146, in __init__
>>>>>>     raise ValueError('bytes is not a 16-char string')
>>>>>> ValueError: bytes is not a 16-char string
>>>>>
>>>>>
>>>>> Here, line 182 of pyre_node.py is actually trying to read the 
>>>>> peer_id
>>>>> off the WHISPER message but seems to be reading something wrong?!
>>>>>
>>>>> In several runs the context is not the same (not the same 
>>>>> "transaction"
>>>>> going on if you want). I can provide the full logs if you want, 
>>>>> but in debug
>>>>> level it's about 13.2MB for three runs...
>>>>>
>>>>>
>>>>> Would you have an idea about what is going on? I though about too 
>>>>> much
>>>>> traffic going over the HWM of the various socket involved here 
>>>>> and there.
>>>>> But I already limit the rate of transactions by pausing 10ms 
>>>>> between
>>>>> each.
>>>>>
>>>>>
>>>>> Cheers,
>>>>> Axel
>>>>>
>>>>> ________________________________
>>>>>
>>>>> zeromq-dev mailing list
>>>>> zeromq-dev at lists.zeromq.org
>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>
>>>>
>>>> Send from my feature bloated phone.
>>>>
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>
>>>
>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev




More information about the zeromq-dev mailing list