[zeromq-dev] PUB/SUB on an epgm socket stops receiving eventually …
Steven McCoy
steven.mccoy at miru.hk
Fri Jun 3 23:43:45 CEST 2011
On 4 June 2011 03:49, Ladan Gharai <lgharai at gmail.com> wrote:
>
>
> On Wed, Jun 1, 2011 at 4:41 PM, Steven McCoy <steven.mccoy at miru.hk> wrote:
>
>> On 2 June 2011 04:17, Ladan Gharai <lgharai at gmail.com> wrote:
>>
>>> I’ve turned on the openpgm trace/debug messages – afaict once the epgm
>>> receiver sustains “a lot” of packet loss its just not able to start-over
>>> again
>>>
>>
>> Every time the receiver sees packet loss it closes the socket and
>> schedules a new socket to be created to reconnect to the PGM stream.
>>
>
> I am not sure I understand this - do you mean the zmq socket gets a new
> zmq socket if the ePGM receiver experiences unrecoverable loss? (I dont see
> any new socket opening I just see the zmq recv not receiving anymore)
>
ZMQ creates a new PGM socket. PGM is a socket based API beneath ZMQ.
>
>>
>>>
>>>
>>> My questions are:
>>>
>>> 1. Is there a way to reset the receiver once this happens?
>>>
>>>
>> Reconnects occur with the same engine as TCP reconnects.
>>
>>
>>>
>>> 1.
>>> 2. Has anyone experimented with changing the size of the rxw (it
>>> currently uses 33333) – and the various timers NAK_RB_IVL, NAK_RPT_IVL and
>>> NAK_RDATA_IVL (something akin to TCP tuning?)
>>>
>>>
>> If you find PGM is non-productive you should investigate tightening the
>> recovery settings so failure is raised sooner rather than later. The
>> default settings are friendly towards 10mb networks and so running at high
>> speed on 1gb networks may pose a problem with high data loss.
>>
>> For example, drop the retry count for DATA & NCF from the default 50 to 2.
>>
>> ~line 211 in pgm_socket.cpp:
>> nak_data_retries = 2,
>>
>
>
>> nak_ncf_retries = 2;
>>
>
> Yes - this seems the most sensible approach, expect now it crashes -
> Segmentation fault - once it falls into a long series of packet losses.
>
Can you provide a trace? A coredump should make it more expedient to
diagnose the bug.
--
Steve-o
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20110604/4d354cec/attachment.htm>
More information about the zeromq-dev
mailing list