[zeromq-dev] 0MQ multicast message lost?
Steven McCoy
steven.mccoy at miru.hk
Thu Oct 7 16:20:27 CEST 2010
On 7 October 2010 17:53, <ntupel at googlemail.com> wrote:
> I am currently wrapping my head around 0MQ + PGM. In particular I try
> to come up with an understanding of reliability guarantees. Given this
> receiver (receiver.cc http://pastebin.com/PHbYCawK) and this sender
> (sender.cc http://pastebin.com/gnrgZtD3) I try to publish messages
> from sender to receiver (ZMQ_PUB/ZMQ_SUB) and the receiver sends back
> the message received (ZMQ_PUSH/ZMQ_PULL). Only after the sender has
> gotten back the message it will publish the next one. I am using two
> separate machines within the same subnet. My problem is that often
> sender and receiver get stuck which means that some message got lost.
> I invoke these processes as follows:
>
> On receiver side: $ ./receiver epgm://eth0\;231.192.0.1:9999
> tcp://10.0.10.2:1234 255
> On sender side: $ ./sender epgm://eth0\;231.192.0.1:9999
> tcp://eth0:1234 255 10000
>
> Then after some time where numbers scroll on both sides they get stuck
> at some n < 9999.
>
> Can somebody shed some light on this?
>
>
You are setting the rate limit at 64kbs and sending 10,000 messages or
approximately 108mbit which means the ODATA is clearly maxing out the
channel capacity. There is no reservation for repairs within the specified
data rate as there is no such definition. The protocol assumes that the
rate is an absolute maximum and your normal throughput is less.
This has been raised before and on the roadmap for subsequent investigation,
although namely the limited of repair bandwidth and repairs sent to an
individual receiver TSI to prevent failing clients bring down a source,
nothing has been decided about ODATA and has not been covered by the
protocol RFC.
What you can do is investigate setting a high rate limit and implement an
application layer coarse grained throttle.
There is PGMCC, congestion control for PGM that implements the functionality
that prevents TCP from imploding with the same issue, however after testing
I found it doesn't work above 10,000pps and requires some extra
modifications to the 0MQ integration to permit blocking and unblocking of
transmission channels.
Fixing PGMCC is probably a good student project and requires going back to
the ns-3 network simulator and testing the protocol configuration at higher
rates than previous thesis authors have done.
--
Steve-o
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20101007/3e1c7d26/attachment.htm>
More information about the zeromq-dev
mailing list