[zeromq-dev] ZMQ_MCAST_LOOP with PGM

Pierre Ynard linkfanel at yahoo.fr
Mon Aug 6 14:19:53 CEST 2012


<CAGFWLih5Cz87R5S51WxEnAsA6VLbKBxtuHPnD=RQjaiE-2SEcA at mail.gmail.com>

> It's been removed because of discussions like this

This is now a discussion about the underlying issues of the PGM
transport, not limited to ZMQ_MCAST_LOOP. Removing the option that made
the issue most visible so that people don't complain about it is just
sweeping under the rug.

> 0MQ makes a host broker trivial to implement: the method other
> middleware systems resolve this issue: TIBCO's Rendezvous Daemon for
> example.

ZeroMQ is advertised as zero broker, zero configuration, N-to-N
communication... but surprisingly enough at some point I always stumble
upon "easy, just use a broker!" I don't want a broker, I don't want more
complexity in my system and I don't want to code and monitor an extra
daemon.

> > But I'm worried by the rest too because regardless of receivers and
> > loopback, I plan on running several senders in parallel on the same
> > host.
>
> As long as they are on different ports it is fine, it's an OS
> multiplexing issue with a single port.

I want to run several instances of the same process in parallel for
redundancy and scalability reasons. So they do the same job and want to
send data to the same destination, so same port. Also, that's a reason
why I'm not interested in adding a new SPOF downstream to that.

> Look back at the protocol, the options to support multiple senders on
> a host are a bit limited:
>
> 1)  Only use multicast NAKs.
> 2)  Include IP source port in the protocol.
> 3)  Redefine UDP encapsulation for senders to bind to both
> data-destination and data-source ports: thus each application on the
> host has a unique data-source port to bind to for NAK receipt.

Agreed. It looks possible, if not easy.

> #3 belies the important fact that UDP loopback is terrible and easily
> suffers 50%+ packet loss on Linux as normal people use TCP or Unix
> sockets. This is an outstanding issue with the Linux kernel and no one
> is bothered enough to fix it.

There is no real reason why UDP would be worse with loopback. And with
loopback the only cause I see for this is receive buffer overrun. It's
meant to happen if you don't manage the buffer size accordingly with the
amount of traffic you send, or if you send data faster than you read it,
etc. Of course there is no network constraint so you're more likely to
hit these buffer limitations first, but that doesn't actually make it
worse.

It's not that normal, reliable protocols were important enough to
be made reliable; it's just that UDP stands out as designed to be
unreliable, so yes users are left to deal with this "issue". Besides,
many local DNS resolvers are reached by loopback UDP, and - correct me
if I'm missing something - I don't think they're considered as terrible.

The fact that more reliable options are available on the local host
doesn't make UDP loopback bad either. Unless we're talking about ZeroMQ
seamlessly emulating a faulty local UDP channel with a better transport
in parallel behind the scenes, then yeah, maybe.

-- 
Pierre Ynard
"Une âme dans un corps, c'est comme un dessin sur une feuille de papier."



More information about the zeromq-dev mailing list