[zeromq-dev] zeromq-dev Digest, Vol 34, Issue 124

Pieter Hintjens ph at imatix.com
Wed Oct 27 17:26:53 CEST 2010


On Wed, Oct 27, 2010 at 12:12 PM, Martin Sustrik <sustrik at 250bpm.com> wrote:

> If there's an intermediary node C in between A and B, B has no way to
> find out that connection between A and C have failed.

What you are saying is that 0MQ cannot create a connection from A to B
but rather creates two connections, A to C and C to B.  So A could see
that C is present, while C finds that B is dead.

> That in turn breaks the scalability. The scenario goes like this: You
> start with a simple distributed application where A speaks to B. Later
> on your business grows and you want to scale your application up. To do
> so you add intermediary node C. At that point the original code that was
> using connect/disconnect notifications stops to work :(

0MQ cannot do this, agreed.  It cannot create connections across
devices.  However you're assuming that A *needs* to know about B's
presence.  When the cases we've seen, A only cares about C.

Typical example is reliable workload distribution to workers.  A gives
to C, a queue.  C finds an available B, gives it the work.  If B dies,
C choses another B.  A *does not* want to know.  Only if C dies, or if
no B replies within some time, does A consider C to be dead, and finds
another C to talk to.

You are thinking of devices as dumb switches, and this works for
pubsub and perhaps pipeline, but for request-reply devices are not
dumb, they have state and in every case I've seen, do routing.  That
means that A and B are by definition abstracted from each other.

> Now, I suppose those that need connect/disconnect functionality need it
> in cases where 0MQ library acts as a simple client with a single
> connection to the rest of the system. (Please, correct me if I am wrong.)

Seems wrong.  Seems basic to any reliability (retry, retransmission) framework.

> 1. Implement a dumb sync client that happens to speak 0MQ wire-protocol.

The protocol's not documented in a usable manner and it would be
pathological to start writing custom stacks just to get heartbeating.

> 2. Create a connect/disconnect patch and distribute it out of the 0MQ
> mainline.

?

> 3. Try to solve the problem in a systematic manner. That, AFAIU, boils
> down to a presence service, as seen in XMPP. The problem I see with that
> approach is that 0MQ is meant to be able to handle much larger number of
> potential peers than XMPP where you mostly have just a few buddies.

Disagree about the "larger number of potential peers", this seems
based on a misconception of how large networks actually interconnect.
Billions of nodes can interconnect (and do) with only dozens or
hundreds of peers for any given node.  By logical analysis the more
distributed the network, the more peers will have an 'average' number
of connections, which is never very high.

Disagree also (sorry! :-) that this requires a presence server.  It
requires (and I think this is an inevitable future functionality of
0MQ) configurable heartbeating in the TCP protocol and notification to
applications when peers connect and disconnect (as in, 0MQ considers
them to be disconnected, rather than lower-level TCP disconnect).

It's possible to do heartbeating / keep-alive over XREP/XREQ sockets.
It's not possible over REQ/REP or PUSH/PULL sockets.  Thus it cannot
be done only at the application level, 0MQ has to help.

Anyhow, this is stuff for the next 0MQ Developers Conference, and OT
for this thread, so I'll shut up :-)

-Pieter



More information about the zeromq-dev mailing list