[zeromq-dev] Publish / Subscribe vs Multicast

gonzalo diethelm gdiethelm at dcv.cl
Fri Feb 12 21:16:18 CET 2010

Hi Martin,

> This is a classic example of multi-hop request/reply scenario.
> Supporting it is on the roadmap, part of the functionality is already
> implemented, resources for implementing the rest are still missing :(

As I said, I intend this to be more of a pipeline scenario, not so much
request/reply. Please see comments below.

> >    1. What would be the practical differences between using a PubSub
> >       approach and using Multicast to pass the requests from
> >       to workers?
> In this scenario each message is passed to a single worker so using
> multicast would be an overkill.

As I see it, there are two alternatives:

1. The distributor has the logic to decide which worker should get a
specific message, and sends it directly to it. This means: the
distributor knows how many workers there are (in order to evenly
distribute messages); it knows what workers are not responding; it is
made aware of new workers. I agree in this scenario Multicast it not
needed, nor is PubSub.

2. The distributor sends every message to all workers, and they decide
whether to ignore it or if they are the ones to process it. This means:
the distributor doesn't need any special knowledge about the workers; it
should use PubSub or Multicast; the workers need logic to determine
whether to process or ignore the message, and this logic would most
likely require knowing about all the other workers and their state.

What do you think?

> >    2. By going with PubSub or Multicast, all the workers will
> >       all task requests and will have to decide whether they are the
> >       worker which should process it. What are practical ways of
> >       this decision? It looks like this approach requires the
workers to
> >       know in advance the total number of workers in the pool,
> As noted above, there's little point in distributing the request to
> the workers (unless you are aiming for hot-hot failover) thus TCP
> transport should be used.

Maybe I should have described first my ideal scenario. I would love to
be able to add workers at will, or even kill them at will, without
having to restart the distributor or any current workers. This is not
strictly hot-hot failover, but you might call it "dynamic load
distribution". That is why I think sending the messages to all workers
might make sense.

> >    3. How to handle crashed workers? How about workers that are not
> >       responding? What if I want to add workers?
> The only 100% reliable algorithm is end-to-end reliability, meaning
> sending application tags request with an unique tag and waits for a
> reply with the same tag. In the meanwhile it drops all non-matching
> replies. If the reply is not delivered within specified time, the
> request is resent.

In my pipeline, I could also do this:

1. The distributor receives a request for work.
2. It creates a unique tag for it and sends it, together with current
time, to the final program in the pipeline (let's call it the checker).
3. It then passes the request to the first stage in the pipeline (one of
N workers for this stage).
4. It moves from stage to stage, where each stage has Ni workers.
5. The final stage passes a notification to the checker including the
request id and current time.

Strict pipeline, no request-reply (so I can fire and forget), still can
know when requests end, how long they take and even take measures for
requests taking too long.

> >    4. Maybe I should have the distributor handle the load
> >       not using PubSub or Multicast, but choosing a specific worker
> >       sending the task request directly to it. Same questions apply,
> right?
> This scenario can be implemented even now. However, requester would
> to have addresses of all the workers so that it is able to connect to
> them. Probably not what you want.


> In case you would like to give a hand with the implementation, let us
> know.

Would love too. I still am at the beginning stages of wrapping my head
around 0mq, so please give me some time.

> Martin

Gonzalo Diethelm

Declaración de confidencialidad: Este Mensaje esta destinado para
el uso de la o las personas o entidades a quien ha sido dirigido y
puede contener información reservada y confidencial que no puede
ser divulgada, difundida, ni aprovechada en forma alguna. El uso no
autorizado de la información contenida en este correo podrá ser
sancionado de conformidad con la ley chilena. 
Si usted ha recibido este correo electrónico por error, le pedimos
eliminarlo junto con los archivos adjuntos y avisar inmediatamente
al remitente, respondiendo este mensaje. 

"Before printing this e-mail think if is really necesary".
Disclosure: This Message is to be used by the individual,
individuals or entities that it is addressed to and may include
private and confidential information that may not be disclosed,
made public nor used in any way at all. Unauthorized use of the
information in this electronic mail message may be subject to the
penalties set forth by Chilean law. 
If you have received this electronic mail message in error, we ask
you to destroy the message and its attached file(s) and to
immediately notify the sender by answering this message. 

More information about the zeromq-dev mailing list