[zeromq-dev] Efficient and reliable pub/sub mechanism

Marko Mikulicic marko.mikulicic at isti.cnr.it
Mon Jan 24 13:56:57 CET 2011

On 24 January 2011 01:06, Yusuf Simonson <simonson at gmail.com> wrote:

> Hi,
> I'm trying to build a pub/sub engine on top of jzmq for performing
> distributed computation in cloud environments. Systems have a relatively
> high chance of failure, so I want to ensure reliability.
> I was wondering if there was any advice on how to do this in an efficient
> and reliable manner. By efficient, I mean that I don't want to rely on a
> central broker for federating events because whatever system running the
> broker could easily get its network card saturated. By reliable, I mean that
> I want the engine to be resilient to individual system failure. So if
> process P publishes an event with a channel that process Q is subscribed to,
> and Q dies before it finishes processing the event, then P should persist
> the event. At some point in time, a process R will be created to take Q's
> place. It will notify P that it is the substitute, and P will send it the
> event.
> I could use zmq's PUB/SUB socket types but they seem to fail the
> reliability clause. From what I'm reading, it appears as if I could do
> many-to-many socket connections with them though, which means they would not
> fail the efficiency clause. Although unless the Publisher Side Messaging
> Filtering topic is out of date (http://www.zeromq.org/topics:new-topics),
> events are filtered at the subscriber side, which could be a bottleneck.

> I could also do REP/REQ, where each process is running a REP socket. When a
> process wants to publish, it queries a broker to see who is subscribed to
> the relevant channel. Then it connects to each of the processes' REP
> sockets, sends the event, and waits for an acknowledgement before moving on.
> This seems like a bad solution because afaics messages have to be processed
> sequentially, and it might take a while for a process to handle a message.

REP is sequential it's true, but that only a programming aid. You should use
the underlying XREP socket type to do asynchronous reply, if you need that.

The nice thing about ZMQ is that it actually allows you to use it for all
concurrency needs. For example if you have a cpu-bound multihreaded worker
which you want to consume as many possible incoming events (and reply
you can have multiple threads with REP sockets, sequentially taking a
message, computing, and replying. Then all those threads can be connected to
an inproc:// queue device (which accepts messages from XREQ and dispatches
messages to your worker threads using XREP socket, see image at:

So, from what I understood, REQ/REP are useful tools if you are going to
model your concurrency needs from within the zmq framework. If you have
other solutions on which you already basing your in-process concurrency,
perhaps also because of legacy code (actors, locking), then REQ/REP isn't
for you, but you have to talk directly to a XREP/XREQ socket.

Anyway, you can also reply acknowledgments through a different channel, like
a PUSH socket back to the publisher. The advantage I see of going the
XREQ/XREP route, is that by using XREP you can reply to whatever was your
event source,
even in case of multiple event publishers connected to your fabric, while
using a secondary push-back channel you have to handle that case manually
and open more sockets.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20110124/f8d6c909/attachment.htm>

More information about the zeromq-dev mailing list