[zeromq-dev] Migrating IPC-based Pub/Sub System To ZMQ

Pieter Hintjens ph at imatix.com
Sat Aug 14 13:32:07 CEST 2010

On Fri, Aug 13, 2010 at 5:57 PM, Santy, Michael
<Michael.Santy at dynetics.com> wrote:

> I'm currently working to refactor an existing pub/sub distributed system
> built around CMU IPC[1] and raw sockets to solely use ZMQ.  I've not been
> able to fit it into any one of the common patterns, so I'd like to get some
> feedback from the list on how to improve my proposed design.

It's important to tell us what your actual use case is. It seems to be this:

> Our existing system...
> collects high-speed data from a number of sources, combines the data
> centrally into processing packets, and farms out these processing packets
> (mostly) round-robin to one of many data processors.

So you have:

* A set of sources producing high volume data streams
* A set of combiners that turn these streams into workloads
* A set of workers that turn these workloads into results
* Control messages that flow between sources, combiners, and workers.

There are key things you've not explained:

* How many sources are there, what are the data rates, and message sizes?
* How many workers are there, what are the workload rates, and message sizes?
* Are the workloads idempotent, i.e. can they be run more than once
without risk?
* What is the requirement for reliability, in terms of how much work
can be lost?

However, making some assumptions, the design you proposed seems the right one:

* Sources each publish their data on a PUB socket.
* A single combiner subscribes to all sources and offers a PUSH socket
for workers.
* Workers connect to the combiner via a PULL socket.
* Sources, combiners, and workers all connect to a forwarder device.

Any node can publish a control message to the forwarder, and all nodes
will receive it, a simple multicast model.

If the combiner crashes, work in progress will be lost.  Workers will
automatically reconnect when the combiner restarts.  Sources will
queue and then eventually discard data.

As a first step you should prototype this in Python or similar, that
will take a few days.  It's actually a nice scenario and I might use
it in the Guide as a worked example.

There is probably a way to run multiple combiners in a redundant array
but it gets complex and I'd aim rather for making all nodes more
robust, and recovering quickly from any failure.

-Pieter Hintjens

More information about the zeromq-dev mailing list