[zeromq-dev] Migrating IPC-based Pub/Sub System To ZMQ
Pieter Hintjens
ph at imatix.com
Sat Aug 14 13:32:07 CEST 2010
On Fri, Aug 13, 2010 at 5:57 PM, Santy, Michael
<Michael.Santy at dynetics.com> wrote:
> I'm currently working to refactor an existing pub/sub distributed system
> built around CMU IPC[1] and raw sockets to solely use ZMQ. I've not been
> able to fit it into any one of the common patterns, so I'd like to get some
> feedback from the list on how to improve my proposed design.
It's important to tell us what your actual use case is. It seems to be this:
> Our existing system...
> collects high-speed data from a number of sources, combines the data
> centrally into processing packets, and farms out these processing packets
> (mostly) round-robin to one of many data processors.
So you have:
* A set of sources producing high volume data streams
* A set of combiners that turn these streams into workloads
* A set of workers that turn these workloads into results
* Control messages that flow between sources, combiners, and workers.
There are key things you've not explained:
* How many sources are there, what are the data rates, and message sizes?
* How many workers are there, what are the workload rates, and message sizes?
* Are the workloads idempotent, i.e. can they be run more than once
without risk?
* What is the requirement for reliability, in terms of how much work
can be lost?
However, making some assumptions, the design you proposed seems the right one:
* Sources each publish their data on a PUB socket.
* A single combiner subscribes to all sources and offers a PUSH socket
for workers.
* Workers connect to the combiner via a PULL socket.
* Sources, combiners, and workers all connect to a forwarder device.
Any node can publish a control message to the forwarder, and all nodes
will receive it, a simple multicast model.
If the combiner crashes, work in progress will be lost. Workers will
automatically reconnect when the combiner restarts. Sources will
queue and then eventually discard data.
As a first step you should prototype this in Python or similar, that
will take a few days. It's actually a nice scenario and I might use
it in the Guide as a worked example.
There is probably a way to run multiple combiners in a redundant array
but it gets complex and I'd aim rather for making all nodes more
robust, and recovering quickly from any failure.
-Pieter Hintjens
iMatix
More information about the zeromq-dev
mailing list