[zeromq-dev] Requirements for reliability

Pieter Hintjens ph at imatix.com
Tue Apr 6 11:05:39 CEST 2010


One of the main questions that AMQP users have when coming to 0MQ is
how we do reliability.

So I'm collecting requirements for reliability on top of 0MQ.  My idea
is to build this as an application layer on top of 0MQ, using a
different API and a separate protocol that sits on top of the 0MQ SPB

Here is a sketch of what seems to be the simplest design for
fire-and-forget reliability:

* It covers only one-to-one delivery of messages (e.g. for trade
execution or request-response).
* The publisher delivers a message to a local dispatcher service
(perhaps based on a 0MQ device).
* The dispatcher queues the message on persistent storage and the
publisher continues.
* The dispatcher in a second thread delivers messages to their destination.
* Recipients reply to messages with an acknowledgment.
* Recipients store received messages and if they get a duplicate, they
discard it but resend the acknowledgement.
* The dispatcher marks acknowledged messages, and ignores duplicate

In the most brutal implementation, both sides use a light database to
hold messages, and store everything.  There are more intelligent ways
to hold persistent queues but that's an implementation detail.

The above design covers crashes in the publisher, consumer, and
network afaics.  Throughput would depend on message size, since
messages can easily be batched and acknowledged in blocks.

Has anyone been implementing such reliability on top of 0MQ already?


More information about the zeromq-dev mailing list