[zeromq-dev] thoughts on pub-sub messaging and "reliability"

Justin Karneges justin at affinix.com
Fri Oct 17 21:39:53 CEST 2014


Hi Holger,

You've got a good list here.

If you're looking for opinions, I'll say that I'm fond of the first
approach (0), which is to treat pubsub as a best-effort transport. This
is also in spirit with what is discussed in the ZeroMQ guide:
http://zguide.zeromq.org/page:all#Pros-and-Cons-of-Pub-Sub

I mostly work on client/server web applications, and I want reliability
just like anyone else. However, 100% guaranteed delivery in the pubsub
layer is not necessary to achieve this. Not only that, guaranteed
delivery is quite difficult to get right and comes with all sorts of
implications. Instead, I create a reliable request/response mechanism
for synchronization (of application data, not "messages") and combine it
with best-effort pubsub messaging.

On Fri, Oct 17, 2014, at 02:38 AM, Holger Joukl wrote:
> 
> Hi,
> 
> I realize I'm sometimes confused by terms used by myself and others.
> 
> So, in an attempt to clarify my thoughts, first and foremost to myself,
> by
> writing them down here's how I tend to think about pub-sub messaging:
> 
> (0) Publish-subscribe (pub-sub) messaging:
> - one or many sender(s) publish data on one or more "channels"
> - one or many listener(s) subscribe to one or more channels
> - communication is asynchronous
> - listeners might not receive all messages even if up-and-running when
> the messages are being sent (due to network glitches or whatever)
> 
> (1) Reliable pub-sub messaging:
> - one or many sender(s) publish data on one or more "channels"
> - one or many listener(s) subscribe to one or more channels
> - communication is asynchronous
> - as long as a listener is up-and-running when the messages are being
> sent
> it will
> receive all messages from all senders publishing on the channels they've
> subscribed to,
> in order (order per sender)
> 
> (2) Guaranteed/Certified pub-sub messaging:
> - one or many sender(s) publish data on one or more "channels"
> - one or many listener(s) subscribe to one or more channels
> - communication is asynchronous
> - as long as a listener is up-and-running when the messages are being
> sent
> it will
> receive all messages from all senders publishing on the channels they've
> subscribed to,
> in order (order per sender)
> - even if a listener is *not* up-and-running when messages are being sent
> a
> listener
> will still get all the messages, i.e. the missed messages will get
> re-delivered,
> in order (order per sender); the listener will not receive new messages
> until it
> has received all missed message
> - as a consequence, messages need to get persisted:
>   - either each listeners' subscriptions need to get registered when
> opening a
> subscription, or by predefined configuration, so persisted messages can
> be
> safely
> deleted from persistent storage when all registered listeners have
> received
> them
> an explicitly acknowledged that fact
>   - or all sent messages need to get persisted for a period of time so a
> listener
> can request missed messages for retransmission
> 
> Sometimes the "reliability" distinction is called different "qualities of
> service"
> (QoS).
> 
> Note that I've deliberately ignored any problems that might arise in
> pub-sub
> communications, e.g. slow consumers in (unreliable) high frequency
> scenarios
> or the like.
> 
> Pub-sub messaging can be implemented over a variety of transports and
> protocols:
> (UDP) broadcast, multicast, TCP, ...
> The transport + protocol used determines the properties of
> pub-sub-messaging, e.g.:
> - plain UDP broadcast is unreliable
> - PGM or NORM multicast is reliable
> - a protocol on top of TCP/UDP/reliable multicast is ususally necessary
> for
> guaranteed/certified messaging
> 
> Design approaches you encounter in the wild:
> 
> - central broker with queues, optionally persistent (e.g. AMQP brokers,
> JMS
> providers, IBM MQ, ...)
>   - senders do not know anything about listeners
>   - listeners do not know anything about senders
>   - the broker is "the rendezvous point" for communication, often called
> "message bus"
>   - senders connect to the broker (usually TCP)
>   - listeners connect to the broker (usually TCP)
>   - channels are usually called "topics" (basically a special case of
>   queue
> that allows
>     for many listeners to receive the topic messages)
>   - broker knows registered guaranteed/certified listeners: When all
>   known
> listeners
>     have retrieved and acknowledged a certain message on a topic this
> message will get
>     deleted from the queue
>   - queue persistence to make queued messages survive broker failure
>   - broker might be distributed, i.e. multiple broker working in
> cooperation, e.g. for reasons
>     of throughput scaling, partitioning of data channels, replication
>   - broker is a single point of failure so will normally get clustered
>   and
> replicated for
>     some notion of cold/hot standby for mission critical communication
> 
> - central broker with a persistent commit log or journal (e.g. Apache
> Kafka, ZPER, ...)
>   - senders do not know anything about listeners
>   - listeners do not know anything about senders
>   - the broker is "the rendezvous point" for communication
>   - senders connect to the broker (usually TCP)
>   - listeners connect to the broker (usually TCP)
>   - messages sent on the channels are persisted by the broker as
>   sequenced
> continous message
>     "commit logs" for a configurable period of time
>   - listeners can get retransmission of any historic message within the
> configured period of
>     time
>   - listeners are responsible for their state, i.e. which messages they
> have already
>     processed
>   - listener state is basically just a pointer to its current position in
> the commit log
>    - broker might be distributed, i.e. multiple broker working in
> cooperation, e.g. for reasons
>     of throughput scaling, partitioning of data channels, replication
>   - broker is a single point of failure so will normally get clustered
>   and
> replicated for
>     some notion of cold/hot standby for mission critical communication
> 
> - distributed message bus (e.g. TIB/Rendezvous, http://iris.karalabe.com/
> , ...)
>   - consists of a network of "micro brokers", sometimes called daemons or
> agent or relay process,
>     ususally one per host
>   - senders connect to a (local) broker (usually IPC or TCP)
>   - listeners connect to a (local) broker (usually IPC or TCP)
>   - the "network of brokers" is "the rendezvous point" for communication,
> i.e. the brokers
>     collectively form a "message bus"
>   - senders do not know anything about listeners in case of reliable
> pub-sub
>   - listeners do not know anything about senders in case of reliable
> pub-sub
>   - senders know about registered listeners in case of
>   guaranteed/certified
> pub-sub
>   - message persistence is local to the sending clients (called "ledger"
> file e.g. in TIB/Rv
>     terms); might also be possible local to the local broker but I
>     haven't
> seen such an approach
>     yet
>   - registered listeners need to acknowledge messages with their senders
>   so
> they can remove them
>     from persistent storage
>   - brokers are not single points of failure
>   - brokers do not need separate clustering/replication apart from what
> needs to be done
>     for the mission critical applications running on their host systems
> anyway; they are
>     basically guarded by the same safety measures taken for the host
> system, ranging from none
>     to whatever
> 
> - somewhat hybrid forms, e.g. brokers that scale out elastically like
> http://roq-messaging.org/
> 
> Hope no one's annoyed by the
> lengthy-not-really-a-question-nor-problem-description
> kind of post. Glad to get anyone's thoughts or totally different views or
> hints on the
> glaringly obvious that I missed.
> 
> Best regards
> Holger
> 
> Landesbank Baden-Wuerttemberg
> Anstalt des oeffentlichen Rechts
> Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz
> HRA 12704
> Amtsgericht Stuttgart
> 
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev



More information about the zeromq-dev mailing list