[zeromq-dev] thoughts on pub-sub messaging and "reliability"

Holger Joukl Holger.Joukl at LBBW.de
Mon Oct 20 09:21:05 CEST 2014


Hi Ben,

thanks for your thoughts on this.

> Von: Ben Kloosterman
> 100% guaranteed delivery is a myth .. pull the cable out and don't
> put it back. QED

100% everything is a myth ;-).

> In most cases attempts at guaranteed delivery create more bugs and
> problems then "non guaranteed" delivery. I have seen several cases
> where people introduced persistence which crippled performance so
> required clustering and required a lot of extra work to overcome
> performance limitations which created even more bugs.

As always, it's a tradeoff you have to make: Higher "QoS" usually
means lesser throughput, much like memory vs speed tradeoffs.

What makes sense for a certain situation will depend on the use case
at hand.

> When the
> upstream link failed for 4 hours ( digger through the cable before
> DR could come on )  the disk system lost the guaranteed messages
> because it ran out of quota disk space - note here bad blocks ,
> corrupt files or indexes for sql for message storage etc  . The big
> issue here was the psychological  "guarantee"  that was part of the
> design and the guarantee` is false / conditional  . It should be
> termed "more reliable" at best as that makes you think about what you
want.

I'll gladly agree that guaranteed/certified is an exaggeration
if you read it as 100%-guaranteed. I think you have to take it like
the guarantees you get when you buy a car: this certainly doesn't
cover each and every potential problem and some things are even
explicitely excluded from the guarantee.

> This does not mean you dont send acks and retransmit but that you
> think about what you need since  relying on the lower layer to
> guarantee does not always work  . eg for systems with unreliable
> networks I prefer retransmit at the app layer because sometimes the
> message / transport layer acks ( like tcp ) but it gets to the
> message layer but not to the application eg a shutdown and the you
> never get the message.

I'd say the retransmit and ack stuff is indeed at the app layer if
you think in terms of a layer model.
But it's the app layer of the middleware implementation that in turn
provides the messaging service to the business or functional application.
The reasoning being that it's hard to get this right so should be packaged
in a layer beneath the functional application.

Best regards
Holger

Landesbank Baden-Wuerttemberg
Anstalt des oeffentlichen Rechts
Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz
HRA 12704
Amtsgericht Stuttgart




More information about the zeromq-dev mailing list