[zeromq-dev] Can we customize a behaviour for in-mem queued messages after reconnect?

Brian Knox brian.knox at neomailbox.net
Sat Dec 21 20:50:36 CET 2013

People in this discussion may be interested in taking a look at RELP.  
It's used in Rsyslog when you want to absolutely guarantee that log 
messages are transmitted.


On 12/20/2013 3:01 PM, Lindley French wrote:
> I'll agree to a limited extent, though I don't see things exactly as 
> you do.
> The problem, in my view, is that normally you can trust TCP to get 
> your packets through intact. When something goes wrong and a 
> connection fails, you can take appropriate action to test what go 
> through and what didn't, and fix it. But when TCP connections go down 
> and then come back within 0MQ, there's no way to react to that, and 
> 0MQ doesn't do a whole lot (from my understanding) to make sure no 
> messages got lost in the ether. So nothing is done automatically and 
> nothing can be done manually when a fault occurs.....which means you 
> are forced to write a higher-level protocol on top of 0MQ *as if* it 
> is totally unreliable and failures can happen at any time, even though 
> in reality TCP is pretty good 99% of the time.
> I'm not going to request that 0MQ do its own acking and 
> retransmissions or anything like that----I've been down that road, 
> sooner or later you're basically writing TCP over TCP----but I do 
> think there should be hooks to let you know what range of messages 
> might be at risk when a connection goes down, so you can give them 
> whatever special treatment you like.
> On Fri, Dec 20, 2013 at 2:42 PM, artemv zmq <artemv.zmq at gmail.com 
> <mailto:artemv.zmq at gmail.com>> wrote:
>     hi Gregg,
>     As for the "acks". The game on mobile device is awaiting (with
>     timeout) for "acks". So, yes, we do "acks", sure.
>     I also was thinking about
>     >> timestamping the messages and giving them a TTL
>     and considered it as not reliable in my case.   The problem is
>     that  we don't have control on where we deploy our software. We
>     can't check: is time settings the same on all nodes in a cluster .
>     And we can't
>     ask our customers: "you have to ensure that time settings are the
>     same on all nodes in your datacenter."  I'm pretty sure that
>     wouldn't work (at least, in my company).
>     As for
>     >>This sounds like an application problem, not a 0MQ problem
>     I wouldn't put like that. It's not a problem, it's rather a
>     missing feature in 0mq.  I think behaviour like:
>      "_unconditionally_ deliver messages on reconnected socket"  is
>     somewhat too strict.  It's more designed to support some kind of
>     historical data flow, where you don't want to lose even one
>     message. What it can be?  E.g. wheather data from sensors, e.g.
>      quotes from stock exchange. But it is not very much suitable
>      when  you deal with something like: "place a bet" , "create a
>     purchase order", "book hotel room".    Agree?
>     2013/12/20 Gregg Irwin <gregg at pointillistic.com
>     <mailto:gregg at pointillistic.com>>
>         Hi Artem,
>         az> Real example from gambling.
>         az> We have thousands users betting from their phones.  For
>         end user a bet is
>         az> just a click in UI, but for backend it's  a bunch of
>         remote calls to
>         az> services. If  service is not available, then bet message
>          will stuck
>         az> in 0mq in-mem message queue  (up to hwm). The game UI can
>         wait up to
>         az> certain timeout  and then render something akin to  "We
>         have communication
>         az> problem with our backend. Try again later."  So at this
>         point user believes
>         az> that bet wasn't succeeded (.. this is important).    What
>         happens then --
>         az> ITOps get their pager rings, and then during 1hr they do
>         their best to
>         az>  restart a failed service.  Ok?
>         az> After 1hr or so service restarts  and now what? Now queued
>         bet will be
>         az> delivered to restarted service. And this is not goood,
>         because 1hr earlier
>         az>  we ensured user that "we had a backend issue"  and his
>         bet wasn't suceeded.
>         az> So  the  question arised --  how to not redeliver messages
>         upon reconnect?
>         This sounds like an application problem, not a 0MQ problem. A
>         request
>         to place the bet can be received, which doesn't guarantee that
>         the bet
>         has been placed (if other work needs to be done). To know that
>         the bet
>         was place, you need an ack. You can also ack that the
>         *request* was
>         received. In your scenario above, timestamping the messages
>         and giving
>         them a TTL lets you handle cases where requests could not be
>         processed
>         in a timely manner, and possibly ask the user what they want
>         to do.
>         -- Gregg
>         _______________________________________________
>         zeromq-dev mailing list
>         zeromq-dev at lists.zeromq.org <mailto:zeromq-dev at lists.zeromq.org>
>         http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>     _______________________________________________
>     zeromq-dev mailing list
>     zeromq-dev at lists.zeromq.org <mailto:zeromq-dev at lists.zeromq.org>
>     http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20131221/74dbca78/attachment.htm>

More information about the zeromq-dev mailing list