[zeromq-dev] Can we customize a behaviour for in-mem queued messages after reconnect?
Brian Knox
brian.knox at neomailbox.net
Sat Dec 21 20:50:36 CET 2013
People in this discussion may be interested in taking a look at RELP.
It's used in Rsyslog when you want to absolutely guarantee that log
messages are transmitted.
http://www.rsyslog.com/doc/relp.html
On 12/20/2013 3:01 PM, Lindley French wrote:
> I'll agree to a limited extent, though I don't see things exactly as
> you do.
>
> The problem, in my view, is that normally you can trust TCP to get
> your packets through intact. When something goes wrong and a
> connection fails, you can take appropriate action to test what go
> through and what didn't, and fix it. But when TCP connections go down
> and then come back within 0MQ, there's no way to react to that, and
> 0MQ doesn't do a whole lot (from my understanding) to make sure no
> messages got lost in the ether. So nothing is done automatically and
> nothing can be done manually when a fault occurs.....which means you
> are forced to write a higher-level protocol on top of 0MQ *as if* it
> is totally unreliable and failures can happen at any time, even though
> in reality TCP is pretty good 99% of the time.
>
> I'm not going to request that 0MQ do its own acking and
> retransmissions or anything like that----I've been down that road,
> sooner or later you're basically writing TCP over TCP----but I do
> think there should be hooks to let you know what range of messages
> might be at risk when a connection goes down, so you can give them
> whatever special treatment you like.
>
>
> On Fri, Dec 20, 2013 at 2:42 PM, artemv zmq <artemv.zmq at gmail.com
> <mailto:artemv.zmq at gmail.com>> wrote:
>
> hi Gregg,
>
> As for the "acks". The game on mobile device is awaiting (with
> timeout) for "acks". So, yes, we do "acks", sure.
>
> I also was thinking about
> >> timestamping the messages and giving them a TTL
>
> and considered it as not reliable in my case. The problem is
> that we don't have control on where we deploy our software. We
> can't check: is time settings the same on all nodes in a cluster .
> And we can't
> ask our customers: "you have to ensure that time settings are the
> same on all nodes in your datacenter." I'm pretty sure that
> wouldn't work (at least, in my company).
>
> As for
> >>This sounds like an application problem, not a 0MQ problem
>
> I wouldn't put like that. It's not a problem, it's rather a
> missing feature in 0mq. I think behaviour like:
> "_unconditionally_ deliver messages on reconnected socket" is
> somewhat too strict. It's more designed to support some kind of
> historical data flow, where you don't want to lose even one
> message. What it can be? E.g. wheather data from sensors, e.g.
> quotes from stock exchange. But it is not very much suitable
> when you deal with something like: "place a bet" , "create a
> purchase order", "book hotel room". Agree?
>
>
>
>
> 2013/12/20 Gregg Irwin <gregg at pointillistic.com
> <mailto:gregg at pointillistic.com>>
>
> Hi Artem,
>
> az> Real example from gambling.
>
> az> We have thousands users betting from their phones. For
> end user a bet is
> az> just a click in UI, but for backend it's a bunch of
> remote calls to
> az> services. If service is not available, then bet message
> will stuck
> az> in 0mq in-mem message queue (up to hwm). The game UI can
> wait up to
> az> certain timeout and then render something akin to "We
> have communication
> az> problem with our backend. Try again later." So at this
> point user believes
> az> that bet wasn't succeeded (.. this is important). What
> happens then --
> az> ITOps get their pager rings, and then during 1hr they do
> their best to
> az> restart a failed service. Ok?
>
> az> After 1hr or so service restarts and now what? Now queued
> bet will be
> az> delivered to restarted service. And this is not goood,
> because 1hr earlier
> az> we ensured user that "we had a backend issue" and his
> bet wasn't suceeded.
>
> az> So the question arised -- how to not redeliver messages
> upon reconnect?
>
> This sounds like an application problem, not a 0MQ problem. A
> request
> to place the bet can be received, which doesn't guarantee that
> the bet
> has been placed (if other work needs to be done). To know that
> the bet
> was place, you need an ack. You can also ack that the
> *request* was
> received. In your scenario above, timestamping the messages
> and giving
> them a TTL lets you handle cases where requests could not be
> processed
> in a timely manner, and possibly ask the user what they want
> to do.
>
> -- Gregg
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org <mailto:zeromq-dev at lists.zeromq.org>
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org <mailto:zeromq-dev at lists.zeromq.org>
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20131221/74dbca78/attachment.htm>
More information about the zeromq-dev
mailing list