[zeromq-dev] Fundamental question on ZMQ. How to determine message fail/success without HWM-arking?

Lindley French lindleyf at gmail.com
Fri Dec 6 17:39:13 CET 2013


There is a protocol called NORM which you might want to look at. It's kind
of like PGM on steroids. It's NACK-based, but it allows you to specify that
the sender wants to know whether a given peer received the message. It can
work in both multicast and unicast cases.


On Fri, Dec 6, 2013 at 9:24 AM, Justin Cook <jhcook at gmail.com> wrote:

> Artem,
>
> The “high-water mark” is simply used to prevent memory exhaustion:
>
> "It has ways of dealing with over-full queues (called "high water mark").
> When a queue is full, ØMQ automatically blocks senders, or throws away
> messages, depending on the kind of messaging you are doing (the so-called
> "pattern”).”
>
> It has nothing to do with with specific peers.  What you are doing is
> setting the HWM to a low number and then hoping for a send() to block — not
> return false — if the queue is exhausted. You then assume that a peer is
> down if you were to block. Given that the behaviour differs based on the
> messaging pattern you are using, you will need to setup a test case.
>
> If I were you, I would abandon this idea and investigate what Pieter said
> @1311. Sequence numbering and NACKs is what I would look at.
>
> Once you have done that, feel free to generate a test case and share with
> us on the list.
>
> --
> Justin Cook
>
>
> On Friday, 6 December 2013 at 13:59, artemv zmq wrote:
>
> > Thanks Pieter. All this sounds new to me ... :|
> >
> > But if we return to HWM question -- when I set hwm=0 and sending to
> unexistent peer, then every .send() call return me "true". Is not this an
> issue in ZMQ core?
> >
> >
> > BR
> > -artemv
> >
> >
> >
> > 2013/12/6 Pieter Hintjens <ph at imatix.com (mailto:ph at imatix.com)>
> > > You should probably think about a mix of sequence numbering, credit
> > > based flow control and negative acks that flow asynchronously against
> > > the message flow. You can then send without waiting, ensure you never
> > > overrun buffers, and catch errors if they happen.
> > >
> > > On Fri, Dec 6, 2013 at 2:02 PM, artemv zmq <artemv.zmq at gmail.com(mailto:
> artemv.zmq at gmail.com)> wrote:
> > > > 2Matt:
> > > >
> > > > > > Sending a message may take some time (connection latency, etc)
> so how
> > > > > > long do you think it will take to send the message before you
> assume it has
> > > > > > been sent or not?
> > > > >
> > > >
> > > > Sure thing -- I can go with some reasonable timeout specified via
> > > > properties/cmdline. Not a problem.
> > > >
> > > > > > If you want send() to return false
> > > > The only way I found ZMQ can give me false on .send() -- is to set
> hwm=1
> > > > and send two messages everytime:
> > > >
> > > > --send_dummy_scout_msg-->
> > > > --send_bet_msg-->
> > > > --send_dummy_scout_msg-->
> > > > -- X
> > > >
> > > > Got idea? If "dummy_scout" stucked in a queue, then "bet_msg" will
> not be
> > > > sent , so .send() will return me "false". Pretty stupid... Not sure
> I can
> > > > seriously explain this to chief architects :) Is there other way?
> > > >
> > > >
> > > >
> > > > 2013/12/6 Diego Duclos <diego.duclos at palmstonegames.com (mailto:
> diego.duclos at palmstonegames.com)>
> > > > >
> > > > > If ALL you need is to know is "has message left NIC on sending
> process or
> > > > > not", there is a socket option for that. It's called
> ZMQ_ROUTER_MANDATORY.
> > > > >
> > > > >
> > > > > On Fri, Dec 6, 2013 at 1:07 PM, Matt Connolly <
> matt.connolly at me.com (mailto:matt.connolly at me.com)>
> > > > > wrote:
> > > > > >
> > > > > > Could you use the socket monitoring to check the connected state
> of the
> > > > > > dealer socket?
> > > > > >
> > > > > > Sending a message may take some time (connection latency, etc)
> so how
> > > > > > long do you think it will take to send the message before you
> assume it has
> > > > > > been sent or not?
> > > > > >
> > > > > > If you want send() to return false, you would need it to be a
> blocking
> > > > > > synchronous call which against the idea of queuing messages to
> be sent (as
> > > > > > far as I understand)
> > > > > >
> > > > > > Good luck
> > > > > >
> > > > > > Cheers,
> > > > > > Matt.
> > > > > >
> > > > > > On 6 Dec 2013, at 9:19 pm, artemv zmq <artemv.zmq at gmail.com(mailto:
> artemv.zmq at gmail.com)> wrote:
> > > > > >
> > > > > > Sorry for confusion.
> > > > > >
> > > > > > When I said out-of-control -- I meant they do have ZMQ but they
> may have
> > > > > > different release cycle and QoS. It's just a service on ZMQ, on
> a ROUTER.
> > > > > >
> > > > > > Our application is aimed to take a message, get its headers,
> decide on
> > > > > > what service ROUTER to send and that's it. W/o waiting for reply.
> > > > > > Essentially we are a DEALER.
> > > > > > Replies are important, but as long as they coming back. If they
> not. Not
> > > > > > a problem. Client application (iPhone game) by itself checking
> replies and
> > > > > > correlation,
> > > > > > and keep watching: "ahha, I didn't receive ack for betting.
> hmmm. Let's
> > > > > > try again". Now it's more clear?
> > > > > >
> > > > > > I really don't need PUB/SUB. I need DEALER/ROUTER. Here, in my
> company,
> > > > > > the only biggest concern so far with ZMQ -- misleading behaviour:
> > > > > > when .send() returns "true" that should mean that message "sent",
> > > > > > whatever that means: left our PID, left our NIC and so on, we
> have to
> > > > > > guarantee that message is not on us.
> > > > > > I know what's PUB/SUB. And again, telling you that it's not
> suitable. The
> > > > > > problem statement is simple:
> > > > > >
> > > > > > - don't use HWM for DEALER/ROUTER (prohibit message queueing).
> > > > > > - raise immediately if you can't .send() (don't collect in
> internal
> > > > > > queue)
> > > > > >
> > > > > >
> > > > > > Is it possible?
> > > > > >
> > > > > >
> > > > > > BR
> > > > > > -artemv
> > > > > >
> > > > > >
> > > > > > 2013/12/6 Justin Cook <jhcook at gmail.com (mailto:jhcook at gmail.com
> )>
> > > > > > >
> > > > > > > Ok, this is confusing. If you are sending a message to a
> service that is
> > > > > > > out of your control, either they use 0MQ or not. I assume they
> do not. If
> > > > > > > that’s the case, it should not be a part of the use case.
> > > > > > >
> > > > > > > You say you need to know if a message has been received. But,
> then you
> > > > > > > say no ACKs or timeouts. I’m even more confused. If you are
> making a request
> > > > > > > to a foreign service over — I assume — HTTP which uses TCP,
> you are very
> > > > > > > well getting HTTP return codes with the TCP session doing all
> the hard work.
> > > > > > > You already have what you are looking for there.
> > > > > > >
> > > > > > > As far as your system — going out to mobile devices — using
> PUB/SUB and
> > > > > > > ACKing messages, this is something you will have to do in
> another channel
> > > > > > > with 0MQ. Multicast uses UDP; because, it is not feasible to
> send TCP ACKs
> > > > > > > from every single subscriber. It’s simply not scalable.
> > > > > > >
> > > > > > > You very well may need to develop your own application
> protocol to send
> > > > > > > ACKs or the publisher retransmits. I highly suggest you have a
> look at this:
> > > > > > >
> > > > > > >
> > > > > > >
> http://stackoverflow.com/questions/12956685/what-are-the-retransmission-rules-for-tcp
> > > > > > >
> > > > > > > It may be something you will want to mimic in your
> implementation.
> > > > > > > Someone else has already suggested a timeout for resending
> unacknowledged
> > > > > > > messages. As you can see, this is one of the ways TCP
> retransmissions work.
> > > > > > > You also may have corrupt data that fail a CRC or hash.
> > > > > > >
> > > > > > > I will finish by saying that if you do have a PUB/SUB design
> using
> > > > > > > another channel for unicast communication, you will need to be
> very aware of
> > > > > > > scalability issues. You may need to use a lockstep pattern
> such as REQ/REP
> > > > > > > if you need guarantee of communication.
> > > > > > >
> > > > > > > --
> > > > > > > Justin Cook
> > > > > > >
> > > > > > >
> > > > > > > On Friday, 6 December 2013 at 09:46, artemv zmq wrote:
> > > > > > >
> > > > > > > > Thanks for heads up.
> > > > > > > >
> > > > > > > > 2crocket:
> > > > > > > > No acks. No timeouts. Nothing should be kept. Messages
> should just
> > > > > > > > flowing back and forth. But for every message we have to
> answer a question:
> > > > > > > > "has message left NIC on sending process or not". Let me
> give example with
> > > > > > > > betting: game on iPhone sending us a message "make-a-bet",
> then we send this
> > > > > > > > to BettingService which isn't in our control,
> > > > > > > > so all we have to guarantee -- "make-a-bet" message has left
> our NIC
> > > > > > > > and been "sent" to BettingService. If "make-a-bet" has been
> droped on a
> > > > > > > > network - ok, if BettingService itself drops it - ok.
> > > > > > > >
> > > > > > > > Back to HWM. Let's consider that we send to unavaliable peer.
> > > > > > > > hwm=1. It means you can send 1 message "blindly" and .send()
> function
> > > > > > > > returns success. Of course sending second time will fail.
> But... the trick
> > > > > > > > is -- we need answer first time.
> > > > > > > > hwm=0. It means you can send any number of messages and
> .send()
> > > > > > > > function _always_ returns success :(( Again, isn't this a
> bug?
> > > > > > > >
> > > > > > > >
> > > > > > > > So let me re-phrase the original question -- how to fail at
> .send()
> > > > > > > > function in ZMQ?
> > > > > > > >
> > > > > > > >
> > > > > > > > BR
> > > > > > > > -artemv
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > 2013/12/6 crocket <crockabiscuit at gmail.com (mailto:
> crockabiscuit at gmail.com)
> > > > > > > > (mailto:crockabiscuit at gmail.com)>
> > > > > > > > > Why don't you set a timeout for asynchronous ACKs?
> > > > > > > > > You receive ACKs asynchronously and keep associated
> messages until
> > > > > > > > > ACKs come or a timeout occurs.
> > > > > > > > > A timeout of 20 seconds is a reasonable estimate.
> > > > > > > > > After a timeout, if a message doesn't have a corresponding
> ACK, it
> > > > > > > > > is determined that the message wasn't delievered, and the
> message is sent
> > > > > > > > > again or discarded.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Dec 6, 2013 at 3:19 AM, artemv zmq <
> artemv.zmq at gmail.com (mailto:artemv.zmq at gmail.com)
> > > > > > > > > (mailto:artemv.zmq at gmail.com)> wrote:
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > My name is Artem. I stay with ZMQ (on java) a year or
> so. Got a
> > > > > > > > > > cool question for you, ppl!
> > > > > > > > > >
> > > > > > > > > > Here's my story. Recently I entered a new company
> (gambling
> > > > > > > > > > games), after working few weeks, after getting
> accustomed with a code, I
> > > > > > > > > > found that they are building
> > > > > > > > > > very-unnecessarly-complex-distibuted-application ... I
> was unhappy few days,
> > > > > > > > > > because couldn't even imagine how to support ALL THAT
> CRAP in an upcoming
> > > > > > > > > > future. So I suggested ZMQ hoping that ZMQ will "open
> eyes" to others. But,
> > > > > > > > > > as a feedback I got one big fundamental concern (from
> chief architects):
> > > > > > > > > >
> > > > > > > > > > - we have to know only one thing about every message: it
> has been
> > > > > > > > > > delivered onto remote peer or not
> > > > > > > > > >
> > > > > > > > > > And few additional comments:
> > > > > > > > > > -we don't care if message will get lost on a network
> > > > > > > > > > - we don't need guarantee deliveri
> > > > > > > > > > - no RPC / everything is asynchronous
> > > > > > > > > > - we don't need HWM
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > So I'm here, because I really can't address this
> question: "for
> > > > > > > > > > every single message how to know : whether it was
> delivered or not" .
> > > > > > > > > >
> > > > > > > > > > Thanks in advance. And appreciate for your help.
> > > > > > > > > > _______________________________________________
> > > > > > > > > > zeromq-dev mailing list
> > > > > > > > > > zeromq-dev at lists.zeromq.org (mailto:
> zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > zeromq-dev mailing list
> > > > > > > > > zeromq-dev at lists.zeromq.org (mailto:
> zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > zeromq-dev mailing list
> > > > > > > > zeromq-dev at lists.zeromq.org (mailto:
> zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > zeromq-dev mailing list
> > > > > > > zeromq-dev at lists.zeromq.org (mailto:
> zeromq-dev at lists.zeromq.org)
> > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > >
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > zeromq-dev mailing list
> > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > zeromq-dev mailing list
> > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > >
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > zeromq-dev mailing list
> > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > zeromq-dev mailing list
> > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > >
> > > _______________________________________________
> > > zeromq-dev mailing list
> > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> >
> >
> > _______________________________________________
> > zeromq-dev mailing list
> > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20131206/47ce09b1/attachment.htm>


More information about the zeromq-dev mailing list