[zeromq-dev] Fundamental question on ZMQ. How to determine message fail/success without HWM-arking?

Bruno D. Rodrigues bruno.rodrigues at litux.org
Fri Dec 6 18:17:17 CET 2013


socket.send() depends on the socket type. If it’s a drop socket (router, pub) it will drop the message. If it’s one of the others, it will queue the message.

On Dec 6, 2013, at 16:15, artemv zmq <artemv.zmq at gmail.com> wrote:

> Thanks Justin.
> 
> But still following points remain unanswered: 
> - why socket.send() returns "true"  when sending to unexisting peer (socket.hwm=0) ? 
> - why poller.pollout()  returns "true"  (socket registered on POLLOUT) when polling socket which is connected  to unexisting peer ?
> 
> For the first case: either prohibit 0 as an argument value or please bring a light on expected behaviour. For second -- such "polling" looks strange, isn't?
> 
> 
> BR
> -artemv
> 
> 
> 
> 
> 2013/12/6 Justin Cook <jhcook at gmail.com>
> Artem,
> 
> The “high-water mark” is simply used to prevent memory exhaustion:
> 
> "It has ways of dealing with over-full queues (called "high water mark"). When a queue is full, ØMQ automatically blocks senders, or throws away messages, depending on the kind of messaging you are doing (the so-called "pattern”).”
> 
> It has nothing to do with with specific peers.  What you are doing is setting the HWM to a low number and then hoping for a send() to block — not return false — if the queue is exhausted. You then assume that a peer is down if you were to block. Given that the behaviour differs based on the messaging pattern you are using, you will need to setup a test case.
> 
> If I were you, I would abandon this idea and investigate what Pieter said @1311. Sequence numbering and NACKs is what I would look at.
> 
> Once you have done that, feel free to generate a test case and share with us on the list.
> 
> --
> Justin Cook
> 
> 
> On Friday, 6 December 2013 at 13:59, artemv zmq wrote:
> 
> > Thanks Pieter. All this sounds new to me ... :|
> >
> > But if we return to HWM question -- when I set hwm=0 and sending to unexistent peer, then every .send() call return me "true". Is not this an issue in ZMQ core?
> >
> >
> > BR
> > -artemv
> >
> >
> >
> > 2013/12/6 Pieter Hintjens <ph at imatix.com (mailto:ph at imatix.com)>
> > > You should probably think about a mix of sequence numbering, credit
> > > based flow control and negative acks that flow asynchronously against
> > > the message flow. You can then send without waiting, ensure you never
> > > overrun buffers, and catch errors if they happen.
> > >
> > > On Fri, Dec 6, 2013 at 2:02 PM, artemv zmq <artemv.zmq at gmail.com (mailto:artemv.zmq at gmail.com)> wrote:
> > > > 2Matt:
> > > >
> > > > > > Sending a message may take some time (connection latency, etc) so how
> > > > > > long do you think it will take to send the message before you assume it has
> > > > > > been sent or not?
> > > > >
> > > >
> > > > Sure thing -- I can go with some reasonable timeout specified via
> > > > properties/cmdline. Not a problem.
> > > >
> > > > > > If you want send() to return false
> > > > The only way I found ZMQ can give me false on .send() -- is to set hwm=1
> > > > and send two messages everytime:
> > > >
> > > > --send_dummy_scout_msg-->
> > > > --send_bet_msg-->
> > > > --send_dummy_scout_msg-->
> > > > -- X
> > > >
> > > > Got idea? If "dummy_scout" stucked in a queue, then "bet_msg" will not be
> > > > sent , so .send() will return me "false". Pretty stupid... Not sure I can
> > > > seriously explain this to chief architects :) Is there other way?
> > > >
> > > >
> > > >
> > > > 2013/12/6 Diego Duclos <diego.duclos at palmstonegames.com (mailto:diego.duclos at palmstonegames.com)>
> > > > >
> > > > > If ALL you need is to know is "has message left NIC on sending process or
> > > > > not", there is a socket option for that. It's called ZMQ_ROUTER_MANDATORY.
> > > > >
> > > > >
> > > > > On Fri, Dec 6, 2013 at 1:07 PM, Matt Connolly <matt.connolly at me.com (mailto:matt.connolly at me.com)>
> > > > > wrote:
> > > > > >
> > > > > > Could you use the socket monitoring to check the connected state of the
> > > > > > dealer socket?
> > > > > >
> > > > > > Sending a message may take some time (connection latency, etc) so how
> > > > > > long do you think it will take to send the message before you assume it has
> > > > > > been sent or not?
> > > > > >
> > > > > > If you want send() to return false, you would need it to be a blocking
> > > > > > synchronous call which against the idea of queuing messages to be sent (as
> > > > > > far as I understand)
> > > > > >
> > > > > > Good luck
> > > > > >
> > > > > > Cheers,
> > > > > > Matt.
> > > > > >
> > > > > > On 6 Dec 2013, at 9:19 pm, artemv zmq <artemv.zmq at gmail.com (mailto:artemv.zmq at gmail.com)> wrote:
> > > > > >
> > > > > > Sorry for confusion.
> > > > > >
> > > > > > When I said out-of-control -- I meant they do have ZMQ but they may have
> > > > > > different release cycle and QoS. It's just a service on ZMQ, on a ROUTER.
> > > > > >
> > > > > > Our application is aimed to take a message, get its headers, decide on
> > > > > > what service ROUTER to send and that's it. W/o waiting for reply.
> > > > > > Essentially we are a DEALER.
> > > > > > Replies are important, but as long as they coming back. If they not. Not
> > > > > > a problem. Client application (iPhone game) by itself checking replies and
> > > > > > correlation,
> > > > > > and keep watching: "ahha, I didn't receive ack for betting. hmmm. Let's
> > > > > > try again". Now it's more clear?
> > > > > >
> > > > > > I really don't need PUB/SUB. I need DEALER/ROUTER. Here, in my company,
> > > > > > the only biggest concern so far with ZMQ -- misleading behaviour:
> > > > > > when .send() returns "true" that should mean that message "sent",
> > > > > > whatever that means: left our PID, left our NIC and so on, we have to
> > > > > > guarantee that message is not on us.
> > > > > > I know what's PUB/SUB. And again, telling you that it's not suitable. The
> > > > > > problem statement is simple:
> > > > > >
> > > > > > - don't use HWM for DEALER/ROUTER (prohibit message queueing).
> > > > > > - raise immediately if you can't .send() (don't collect in internal
> > > > > > queue)
> > > > > >
> > > > > >
> > > > > > Is it possible?
> > > > > >
> > > > > >
> > > > > > BR
> > > > > > -artemv
> > > > > >
> > > > > >
> > > > > > 2013/12/6 Justin Cook <jhcook at gmail.com (mailto:jhcook at gmail.com)>
> > > > > > >
> > > > > > > Ok, this is confusing. If you are sending a message to a service that is
> > > > > > > out of your control, either they use 0MQ or not. I assume they do not. If
> > > > > > > that’s the case, it should not be a part of the use case.
> > > > > > >
> > > > > > > You say you need to know if a message has been received. But, then you
> > > > > > > say no ACKs or timeouts. I’m even more confused. If you are making a request
> > > > > > > to a foreign service over — I assume — HTTP which uses TCP, you are very
> > > > > > > well getting HTTP return codes with the TCP session doing all the hard work.
> > > > > > > You already have what you are looking for there.
> > > > > > >
> > > > > > > As far as your system — going out to mobile devices — using PUB/SUB and
> > > > > > > ACKing messages, this is something you will have to do in another channel
> > > > > > > with 0MQ. Multicast uses UDP; because, it is not feasible to send TCP ACKs
> > > > > > > from every single subscriber. It’s simply not scalable.
> > > > > > >
> > > > > > > You very well may need to develop your own application protocol to send
> > > > > > > ACKs or the publisher retransmits. I highly suggest you have a look at this:
> > > > > > >
> > > > > > >
> > > > > > > http://stackoverflow.com/questions/12956685/what-are-the-retransmission-rules-for-tcp
> > > > > > >
> > > > > > > It may be something you will want to mimic in your implementation.
> > > > > > > Someone else has already suggested a timeout for resending unacknowledged
> > > > > > > messages. As you can see, this is one of the ways TCP retransmissions work.
> > > > > > > You also may have corrupt data that fail a CRC or hash.
> > > > > > >
> > > > > > > I will finish by saying that if you do have a PUB/SUB design using
> > > > > > > another channel for unicast communication, you will need to be very aware of
> > > > > > > scalability issues. You may need to use a lockstep pattern such as REQ/REP
> > > > > > > if you need guarantee of communication.
> > > > > > >
> > > > > > > --
> > > > > > > Justin Cook
> > > > > > >
> > > > > > >
> > > > > > > On Friday, 6 December 2013 at 09:46, artemv zmq wrote:
> > > > > > >
> > > > > > > > Thanks for heads up.
> > > > > > > >
> > > > > > > > 2crocket:
> > > > > > > > No acks. No timeouts. Nothing should be kept. Messages should just
> > > > > > > > flowing back and forth. But for every message we have to answer a question:
> > > > > > > > "has message left NIC on sending process or not". Let me give example with
> > > > > > > > betting: game on iPhone sending us a message "make-a-bet", then we send this
> > > > > > > > to BettingService which isn't in our control,
> > > > > > > > so all we have to guarantee -- "make-a-bet" message has left our NIC
> > > > > > > > and been "sent" to BettingService. If "make-a-bet" has been droped on a
> > > > > > > > network - ok, if BettingService itself drops it - ok.
> > > > > > > >
> > > > > > > > Back to HWM. Let's consider that we send to unavaliable peer.
> > > > > > > > hwm=1. It means you can send 1 message "blindly" and .send() function
> > > > > > > > returns success. Of course sending second time will fail. But... the trick
> > > > > > > > is -- we need answer first time.
> > > > > > > > hwm=0. It means you can send any number of messages and .send()
> > > > > > > > function _always_ returns success :(( Again, isn't this a bug?
> > > > > > > >
> > > > > > > >
> > > > > > > > So let me re-phrase the original question -- how to fail at .send()
> > > > > > > > function in ZMQ?
> > > > > > > >
> > > > > > > >
> > > > > > > > BR
> > > > > > > > -artemv
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > 2013/12/6 crocket <crockabiscuit at gmail.com (mailto:crockabiscuit at gmail.com)
> > > > > > > > (mailto:crockabiscuit at gmail.com)>
> > > > > > > > > Why don't you set a timeout for asynchronous ACKs?
> > > > > > > > > You receive ACKs asynchronously and keep associated messages until
> > > > > > > > > ACKs come or a timeout occurs.
> > > > > > > > > A timeout of 20 seconds is a reasonable estimate.
> > > > > > > > > After a timeout, if a message doesn't have a corresponding ACK, it
> > > > > > > > > is determined that the message wasn't delievered, and the message is sent
> > > > > > > > > again or discarded.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Dec 6, 2013 at 3:19 AM, artemv zmq <artemv.zmq at gmail.com (mailto:artemv.zmq at gmail.com)
> > > > > > > > > (mailto:artemv.zmq at gmail.com)> wrote:
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > My name is Artem. I stay with ZMQ (on java) a year or so. Got a
> > > > > > > > > > cool question for you, ppl!
> > > > > > > > > >
> > > > > > > > > > Here's my story. Recently I entered a new company (gambling
> > > > > > > > > > games), after working few weeks, after getting accustomed with a code, I
> > > > > > > > > > found that they are building
> > > > > > > > > > very-unnecessarly-complex-distibuted-application ... I was unhappy few days,
> > > > > > > > > > because couldn't even imagine how to support ALL THAT CRAP in an upcoming
> > > > > > > > > > future. So I suggested ZMQ hoping that ZMQ will "open eyes" to others. But,
> > > > > > > > > > as a feedback I got one big fundamental concern (from chief architects):
> > > > > > > > > >
> > > > > > > > > > - we have to know only one thing about every message: it has been
> > > > > > > > > > delivered onto remote peer or not
> > > > > > > > > >
> > > > > > > > > > And few additional comments:
> > > > > > > > > > -we don't care if message will get lost on a network
> > > > > > > > > > - we don't need guarantee deliveri
> > > > > > > > > > - no RPC / everything is asynchronous
> > > > > > > > > > - we don't need HWM
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > So I'm here, because I really can't address this question: "for
> > > > > > > > > > every single message how to know : whether it was delivered or not" .
> > > > > > > > > >
> > > > > > > > > > Thanks in advance. And appreciate for your help.
> > > > > > > > > > _______________________________________________
> > > > > > > > > > zeromq-dev mailing list
> > > > > > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > zeromq-dev mailing list
> > > > > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > zeromq-dev mailing list
> > > > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > zeromq-dev mailing list
> > > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > >
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > zeromq-dev mailing list
> > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > zeromq-dev mailing list
> > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > >
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > zeromq-dev mailing list
> > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > zeromq-dev mailing list
> > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > >
> > > _______________________________________________
> > > zeromq-dev mailing list
> > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> >
> >
> > _______________________________________________
> > zeromq-dev mailing list
> > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> 
> 
> 
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> 
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20131206/a4fa06dd/attachment.htm>


More information about the zeromq-dev mailing list