[zeromq-dev] Fundamental question on ZMQ. How to determine message fail/success without HWM-arking?

Justin Cook jhcook at gmail.com
Fri Dec 6 19:42:25 CET 2013


Artem,

What do you think socket.hwm=0 does? I think it’s safe to say you are assuming it sets the high water mark to zero. Guess again. Remember, 0MQ is an asynchronous messaging library. Hint, hint.

You have a DEALER going to a ROUTER. The ROUTER is out of your control. I assume the protocol the “Betting Company” is using is out of your control.  

If you set the high-water mark to 1, it will queue one message, all the others thereafter will block if you are a DEALER socket. That is pretty much the only indication you will get that the message was not sent. This is assuming it is a simple setup and only communicating to one ROUTER.

As far as pollout(), it will return the fd for the socket, because the queue has room and is accepting messages.  

You really need to read this:

http://zguide.zeromq.org/page:all#Dealing-with-Blocked-Peers

EAGAIN is your friend.  

--  
Justin Cook

+44 7500 960 000
+1 682 738 5380

Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Friday, 6 December 2013 at 16:15, artemv zmq wrote:

> Thanks Justin.
>  
> But still following points remain unanswered:  
> - why socket.send() returns "true" when sending to unexisting peer (socket.hwm=0) ?  
> - why poller.pollout() returns "true" (socket registered on POLLOUT) when polling socket which is connected to unexisting peer ?  
>  
> For the first case: either prohibit 0 as an argument value or please bring a light on expected behaviour. For second -- such "polling" looks strange, isn't?
>  
>  
> BR
> -artemv
>  
>  
>  
>  
> 2013/12/6 Justin Cook <jhcook at gmail.com (mailto:jhcook at gmail.com)>
> > Artem,
> >  
> > The “high-water mark” is simply used to prevent memory exhaustion:
> >  
> > "It has ways of dealing with over-full queues (called "high water mark"). When a queue is full, ØMQ automatically blocks senders, or throws away messages, depending on the kind of messaging you are doing (the so-called "pattern”).”
> >  
> > It has nothing to do with with specific peers. What you are doing is setting the HWM to a low number and then hoping for a send() to block — not return false — if the queue is exhausted. You then assume that a peer is down if you were to block. Given that the behaviour differs based on the messaging pattern you are using, you will need to setup a test case.
> >  
> > If I were you, I would abandon this idea and investigate what Pieter said @1311. Sequence numbering and NACKs is what I would look at.
> >  
> > Once you have done that, feel free to generate a test case and share with us on the list.
> >  
> > --
> > Justin Cook
> >  
> >  
> > On Friday, 6 December 2013 at 13:59, artemv zmq wrote:
> >  
> > > Thanks Pieter. All this sounds new to me ... :|
> > >  
> > > But if we return to HWM question -- when I set hwm=0 and sending to unexistent peer, then every .send() call return me "true". Is not this an issue in ZMQ core?
> > >  
> > >  
> > > BR
> > > -artemv
> > >  
> > >  
> > >  
> > > 2013/12/6 Pieter Hintjens <ph at imatix.com (mailto:ph at imatix.com) (mailto:ph at imatix.com)>
> > > > You should probably think about a mix of sequence numbering, credit
> > > > based flow control and negative acks that flow asynchronously against
> > > > the message flow. You can then send without waiting, ensure you never
> > > > overrun buffers, and catch errors if they happen.
> > > >  
> > > > On Fri, Dec 6, 2013 at 2:02 PM, artemv zmq <artemv.zmq at gmail.com (mailto:artemv.zmq at gmail.com) (mailto:artemv.zmq at gmail.com)> wrote:
> > > > > 2Matt:
> > > > >  
> > > > > > > Sending a message may take some time (connection latency, etc) so how
> > > > > > > long do you think it will take to send the message before you assume it has
> > > > > > > been sent or not?
> > > > > >  
> > > > >  
> > > > >  
> > > > > Sure thing -- I can go with some reasonable timeout specified via
> > > > > properties/cmdline. Not a problem.
> > > > >  
> > > > > > > If you want send() to return false
> > > > > The only way I found ZMQ can give me false on .send() -- is to set hwm=1
> > > > > and send two messages everytime:
> > > > >  
> > > > > --send_dummy_scout_msg-->
> > > > > --send_bet_msg-->
> > > > > --send_dummy_scout_msg-->
> > > > > -- X
> > > > >  
> > > > > Got idea? If "dummy_scout" stucked in a queue, then "bet_msg" will not be
> > > > > sent , so .send() will return me "false". Pretty stupid... Not sure I can
> > > > > seriously explain this to chief architects :) Is there other way?
> > > > >  
> > > > >  
> > > > >  
> > > > > 2013/12/6 Diego Duclos <diego.duclos at palmstonegames.com (mailto:diego.duclos at palmstonegames.com) (mailto:diego.duclos at palmstonegames.com)>
> > > > > >  
> > > > > > If ALL you need is to know is "has message left NIC on sending process or
> > > > > > not", there is a socket option for that. It's called ZMQ_ROUTER_MANDATORY.
> > > > > >  
> > > > > >  
> > > > > > On Fri, Dec 6, 2013 at 1:07 PM, Matt Connolly <matt.connolly at me.com (mailto:matt.connolly at me.com) (mailto:matt.connolly at me.com)>
> > > > > > wrote:
> > > > > > >  
> > > > > > > Could you use the socket monitoring to check the connected state of the
> > > > > > > dealer socket?
> > > > > > >  
> > > > > > > Sending a message may take some time (connection latency, etc) so how
> > > > > > > long do you think it will take to send the message before you assume it has
> > > > > > > been sent or not?
> > > > > > >  
> > > > > > > If you want send() to return false, you would need it to be a blocking
> > > > > > > synchronous call which against the idea of queuing messages to be sent (as
> > > > > > > far as I understand)
> > > > > > >  
> > > > > > > Good luck
> > > > > > >  
> > > > > > > Cheers,
> > > > > > > Matt.
> > > > > > >  
> > > > > > > On 6 Dec 2013, at 9:19 pm, artemv zmq <artemv.zmq at gmail.com (mailto:artemv.zmq at gmail.com) (mailto:artemv.zmq at gmail.com)> wrote:
> > > > > > >  
> > > > > > > Sorry for confusion.
> > > > > > >  
> > > > > > > When I said out-of-control -- I meant they do have ZMQ but they may have
> > > > > > > different release cycle and QoS. It's just a service on ZMQ, on a ROUTER.
> > > > > > >  
> > > > > > > Our application is aimed to take a message, get its headers, decide on
> > > > > > > what service ROUTER to send and that's it. W/o waiting for reply.
> > > > > > > Essentially we are a DEALER.
> > > > > > > Replies are important, but as long as they coming back. If they not. Not
> > > > > > > a problem. Client application (iPhone game) by itself checking replies and
> > > > > > > correlation,
> > > > > > > and keep watching: "ahha, I didn't receive ack for betting. hmmm. Let's
> > > > > > > try again". Now it's more clear?
> > > > > > >  
> > > > > > > I really don't need PUB/SUB. I need DEALER/ROUTER. Here, in my company,
> > > > > > > the only biggest concern so far with ZMQ -- misleading behaviour:
> > > > > > > when .send() returns "true" that should mean that message "sent",
> > > > > > > whatever that means: left our PID, left our NIC and so on, we have to
> > > > > > > guarantee that message is not on us.
> > > > > > > I know what's PUB/SUB. And again, telling you that it's not suitable. The
> > > > > > > problem statement is simple:
> > > > > > >  
> > > > > > > - don't use HWM for DEALER/ROUTER (prohibit message queueing).
> > > > > > > - raise immediately if you can't .send() (don't collect in internal
> > > > > > > queue)
> > > > > > >  
> > > > > > >  
> > > > > > > Is it possible?
> > > > > > >  
> > > > > > >  
> > > > > > > BR
> > > > > > > -artemv
> > > > > > >  
> > > > > > >  
> > > > > > > 2013/12/6 Justin Cook <jhcook at gmail.com (mailto:jhcook at gmail.com) (mailto:jhcook at gmail.com)>
> > > > > > > >  
> > > > > > > > Ok, this is confusing. If you are sending a message to a service that is
> > > > > > > > out of your control, either they use 0MQ or not. I assume they do not. If
> > > > > > > > that’s the case, it should not be a part of the use case.
> > > > > > > >  
> > > > > > > > You say you need to know if a message has been received. But, then you
> > > > > > > > say no ACKs or timeouts. I’m even more confused. If you are making a request
> > > > > > > > to a foreign service over — I assume — HTTP which uses TCP, you are very
> > > > > > > > well getting HTTP return codes with the TCP session doing all the hard work.
> > > > > > > > You already have what you are looking for there.
> > > > > > > >  
> > > > > > > > As far as your system — going out to mobile devices — using PUB/SUB and
> > > > > > > > ACKing messages, this is something you will have to do in another channel
> > > > > > > > with 0MQ. Multicast uses UDP; because, it is not feasible to send TCP ACKs
> > > > > > > > from every single subscriber. It’s simply not scalable.
> > > > > > > >  
> > > > > > > > You very well may need to develop your own application protocol to send
> > > > > > > > ACKs or the publisher retransmits. I highly suggest you have a look at this:
> > > > > > > >  
> > > > > > > >  
> > > > > > > > http://stackoverflow.com/questions/12956685/what-are-the-retransmission-rules-for-tcp
> > > > > > > >  
> > > > > > > > It may be something you will want to mimic in your implementation.
> > > > > > > > Someone else has already suggested a timeout for resending unacknowledged
> > > > > > > > messages. As you can see, this is one of the ways TCP retransmissions work.
> > > > > > > > You also may have corrupt data that fail a CRC or hash.
> > > > > > > >  
> > > > > > > > I will finish by saying that if you do have a PUB/SUB design using
> > > > > > > > another channel for unicast communication, you will need to be very aware of
> > > > > > > > scalability issues. You may need to use a lockstep pattern such as REQ/REP
> > > > > > > > if you need guarantee of communication.
> > > > > > > >  
> > > > > > > > --
> > > > > > > > Justin Cook
> > > > > > > >  
> > > > > > > >  
> > > > > > > > On Friday, 6 December 2013 at 09:46, artemv zmq wrote:
> > > > > > > >  
> > > > > > > > > Thanks for heads up.
> > > > > > > > >  
> > > > > > > > > 2crocket:
> > > > > > > > > No acks. No timeouts. Nothing should be kept. Messages should just
> > > > > > > > > flowing back and forth. But for every message we have to answer a question:
> > > > > > > > > "has message left NIC on sending process or not". Let me give example with
> > > > > > > > > betting: game on iPhone sending us a message "make-a-bet", then we send this
> > > > > > > > > to BettingService which isn't in our control,
> > > > > > > > > so all we have to guarantee -- "make-a-bet" message has left our NIC
> > > > > > > > > and been "sent" to BettingService. If "make-a-bet" has been droped on a
> > > > > > > > > network - ok, if BettingService itself drops it - ok.
> > > > > > > > >  
> > > > > > > > > Back to HWM. Let's consider that we send to unavaliable peer.
> > > > > > > > > hwm=1. It means you can send 1 message "blindly" and .send() function
> > > > > > > > > returns success. Of course sending second time will fail. But... the trick
> > > > > > > > > is -- we need answer first time.
> > > > > > > > > hwm=0. It means you can send any number of messages and .send()
> > > > > > > > > function _always_ returns success :(( Again, isn't this a bug?
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > So let me re-phrase the original question -- how to fail at .send()
> > > > > > > > > function in ZMQ?
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > BR
> > > > > > > > > -artemv
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 2013/12/6 crocket <crockabiscuit at gmail.com (mailto:crockabiscuit at gmail.com) (mailto:crockabiscuit at gmail.com)
> > > > > > > > > (mailto:crockabiscuit at gmail.com)>
> > > > > > > > > > Why don't you set a timeout for asynchronous ACKs?
> > > > > > > > > > You receive ACKs asynchronously and keep associated messages until
> > > > > > > > > > ACKs come or a timeout occurs.
> > > > > > > > > > A timeout of 20 seconds is a reasonable estimate.
> > > > > > > > > > After a timeout, if a message doesn't have a corresponding ACK, it
> > > > > > > > > > is determined that the message wasn't delievered, and the message is sent
> > > > > > > > > > again or discarded.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > On Fri, Dec 6, 2013 at 3:19 AM, artemv zmq <artemv.zmq at gmail.com (mailto:artemv.zmq at gmail.com) (mailto:artemv.zmq at gmail.com)
> > > > > > > > > > (mailto:artemv.zmq at gmail.com)> wrote:
> > > > > > > > > > > Hi,
> > > > > > > > > > >  
> > > > > > > > > > > My name is Artem. I stay with ZMQ (on java) a year or so. Got a
> > > > > > > > > > > cool question for you, ppl!
> > > > > > > > > > >  
> > > > > > > > > > > Here's my story. Recently I entered a new company (gambling
> > > > > > > > > > > games), after working few weeks, after getting accustomed with a code, I
> > > > > > > > > > > found that they are building
> > > > > > > > > > > very-unnecessarly-complex-distibuted-application ... I was unhappy few days,
> > > > > > > > > > > because couldn't even imagine how to support ALL THAT CRAP in an upcoming
> > > > > > > > > > > future. So I suggested ZMQ hoping that ZMQ will "open eyes" to others. But,
> > > > > > > > > > > as a feedback I got one big fundamental concern (from chief architects):
> > > > > > > > > > >  
> > > > > > > > > > > - we have to know only one thing about every message: it has been
> > > > > > > > > > > delivered onto remote peer or not
> > > > > > > > > > >  
> > > > > > > > > > > And few additional comments:
> > > > > > > > > > > -we don't care if message will get lost on a network
> > > > > > > > > > > - we don't need guarantee deliveri
> > > > > > > > > > > - no RPC / everything is asynchronous
> > > > > > > > > > > - we don't need HWM
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > > So I'm here, because I really can't address this question: "for
> > > > > > > > > > > every single message how to know : whether it was delivered or not" .
> > > > > > > > > > >  
> > > > > > > > > > > Thanks in advance. And appreciate for your help.
> > > > > > > > > > > _______________________________________________
> > > > > > > > > > > zeromq-dev mailing list
> > > > > > > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > _______________________________________________
> > > > > > > > > > zeromq-dev mailing list
> > > > > > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > _______________________________________________
> > > > > > > > > zeromq-dev mailing list
> > > > > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > _______________________________________________
> > > > > > > > zeromq-dev mailing list
> > > > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > _______________________________________________
> > > > > > > zeromq-dev mailing list
> > > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > > >  
> > > > > > >  
> > > > > > > _______________________________________________
> > > > > > > zeromq-dev mailing list
> > > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > _______________________________________________
> > > > > > zeromq-dev mailing list
> > > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > > _______________________________________________
> > > > > zeromq-dev mailing list
> > > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > >  
> > > >  
> > > > _______________________________________________
> > > > zeromq-dev mailing list
> > > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > >  
> > >  
> > >  
> > > _______________________________________________
> > > zeromq-dev mailing list
> > > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org) (mailto:zeromq-dev at lists.zeromq.org)
> > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> >  
> >  
> >  
> >  
> > _______________________________________________
> > zeromq-dev mailing list
> > zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>  
>  
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org (mailto:zeromq-dev at lists.zeromq.org)
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev






More information about the zeromq-dev mailing list