[zeromq-dev] exiting without missing messages

Mathijs Kwik bluescreen303 at gmail.com
Sat Aug 20 15:29:31 CEST 2011



On 20 aug, 14:40, Chuck Remes <cremes.devl... at mac.com> wrote:
> On Aug 20, 2011, at 3:44 AM, Mathijs Kwik wrote:
>
>
>
>
>
>
>
>
>
> > Hi all,
>
> > I would like to be able to scale a part of my application by just
> > adding processes/boxes.
> > If load is lower again, I would like to remove them again.
>
> > This part of the application uses a PULL socket to get work.
> > And just like in the guide/example some kind of ventilator is
> > PUSH'ing.
>
> > Now, adding processes is easy, just zmq_connect them and the load
> > balancing push socket will do its magic trick automatically.
>
> > However, I couldn't find a way to disconnect again.
> > Looking at the manpage for zmq_close, it tells me that it is going to
> > drop messages that have been received from the network, but not yet
> > handed out to the app itself.
>
> > So zmq_close isn't gonna work for me.
> > Even if there was a way to make sure nothing is in the "incoming"
> > buffer, I understand that the push socket might already have queued
> > messages for a specific pull socket, so if that disappears, messages
> > get stuck waiting on the push side.
> > So it would be nice if there's a way to tell the push socket to stop
> > sending / disconnect a certain puller.
>
> You need another pair of sockets to carry this communication. You can't accomplish your goal with only a PUSH/PULL setup.

Adding another socket pair is only 1 part of the solution.
With it, workers can tell the ventilator "I'm leaving, please stop
sending me work".
However, there is nothing in the zeromq api that the ventilator can
use to have its push socket stop sending messages to that 1 specific
worker.
So how do I proceed _after_ the extra socket?

>
> > Is there a graceful way to just take out a few workers without
> > dropping messages?
> > Or do I need to build all kinds of checks around it where the
> > ventilator and the sink communicate a lot to detect missing messages
> > and use in-app buffers in the ventilator so it can resend missing
> > ones?
> > I hope this isn't needed, since it will cause latency. This isn't an
> > error/exception situation either, just a "planned" change in the
> > network.
>
> Take a look at the more advanced patterns in the 0mq guide. You probably want some kind of reliable request/reply so look at the "pirate" patterns in chapter 4.

No, I don't want request/reply, I want a pipeline.

I can imagine the ventilator->workers->sink as a whole to act like a
request/reply pattern though, but even then, that chapter doesn't buy
me much. It's all about reliability and talks about handling
unexpected situations like hardware crashes / network problems.

My question isn't about "broken" situations, I only want to remove
some workers, in a fully managed/expected way, so handling it as if
someone pulled the plug sounds like overkill, as that involves caching
requests and re-sending them after a timeout.
Also, it would introduce latency, by having to detect some message got
lost, wait for it some arbitrary time, re-request it.

For real crashes, I already have ways to check consistency and do a
recovery to resume operation, so I don't gain much by handling
everything as if it's a failure.

So I would rather keep the system simple. I don't mind adding extra
sockets so parts can signal each other about adds/removals, but as far
as I can tell, zeromq doesn't provide means to do this (disconnect /
remove 1 peer from load-balancer).

Am I missing something trivial here?

Thanks,
Mathijs

>
> cr
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-... at lists.zeromq.orghttp://lists.zeromq.org/mailman/listinfo/zeromq-dev



More information about the zeromq-dev mailing list