[zeromq-dev] ZMQ_ROUTER modification

Whit Armstrong armstrong.whit at gmail.com
Thu Dec 15 16:10:01 CET 2011


Thanks, Andrew.

I agree, the problem comes from zmq being so fast distributing the messages.

So, basically use a 'mama' worker instead of 'papa' for job scheduling.

I've been thinking about that, but I think there is a different
problem with using mama's.  Assume that the client connects to several
queue devices using a REP to which the workers connected via REQ.  We
submit our job, which completes.  The problem arises if the workers
are fast enough to do a send before the client can disconnect.  Since
zmq is so fast, the disconnecting client socket can receive the
'please give me work' message that was meant for the next client who
connects...

I'm sure there is an intelligent way to avoid this problem, but I
haven't thought of it.

I'm thinking of a different design which I'll send under a separate email.

Thanks again, Andrew.

-Whit


On Thu, Dec 15, 2011 at 9:27 AM, Andrew Hume <andrew at research.att.com> wrote:
> whit,
>
> i believe this is a common mistake, with an easy solution.
> the fundamental error is confusing message distribution
> with job scheduling. zeromq is partially to blame
> because it does a good job at what it does (fair share
> and load balancing) and tempts you into thinking it
> solves the job scheduling problem as well.
>
> in general, the best solution is that each worker
> asks a job to do when it is ready for that work. typically,
> we might use a REQ/REP for this. this works cleanly
> if the request overhead is not significant (normally the case).
> even when we get near the edge condition of the latency
> becoming an issue, i normally solve that by keeping an internal
> queue on the worker of 2-3 jobs (so that there is always something to do).
> then, the only bad case is when the time to do a job is comparable
> to teh time to transmit the job description. in this case, life is hard,
> but generally in this case, volume is high, so you can afford to simply
> batch jobs into groups (of 100 or somesuch) and then treat those
> as a single managed unit.
>
> andrew
>
>
> On Dec 14, 2011, at 7:58 AM, Whit Armstrong wrote:
>
> Well, let me explain what I'm trying to do.  Perhaps someone can show
> me a better way.
>
> I have a client using a dealer socket. Talking to a mixed server
> environment,  a couple of 6 core machines and a 12 core machine.
>
> Each of the servers uses a simple queue device to fan out the jobs to
> the workers over ipc:
>
> So, basically this pattern, but the client connects to many machines
> w/ different numbers of cores.
>
> client(DEALER)->Queue(ROUTER,DEALER)->worker(REP)
>
> Because the dealer socket on the client fair queue's the messages to
> all the queue devices equally, so the 12 core machine quickly becomes
> idle after working off its queue while the 6 core machines continue
> work off their queues.
>
> My thought was that I could set the HWM to 1 on the ROUTER socket
> which would prevent the messages from being read aggressively, but
> since ROUTER will drop on HWM, I can't do that.
>
> Can anyone suggest a better pattern?
>
> -Whit
>
>
>
>
>
> On Wed, Dec 14, 2011 at 3:35 AM, Martin Sustrik <sustrik at 250bpm.com> wrote:
>
> On 12/14/2011 11:49 AM, Whit Armstrong wrote:
>
>
> Is it possible to construct a ZMQ_ROUTER socket that does not drop on HWM?
>
>
>
> Technically it is possible. It can block instead of dropping. The question
>
> is whether single peer being dead/slow should really block sending messages
>
> to all the other peers.
>
>
> Martin
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
>
> ------------------
> Andrew Hume  (best -> Telework) +1 623-551-2845
> andrew at research.att.com  (Work) +1 973-236-2014
> AT&T Labs - Research; member of USENIX and LOPSA
>
>
>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>



More information about the zeromq-dev mailing list