[zeromq-dev] ZMQ_ROUTER modification
Whit Armstrong
armstrong.whit at gmail.com
Thu Dec 15 18:35:33 CET 2011
Andrew,
> why is anyone disconnecting?
> maybe i don't grok your setup.
Sorry, it's tough to give the whole story in a canned example for the list.
Basically, we have several servers in house that have init scripts
that start a queue device and a worker per core. The workers all
connect to the queue via ipc, and the queue binds on a tcp port.
So, any person in the firm can use any of these servers by connecting
to the queue devices and firing jobs into them.
In our setup, the 'stable' pieces are the queue devices and workers.
And clients connect and disconnect to the queue devices at any point.
Here's a sample (using R):
library(rzmq)
estimatePi <- function(seed) {
set.seed(seed)
numDraws <- 1e5
Sys.sleep(2)
r <- .5
x <- runif(numDraws, min=-r, max=r)
y <- runif(numDraws, min=-r, max=r)
inCircle <- ifelse( (x^2 + y^2)^.5 < r , 1, 0)
sum(inCircle) / length(inCircle) * 4
}
cluster <- c("krypton","mongodb") ## and occasionally ec2 nodes are
used for big jobs
zmq.cluster.lapply(cluster=cluster,as.list(1:100),estimatePi)
(you can find the zmq.cluster.lapply code here:
https://github.com/armstrtw/rzmq/blob/master/R/rzmq.R)
My point was that using mama workers (workers connect with REQ) if I
connect to the queue device on one of the servers, odds are that after
my job completes I will pick up the 'ready' messages from the workers
before I disconnect. Hence, the next client who connects will not be
able to pick up the 'ready' message and could potentially wait
forever.
What I'm thinking of as an alternative is to run a service on a
different port of each server that simply returns the number of idle
workers (the spare capacity). In that case, the client would actually
open a separate connection to each queue device, and then always fire
jobs out to the server that has the most spare capacity. It's
somewhat more complicated from the client side, but it will keep the
server setup extremely simple. And, as you noted, timeouts and
redo's, and node failure checks could be built into the client code.
-Whit
>
> to me, in a situation like this where there is a job-assigner
> (it seems zmq folks like to use ventilator for some reason),
> and worker-bees, then these entities are relatively long-lived,
> and thus have stable permament control channels between them.
>
> if your style is to fork off a process to do teh job, then the worker-bee
> would do that. in this case, the process doing the work need not have any
> zmq
> connections at all.
>
> alternatively, if numbers don;t get goofily large, then each process
> finishing a job
> can use its job-id and a well-known address to communicate back a short
> status.
> that way, the job-assigner can do timeouts etc and redo jobs.
>
> there are lots of ways to embellish this stuff.
>
> andrew
>
> On Dec 15, 2011, at 8:10 AM, Whit Armstrong wrote:
>
> Thanks, Andrew.
>
> I agree, the problem comes from zmq being so fast distributing the messages.
>
> So, basically use a 'mama' worker instead of 'papa' for job scheduling.
>
> I've been thinking about that, but I think there is a different
> problem with using mama's. Assume that the client connects to several
> queue devices using a REP to which the workers connected via REQ. We
> submit our job, which completes. The problem arises if the workers
> are fast enough to do a send before the client can disconnect. Since
> zmq is so fast, the disconnecting client socket can receive the
> 'please give me work' message that was meant for the next client who
> connects...
>
> I'm sure there is an intelligent way to avoid this problem, but I
> haven't thought of it.
>
> I'm thinking of a different design which I'll send under a separate email.
>
> Thanks again, Andrew.
>
> -Whit
>
>
> On Thu, Dec 15, 2011 at 9:27 AM, Andrew Hume <andrew at research.att.com>
> wrote:
>
> whit,
>
>
> i believe this is a common mistake, with an easy solution.
>
> the fundamental error is confusing message distribution
>
> with job scheduling. zeromq is partially to blame
>
> because it does a good job at what it does (fair share
>
> and load balancing) and tempts you into thinking it
>
> solves the job scheduling problem as well.
>
>
> in general, the best solution is that each worker
>
> asks a job to do when it is ready for that work. typically,
>
> we might use a REQ/REP for this. this works cleanly
>
> if the request overhead is not significant (normally the case).
>
> even when we get near the edge condition of the latency
>
> becoming an issue, i normally solve that by keeping an internal
>
> queue on the worker of 2-3 jobs (so that there is always something to do).
>
> then, the only bad case is when the time to do a job is comparable
>
> to teh time to transmit the job description. in this case, life is hard,
>
> but generally in this case, volume is high, so you can afford to simply
>
> batch jobs into groups (of 100 or somesuch) and then treat those
>
> as a single managed unit.
>
>
> andrew
>
>
>
> On Dec 14, 2011, at 7:58 AM, Whit Armstrong wrote:
>
>
> Well, let me explain what I'm trying to do. Perhaps someone can show
>
> me a better way.
>
>
> I have a client using a dealer socket. Talking to a mixed server
>
> environment, a couple of 6 core machines and a 12 core machine.
>
>
> Each of the servers uses a simple queue device to fan out the jobs to
>
> the workers over ipc:
>
>
> So, basically this pattern, but the client connects to many machines
>
> w/ different numbers of cores.
>
>
> client(DEALER)->Queue(ROUTER,DEALER)->worker(REP)
>
>
> Because the dealer socket on the client fair queue's the messages to
>
> all the queue devices equally, so the 12 core machine quickly becomes
>
> idle after working off its queue while the 6 core machines continue
>
> work off their queues.
>
>
> My thought was that I could set the HWM to 1 on the ROUTER socket
>
> which would prevent the messages from being read aggressively, but
>
> since ROUTER will drop on HWM, I can't do that.
>
>
> Can anyone suggest a better pattern?
>
>
> -Whit
>
>
>
>
>
>
> On Wed, Dec 14, 2011 at 3:35 AM, Martin Sustrik <sustrik at 250bpm.com> wrote:
>
>
> On 12/14/2011 11:49 AM, Whit Armstrong wrote:
>
>
>
> Is it possible to construct a ZMQ_ROUTER socket that does not drop on HWM?
>
>
>
>
> Technically it is possible. It can block instead of dropping. The question
>
>
> is whether single peer being dead/slow should really block sending messages
>
>
> to all the other peers.
>
>
>
> Martin
>
>
>
> _______________________________________________
>
> zeromq-dev mailing list
>
> zeromq-dev at lists.zeromq.org
>
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
>
>
> ------------------
>
> Andrew Hume (best -> Telework) +1 623-551-2845
>
> andrew at research.att.com (Work) +1 973-236-2014
>
> AT&T Labs - Research; member of USENIX and LOPSA
>
>
>
>
>
>
> _______________________________________________
>
> zeromq-dev mailing list
>
> zeromq-dev at lists.zeromq.org
>
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
>
> ------------------
> Andrew Hume (best -> Telework) +1 623-551-2845
> andrew at research.att.com (Work) +1 973-236-2014
> AT&T Labs - Research; member of USENIX and LOPSA
>
>
>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
More information about the zeromq-dev
mailing list