[zeromq-dev] LRU broker queue in intuitive way on 3.0
Henry Baragar
Henry.Baragar at instantiated.ca
Wed Apr 6 15:00:48 CEST 2011
On April 5, 2011 10:30:31 am Andrew Hume wrote:
> i observe that your 20% growth per month is a factor of 9x per year.
> you can therefore stay within a single server for probably 2 years,
> given you are not even using up one core for your processing window.
>
> given this degenerate case of a single server and no per-task overhead,
> then greedy will be your friend. set up one worker thread per cpu,
> connected to the scheduler by a PULL socket with a queue length of 1.
> each worker pulls a task description, performs it, and then asks for more.
> when it receives a 'all done' message, it exits.
>
Basically, that is what I was thinking. However, my understading is that you
have to have all workers connected to the boss before you start sending out
tasks. Otherwise the longest running task, as well as the second longest
running task get assigned the the first worker that connects (which is very
suboptimal).
If I have to do a lot of bookkeeping to keep track of the workers, then I
suspect piping to subprocesses is an easier solution. I was intrigued by the
BOSS/WORKER pattern email because it sounded like it handled all this
bookkeeping, resulting in much simpler code in my application.
Now that I have elaborated my use case, is the BOSS/WORK pattern would be
appropriate for this use case?
> the scheduler should just push task descriptions on its channel to teh
> workers. for 200ish such messages, i wouldn't worry about queue length
> restrictions on the sending side.
>
> to make it neat and tidy, i would add a control socket,
> PULLed by the scheduler and PUSHed by the workers,
> where the traffic is an announce msg by the worker when its ready
> and a 'done' message (including task-related statistics) as it exits.
> the scheduler process can then then announce that all tasks are done
> and emit any statistical summary of the work done.
>
Yes, this would be an elaboration on what is already being done.
> this scheme also allows teh scheduler to predict
> the total run time and what the effects would be of adding additional
> workers.
>
> use this 2 year window to develop a model for how long subtasks take
> (for example, t = a + b*(number of transactions)) and how much work
> a server can do (for example, server A can process 6 tasks simultaneously
> and its a is 23ms and its b is 2us.) you then will be able to transition
> sensibly (assuming you continue to grow) to the more normal case of
> multiple heterogenous servers of varying capacity. for that scenario, you
> will need teh scheduler to be able to place jobs on specific servers.
>
As you say, this is probably about two years out, which is beyond the horizon
of my current thinking.
> hope this helps,
Thanks,
Henry
> andrew
>
> On Apr 5, 2011, at 6:04 AM, Henry Baragar wrote:
> > On April 4, 2011 11:35:25 pm Andrew Hume wrote:
> > > i like creative ways to solve problems as much as the next sapient,
> > > but this problem (optimal job scheduling) is much more complicated
> > > than that, and to run well, requires adaptive scheduling stuff in
> > > realtime. you can't really do it with just load balancing and back
> > > pressure.
> >
> > Right now, all the subtasks are being run on a single server in
> > sequential order in a single process. Currently, the overall task is
> > completing within its operational window, but the transaction volumes
> > are going at 20% per month. My initial thoughts for a solution were to
> > fork a few subprocesses and use polling to find the next available
> > subprocess.
> >
> > > how accurately can you predict teh runtime of a task?
> >
> > I know the transaction volumes that need to be processed by each subtask.
> >
> > > how accurately can you predict the task capacity of a server or worker?
> >
> > Initially, I would like to be able to take advantage of all 4 CPU's on
> > the server, meaning the capacities should be identical. Until I
> > saturate the single server (potentially with the addition of more CPU's)
> > I don't need to worry about where the other workers will be run.
> >
> > > what is the model relating task capacity, workers and servers?
> >
> > I'm not sure that I understand this question. Currently, I hope to be
> > able to run everything on one server.
> >
> > > is there a significant overhead to starting and documenting a task?
> >
> > No.
> >
> > > do you have a real job scheduler? (or are you trying to wrench this
> > > functionality out of 0mq?)
> >
> > No. A job scheduler does not seem to be at the correct level of
> > granularity for this project and I suspect would add unnecessary
> > complications.
> >
> > > do you need to consider failures (or servers, workers, ...)?
> >
> > That would be nice, but essential. The application already has a number
> > of checks to verify that tasks have completed successfully. Regards,
> > Henry
> >
> > > if you have the answers to these, i can advise you on a path forward.
> > >
> > > andrew
> > >
> > > On Apr 4, 2011, at 8:16 PM, Henry Baragar wrote:
> > > > I think that this is the pattern that I am trying to figure out if it
> > > > would be easy to implement in zeromq. Here is my use case...
> > > > I have a "day end" task that can be split up into 200 subtask and I
> > > > want the task to run as fast as possible. The interesting thing is
> > > > that the longest subtask could take 10K times as long to run as the
> > > > shortest, and I have information that allows me to sort them in
> > > > descending order from longest to shortest run time. I want the task
> > > > manager (the Boss) to hand out subtasks to workers one at a time, so
> > > > that the first worker to connect gets the longest running task, the
> > > > second one gets second longest, etc. I would only start 4 workers,
> > > > so I want to assign the fifth longest task to the first available
> > > > worker, which probably would be the fourth worker. The sixth
> > > > longest subtask would not be assigned until there was an available
> > > > worker, etc. until all the subtasks are complete, at which point all
> > > > the processes would be shut down. Oh, if the task is taking too
> > > > long, I would like to be able to add workers as needed (probably on
> > > > other servers). Would this BOSS/WORKER pattern you envision address
> > > > my use case? Regards,
> > > > Henry
> > >
> > > ------------------
> > > Andrew Hume (best -> Telework) +1 623-551-2845
> > > andrew at research.att.com (Work) +1 973-236-2014
> > > AT&T Labs - Research; member of USENIX and LOPSA
>
> ------------------
> Andrew Hume (best -> Telework) +1 623-551-2845
> andrew at research.att.com (Work) +1 973-236-2014
> AT&T Labs - Research; member of USENIX and LOPSA
--
Henry Baragar
Instantiated Software
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20110406/7c76597c/attachment.htm>
More information about the zeromq-dev
mailing list