[zeromq-dev] General understanding of ZMQ and architecture advices
dev at innercircleproject.com
dev at innercircleproject.com
Sat Dec 15 21:18:55 CET 2012
Hello.
We are building data mapping tool that gather several sources of
information and try to fit them in a common document model for further
consumption.
Stage one was to design the "mapping machinery" and it was in our mind
from the start to distribute the workload among processes and machines
so the design was to have self contained data units that we can process
independently, with the only thorny issues being of course the
distribution at the start (with the relevant retry if fail and TTL
problems) and the gathering at the end.
We are working in Python and so planning to use pyZMQ.
My plan is to stick as close as possible to proved design because I
fully understand that it is really easy to "fuck up in mysterious ways"
in the real of distributed and asynchronous processing and the team lack
experience in that domain, but one need to start one day right ? :)
Reading the docs and reading them again I was at first attracted to the
"ventilator design" but several questions came to my mind :
The ventilator design seems to be a big "spread it as long as you have
something to spread" meaning that if the workers have a processing time
superior to the "dividing time" from the ventilator (a probably frequent
case and definitely the case in our situation), the ventilator will
quickly divide the work between the "n" workers and fill their queues,
possibly till overflow (we can have burst of 270 000 jobs-unit to
process when dealing with some inbound flux).
And of course even if the worker queues (they are on the worker socket
side right ? ) can withhold the pressure, it means a worker failure will
send to oblivion potentially thousands of jobs that will have to be
flagged as such and spread again.
My first reaction was to only spread when the sink receive something,
thus insuring that no overflow can occurs, but that means the ventilator
must know how many workers are connected and the ventilator and the sink
must communicate about that somehow, doable but complicated design.
My second idea was to reverse the design and have the worker request a
job from the ventilator, but that means the "load balancing"
capabilities of ZMQ become useless and that the "mutated ventilator"
(more a dispatcher now) needs to manage by itself has many two way
communications as there is workers connected. Doable again, not really
anything to do with the ventilator design anymore, but we can rely on
the Queue Device of pyZMQ...
My first question : is my "analysis" of the ventilator design right and
am I right to assume this is a simple teaching design that is no really
practical in "real life" when worker processing time is significant, or
do I misunderstand something ?
Second question : from my two "ideas", witch one a more seasoned ZMQ
user than me (anybody nearly ;) ) would recommend to achieve a paced
dispatching of the "jobs" to the workers ?
Thanks a lot for your advices.
.X.
More information about the zeromq-dev
mailing list