[zeromq-dev] help with feng shui
Andrew Hume
andrew at research.att.com
Fri Aug 27 16:01:06 CEST 2010
thanks! that was just the input i was after.
my intent is to do out-of-band signalling,
but because 0MQ doesn't provide clean startup/termination semantics,
and because of teh uncertainty caused by buffering, i had to simulate
one step of teh signalling by sending NO-OPs.
if i don't use NO-OPs, and purely use OOB signalling,
how do i know when a worker is done with its work?
how do i know when a ventilator's work messages have all been delivered?
and, if possible, the answer shouldn't contain any time-related waits.
On Aug 27, 2010, at 9:01 AM, Matt Weinstein wrote:
> IMO
>
> You're trying to get state control messages to flow through the
> system, this method is a hybrid "in band" and "out of band" system.
>
> You probably should choose one or the other.
>
> OOB - You mirror the topology with a group of PUB/SUB sockets, top
> to bottom
> IB - you put an input at the top of the ventilators and send
> inband messages downstream. In this case it might be useful to
> have signaling points (devices) that let local components know
> what's going on without the stream of NOPs.
>
> I don't think both IB and OOB are necessary, and it will be easier
> to build a correct solution if you choose just one.
>
> In both cases UUIDs would be good to ensure that all nodes have
> been accounted for. Counting is not particularly safe in a
> distributed environment.
>
> Best,
> Matt
>
> On Aug 26, 2010, at 10:05 PM, Andrew Hume wrote:
>
>> i need some advice. i do not yet grok the feng shui of zeromq,
>> and thus seek advice from those who do.
>>
>> i have a fairly normal setup similiar to the parallel pipeline
>> example in teh guide.
>> except that i have a handful of ventilators, and a handful of sinks.
>> so far, so good. we just use the PUSH/PULL pattern.
>>
>> here is where it gets harder. i need to be able to essentially pause
>> the ventilators, adjust the number of workers and sinks, and then
>> unpause the ventilators WITHOUT losing any packets.
>>
>> the best (!?) solution i have so far is
>>
>> a) add a PUSH/PULL feedback socket (with all sinks and workers PUSH,
>> and the master is a PULL)
>> b) add a PUB/SUB command socket (with all ventilators, sinks and
>> workers SUB,
>> and the master PUB)
>>
>> c) we send an "IDLE" command to the ventilators; they pause their
>> normal work
>> and start sending NO-OP work items
>> d) as each worker starts getting NO-OPs, they push a "LAZY"
>> message to the master.
>> they orward the NO-OP to the sinks.
>> e) when the master sees k LAZY messages (where k is the existing
>> number of workers),
>> it rearranges teh workers (killing some or starting new ones). new
>> workers send NO-OPs.
>> f) when each sink starts getting NO-OPs, it sends a "LAZY" message
>> to the master.
>> g) when the master has done e), and seen NO-OPs from each of the j
>> sinks, it
>> rearranges the sinks. when each new sink starts getting NO-OPs, it
>> send s a LAZY to teh master.
>>
>> h) when the master receives m "LAZY"s (where m is the number of
>> new sinks), it send an "GO"
>> command to teh ventilators, who then stop sending NO-OPs and start
>> sending real work.
>>
>> -------------------------------------
>>
>> pros: i believe this scheme will work. and the additional cost of
>> two sockets is modest.
>> cons: it is tedious to send NO-OPs, but i don't know how else to
>> flush teh buffers
>> and synchronise everyone. it does involve knowing how many things
>> there are,
>> but that is part of an external configuration in any case.
>>
>> is this the (or a) right way to do this? is there a better way?
>>
>> andrew
>>
>> ------------------
>> Andrew Hume (best -> Telework) +1 732-886-1886
>> andrew at research.att.com (Work) +1 973-360-8651
>> AT&T Labs - Research; member of USENIX and LOPSA
>>
>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
------------------
Andrew Hume (best -> Telework) +1 732-886-1886
andrew at research.att.com (Work) +1 973-360-8651
AT&T Labs - Research; member of USENIX and LOPSA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20100827/0f596c47/attachment.htm>
More information about the zeromq-dev
mailing list