[zeromq-dev] queue length

Andrew Hume andrew at research.att.com
Fri Apr 22 19:25:32 CEST 2011


the data is processed in complicated ways.

it is already being deduped, and i am already
spreading the data across multiple nodes.
(for reasons too obnoxious to state, the incoming data is
split among 160 sockets connected evenly across 8 nodes;
from these 160 receiving processes, teh data is split into messages
and then redistributed to 20-30 worker processes (across all 8 nodes)
and the processed results redistributed again.

in many ways, i have implemented the whaleshark.

On Apr 22, 2011, at 9:50 AM, Matt Weinstein wrote:

> what transforms take place on the data?
> 
> is data being reduced, or can results be de-duplicated?
> 
> if so, what about using PGM? you could sequence # the data, use a hash to spread the data among k/n nodes, and de-duplicate after reduction.
> 
> On Apr 22, 2011, at 10:51 AM, Andrew Hume wrote:
> 
>> a queue length independent of TCP buffers etc would be fine.
>> 
>> maybe the problem is that i have an edge case.
>> 
>> in general, everything works well if you can use back-pressure from
>> processes farther down the pipeline to regulate the processing.
>> prcesses who always have a queue are the bottlenecks; those with no or
>> short queues are not the bottleneck.
>> 
>> unfortunately, the front end, or root process, for this flotilla of processes
>> is on the receiving end of a tcp socket that is delivering data at a rate
>> that can't be controlled. if this receiving process can't handle the data rate,
>> data gets discarded (at the sending side of this socket).
>> 
>> i had thought of using the queue to disk but not memory and somehow measuring that
>> but at my high data rate, i am scared of touching disk.
>> 
>> andrew
>> 
>> On Apr 21, 2011, at 10:31 PM, Martin Sustrik wrote:
>> 
>>> Hi Andrew,
>>> 
>>>> i know this was discussed earlier in the 2.0 context, but i can't recall
>>>> what the resolution was. is there a way to find out how many
>>>> messages are queued for a zmq socket? especially for PUSH/PULL.
>>>> i know its a hard problem in general, but i need something.
>>> 
>>> The problem is that messages can be stored in network buffers which we have no way of querying.
>>> 
>>>> my real problem is that i have a system of zeromq-connected processes
>>>> and the system runs out of memory. so far, its seems like zeromq is the
>>>> cause,
>>>> although it is no zeromq bug. it is because processes are queueing up
>>>> serious amounts of
>>>> messages thus enlarging the memory footprint.
>>>> if i had a notion of queue length (even a sloppy one), i could take
>>>> automatic action.
>>> 
>>> Why not set hard buffer limits (0MQ's HWMs & TCP buffer sizes)?
>>> 
>>> Martin
>> 
>> 
>> ------------------
>> Andrew Hume  (best -> Telework) +1 623-551-2845
>> andrew at research.att.com  (Work) +1 973-236-2014
>> AT&T Labs - Research; member of USENIX and LOPSA
>> 
>> 
>> 
>> 
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> 
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev


------------------
Andrew Hume  (best -> Telework) +1 623-551-2845
andrew at research.att.com  (Work) +1 973-236-2014
AT&T Labs - Research; member of USENIX and LOPSA




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20110422/c07e55e1/attachment.htm>


More information about the zeromq-dev mailing list