[zeromq-dev] queue length
Andrew Hume
andrew at research.att.com
Fri Apr 22 19:25:32 CEST 2011
the data is processed in complicated ways.
it is already being deduped, and i am already
spreading the data across multiple nodes.
(for reasons too obnoxious to state, the incoming data is
split among 160 sockets connected evenly across 8 nodes;
from these 160 receiving processes, teh data is split into messages
and then redistributed to 20-30 worker processes (across all 8 nodes)
and the processed results redistributed again.
in many ways, i have implemented the whaleshark.
On Apr 22, 2011, at 9:50 AM, Matt Weinstein wrote:
> what transforms take place on the data?
>
> is data being reduced, or can results be de-duplicated?
>
> if so, what about using PGM? you could sequence # the data, use a hash to spread the data among k/n nodes, and de-duplicate after reduction.
>
> On Apr 22, 2011, at 10:51 AM, Andrew Hume wrote:
>
>> a queue length independent of TCP buffers etc would be fine.
>>
>> maybe the problem is that i have an edge case.
>>
>> in general, everything works well if you can use back-pressure from
>> processes farther down the pipeline to regulate the processing.
>> prcesses who always have a queue are the bottlenecks; those with no or
>> short queues are not the bottleneck.
>>
>> unfortunately, the front end, or root process, for this flotilla of processes
>> is on the receiving end of a tcp socket that is delivering data at a rate
>> that can't be controlled. if this receiving process can't handle the data rate,
>> data gets discarded (at the sending side of this socket).
>>
>> i had thought of using the queue to disk but not memory and somehow measuring that
>> but at my high data rate, i am scared of touching disk.
>>
>> andrew
>>
>> On Apr 21, 2011, at 10:31 PM, Martin Sustrik wrote:
>>
>>> Hi Andrew,
>>>
>>>> i know this was discussed earlier in the 2.0 context, but i can't recall
>>>> what the resolution was. is there a way to find out how many
>>>> messages are queued for a zmq socket? especially for PUSH/PULL.
>>>> i know its a hard problem in general, but i need something.
>>>
>>> The problem is that messages can be stored in network buffers which we have no way of querying.
>>>
>>>> my real problem is that i have a system of zeromq-connected processes
>>>> and the system runs out of memory. so far, its seems like zeromq is the
>>>> cause,
>>>> although it is no zeromq bug. it is because processes are queueing up
>>>> serious amounts of
>>>> messages thus enlarging the memory footprint.
>>>> if i had a notion of queue length (even a sloppy one), i could take
>>>> automatic action.
>>>
>>> Why not set hard buffer limits (0MQ's HWMs & TCP buffer sizes)?
>>>
>>> Martin
>>
>>
>> ------------------
>> Andrew Hume (best -> Telework) +1 623-551-2845
>> andrew at research.att.com (Work) +1 973-236-2014
>> AT&T Labs - Research; member of USENIX and LOPSA
>>
>>
>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
------------------
Andrew Hume (best -> Telework) +1 623-551-2845
andrew at research.att.com (Work) +1 973-236-2014
AT&T Labs - Research; member of USENIX and LOPSA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20110422/c07e55e1/attachment.htm>
More information about the zeromq-dev
mailing list