[zeromq-dev] Pings get dropped for certain listeners in the Forwarder pattern

Varun Vijayaraghavan varun.kvv at gmail.com
Thu Feb 7 17:16:07 CET 2013


Hey,

I had to wait for it to stop dropping pings again, which is why I did not
get back. :)

After restarting the process again, the throughput became normal.


Ian:
>So is this throughput dropping (msgs/second) or dropped data (messages not
arriving)?

So it looks like at some point the messages started getting delayed by 290
- 400 seconds. So, it's quite likely that this crossed the HWM that we have
on the forwarder.



Pieter: I should have been more specific in my initial question.

>- what version of 0MQ you are using
We are using zeromq-2.2.0.

>- the operating system and hardware configurations
Ubuntu 12.04 on AWS EC2 instances. 2 virtual cores + 7 GB RAM on the
forwarder, and 4 virtual cores + 15 GB RAM on the subscriber (albeit that
runs other processes that don't have as much network / CPU requirements).
Amazon claims that both of them have high I/O throughput.

>- the message rate (messages per second) and typical message size
8000 messages per second at peak, 1.5KB typical size.

>- whether consumers may be fighting for CPU cores with other processes
Yes they could be. Would that explain getting delayed over a long period of
time though?

>- precisely the types of sockets you are using.
We have 4 PUB-SUB forwarders. The producers each "publish" to one
forwarder. We have many processes that subscribe from all four forwarders.
The forwarders have HWM of 100000.

>- whether you're losing one in every two messages, or bursts of messages.
When the throughput decreased, I noticed that the messages were getting
delayed almost consistently between 300 - 400 seconds. Pretty certain that
this triggered the HWM.





On Mon, Jan 28, 2013 at 4:53 PM, Varun Vijayaraghavan
<varun.kvv at gmail.com>wrote:

> Pieter, Ian,
>
> Thanks for your replies, and you have raised good points.
>
> I am not certain about some of the questions you have asked. It's a good
> place for me to start exploring.
>
> I'll reply back to this thread once I find something.
>
> Thanks again!
>
>
>
>
> On Mon, Jan 28, 2013 at 4:17 PM, Ian Barber <ian.barber at gmail.com> wrote:
>
>>
>>
>> On Mon, Jan 28, 2013 at 9:32 AM, Varun Vijayaraghavan <
>> varun.kvv at gmail.com> wrote:
>>>
>>> On some of our processes, which incidentally run on smaller instances,
>>> we see that the message count in the consumer suddenly drop to about 50%.
>>> This happens once a week, and does not get fixed by itself till we restart
>>> the consumer process.
>>>
>>> Is this expected behavior related to smaller machines, or .. something
>>> else? Also, could someone explain the mechanism that would cause the ping
>>> count to drop like that?
>>>
>>
>> So is this throughput dropping (msgs/second) or dropped data (messages
>> not arriving)?
>>
>> Ian
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>
>
> --
> - varun :)
>



-- 
- varun :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20130207/a42d46f3/attachment.htm>


More information about the zeromq-dev mailing list