[zeromq-dev] I'm losing messages doing 2 way heartbeat checks

Joseph Bowman bowman.joseph at gmail.com
Mon May 23 05:30:07 CEST 2011


I'm new to ZeroMQ. After reading about it and going through the guide, I had
the idea of building a self configuring load balancing application using the
Least Recently Used methodology demonstrated throughout the guide.

I've started work, and I've run into my first problem that I've beat on for
a while and I think I must be missing something, but I can't figure out what
it is.

Basic information:
ZeroMQ v 2.1.7
pyzmq v 2.1.7
Python 2.6.1

Operating systems tested:
OSX on a 13in MPB (core 2 duo)
Rackspace Ubuntu 9.10 server, 4 cores.

Results were the same on both servers.

The high level description is I've got 2 applications. One is the broker,
the other is the worker. I haven't gotten to getting multiple workers
talking to the broker yet.

Direct links to code are:
github repo - https://github.com/joerussbowman/Scale0
broker code -
worker code -

To run it start scale0.py, then start test_worker.py. Each server keeps a
list of pings/heartbeats it sends. The pings/heartbeats include timestamps.
When they get a response they delete that timestamp from the list. If you
run them you will see that the lists start growing on both ends. It's like
it switches off with one heartbeat working for while, then the other.

The only other thing I can think of to note is I am using the eventloop.
I've done my best to add in some comments to the code to make it easier to
understand what's going on, though I'm sure it could be commented better.

More detailed description follows.

Broker has 1 XREP socket open.

Worker has 1 XREP socket and 1 XREQ socket open.

Both send heartbeat messages to each other, expecting a response to validate
the other is alive. Broker sends ping, expects pong. Worker sends heartbeat,
expects heartbeatreply.

Workers connect to the Broker and send a ready. The Broker adds the worker
to the LRU. The Worker also starts sending heartbeat requests. The Broker
every second goes through it's LRU queue and sends a ping to each worker in
the queue. This is done by creating a socket and sending it to the
connection the worker informed the broker it has available.

If anyone could get me pointed in the right direction to not get the
heartbeat misses, I'd be grateful.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20110522/a1ba63e5/attachment.htm>

More information about the zeromq-dev mailing list