[zeromq-dev] TCP Based Message Bus

Doron Somech somdoron at gmail.com
Wed May 8 09:18:42 CEST 2013

Hi All,

Usually we are using zeromq with pgm as our message bus. We are using
message bus to publish events between server side services.

The issue is that we need to support environment where multicast is not
supported (like amazon cloud).

I'm working on a design to make tcp based message bus and want to get your
thoughts on that.

There are three major requirements, we want services to be able to come and
go without need to reconfigure the system, we want a brokeless design and
we want to be able to recover lost messages between a publisher and a
subscriber (caused by connection problem) like pgm does.

We have three types of components, a discovery service, publisher and

Discovery Service is a standalone service, the discovery service has the
list of all the subscribers in the network, the subscriber ping the
discovery service every X seconds, when specific subscriber didn't ping the
service for more than Y seconds it consider dead. On every new subscriber
the publisher publish a message to all the publishers. For high
availability there are more than one discovery services (probably 3).

When publisher is starting it's asking the discovery service for all of the
subscribers and subscribe for new subscribers (it asked all configured
discovery services and takes the first answer, it subscribed for all of the
discovery services). After getting the list the publisher is connecting to
all of the subscribers. The publisher also connects to every new
subscriber. The publisher is ignoring dead subscribers (mostly because I
don't know how to handle it because the dead message can come from one of
the discovery service but can still be alive on others).

All the messages the publisher is sending are numbered, also the publisher
is saving the X last messages it sends to support recovery of lost
messages. Each publisher has a unique random id.

If publisher doesn't send a message in X seconds the publisher will send a
keep alive message to all subscribers.

As mentioned the subscriber ping the discovery services every X seconds,
when the subscriber get a message from a publisher for the first time it's
saving the message number. From there if the subscriber detects a gap in
the messages it directly connects to the publisher (using request-response)
and asking for the missing messages. The only problem is that in lost
messages situation the subscriber will stop handle new messages from all
publishers until the missing messages are restored.

If the publisher doesn't have those messages anymore the subscriber should
raise an exception or restart the entire service.

The only thing the subscriber and publisher need to know is the addresses
of the discovery services.

The reason I want the publisher to connect to the subscriber is to make
sure when the connection is dropped the publisher will be able to recognize
it and reconnect (the subscriber may not be able to recognize it because it
doesn't send any data to the publishers).
Thanks, I will very much appreciate your comments.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20130508/c0cb4297/attachment.htm>

More information about the zeromq-dev mailing list