[zeromq-dev] Our router architecture - suggestions going forward?
noah at ooyala.com
Thu Feb 2 20:18:54 CET 2012
Hi! My team at Ooyala are putting together a zmq-based architecture for
some monitoring stuff we're doing. We're trying to figure out if it's
reasonable to keep compatibility options for ZMQ 4.x. I'm hoping you might
have suggestions, for that or in general.
** First, why we're doing this:
The idea is that a monitoring client runs on each monitored machine. Local
processes send registrations, statistics, heartbeats and notifications
(errors, warnings, etc). They also declare plugins to run periodically to
assess process and machine health, roughly like what Nagios does. Client
sets are dynamic, and a lot of this runs in EC2.
We put the various information to Graphite, into our alerting system, into
our scheduling system for running plugins and a few other places. Then we
can see the results and determine the health of our cluster -- what
machines are running and what applications they're running, as well as
health checks from the plugins.
** Next, what we're doing with ZMQ:
The clients send JSON with the stats, notifications, etc. over a ZMQ_DEALER
connected to central routers (six routers, to start with). The routers
bind a ZMQ_ROUTER socket for client traffic, which is resent via a ZMQ_PUSH
socket to our back-end message sinks.
A few high-value messages like error notifications require acknowledgements
from the sink, and will be resent periodically until the ack is received by
the client. The router doesn't store any state about that, it just
Each client has a UUID. It's sent in their JSON messages, it's what they
bind as the socket identity. That's how we send them things. It's how we
identify things like statistics from them. It persists across reboots, but
we can generate new ones easily when provisioning new virtual machine
The message sinks connect to the routers with a ZMQ_PULL socket. They
receive messages (stats, notifications, etc.) and put them in various
back-end storage, including sending out notifications by email or pager
where appropriate. Each message sink has a type (heartbeat sink, stats
sink, registration sink, etc), and the pull socket distributes the work
among the available sinks.
The routers also bind a REP socket for traffic from the back end *to* the
clients. At the moment, the traffic to the client is either acks or "run
this plugin now" messages.
A scheduler (in practice, several machines) looks at that storage,
determines what plugins need to run, and then sends "run this plugin"
messages to the REP socket on the router to be forwarded to the clients by
** What we're worried about with 4.0:
>From the mailing list, it sounds like ZMQ 4.0 router sockets won't support
setting identity, which makes it difficult to send to a client by UUID.
Presumably we could make each client, when it connects to the router, send
its UUID in a "hello" message so that the router could then save its
identity and forward messages to it. Does that sound like the right
approach? Should we be doing this already in 3.1?
Right now we're using ROUTER and DEALER for the client/router connection,
which lets us send everything over a single socket - very nice for keeping
our firewall rules simple. But it sounds like there's not any way to do
this in a way that's both 3.1- and 4.0-compatible. Is that true, or am I
Right now we're in the early stages. We have a basic ZMQ topology running
and a few tests, but there will never be a better time to change this
architecture. What are we doing wrong?
Software Engineer |
noah at ooyala.com | (510) 260-5409 (cell)
www.ooyala.com | blog <http://www.ooyala.com/blog> |
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the zeromq-dev