[zeromq-dev] Understanding ØMQ PubSub Scaling

Martin Sustrik sustrik at 250bpm.com
Tue Jan 11 22:01:19 CET 2011


Hi Dave,

> I am looking into products for pub-sub problem that I am working through
> and I think thank ZeroMQ might fit the bill. I am curious to understand
> the economies of scale that ZeroMQ works upon and how I can efficiently
> tackle the problem at hand. My problem is best described with a simple
> scenario:
>
> 10 machines, each acting as both publishers and subscribers.
> 1 publisher per machine.
> 10,000 subscribers per machine.
> Subscribers are non-blocking and being looped over in a single thread
> looking for data.
> Messages being passed are relatively small. Let's 1K for the sake of
> this scenario.
> It is not a pure fan-out scenario. A single subscriber will be
> interested in only a subset of the messages published. From the 100,000
> subscribers, maybe 1, maybe 20, maybe 80,000 will be interested in the
> message, depending on the routing key for the message.
> A subscriber/consumer will be interested in several routing keys, but it
> will have an explicit list of what those are. There will be no fuzzy
> matching on the routing key. A routing key will either match exactly
> something in its list or it won't.
>
> How does scenario like this perform, theoretically. Assuming an
> adequately beefy machine (whatever that means), if I send a single
> message to all 100,000 subscribers, how quickly can I expect them to see
> it as available? How does memory usage correlate? What happens if I have
> more than 10,000 subscribers in a single thread? What happens if I have
> each machine run 4 processes instead of 1, thus having 4 publishers and
> 40,000 subscribers on a machine? What happens if I increase the number
> of machines from 10 to 100, with 1 million total subscribers?
>
> And what sort of topology best makes this work? Can I setup multicast
> and do it all over TCP or should I have relays at each of the servers
> that subscribe to each other indiscriminately and then forward the
> messages with the routing keys over something like ipc/inproc?
>
> I know that I'm asking a lot and I'm sure that there's more variables I
> haven't thought of here. I'm looking for some sage advice or direction
> that can help me understand ZeroMQ better and what it can or can't do.

So, in short, 0MQ is optimised for large number of subscribers (using 
epoll/kqueue, using O(1) message routing algorithms etc.)

Moreover, you can build a message distribution tree using forwarder 
devices. Think of devices as routers that re-distribute the messages for 
a particular sub-networks. There can be arbitrary number of levels in 
the distribution tree.

However, 0MQ lacks "subsctription forwarding" i.e. the ability to pass 
the subscriptions up the distribution tree so that each node in the tree 
would forward only the messages that the nodes below are actually 
interested in. This feature is much desired, but unofortunately, it 
wasn't implemented yet.

Martin




More information about the zeromq-dev mailing list