[zeromq-dev] discussion on routing algorithms

Martin Sustrik sustrik at fastmq.com
Fri Jan 23 14:43:10 CET 2009


I'm forwarding the discussion we've had with Kirill Smelkov (with his 
kind permission) on the topic of routing algorithms.

Martin

 > Actually the main load is located in routing, i.e. the task to decide
 > which subscribers to notify takes the most of the time.

We have some advanced routing algorithms described here, in case you are 
interested:

http://www.zeromq.org/whitepapers:message-matching

 > Actually we used to put not-so-high traffic on the bus, because most of
 > the on-board digital equipment produce small traffic (e.g. GPS sends 1-5
 > messages a second)
 >
 > *but* now we are also planning to put several additional sources,
 > converted from analog to digital with much more higher traffic load.
 >
 > e.g. radar/sonar signals and video signals, etc...
 > The idea is to put everything into the same bus, so that the system is
 > unified I/O wise.

 > [One thing I could suggest to also implement in zmq as well, is to try
 >  to automatically offload routers with the help of hardware, or OS
 >  stack.
 >
 >  For example if we talk about UDP/Multicast transport there could be a
 >  rule to automatically split traffic for different topics on different
 >  ports, so e.g. '/radar/...' goes to port1, and the rest goes to port2.
 >
 >  If this is done automatically on both publisher and subscriber ends,
 >  we'll not load every receiver process which does not need radar data
 >  with unneeded stuff -- it simply will not open a listening socket on
 >  port1
 >
 >  I think this idea could be handy,

Yes, that's the way we do it. However, this means you have to have small 
number of pre-configured topics so that they can be assigned to 
multicast groups/ports. In the generic case of topic routing (e.g. 
"ibm.stocks.*.usd.*" the set is too large to be handled this way (also 
the topic subscriptions are overlapping, i.e. "ibm.stocks.futures.jan10" 
matches both "ibm.*.*.*" and "*.*.*.jan10" meaning that the message 
would have to be sent twice on different multicast groups). Actually I 
recall discussing offloading such complex routing into NICs with Intel 
guys once...

 >  Another interesting thing is to try to implement 'topic compiler'. e.g.
 >  instead of matching strings (or regexps) topics at runtime, try to map
 >  them into numbers, so that matching could be implemented with only
 >  checking whether
 >
 >     i_topic is in [i_low,i_high]
 >
 >  range, or several such ranges.
 >
 >  For general routing engine this would allow to reduce memory for
 >  routing tables from O(ntopics2) to something which is O(ntopic)
 >
 >  The compiler could not neccessarily be made static, I mean such a
 >  mapping tables could be constructed at runtime, maybe ... ]

Ah, the generic idea of dynamically compiling the subscriptions is 
described in detail in the whitepaper linked above.

What I see as particularly interesting is the case where's the set of 
topics is fixed tree-like hierarchy and the subscriptions have to have 
wildcard on the end only (e.g. "ibm.stocks.*"). In this case we can get 
routing overhead as low as 2 machine instructions as you've suggested 
above. This would mean just couple of CPU ticks = several nanoseconds 
per match.




More information about the zeromq-dev mailing list