[zeromq-dev] Migrating IPC-based Pub/Sub System To ZMQ
Santy, Michael
Michael.Santy at dynetics.com
Fri Aug 13 17:57:53 CEST 2010
All,
I'm currently working to refactor an existing pub/sub distributed system built around CMU IPC[1] and raw sockets to solely use ZMQ. I've not been able to fit it into any one of the common patterns, so I'd like to get some feedback from the list on how to improve my proposed design.
For those of you not familiar with CMU IPC: It is a brokered pub/sub framework in which every process can pub and sub through the CMU IPC API. CMU IPC connects to a "central" server process via TCP/IP to register for subscriptions and publish messages. Typically this central server also routes the messages as well. While CMU IPC works relatively well for its intended use, we've found its performance and fault tolerance a bit lacking. We've done some benchmarking and on large messages the performance is unacceptable. In addition to the multi-hop latency forced by a brokered system, the network utilization was less than 30% on 1GbE and much worse on 20Gb IB. In addition, the IPC central server is a central point of failure. If it goes down, it takes all of the clients with it and we've found no way to recover.
The top half of the attached diagram illustrates our existing system. It collects high-speed data from a number of sources, combines the data centrally into processing packets, and farms out these processing packets (mostly) round-robin to one of many data processors. Because of the performance issues we only use CMU IPC for administrative messages (shown in dashed red). We've implemented a mechanism based on raw sockets for distributing the high-speed data (shown in blue), and use CMU IPC to coordinate these socket connections.
I believe that by adopting ZMQ, we can improve / simplify our system by moving both our high-speed data and administrative messages into a single ZMQ-based messaging framework. As a benefit, it appears that we can gain some fault tolerance, as ZMQ abstracts the socket bind/connect ordering and handles reconnection. The lower half of the diagram shows my best guess of how to adopt ZMQ. In this diagram, all connections between processes are ZMQ, and the endpoints are labeled as to whether the ZMQ socket binds or connects, and whether it is a pub or a sub ZMQ socket.
The zmq_forwarder device seems very analogous to the CMU IPC central server, so it appears easiest to have it bind for subscribers and for publishers. Each process that needs to publish or subscribe would connect to one or both of the endpoints provided by zmq_forwarder. All administrative messaging will go through this zmq_forwarder process. For high-speed data, I don't think it would be best to go through the forwarder, as the data will only be sent to one process, and I'd like to avoid the overhead of two network hops. In this case I think it makes sense to use pub/sub ZMQ sockets directly from the publisher of the high-speed data to the subscriber.
In this design, there are still two central points of failure: the zmq_forwarder and the data_combiner. If either of these processes go down, the operation of the system will stop. In the previous system, it would stop catastrophically. In the ZMQ-based refactored system, I believe that we can gracefully handle unavailability of either of these processes and restart when they come back only due to ZMQ's connection abstraction.
I've played around with the examples and have a superficial understanding of ZMQ, but have yet to apply it to a non-academic problem. I'd really appreciate any feedback on this design from those of you using ZMQ in production systems.
Thanks!
Mike Santy
[1] http://www.cs.cmu.edu/~IPC/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20100813/4982eb50/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: zmq_topology.svgz
Type: image/svg+xml-compressed
Size: 2527 bytes
Desc: zmq_topology.svgz
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20100813/4982eb50/attachment.bin>
More information about the zeromq-dev
mailing list