[zeromq-dev] Not sure if ZeroMQ is right for me

Van Klaveren, Brian N. bvan at slac.stanford.edu
Sat Dec 14 01:14:12 CET 2013


I’m trying to determine if ZeroMQ is right for me. 

I have a local batch system of a few thousand nodes, several external batch systems distributed globally, and a server which distributes those jobs to their respective batch system. 

Upon startup and job completion, a job will notify the server that it has started/ended (traditionally done through email, although this has many problems, which is why we’re moving away from it).
Typically jobs are staggered, but occasionally thousands of jobs all start up at once (still trying to explain to a coworker why querying oracle from 1100 hosts yesterday was bad idea)

What I think I’d like to do is to have a system where the server and clients talk through a broker preferentially.
There will be a main broker and a failover broker.
When the server goes down, or unresponsive, the broker should deliver those to a persistent-queue client (similar to titanic service protocol?), and that persistent queue should keep those messages until the server is reconnected.
When a client isn’t responsive (happens semi-frequently when you are dealing with thousands of batch machines), messages originating from the server should be delivered to the persistent queue, where it will retry at two intervals, after which the client is disconnected, server notified, and messages are discarded. (persistent queue will have write-back caching)

For firewall reasons and other networking reasons, it may be necessary to have an intermediate broker (and possibly a persistent queue) at each of the external batch locations.

Server implementation would probably use jeromq since the rest of the server is written in java, clients would likely use pyczmq.

For message security, I was thinking I’d just pre-distribute a symmetric to each batch system that the jobs would use to encrypt their message (but leave any necessary metadata in the clear).

I couldn’t really determine if there is a library/framework or something that already exists that would make this easy, although it seems like it shouldn’t be too hard with RabbitMQ. However, the support of zeromq and minimalism of the library is desirable to me and possible users. Even if I didn’t use RabbitMQ for the whole thing, I’d probably end up using it for the persistent queue at least. Should I even bother with ZeroMQ?


More information about the zeromq-dev mailing list