[zeromq-dev] RFH: PUB/SUB + REQ/REP + PPP combo needed

Goswin von Brederlow goswin-v-b at web.de
Thu Jun 26 16:55:43 CEST 2014


Hi,

I'm need a combination of PUB/SUB and REQ/REP with some form of PPP
(Paranoid Pirate Protocol) added in the mix and I wonder how to best
do this. I have something in mind but I don't want to influecne your
thinking.

So lets look at it fresh from the outside.

Peers:
------

- I have a central master (M) that acts as a controler for a large
number of workers, interface to a maria DB and internal config
settings.

- I have a large number of workers (W) connected to the master with
heartbeat so the master knows what workers are online. Take that part
as given.

- I have a small number of clients that users start/stop at any time.
A client is a frontend for configuration and running jobs on the
workers (through the master).


Message traffic:
----------------

1) A client send simple requests to the master, e.g. set config
BAR="foo". The master should ACK the request if it is correct or NACK
on error. I can make those message idempotent I think. So a client can
resend a request till it gets an ACK or NACK back. Simple requests are
synchronous, atomic and fast. If they aren't done in 1s then something
is wrong.

2) The master tells all clients that a config option has changes, now
BAR="foo". That message must not be droped or clients get out of sync.

3) The master tells all clients that a worker is now online/offline
(same as 2 but different source).

4) A client sends a work order to the master, e.g. run "date" on
worker "beo-[1-5]". The master should ACK the request, sends it to the
respective workers. Each worker ACKs the command, sends output for the
command as it appears and finaly sends a FINISHED for the command
including the exit status. The worker output needs to be forwarded to
the client. When all workers have send their FINISHED the master sends
a finale ALLFINISHED to the client.

So this a complex req/rep pattern with many async replies for a single
request. Idealy other clients should be able to subscribe to a running
work order too.


Requirements:
-------------

- must handle network outage
- must handle crash/restart of master (pending requests must not be lost)
- must handle crash of client (pending requests can be lost)
- clients should use a single socket to make port configuration and
  tunneling easy


So what are your ideas or recommendations?

Note: This uses Python3 and latest libzmq/pyzmq and on the client pyside (QT).

MfG
	Goswin



More information about the zeromq-dev mailing list