[zeromq-dev] content based routing for a thread pool

Brett Viren bv at bnl.gov
Sun Oct 22 18:37:21 CEST 2023


Hi again,

Lei Jiang <lei.jiang.29 at gmail.com> writes:

> The MDP pattern is very interesting but I can't use it in my case. I
> am aiming to share data and resources inside a process so can't go
> inter-process.

Like all ZeroMQ things, there is no need to choose the transport
mechanism when writing the code.  You may simply use an "inproc://"
address to have inter-thread communication.  No code changes needed.

> I'm also trying to maximize concurrency and minimize latency so
> ideally as few hops as possible.

The hops from client to broker to worker do not contribute significantly
to latency.  ZeroMQ transfer latency is very low, especially with
inproc://.  Any but the most trivial application-level processing in the
worker will dominate the overall latency.

And, even then, MDP and PPP uses a clever latency reducing feature.  The
worker-broker latency in task issuing is removed by the worker making a
request for a task *before* any task may be available.  The worker then
just hangs until the broker has a task to give it and the task is
delivered with the reply.  The worker's *next* request to the broker for
a new task includes the result from the last task.  This pattern can't be
done with a direct client-worker pattern (unless client effectively
internalizes the broker's role).

> The other thing is the clients don't know the worker IDs in advance.

Just to be clear: in MDP/PPP, clients do not know anything about workers
and vice versa.

> The workers are of the same type of service.

Okay, then PPP would be enough.

> The reason I want to route messages by key(s) is that I must maintain
> sequence of messages of the same key so the only possible way seems to
> me to be to hash the key and send the messages to certain workers
> based on the hash.

I see.

So, you must record the order of tasks from client and then sort the
results to reflect that initial ordering.

You can do that with the socket "routing ID" but that will require
maintaining a map from "routing ID" to a "sequence number".  This map
would be filled when tasks go out and when results come back one must
map backwards to get the sequence number with which to sort.

It is simpler (imo) to not make "routing ID" serve both purposes but to
split up the job.

To do that a "sequence number" can passed with the task message and you
can require that the task result message include "sequence number".  You
can then sort by sequence number in code that need not be tied to the
more detailed act of servicing sockets.

> The server/client socket looks very promising. But there are 2
> difficulties for me. First they are in draft so I have to get them
> from an unstable version, small hurdle. Second by reading the code
> they do not seem to support muti-part messages, a deal breaker.

Yes, both these are true statements.

Now that we just got a fresh release of libzmq I hope we can make
progress to remove the "draft" label.  At least for SERVER/CLIENT.  In
practice, the "draft" sockets work well and their development appears to
be rather frozen/stable.

Next, the lack of multi-part is inherently tied to thread safety in the
draft sockets.  I think this can not be avoided even in principle
(without adding a very large complexity inside the socket code).

However, in practice the lack of multi-part messages does not pose any
big limitation.  At least not in my experience.

One can always concatenate multiple "messages" into one message at the
application level.  Or, likewise, the application can split one
"message" into multiple independent messages.

I don't actually understand why sockets were exposed to multi-part
messages in the first place.

I do see why libzmq has multi-part message schema as it needed to add
some structure in the messages to support eg routing ID.  The multi-part
message schema that was chosen is, I think, reasonable as it is simple
and structure is defined at a "byte level" in keeping with libzmq's low
level position.  ZeroMQ should certainly not dictate eg JSON or
ProtoBuffers or other "branded" schema.

But, even with multi-part message schema the sockets could have required
to send/recv all parts in a single monolithic block.  In hindsight, at
least, this would have made thread-safety easier to assure.

I would be very curious to learn history/reasons for this aspect.

-Brett.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 849 bytes
Desc: not available
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20231022/b1179c69/attachment.sig>


More information about the zeromq-dev mailing list