[zeromq-dev] Limitations of patterns?
Kerby at inocode.com
Wed Aug 25 04:25:50 CEST 2010
I'm new here and still learning Zmq. I want to state first that I
love how it "just works" in most cases. What I'm missing seem to be against
the usage pattern of Zmq, so maybe I'm missing something so figured I'd ask
about possible alternatives and ways to get things to work.
The first pattern I'm having minor issues with is the sub/pub stuff.
I can work around the missing item but it's rather annoying and costs some
notable bandwidth. I have a centralized distributor which does nothing but
maintain the primary database. I send out a hell of a lot of changes from
this central item, while I would love to distribute this, it is not yet
Now, the problem is that when a new worker comes up, I need to
gather all data about a sub class of the contained primary distributor data.
Currently, I bring up the new service, open a new rep/req socket to get the
initialization data. Of course this is an obvious issue as changes are
going on while I'm bringing up the new service and getting the
initialization. It is an obvious race condition that I will potentially
have out of date information when I finalize the pub/sub connection.
The only solution I have currently is to put up the pub/sub first,
wait for a message and then put up the rep/req connection to get my initial
data. I have to queue up all messages while I get the initial data and then
filter it manually in order to get to the data which came in after the
initialization data. This means that I have to add a "sequence" value to
each of the update pubs, which is a very notable waste of bandwidth given
the frequency of the published messages. But it does work. Of course, to
"fix" this in zmq, I guess you'd have to do the same sequencing in the
framing. Ideas are welcome or I just have to live with it. :)
The second item is in the pipeline patterns. I'm not entirely sure
of the behavior behind the scenes (I believe round robin is mentioned) but
I've already seen some bad processing patterns using the pipelines. I send
down several thousand "process" messages and often I end up with one
processor stuck doing several long processes at the end, all by it's
lonesome. Nothing can solve the fact that I don't know which ones will be
long or short in this case, if 2 "REALLY" long ones end up on a single
receiver, that single item ends up blocking the entire process.
The first issue is easily solved with a sequence number and some
buffering. Unfortunately this is annoying and a pain in the butt for
something like Zmq which hopes to be a standard. It's also unnecessary as
the proper way to deal with this would be a method for the system to note
new connections, post the init data and then post a "pay attention to the
rest of this", message in the normal stream of pub messages. (I suspect
this touches on the sub/pub filtering item.)
The second item would be a very different problem. That one is a
bit more complicated in terms that it implies an ack to various messages in
certain connection types. A non-even distribution requires knowledge of
completion states. As such, downstream/upstream seems to me to require a
new flag: "ZMQ_ACKREQUIRED". Before ZMQ tries to post more messages to a
downstream in this case, it will require a zmq_close to occur.
Please take this as intended; I'm a newbie to Zqm so maybe I'm
missing things. But I am very experienced in networking and as such, know
how to avoid silly waste. My current work around's are wastes, and really
should not be required. Overall, being able to recv "connections" would
solve many issues.
More information about the zeromq-dev