[zeromq-dev] Limitations of patterns?

Martin Sustrik sustrik at 250bpm.com
Fri Aug 27 10:09:43 CEST 2010


On 08/26/2010 09:33 PM, Kelly Brock wrote:

>> As for the solution, last value cache cannot be an organic part of
>> PUB/SUB messaging pattern. Here's the rationale:
>
> 	I forgot about the multicast case in this.  Hmm, seems like maybe a
> generic wrapper for 'only' tcp/inproc/ipc types would be in order to handle
> this.  The only thing is that the connections are relatively rare for my
> usage but the updates are very high frequency so the sequence value still
> seems like a considerable waste of bandwidth given that it is actually a
> quarter of the message size making up the updates.  Not that I'll complain
> much, I just need it working and can hack it later if it becomes an issue.
> :)

An option to sequence number would be for publisher to post a tag 
message into the PUB/SUB feed that would correlate with a tag attached 
to the snapshot.

>>> 	The second item would be a very different problem.  That one is a
>>> bit more complicated in terms that it implies an ack to various messages
>> in
>>> certain connection types.  A non-even distribution requires knowledge of
>>> completion states.  As such, downstream/upstream seems to me to require
>> a
>>> new flag: "ZMQ_ACKREQUIRED".  Before ZMQ tries to post more messages to
>> a
>>> downstream in this case, it will require a zmq_close to occur.
>>>
>> Yes. The problem here is that there's no way to control number of
>> requests on the fly. The existing pushback mechanism is based on TCP
>> pushback, so it allows you to write messages until TCP buffers are full
>> and 0MQ's high watermark is reached. What you need instead is a hard
>> limit. To implement it you need ACKs sent from the receiver to the
>> sender. If you are interested in implementing this, let me know and I'll
>> help you with making sense of the existing code.
>
> 	I'm definitely going to want to look at this in the next couple
> weeks.  Currently at work I'm using Zmq to organize an asset dependency
> crawler in order to generate patch files SWTOR.  The problem has shown up a
> little since some files are leaf nodes which can't generate further
> dependencies, so those process very quickly, others often start in an Oracle
> DB, do a ton of queries, then open a bunch of binary files and parse though
> them etc etc.  If I get several of the tough ones stuck on a worker, the
> whole process gets backed up in a hurry.
>
> 	Also, there was a typo above; I meant that perhaps a message with an
> ack required could trigger the ack only when you call "zmq_msg_close".  So,
> you keep the msg around until such time as you have completed processing and
> when you close it, the ack is sent out.  So, you would only need the new
> flag and from the api point of view there would be no changes.  Or it could
> of course be made explicit, say "zmq_msg_ack".

You don't want to go that way unless you are IBM. Been there, seen it. 
It's a way that leads to transacted messages, later on distributed 
transactions, then redundant distributed message store clusters etc.

>>> 	Please take this as intended; I'm a newbie to Zqm so maybe I'm
>>> missing things.  But I am very experienced in networking and as such,
>> know
>>> how to avoid silly waste.  My current work around's are wastes, and
>> really
>>> should not be required.  Overall, being able to recv "connections" would
>>> solve many issues.
>>>
>> 0MQ is so easy to use and scale exactly because individual connections
>> are invisible to the user. Once you allow user to handle individual
>> connections you'll end up with an equivalent of TCP. If that's what you
>> want on the other hand, why not use TCP proper?
>
> 	Yes and no, I like all the routing and easy handling it just seems
> that if the zmq_poll allowed you to catch new connections (only on
> tcp/inproc/ipc/etc of course) you could write the "last value cache" pattern
> much easier and save a bunch of bandwidth at the same time, at least in my
> case.  I.e. my message is "int id, int itemId, int valueChange, int flags",
> the first id is unused except for the first little bit of time while the new
> connection gets initialized.  The pub/sub is perfect for this generally
> except the initialization and having to make sure I don't get a race
> condition, hence the id and wastage of bandwidth.

Well, current way 0mq works is result of years of experimentation and 
design work. And here's my intuition after all those years:

If you allow users to deal with individual connections you:

1. Loose the scalability as you are not longer in control of the topology.

2. You'll loose the ability to abstract the underlying transport 
mechanism (TCP vs. multicast).

3. Allowing users to do one operation with individual connections means 
you are asking them to write applications that would take advantage of 
the feature. Thus you'll have connection-aware applications. These in 
turn would need more control over connections. Soon you'll end up with 
reimplementing every single feature of TCP sockets.

> 	Overall, I'm very happy with Zmq in conjunction with zero conf
> (Bonjour/Avahi) since I can bring services/clients up anywhere on my network
> and debug/kill/restart/whatever with almost no problems.  I'd say the only
> real issues I'm running into are those already mentioned and one other which
> still seems to be a case where connect/disconnect knowledge would be the
> best way to solve the problem.  I'm looking at it from more of a Zmq point
> of view now (I believe), if I can't figure out something I'll definitely be
> posting a follow up.

Great. Thanks!
Martin




More information about the zeromq-dev mailing list