[zeromq-dev] Java/Windows issues
Martin Sustrik
sustrik at fastmq.com
Fri Aug 21 15:50:05 CEST 2009
> 1. Consumers would publish special subscribe/unsubscribe messages on
> the bus which would allow producers not to physically send any
> messages that have no consumers. The obvious problem are
> late-joining producers - they would have to get a snapshot of
> current subscriptions either from a centralised repository or from
> each consumer. Also, consumers would have to be monitored and
> subscriptions should be deleted when a consumer dies unexpectedly.
> The whole thing is more or less doable, but pretty complex.
>
> I think there are actually 2 distinct ideas in here :)
> 1) The concept of a topic which can be used for filtering messages on
> a multicast group. Wildcards can be built on top of that which adds
> another layer of complexity.
Yes. Mutlicast groups are insufficient when more complex content-based
routing/filtering such as regexp-based one is required.
> 2) The concept of publishers being aware of individual subscribers
> (using join/leave/heartbeat messages). This is necessary for doing
> persistent messaging and has advantages when it comes to debugging (e.g.
> in some middlewares you can see who is "connected" to who from a web
> browser embedded in the API which is embedded in your app). Its also
> useful so publishers can realize that a particular listener is doing too
> many resend requests and is hurting the performance for everyone else so
> you can cutoff/throttle responses to it so you don't end up w/ a
> multicast storms. I agree its VERY complex though as it requires lots of
> coordinating. My point is IF you ever decide to tackle these problems
> you will find you have implemented most if not all of the PGM
> functionality. You may still want to support PGM for interoperability
> reasons.
Yes, it's complex. Lot of different problems are to be solved. Some of
them can be handled directly in PGM stack, others have to be implemented
on the top of it.
> Why? The ZRecoverableSock will write every message from a file
> and can replay the messages when any listener requests them. Any
> listener can request a resend. You may want to add features so
> listeners request replays over TCP or at least receive the
> replays over unicast UDP, so you don't bog down other listeners.
> You could also add options to make it aware if subscribers and
> tracks which messages have been received by which subscriber and
> then removes the messages that no longer need to be stored.
>
>
> The slow consumer problem. With TCP you can simply block the
> producer when consumer is not consuming fast enough and there's no
> more memory/disk space to store the messages to send. With multicast
> the same strategy would result in whole system blocking because of a
> single misbahaved/slow consumer.
>
>
> I think the assumption that you want the publisher to stop sending if
> one listener is slow is wrong. There are uses case for that but I don't
> think its the most common one in a multicast world. Obviously you need
> to handle the boundry condition of running out of disk space/memory and
> you can provide the option of blocking or discarding oldest
> message/newest message etc. You may be able to resolve the slow listener
> issue before than though or you may have enough disk space to go the
> whole week w/ a listener down.
We've misunderstood each other. The publishing is not going to stop
because of a single slow consumer. Other way round - the consumer is
going to be disconnected.
As for storing the data on disk, our experience with market data is that
storing it when consumer isn't available results in something like 300GB
a day which is quite a lot.
> You could also have a ZReqReplySock that wraps any of the
> above
> sockets.
>
>
> Will be done in 0MQ/2.0
>
> OK, I'm curious to see how thats implemented in multicast. Does
> everyone see the responses and have to filter them?
>
>
> No point in implementing req/rep on top of multicast. It's always
> 1-to-1 dialogue.
>
>
> Strongly disagree w/ this. Using multicast to send requests to a server
> is very useful (this is my preferred method) because you don't have to
> know which machine a server is running on and you can have multiple
> servers load balancing the requests. Also useful if you failover or just
> don't want to always update your clients when a server moves.Sending
> replies on mutlicast is less useful and requires other requestors to
> receive and ignore responses not meant for them. (although still useful
> if you want to log request/replies from an external process for
> debugging/stat collection purposes w/o impacting performance of the
> critical path), other middleware implementations implement this feature
> by having the the request message contain information for sending the
> response, e.g. you could include an ip/port which will receive unicast
> messages.
True. However, you should take performance impact of such architecture
into account. With multicast, each box on the network has to process all
messages and drop those that are not needed. Filtering is done on
software level and is often expensive - say regexp evaluation. With
unicast, messages are routed only to the relevant box. The whole thing
is done ultra-efficiently in the hardware. Possibly, both scenarios
should be enabled and the user should choose one of them depending on
the ratio of messages to be filtered out and performance requirements.
Martin
More information about the zeromq-dev
mailing list