[zeromq-dev] Publisher side filtering... (draft)

Martin Sustrik sustrik at 250bpm.com
Fri Dec 3 12:48:04 CET 2010


Hi Gerard,

Checked your patch. Pieces are missing, but it's a nice start...

Now the question is how can we proceed with incorporating 
subscription-forwarding-related code into the mainline in a gradual manner.

What about say starting with making the pipes between PUB/SUB socket and 
the I/O thread bi-directional? At the moment the pipe in the opposite 
direction won't be used for anything, however, it would allow us to 
check whether there are no related problems (memleaks or such).

More comments are inlined:

> I've been working on this again yesterday and attached is a draft
> approach that works, but has some serious issues to be addressed.
> So this should be regarded a draft, intended to receive comments on the
> approach taken. The following issues are to be addressed:
>
>    1. I've now used the "fair-queue" object to maintain a list of pipe
>       readers in pub.cpp.

Yes. That prevents a malicious client, issuing large amount of 
subscriptions, to make other clients unable to subscribe.


> The pub socket should periodically check if
>       anything new was written by subscribers, which indicate topic
>       changes. At the moment, it checks fq_t whenever any message is
>       sent by the pub socket. The "if (has_in())" should probably become
>       a "while (fq.has_in())" call to process all subscription changes
>       first. fq_t only verifies all currently 'active' sockets, so this
>       should not be too expensive...?

First of all we should decide where exactly should the filters reside. 
Keep in mind that there are 2 possible places for each socket -- the 
socket itself and the associated session living in the I/O thread. That 
gives 4 possible places in simple PUB/SUB setup. If a device is placed 
in the middle, there are 8 possible places (1 socket in the publisher, 2 
in the device and 1 in the subscriber).

My thinking at the moment is that there's should be a  filter at the 
ultimate publisher (not in the intermediary devices) to limit the 
overall number of messages as soon as possible and another filter at 
ultimate consumer (not in the intermediary devices) to filter out any 
stray messages that may be received because of delayed unsubscription.

Aside of that there should be a "dispatcher" (as opposed to the filter) 
at PUB side of every intermediary device that would send a message only 
to the relevant subscribers.

The matter is complex. We should think about it in more detail, draw 
some diagrams etc.

>    2. The current implementation relies on the reader_t object from fq_t
>       to find the identity, then find the writer socket through the
>       identity and update the subscription list accordingly. There
>       should be better ways to do this.

Yes, presumably.

>    3. The pub socket now stores pipes three times, so there are three
>       lists. One is to keep track of readers, so that subscription
>       requests can be handled. This requires some correlation between
>       the reader list and a writer list, so that the actual subscription
>       list can be found and updated. I think a separate structure which
>       associates the reader and writer in a single structure together
>       with the subscription list and whatever else is needed for
>       multipart functionality is a better way of doing this. This means
>       that the class is essentially reimplemented. There's also a need
>       for a separate list to keep track of all readers that have data to
>       be read.

Yes. Lot of work to do still.

>    4. For subscriptions to be sent upstream, the sub should connect
>       first. Subscribes that took place prior to connecting are not
>       propagated yet. This should probably change.

Yes. The connecting side should keep a list of subscriptions are 
re-issue them after each reconnection.

>    5. Should this really replace the current pub/sub implementations, or
>       should this implementation become a different socket type
>       altogether?  I can imagine that not everybody may need this and
>       there is always a small performance penalty to be paid due to list
>       updates, the subscription updating, etc.( some people may prefer
>       to do this client side to spread the load).

The performance penalty is negligible when compared to the improved 
functionality. It should be standard PUB/SUB.

> Multipart seems to be working this way, but looks a bit kludgy. Any tips
> on better ways of doing that are welcome.

Martin



More information about the zeromq-dev mailing list