[zeromq-dev] Tries vs. Filters?

Martin Sustrik sustrik at imatix.com
Fri May 21 07:31:30 CEST 2010

Matt Weinstein wrote:
> Martin -
> Are you in userland when you do the matching?

Yes. At the moment.

> If so, a dispatch might work:
> struct zmq_filter_t {
>     int (subscribe*)(void const* prefix, unsigned size);
>     int (unsubscribe*)(void const* prefix, unsigned size);
>     int (match*)(void const* msg, unsigned size);
> };
> zmq_setsockopt(sock, ZMQ_SUBSFILTER, zmq_filter_t*, sizeof(zmq_filter_t);
> Alternatively it could be a classical:
> zmq_setsockopt(sock, ZMQ_SET_FILTER, filterfn*, sizeof(filterfn*)); // 0 
> means use default filter

Won't work if matching is done in kernelspace or even remotely. 
Something like this would be better:

zmq_setsockopt (s, ZMQ_MATCHING_ALGORITHM, &ZMQ_BLOOMFILTER, sizeof (int));

> And let me build my own filter chain (just like good old DOS device 
> chain days :-) )
> There's another approach, which I'm already up against : let me do it 
> myself :-)
> zmq_setsockopt(sock, ZMQ_SUBSCRIBE_ALL, 0, 0);  // equivalent to 
> subscribe ""

This already exist:

zmq_setsockopt (s, ZMQ_SUBSCRIBE, "", 0);

There's no performance overhead.

> In order to do the head end work I need SNDMORE, and I waste a lot of 
> buffer copies (if you need to do that internally), and I need the 
> SNDMORE bit back, or I could use a gather() form of zmq_send, e.g.:
> int zmq_send_gather(void *socket, zmq_msg_t **msg, int flags); // NULL 
> terminated vector
> int zmq_send_many(void *socket, int flags, zmq_msg_t* msg1, ...);  // 
> varargs style if you like

There are no copies made internally.

As for scatter/gather it was rejected in favour of SNDMORE/RCVMORE. The 
rationale was:

a.) Simpler API (no arrays)
b.) Complete message doesn't have to exist in the memory before sedining 
it or after reading it. It can be produced/processed part by part.

> I'd build my own header like a Bloom prehash, etc., and nail it.
> If you are in userland, you could also get some efficiencies with e.g. a 
> peek+upcall form, e.g.:
> int zmq_peek(sock, unsigned peek_size, int (my_filter*)(void const*, 
> unsigned size));
> This would be synchronous (wait style), and pass up to peek_size bytes 
> to the local filter, which could accept / reject the packet.  Then my 
> regular filter would kick in.

Sounds like separating the filtering into two steps. What would that be 
good for?

> I'm sure there are upcall latency issues, those have to be "caveat 
> emptor": if you build an inefficient function, you eat it.
> If you're not in userland, you could of course have a peek that just 
> copies only a message prefix:
> int zmq_peek(void *sock, unsigned peek_size, zmq_msg_t *msg, int flags);
> I really wanted to just say ZMQ_PEEK in zmq_recv, if you're happy with 
> OR style flags:
> int zmq_recv(void *sock, zmq_msg_t *msg, (ZMQ_PEEK | 4)) // peek 4 
> bytes, could have a byte, single, double, quadword instead, such as 

Why not send the peek section of the message as a separate message part 

> Regardless, I'd still propose some type of coarse filter for efficiency 
> -- if the namespace is properly designed there will be a huge 
> performance boost (for low hit rates). One might even argue that you 
> should just provide coarse filtering, and let userland do the final 
> filtering, and provide a prototype implementation e.g. Tries.

I think doing two (or more) levels of filtering is automatic. First 
level is done by 0MQ itself, second (and any subsequent) by application.

> I've built several systems that required tracking >>thousands of ticker 
> symbols.  No matter how you slice it, you'll never have enough 
> performance in your default algorithm for these types of special cases, 
> so I would argue that you should make the standard case work well e.g. 
> up to a few dozen carefully structured topics per connect, and let the 
> user replace the filter or process.

I would say that current algorithm works well with dozens of 
subscriptions. In such case whole trie is in the cache and thus lookup 
can be accomplished in nanoseconds.

Then your custom algorithm can kick on the application level.

> Anyway, that's my train of consciousness feedback (while on the train :-) )


Btw, I'm ccing the mailing list so that other may join the discussion.


More information about the zeromq-dev mailing list