[zeromq-dev] scalability of zeromq in terms fof queues and topics

Bhavin Turakhia bhavin.t at directi.com
Sat Apr 10 09:08:25 CEST 2010


Interesting. I like the idea. I have a few quick observations -

* If developing something like a chat server or a pub-sub server where
topics are domain names (user1 at domain.com, user2 at domain.com) this
optimization may help if the string were to be reversed (moc.niamod at 1resu)
possibly and interestingly increasing the number of single-branch nodes at
the top of the tree as opposed to only at the bottom :). Infact in general
topic strings could be created in a manner so as to utilize this
optimization

* I obviously haven't thought through this structure as much as you have -
but why the "limit" in terms of the number of characters stored at that node
(say 8). Why not store as many characters as represent no branching. The
moment another topic comes in that needs to branch out at a particular
place, then the split could be made at that time. This could increase
insertion complexity, but insertion is a one time activity. I might be
missing something obvious.

* I can share my experience here. We had done a ton of tests concerning
optimization of a similar data structure w.r.t DNS caches. In all our tests
the difference in performance was of the order of double-digits when the
memory structure could be stored in the CPU Cache vs RAM.

Bhavin
http://bhavin.directi.com


> There's still one possibility for optimisation though: If, at any
> particular place of the tree there's a single-branch node (currently
> the
> memory footprint optimisation you've noticed), the associated structure
> could possibly hold up to say 8 subsequent characters. The idea is that
> single-branch nodes tend to appear in strings. Consider following
> example. Plus signs correspond to real branchings while minus signs
> denote strings of single-branch nodes:
> 
> animals.dogs
> --------+++-
> 
> animals.donkeys
> --------+++----
> 
> animals.ducks
> --------++---
> 
> animals.cats
> --------+---
> 
> The whole point is to significantly lower memory footprint which in
> turn
> should have severe impact on performance. Specifically with large
> search
> trees it may happen that some parts of the tree are eliminted from the
> CPU cache. In such case, access to physical memory can slow down the
> algorithm by an order of magnitude. Thus keeping the data structure as
> compact as possible will make it much faster, even for sparsely used
> topics.
> 
> Thoughts?
> Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 3762 bytes
Desc: not available
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20100410/25c12adf/attachment.bin>


More information about the zeromq-dev mailing list