[zeromq-dev] To trie or not to trie :)

Martin Sustrik sustrik at 250bpm.com
Tue May 18 13:04:17 CEST 2010


Hi Bhavin,

I'm reading through the documents...

One idea I find compelling is using bitmaps and dense pointer arrays at 
a node rather than min/max values and sparse pointer arrays, i.e.

struct node_t
{
     unsigned char bitmap [32]; // All 256 possible next bytes,
                                // each represented as a single bit.
     node_t **nodes; // There are no NULL pointers in the array.
}

rather than:

struct node_t
{
     unsigned char min;
     unsigned char max;
     node_t **nodes; //  There may be NULL pointers in the array.
}

The rationale goes like this:

If there's only one subsequent node, the optimisation already in place 
uses 'nodes' pointer to point directly to that node rather then 
one-element long array of pointers to nodes.

Thus, at each non-trivial node we have at least 2 subsequent nodes. If 
these are close each to another (say 'a' and 'b') the array has only 2 
elements and no NULL pointers. However, if the nodes are more far apart, 
say 'a' and 'm', the array is rather long (13 elements) and contains a 
lot of (11) NULL pointers.

If bitmap solution is used instead we loose some memory (30 bytes) by 
using bitmap (32 bytes) instead of min/max (2 bytes), however, in case 
such as 'a' and 'm' we gain some memory (88 bytes on 64-bit CPU) by not 
storing NULL pointers.

The sweet spot seems to be at subsequent nodes being in average more 
than 5 positions apart (such as 'a' and 'f') on 64-bit CPU or 9 
positions apart (such as 'a' and 'k') on 32-bit CPU.

That seems reasonable...

Thoughts?
Martin

Bhavin Turakhia wrote:
> Hi
> 
>  
> 
> Since my discussion thread with Martin on the efficiency of in-memory 
> data structures of ZeroMQ, I have been reading up a bit about other data 
> structures from the perspective of memory utilization. Here is a 
> compendium of a few other data structures that are considerably more 
> efficient than a trie –
> 
>  
> 
> http://bhavin.directi.com/to-trie-or-not-to-trie-a-comparison-of-efficient-data-structures/
> 
>  
> 
> 0MQ may want to trie ;) some benchmarks with these
> 
> Warm Regards
> Bhavin Turakhia
> Founder & CEO
> Directi
> *********************
> Always hiring @ http://careers.directi.com <http://careers.directi.com/>
> -----------------------
> http://twitter.com/bhavintu
> Blog: http://bhavin.directi.com <http://bhavin.directi.com/>
> Wiki: http://wiki.directi.com <blocked::http://wiki.directi.com/>
> Web: http://www.directi.com <http://www.directi.com/>
> T: +91-22-30797600
> M (US): +1 (415) 366 7762
> M (IN): +91 9820097557
> F: +91-22-30797510
> *********************
> 
>  
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev




More information about the zeromq-dev mailing list