[zeromq-dev] max_app_threads = 512

Matt Weinstein matt_weinstein at yahoo.com
Wed Jun 16 14:31:29 CEST 2010


On Jun 16, 2010, at 7:37 AM, Martin Sustrik wrote:

> Matt,
>
>> First off, let me apologize for asking newbie questions. SMP has
>> really only been a serious subject for me for the last four weeks.   
>> My
>> graduate work predated the RAM model (80's), although I'm a EE so I
>> understand the pipelining tradeoffs and store ordering "non  
>> guarantees".
>>
>> As near as I can tell it should be straightforward to build a 1R/1W
>> lockless pipe using barriers (certainly you can do a circular buffer
>> with a count, so this should extend to a queue, although it may
>> require getting efficiencies using an unrolled list, i.e. one lock  
>> per
>> allocation of N blocks).  I haven't had the time to dig into your
>> ypipes yet, which may be doing that.
>>
>> Am I correct here?
>
> Yes, you are.
>
>> Intuitively I think it should be possible to build a zero barrier 1R/
>> 1R lockless queue leveraging CPU affinity and cache alignment tricks
>> to preserve store ordering (essentially using memory as an inter-
>> processor messaging platform), but it seems pretty exotic, and I've
>> got to study the cache update guarantees for modern architectures.
>
> 1R/1R? Can you be more specific?
Typo: 1R/1W ... sorry, on a train (again :-) )

I've got to look at the cache models to see what's being enforced  
between processors, the notion is to transmit state between processors  
using cache aligned writes that combine vector clocks and pointers in  
the same line, essentially creating a micro-packet , and letting the  
other end of the pipe handle version and consistency detection.  It's  
an old approach applied to yet another network (SMP cache).  It  
depends on having an atomic cache write that's long enough to hold the  
<vector clock, pointer> pair.   Not sure yet how to model the protocol  
(my toolbox is a bit rusty), and I have to dig into the hardware  
manuals to see what the guarantees are.

I'm assuming that QPI and HyperTransport are where I should be  
starting, those seem to be the x86 inter-processor links these days?

>
>> I've been reading the web pubs, but probably have to buy an actual
>> book (!) to understand the basics of the RAM model (tall RAM
>> assumption, etc., etc.)  Recommendations would be appreciated.
>
> The whole area is poorly documented. I have no real recommendation for
> you :|
>
Yes, ironically the area is not well synchronized, but will become  
consistent ... eventually :-)

> Martin
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Best,
Matt



More information about the zeromq-dev mailing list