[zeromq-dev] Async::Worker, C++ task offloading.
Oliver Smith
oliver at kfs.org
Sat Jul 24 09:51:45 CEST 2010
On 7/24/2010 2:38 AM, Martin Sustrik wrote:
>>> OpenMP/TBB/0MQ approaches, benchmarks etc. you can possibly write a blog
>>> about it to post on zeromq.org.
>>>
>>>
>> This reminded me of a discussion I started on TBBs forums, you might
>> like my last two posts on the thread:
>> http://software.intel.com/en-us/forums/showthread.php?t=73155&p=2&o=d&s=lr
>> <http://software.intel.com/en-us/forums/showthread.php?t=73155&p=2&o=d&s=lr>
>>
>> Ha - reviewing my original post there, I can see the seeds of
>> Async::Worker :)
>>
> The above comparison is interesting. It would be good to have it
> accessible somewhere on the website. I'll give it a thought.
>
I'll put some thought into a better write-up for you, something a little
less "wary" of getting a TBB-forum thread deleted or something :)
>>>> This is a somewhat weak example because the work being done by the
>>>> worker is so trivial, but even so on a virtual quad-core machine
>>>> building with -O0 I see a 35-40% reduction in processing time.
>>>>
>>>>
>>> Wrker being trivial, the large reduction in processing time is even more
>>> impressive.
>>>
>>>
>> The great shame is that - by passing pointers - this first version would
>> /seem/ to preclude scalability across machines, but the very first thing
>> I wanted to pass was a { std::string ; std::vector ; }.
>>
> Actually, when using inproc:// transport 0MQ passes pointers between the
> threads under the hood. Yet you can trivially change it to tcp:// when
> scaling to multiple boxes. The only overhead is serialisation /
> deserialisation of your structures into the binary BLOB.
>
>
I suspected as much but my instinct these days is usually to get
something working before spending time under the hood. I'll probably do
some reworking on Async::Worker over the weekend.
>> The most obvious weak point in my current implementation is that I
>> failed to do zero-copy on the pointer itself! I need to figure out what
>> stupid thing I did wrong there because eliminating that extra allocation
>> would significantly improve throughput.
>>
> I'm a bit lost here, what extra allocation? If you are passing just the
> pointer, it's 8 bytes (on 64-bit microarchs). Messages below 30 bytes of
> length are called VSMs (very small messages) in 0MQ and are passed
> *without* any extra memory allocations.
>
Ahh! This I did not know.
In short: what I wanted to achieve was:
void sendMe(void* ptr)
{
message_t msg(&ptr, sizeof(ptr)) ;
outSocket.send(msg, 0) ;
}
but for some reason this failed horribly. I didn't take the time to
investigate because, again, I wanted POC before going under the hood :)
I had three overdue deadlines and needed parallelism a week ago and on
top of this MySQL's Prepared Statement system was kicking me in the behind.
In hindsight, I suspect that I forgot to do the '&' in '&ptr' :)
- Oliver
More information about the zeromq-dev
mailing list