[zeromq-dev] [***SPAM*** Score/Req: 10.1/8.0] Re: [***SPAM*** Score/Req: 08.1/8.0] Re: IPC (again)
Erik Rigtorp
erik at rigtorp.com
Tue Jan 5 13:03:29 CET 2010
On Mon, Jan 4, 2010 at 12:48, Martin Sustrik <sustrik at 250bpm.com> wrote:
>>> 3. The above would work OK for VSMs (very small messages). Still, larger
>>> message contents are allocated via malloc (see zmq_msg_init_size
>>> implementation) and these would require allocating shmem for each
>>> message. While doable, it would make sense only for very large messages,
>>> and only those very large messages that are known in advance to be sent
>>> via shmem transport. It's kind of complex.
>>
>> That would be a neat optimization, but complex. I think as a start we
>> should implement a ringbuffer with byte elements and use it as a
>> shared memory pipe. Basically you would write() and read() from the
>> buffer just like a socket but without the overhead. If you know the
>> max message size you could optimize this and implement a ringbuffer
>> where each element is a message and let the user program work directly
>> on shared memory. That would be hard to integrate with ZeroMQs API.
>
> What about passing just VSMs via the ringbuffer? You can increase
> MAX_VSM_SIZE when compiling 0MQ so that all the messages fit into the
> ringbuffer.
>
Is it correct that VSMs are messages that get copied within 0mq and
other messages get passed by reference?
Then we would implement copy-on-write for VSMs and zero-copy reading
in the client. For large messages we would need a shared memory area
which is mapped to all processes, a lock-free allocator and a
reference counting garbage collector. Doable but complex and it has
it's own performance bottlenecks. To utilize this we also need to
change/complement the API (i think).
I think we should forget about implementing zero-copy to begin with
and for small messages it's not necessarily better.
>>
>> I'll try to find/write a good c++ lock-free ringbuffer template.
>
> I would start with yqueue_t and ypipe_t. We've spent a lot of time making
> them as efficient as possible. The only thing needed is to split each of
> them into read & write part. This shouldn't be that complex. Both classes
> have variables accessed exclusively by reader and variables accessed
> exclusively by the writer. Then there are shared variables manipulated by
> atomic operations that should reside in the shared memory.
They look efficient, but assumes you can allocate new memory. For shm
it's simpler to assume a fixed length buffer. Also I think that the
code is not correct. On PPC you are not guaranteed that the memcpy()
is commited before you update the atomic_ptr. You need to add a memory
barrier.
More information about the zeromq-dev
mailing list