[zeromq-dev] IPC (again)

Martin Sustrik sustrik at 250bpm.com
Mon Jan 4 09:43:28 CET 2010


Hi Erik, John,

>> I've read the two discussions on using ZeroMQ for IPC. I think ZeroMQ
>> should support IPC and in-process communication.
>>
> I think we all agree on this.
> 
>> TCP is nice to work with but it has one problem: On linux (and others)
>> TCP over loopback doesn't bypass the TCP stack which makes the latency
>> several times higher than using pipes or unix domain sockets. I know
>> that on Solaris this is optimized so that a loopback TCP connection
> 
> is that since a particular solaris release 8,9,10?
> I havent got my solaris internals book to hand right now ;-)
> 
>> becomes more or less a pipe. For low latency IPC on Linux ZeroMQ needs
>> pipes or unix domain sockets.
>>
> just before xmas I exchanged an email with Martin about providing a fifo/pipe 
> interface. (I wasnt concerned about performance, but wanted a zmq socket 
> connection that could only be accessed via the same machine and not via 
> loopback.) Subsequently I think that providing AF_LOCAL (AF_UNIX) sockets 
> would be a good idea.
> 
>> For ultra low latency IPC there is only one way to go and that is to
>> use shared memory. I took a look at yqueue.hpp in zeromq2 and it's a
>> good start. We only need to add a lock free memory allocator (which
> 
> I'm glad some one else has looked at this because a while back I wondered 
> whether the yqueue.hpp could use shared memory.
> 
> 
>> can be implemented using a lock free queue) or implement a lock free
> 
> ypipe.hpp for example?
> 
>> ringbuffer that would hold a fixed number of messages and block the
>> writer when it's full. For signaling I suggest to implement two
>> different approaches. One using pthreads conditions and one using busy
>> waiting. From my own testing I've seen that the pthreads
>> implementation would have similar latency as pipes/unix domain sockets
>> and a busy waiting solution would achieve latencies <1µs.

Great that there's an interest in IPC out there! Few comments follow:

1. pipes: Using pipes instead of TCP connections makes sense. It 
requires no changes to the codebase starting from the point where the 
connection is established. Still, we should think of a mechanism to be 
used to pass the file descriptor of the pipe from connecting application 
to the binding application. (Maybethis way: Open a TCP connection, pass 
the fd as a message, close the TCP connection, use the pipe instead?)

2. Yes, yqueue_t could use shared memory. It uses malloc for each N 
elements (currently 256) in the queue and the size of the block 
allocated is constant. As for multithreading there are two threads 
accessing the yqueue, one of them writing to the queue (thus allocating 
the chunks when needed) other one reading from the queue (thus 
deallocating the chunks).

3. The above would work OK for VSMs (very small messages). Still, larger 
message contents are allocated via malloc (see zmq_msg_init_size 
implementation) and these would require allocating shmem for each 
message. While doable, it would make sense only for very large messages, 
and only those very large messages that are known in advance to be sent 
via shmem transport. It's kind of complex.

4. Signalisation: Note that the receiver of the signals polls for 
incoming signals using file descriptors. Thus the condition variable 
won't do. Creating a fake file descriptor (always signaling) to 
implement busy loop style of polling is viable, however, using 100% CPU 
isn't exactly green. On Linux eventfd can be used to implement singaling 
in efficient manner. Not sure about other OSes.

As a summary, I would start with implementing the pipe transport, and 
move to shmem when that part is done. Anyone interested in the task?

Martin

Martin



More information about the zeromq-dev mailing list