[zeromq-dev] Non-contiguous message thoughts

Brian Granger ellisonbg at gmail.com
Thu Mar 4 21:43:36 CET 2010


Mike,

I too am using 0MQ in an HPC context and am worrying about sending
large objects....

> The code in question takes a few separate data structures (matrices and
> associated metadata) and sends it to another process over a socket.  The
> most straightforward way for me to accomplish this with 0MQ is to create a
> message (I know the data structure's size ahead of time), pack the
> structures into contiguous memory managed by the message, and send with the
> 0MQ API.  The issue that I'm running into is that the matrices are rather
> large, and I don't want to incur the penalty of copying them into contiguous
> memory as required by the 0MQ API.

In my experience so far, it is not clear that avoiding copying is
something that you want to do,
even for large messages.  Here is my logic:

* If you have large messages, there are two issues, the time cost of
memcpy and the
memory footprint.
* The latency improvement that you get by not memcpying is less than
you might imagine
and it doesn't kick in until you get quite large messages.
* The memory footprint can be *worse* if you don't copy.  This may
seem non-intuitive, but
here is the argument.  If you don't copy the original object, but
simply point 0MQ to it, the original
object *has* to remain alive until 0MQ sends the message.  Options: 1)
pass a deallocator function
to 0MQ.  This is i nice idea, but if you are using multiple threads in
0MQ (most people do), your
deallocator will be called in the IO thread.  Thus, you will have to
introduce a lock or something to
make the deallocator threadsafe.  BUT, the time it takes to acquire
that lock destroys any latency
benefit you got by not memcopying.  2) You don't pass a deallocator,
but instead just make sure to hold
the original object until 0MQ sends the message.  But, 0MQ doesn't
have any way of telling you
that the message has been sent.  Thus, you end up holding onto the
original message longer than
you wanted, increasing the memory footprint.

Summary:

* You don't want to mess with lock that manage data across the IO/app
thread boundary.
* memcpy is not that slow, and by using it, you enable 0MQ to
deallocate the msg ASAP.

With all that said, I would *love* to figure out a way of doing fast
non-copying sends, so hopefully,
other can help figure this out.

Cheers,

Brian


> 0MQ seems to have a mechanism to avoid excessive copying by providing the
> memory for the underlying message.  However, this functionality requires the
> input data to be already stored in a single chunk of contiguous memory.
>
> Has anyone considered implementing a message in 0MQ that would allow a
> developer to send several regions of memory as a message's content?  One
> idea to accomplish this would be to create a separate message constructor
> that took a list of memory regions (base + length) representing the
> message's content.  Another idea to accomplish this is to provide a
> streaming interface that would allow a developer to append to a message
> without having to copy data.
>
> What are your thoughts on this idea?  If this is something that would be
> generally useful to others, I'd be happy to contribute to an investigation
> or development effort.
>
> Thanks,
> Mike
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>



-- 
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu
ellisonbg at gmail.com



More information about the zeromq-dev mailing list