[zeromq-dev] (almost) zero-copy message receive

Thomas Rodgers rodgert at twrodgers.com
Tue Jun 2 18:28:51 CEST 2015


It is really hard to grok 3.9.2 of the standard regarding type punning
rules, but I *think* you are safe if you make the backing array of either
char or unsigned char. And if 5.2.10 is be believed, reinterpret_cast
*should* work, but it does seem to run afoul of the strict_aliasing
warning, soooo....you have to do a double static_cast through void, e.g. -

void* pv = static_castd<void*>(&u.lmsg.content->ref_stg[0]);
return *static_cast<zmq::atomic_counter_t*>(pv);

Boost (serialization, iterator to pointer conversion, flat set/map, etc)
also uses this technique.



On Tue, Jun 2, 2015 at 11:10 AM, Thomas Rodgers <rodgert at twrodgers.com>
wrote:

> with reinterpret_cast?
>
> On Tue, Jun 2, 2015 at 10:55 AM, Auer, Jens <jens.auer at cgi.com> wrote:
>
>>  I already tried this, but then the compiler complaint about a violation
>> of the strict alias rules and the code doesn't compile anymore. The problem
>> is that I access something of type uint8_t[] with a pointer of type
>> atomic_counter_t.
>>
>> Cheers,
>>   Jens
>>
>>      --
>>  *Jens Auer *| CGI | Software-Engineer
>>  CGI (Germany) GmbH & Co. KG
>> Rheinstraße 95 | 64295 Darmstadt | Germany
>> T: +49 6151 36860 154
>> *jens.auer at cgi.com* <jens.auer at cgi.com>
>> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter
>> *de.cgi.com/pflichtangaben* <http://de.cgi.com/pflichtangaben>.
>>
>> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging to
>> CGI Group Inc. and its affiliates may be contained in this message. If you
>> are not a recipient indicated or intended in this message (or responsible
>> for delivery of this message to such person), or you think for any reason
>> that this message may have been addressed to you in error, you may not use
>> or copy or deliver this message to anyone else. In such case, you should
>> destroy this message and are asked to notify the sender by reply e-mail.
>>       ------------------------------
>> *Von:* zeromq-dev-bounces at lists.zeromq.org [
>> zeromq-dev-bounces at lists.zeromq.org]" im Auftrag von "Thomas Rodgers [
>> rodgert at twrodgers.com]
>> *Gesendet:* Dienstag, 2. Juni 2015 17:34
>>
>> *An:* ZeroMQ development list
>> *Betreff:* Re: [zeromq-dev] (almost) zero-copy message receive
>>
>>   One other point, you would need to guarantee appropriate alignment of
>> the backing array (sadly, C++11 gave us std::aligned_storage but 03 not so
>> much).
>>
>> On Tue, Jun 2, 2015 at 10:11 AM, Thomas Rodgers <rodgert at twrodgers.com>
>> wrote:
>>
>>> I am unaware of any list, it seems (to me at least) it's whatever people
>>> have contributed support for.
>>>
>>>  As for a concrete suggestion. I am not sure this is any less ugly
>>> but...
>>>
>>>  libzmq goes out of it's way generally to avoid calling new() in many
>>> cases (non-throwing failure on OOM presumably), preferring malloc() and
>>> placement new instead, content_t was no exception. It did embed an explicit
>>> atomic_counter_t and would placement new it, and explicitly call the
>>> destructor when the lmsg was destroyed.
>>>
>>>  You could, instead declare the storage for atomic_counter_t as an
>>> array, uint8_t ctr_storage[sizeof(atomic_counter_t)] and placement
>>> new/explicit dtor as before, and then provide an accessor to return an
>>> atomic_counter_t& by returning
>>> *reinterpret_cast<atomic_counter_t*>(&ctr_storage[0]).
>>>
>>>  It doesn't change the size of the lmsg type, but it does remove one
>>> level of indirection, and you have hidden the ugly bits behind an accessor
>>> method.
>>>
>>> On Tue, Jun 2, 2015 at 9:40 AM, Auer, Jens <jens.auer at cgi.com> wrote:
>>>
>>>>  Hi Thomas,
>>>>
>>>>
>>>>
>>>> I did not want to say that you intended to start a flame war and you
>>>> certainly did not. I know that things like these easily turn into flame
>>>> wars, so I wanted to prevent that. All I wanted say was that I know that
>>>> switching to C++11 may not be applicable so I want to discuss the patch
>>>> first and search for an alternative to C++11. Do you have a suggestion for
>>>> the issue with the non-POD atomic_counter_t? My experience is quite
>>>> limited because I normally don’t write code which has to support older
>>>> compilers.
>>>>
>>>>
>>>>
>>>> Just out of curiosity, do you know a definite list of compilers zeroMQ
>>>> intends to supports? I only found the document
>>>> http://zeromq.org/docs:builds, but this is quite outdated.
>>>>
>>>>
>>>>
>>>> Best wishes,
>>>>
>>>>   Jens
>>>>
>>>>
>>>>
>>>> *--*
>>>>
>>>> *Dr. Jens Auer *| CGI | Software Engineer
>>>>
>>>> CGI Deutschland Ltd. & Co. KG
>>>> Rheinstraße 95 | 64295 Darmstadt | Germany
>>>>
>>>> T: +49 6151 36860 154
>>>>
>>>> jens.auer at cgi.com
>>>>
>>>> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie
>>>> unter de.cgi.com/pflichtangaben.
>>>>
>>>>
>>>>
>>>> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging
>>>> to CGI Group Inc. and its affiliates may be contained in this message. If
>>>> you are not a recipient indicated or intended in this message (or
>>>> responsible for delivery of this message to such person), or you think for
>>>> any reason that this message may have been addressed to you in error, you
>>>> may not use or copy or deliver this message to anyone else. In such case,
>>>> you should destroy this message and are asked to notify the sender by reply
>>>> e-mail.
>>>>
>>>>
>>>>
>>>> *From:* zeromq-dev-bounces at lists.zeromq.org [mailto:
>>>> zeromq-dev-bounces at lists.zeromq.org] *On Behalf Of *Thomas Rodgers
>>>> *Sent:* 02 June 2015 16:07
>>>> *To:* ZeroMQ development list
>>>> *Subject:* Re: [zeromq-dev] (almost) zero-copy message receive
>>>>
>>>>
>>>>
>>>> A better chart of compiler conformance is probably here -
>>>>
>>>>
>>>>
>>>> http://en.cppreference.com/w/cpp/compiler_support
>>>>
>>>>
>>>>
>>>> I don't know that there's been a "flamewar" around this topic, but
>>>> switching to -std=c++11 has been discussed recently in the context of
>>>> implementing thread safe sockets. IIRC the consensus at the time was to
>>>> stick with 98/03 conforming code.
>>>>
>>>>
>>>>
>>>> In my experience, supporting C++11, opens opens up a bit of a minefield
>>>> of portability "gotchas". For https://github.com/zeromq/azmq I can
>>>> generally fall back on Boost to paper over these issues, but that's not a
>>>> realistic option with libzmq. So, while I get that *this* change wouldn't
>>>> necessarily run afoul of the current state of C++ conformance for MSVC, I
>>>> am personally a bit leary of the general prospects of switching to C++11 in
>>>> libzmq because the next change that assumes C++11 conformance might not be
>>>> supported by the range of compilers used to build libzmq.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jun 2, 2015 at 8:44 AM, Auer, Jens <jens.auer at cgi.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I don't want to start a flame war on this because I know that there can
>>>> be very good reasons to not rely on C++11. That's why I did not just send
>>>> the patch, but wanted to discuss it first and see if there are any other
>>>> ways to do this.
>>>> The main issue is that I want to put an atomic_counter_t into a union
>>>> in msg_t, which C++98/C++03 does not allow because it has constructors and
>>>> non-public data members. So an obvious way to solve this would be to make
>>>> atomic_counter_t a plain C struct with free function to create and modify
>>>> it. I was hoping for other suggestions.
>>>>
>>>> For what it's worth, the change does not need a fully compliant C++,
>>>> but just standard layout and trivial types which are implemented in MSVC
>>>> since 2012.
>>>>
>>>> Cheers,
>>>>   Jens
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> *Jens Auer *| CGI | Software-Engineer
>>>>
>>>> CGI (Germany) GmbH & Co. KG
>>>> Rheinstraße 95 | 64295 Darmstadt | Germany
>>>>
>>>> T: +49 6151 36860 154
>>>> jens.auer at cgi.com
>>>> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter
>>>> de.cgi.com/pflichtangaben.
>>>>
>>>>
>>>>
>>>> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging
>>>> to CGI Group Inc. and its affiliates may be contained in this message. If
>>>> you are not a recipient indicated or intended in this message (or
>>>> responsible for delivery of this message to such person), or you think for
>>>> any reason that this message may have been addressed to you in error, you
>>>> may not use or copy or deliver this message to anyone else. In such case,
>>>> you should destroy this message and are asked to notify the sender by reply
>>>> e-mail.
>>>>         ------------------------------
>>>>
>>>> *Von:* zeromq-dev-bounces at lists.zeromq.org [
>>>> zeromq-dev-bounces at lists.zeromq.org]" im Auftrag von "Thomas Rodgers [
>>>> rodgert at twrodgers.com]
>>>> *Gesendet:* Dienstag, 2. Juni 2015 15:37
>>>>
>>>>
>>>> *An:* ZeroMQ development list
>>>> *Betreff:* Re: [zeromq-dev] (almost) zero-copy message receive
>>>>
>>>>
>>>>
>>>> Personally, I think that 4 years of C++11, this should not be an
>>>> issues, but there may be platforms with old compilers which you want to
>>>> support.
>>>>
>>>>
>>>>
>>>> 4 years of C++11 *should* be enough, but wide-spread use of fully
>>>> conforming compilers is still an issue, for instance -
>>>>
>>>>
>>>>
>>>> https://msdn.microsoft.com/en-us/library/hh567368.aspx
>>>>
>>>>
>>>>
>>>> On Tue, Jun 2, 2015 at 8:05 AM, Auer, Jens <jens.auer at cgi.com> wrote:
>>>>
>>>> Hi Pieter,
>>>>
>>>> the reason I wanted to ask first is because I had to switch on C++11 to
>>>> make it work without changing atomic_counter_t. The reason is that I
>>>> eliminated msg_t::content_t completely to save a mallic call  by adding the
>>>> members in content_t to the msg_t class directly since there is now space
>>>> enough. However, atomic_counter_t is not a POD and cannot be put into the
>>>> union. For my proof-of-concept, switching on C++11 is fine, but I am not
>>>> sure if that is ok for the main branch. Personally, I think that 4 years of
>>>> C++11, this should not be an issues, but there may be platforms with old
>>>> compilers which you want to support.
>>>>
>>>> The only alternative I came up with would be to make atomic_counter_t a
>>>> classical C struct with free functions instead of a class. I don't like
>>>> this very much.
>>>>
>>>> Best wishes,
>>>>   Jens
>>>>
>>>> --
>>>> Jens Auer | CGI | Software-Engineer
>>>> CGI (Germany) GmbH & Co. KG
>>>> Rheinstraße 95 | 64295 Darmstadt | Germany
>>>> T: +49 6151 36860 154
>>>> jens.auer at cgi.com
>>>> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie
>>>> unter de.cgi.com/pflichtangaben.
>>>>
>>>> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging
>>>> to CGI Group Inc. and its affiliates may be contained in this message. If
>>>> you are not a recipient indicated or intended in this message (or
>>>> responsible for delivery of this message to such person), or you think for
>>>> any reason that this message may have been addressed to you in error, you
>>>> may not use or copy or deliver this message to anyone else. In such case,
>>>> you should destroy this message and are asked to notify the sender by reply
>>>> e-mail.
>>>>
>>>> ________________________________________
>>>> Von: zeromq-dev-bounces at lists.zeromq.org [
>>>> zeromq-dev-bounces at lists.zeromq.org]" im Auftrag von "Pieter
>>>> Hintjens [ph at imatix.com]
>>>> Gesendet: Dienstag, 2. Juni 2015 10:18
>>>> An: ZeroMQ development list
>>>> Betreff: Re: [zeromq-dev] (almost) zero-copy message receive
>>>>
>>>> Jens,
>>>>
>>>> Sounds great. Feel free to send such patches to libzmq master; please
>>>> make sure they are as atomic as possible, each with a clear problem
>>>> statement, each testable individually.
>>>>
>>>> -Pieter
>>>>
>>>> On Tue, Jun 2, 2015 at 10:13 AM, Arnaud Loonstra <arnaud at sphaero.org>
>>>> wrote:
>>>> > Although I'm not very familiar with zmq's internals this looks
>>>> > promising.
>>>> > Did you test if your implementation remains correct? ie. it doesn't
>>>> > introduce deadlocks or other race conditions?
>>>> >
>>>> > Rg,
>>>> >
>>>> > Arnaud
>>>> >
>>>> > On 2015-05-31 19:29, Jens Auer wrote:
>>>> >> Hi,
>>>> >>
>>>> >> I did some performance analysis of  a program which receives data on
>>>> >> a (SUB or
>>>> >> PULL) socket, filters it for some criteria, extracts a value from the
>>>> >> message
>>>> >> and uses this as a subscription to forward the datato a PUB socket.
>>>> >> As
>>>> >> expected, most time is spent in memory allocations and memcpy
>>>> >> operations, so I
>>>> >> decided to check if there is an opportunity to  minimize these
>>>> >> operations in
>>>> >> the critical path. From my analysis, the path is as follows:
>>>> >> 1. stream_engine receives data from a socket into a static buffer of
>>>> >> 8192
>>>> >> bytes
>>>> >> 2. decoder/v2_decoder implement a state machine which reads the flag
>>>> >> and
>>>> >> message size, create a new message and copy the data into the message
>>>> >> data
>>>> >> field
>>>> >> 3. When sending, stream_engine copies the flags field, message and
>>>> >> message
>>>> >> data into a static buffer and sends this buffer completely to the
>>>> >> socket
>>>> >>
>>>> >> Memory allocations are done in v2_decoder when a new message is
>>>> >> created, and
>>>> >> deallocations are done when sending the message. Memcpy operations
>>>> >> are done in
>>>> >> decoder to copy
>>>> >> - the flags byte into a temporary buffer
>>>> >> - the message size into a temporary buffer
>>>> >> - the message data into the dynamically allocated storage
>>>> >>
>>>> >> Since the allocations and memcpy are the dominating operations, I
>>>> >> implemented
>>>> >> a scheme where these operations are minimized. The main idea is to
>>>> >> allocate
>>>> >> the receive buffer of 8192 byte dynamically and use this as the data
>>>> >> storage
>>>> >> for zero-copy messages created with msg_t::init_data. This replaces n
>>>> >> = 8192 /
>>>> >> (m_size + 10) memory allocations with one allocation, and it gets rid
>>>> >> of the
>>>> >> same number of memcpy operations for the message data. I implemented
>>>> >> this in a
>>>> >> fork (https://github.com/jens-auer/libzmq/tree/zero_copy_receive).
>>>> >> For
>>>> >> testing, I ran the throughput test (message size 100, 100000
>>>> >> messages) locally
>>>> >> and profiled for memory allocations and memcpy. The results are
>>>> >> promising:
>>>> >> - memory allocations reduced from 100,260 to 2,573
>>>> >> - memcpy operations reduced from 301,227 to 202,449. This is expected
>>>> >> because
>>>> >> for every message, three memcpys are done, and the patch removes the
>>>> >> data
>>>> >> memcpy only.
>>>> >> - throughput increased significantly by about 30-40% ( I only did a
>>>> >> couple of
>>>> >> runs to test it, no thorough benchmarking)
>>>> >>
>>>> >> For the implementation, I had to change two other things. After my
>>>> >> first
>>>> >> implementation, I realized that msg_t::init_data does a malloc to
>>>> >> create the
>>>> >> content_t member. Given that msg_t's size is now 64 bytes, I removed
>>>> >> content_t
>>>> >> completely by adding the members of content_t to the lmsg_t union.
>>>> >> However,
>>>> >> this is problem with the current code because one of the members is a
>>>> >> atomic_counter_t which is a non-POD type and cannot be a union
>>>> >> member. For my
>>>> >> proof-of-concept implementation, I switched on C++11 mode because
>>>> >> this relaxes
>>>> >> the requirements for PODs.
>>>> >>
>>>> >> I hope this could be useful and maybe included in the main branch. My
>>>> >> next
>>>> >> step is to change the encoder/stream engine to use writev to skip the
>>>> >> memcpy
>>>> >> operations when sending messages.
>>>> >>
>>>> >> Best wishes,
>>>> >>   Jens Auer
>>>> >>
>>>> >>
>>>> >> _______________________________________________
>>>> >> zeromq-dev mailing list
>>>> >> zeromq-dev at lists.zeromq.org
>>>> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>> >
>>>> > _______________________________________________
>>>> > zeromq-dev mailing list
>>>> > zeromq-dev at lists.zeromq.org
>>>> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20150602/cb3a0afe/attachment.htm>


More information about the zeromq-dev mailing list