[zeromq-dev] (almost) zero-copy message receive
Thomas Rodgers
rodgert at twrodgers.com
Tue Jun 2 18:10:38 CEST 2015
with reinterpret_cast?
On Tue, Jun 2, 2015 at 10:55 AM, Auer, Jens <jens.auer at cgi.com> wrote:
> I already tried this, but then the compiler complaint about a violation
> of the strict alias rules and the code doesn't compile anymore. The problem
> is that I access something of type uint8_t[] with a pointer of type
> atomic_counter_t.
>
> Cheers,
> Jens
>
> --
> *Jens Auer *| CGI | Software-Engineer
> CGI (Germany) GmbH & Co. KG
> Rheinstraße 95 | 64295 Darmstadt | Germany
> T: +49 6151 36860 154
> *jens.auer at cgi.com* <jens.auer at cgi.com>
> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter
> *de.cgi.com/pflichtangaben* <http://de.cgi.com/pflichtangaben>.
>
> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging to
> CGI Group Inc. and its affiliates may be contained in this message. If you
> are not a recipient indicated or intended in this message (or responsible
> for delivery of this message to such person), or you think for any reason
> that this message may have been addressed to you in error, you may not use
> or copy or deliver this message to anyone else. In such case, you should
> destroy this message and are asked to notify the sender by reply e-mail.
> ------------------------------
> *Von:* zeromq-dev-bounces at lists.zeromq.org [
> zeromq-dev-bounces at lists.zeromq.org]" im Auftrag von "Thomas Rodgers [
> rodgert at twrodgers.com]
> *Gesendet:* Dienstag, 2. Juni 2015 17:34
>
> *An:* ZeroMQ development list
> *Betreff:* Re: [zeromq-dev] (almost) zero-copy message receive
>
> One other point, you would need to guarantee appropriate alignment of
> the backing array (sadly, C++11 gave us std::aligned_storage but 03 not so
> much).
>
> On Tue, Jun 2, 2015 at 10:11 AM, Thomas Rodgers <rodgert at twrodgers.com>
> wrote:
>
>> I am unaware of any list, it seems (to me at least) it's whatever people
>> have contributed support for.
>>
>> As for a concrete suggestion. I am not sure this is any less ugly but...
>>
>> libzmq goes out of it's way generally to avoid calling new() in many
>> cases (non-throwing failure on OOM presumably), preferring malloc() and
>> placement new instead, content_t was no exception. It did embed an explicit
>> atomic_counter_t and would placement new it, and explicitly call the
>> destructor when the lmsg was destroyed.
>>
>> You could, instead declare the storage for atomic_counter_t as an
>> array, uint8_t ctr_storage[sizeof(atomic_counter_t)] and placement
>> new/explicit dtor as before, and then provide an accessor to return an
>> atomic_counter_t& by returning
>> *reinterpret_cast<atomic_counter_t*>(&ctr_storage[0]).
>>
>> It doesn't change the size of the lmsg type, but it does remove one
>> level of indirection, and you have hidden the ugly bits behind an accessor
>> method.
>>
>> On Tue, Jun 2, 2015 at 9:40 AM, Auer, Jens <jens.auer at cgi.com> wrote:
>>
>>> Hi Thomas,
>>>
>>>
>>>
>>> I did not want to say that you intended to start a flame war and you
>>> certainly did not. I know that things like these easily turn into flame
>>> wars, so I wanted to prevent that. All I wanted say was that I know that
>>> switching to C++11 may not be applicable so I want to discuss the patch
>>> first and search for an alternative to C++11. Do you have a suggestion for
>>> the issue with the non-POD atomic_counter_t? My experience is quite
>>> limited because I normally don’t write code which has to support older
>>> compilers.
>>>
>>>
>>>
>>> Just out of curiosity, do you know a definite list of compilers zeroMQ
>>> intends to supports? I only found the document
>>> http://zeromq.org/docs:builds, but this is quite outdated.
>>>
>>>
>>>
>>> Best wishes,
>>>
>>> Jens
>>>
>>>
>>>
>>> *--*
>>>
>>> *Dr. Jens Auer *| CGI | Software Engineer
>>>
>>> CGI Deutschland Ltd. & Co. KG
>>> Rheinstraße 95 | 64295 Darmstadt | Germany
>>>
>>> T: +49 6151 36860 154
>>>
>>> jens.auer at cgi.com
>>>
>>> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie
>>> unter de.cgi.com/pflichtangaben.
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging
>>> to CGI Group Inc. and its affiliates may be contained in this message. If
>>> you are not a recipient indicated or intended in this message (or
>>> responsible for delivery of this message to such person), or you think for
>>> any reason that this message may have been addressed to you in error, you
>>> may not use or copy or deliver this message to anyone else. In such case,
>>> you should destroy this message and are asked to notify the sender by reply
>>> e-mail.
>>>
>>>
>>>
>>> *From:* zeromq-dev-bounces at lists.zeromq.org [mailto:
>>> zeromq-dev-bounces at lists.zeromq.org] *On Behalf Of *Thomas Rodgers
>>> *Sent:* 02 June 2015 16:07
>>> *To:* ZeroMQ development list
>>> *Subject:* Re: [zeromq-dev] (almost) zero-copy message receive
>>>
>>>
>>>
>>> A better chart of compiler conformance is probably here -
>>>
>>>
>>>
>>> http://en.cppreference.com/w/cpp/compiler_support
>>>
>>>
>>>
>>> I don't know that there's been a "flamewar" around this topic, but
>>> switching to -std=c++11 has been discussed recently in the context of
>>> implementing thread safe sockets. IIRC the consensus at the time was to
>>> stick with 98/03 conforming code.
>>>
>>>
>>>
>>> In my experience, supporting C++11, opens opens up a bit of a minefield
>>> of portability "gotchas". For https://github.com/zeromq/azmq I can
>>> generally fall back on Boost to paper over these issues, but that's not a
>>> realistic option with libzmq. So, while I get that *this* change wouldn't
>>> necessarily run afoul of the current state of C++ conformance for MSVC, I
>>> am personally a bit leary of the general prospects of switching to C++11 in
>>> libzmq because the next change that assumes C++11 conformance might not be
>>> supported by the range of compilers used to build libzmq.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Jun 2, 2015 at 8:44 AM, Auer, Jens <jens.auer at cgi.com> wrote:
>>>
>>> Hi,
>>>
>>> I don't want to start a flame war on this because I know that there can
>>> be very good reasons to not rely on C++11. That's why I did not just send
>>> the patch, but wanted to discuss it first and see if there are any other
>>> ways to do this.
>>> The main issue is that I want to put an atomic_counter_t into a union in
>>> msg_t, which C++98/C++03 does not allow because it has constructors and
>>> non-public data members. So an obvious way to solve this would be to make
>>> atomic_counter_t a plain C struct with free function to create and modify
>>> it. I was hoping for other suggestions.
>>>
>>> For what it's worth, the change does not need a fully compliant C++, but
>>> just standard layout and trivial types which are implemented in MSVC since
>>> 2012.
>>>
>>> Cheers,
>>> Jens
>>>
>>>
>>>
>>> --
>>>
>>> *Jens Auer *| CGI | Software-Engineer
>>>
>>> CGI (Germany) GmbH & Co. KG
>>> Rheinstraße 95 | 64295 Darmstadt | Germany
>>>
>>> T: +49 6151 36860 154
>>> jens.auer at cgi.com
>>> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter
>>> de.cgi.com/pflichtangaben.
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging
>>> to CGI Group Inc. and its affiliates may be contained in this message. If
>>> you are not a recipient indicated or intended in this message (or
>>> responsible for delivery of this message to such person), or you think for
>>> any reason that this message may have been addressed to you in error, you
>>> may not use or copy or deliver this message to anyone else. In such case,
>>> you should destroy this message and are asked to notify the sender by reply
>>> e-mail.
>>> ------------------------------
>>>
>>> *Von:* zeromq-dev-bounces at lists.zeromq.org [
>>> zeromq-dev-bounces at lists.zeromq.org]" im Auftrag von "Thomas Rodgers [
>>> rodgert at twrodgers.com]
>>> *Gesendet:* Dienstag, 2. Juni 2015 15:37
>>>
>>>
>>> *An:* ZeroMQ development list
>>> *Betreff:* Re: [zeromq-dev] (almost) zero-copy message receive
>>>
>>>
>>>
>>> Personally, I think that 4 years of C++11, this should not be an issues,
>>> but there may be platforms with old compilers which you want to support.
>>>
>>>
>>>
>>> 4 years of C++11 *should* be enough, but wide-spread use of fully
>>> conforming compilers is still an issue, for instance -
>>>
>>>
>>>
>>> https://msdn.microsoft.com/en-us/library/hh567368.aspx
>>>
>>>
>>>
>>> On Tue, Jun 2, 2015 at 8:05 AM, Auer, Jens <jens.auer at cgi.com> wrote:
>>>
>>> Hi Pieter,
>>>
>>> the reason I wanted to ask first is because I had to switch on C++11 to
>>> make it work without changing atomic_counter_t. The reason is that I
>>> eliminated msg_t::content_t completely to save a mallic call by adding the
>>> members in content_t to the msg_t class directly since there is now space
>>> enough. However, atomic_counter_t is not a POD and cannot be put into the
>>> union. For my proof-of-concept, switching on C++11 is fine, but I am not
>>> sure if that is ok for the main branch. Personally, I think that 4 years of
>>> C++11, this should not be an issues, but there may be platforms with old
>>> compilers which you want to support.
>>>
>>> The only alternative I came up with would be to make atomic_counter_t a
>>> classical C struct with free functions instead of a class. I don't like
>>> this very much.
>>>
>>> Best wishes,
>>> Jens
>>>
>>> --
>>> Jens Auer | CGI | Software-Engineer
>>> CGI (Germany) GmbH & Co. KG
>>> Rheinstraße 95 | 64295 Darmstadt | Germany
>>> T: +49 6151 36860 154
>>> jens.auer at cgi.com
>>> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie
>>> unter de.cgi.com/pflichtangaben.
>>>
>>> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging
>>> to CGI Group Inc. and its affiliates may be contained in this message. If
>>> you are not a recipient indicated or intended in this message (or
>>> responsible for delivery of this message to such person), or you think for
>>> any reason that this message may have been addressed to you in error, you
>>> may not use or copy or deliver this message to anyone else. In such case,
>>> you should destroy this message and are asked to notify the sender by reply
>>> e-mail.
>>>
>>> ________________________________________
>>> Von: zeromq-dev-bounces at lists.zeromq.org [
>>> zeromq-dev-bounces at lists.zeromq.org]" im Auftrag von "Pieter
>>> Hintjens [ph at imatix.com]
>>> Gesendet: Dienstag, 2. Juni 2015 10:18
>>> An: ZeroMQ development list
>>> Betreff: Re: [zeromq-dev] (almost) zero-copy message receive
>>>
>>> Jens,
>>>
>>> Sounds great. Feel free to send such patches to libzmq master; please
>>> make sure they are as atomic as possible, each with a clear problem
>>> statement, each testable individually.
>>>
>>> -Pieter
>>>
>>> On Tue, Jun 2, 2015 at 10:13 AM, Arnaud Loonstra <arnaud at sphaero.org>
>>> wrote:
>>> > Although I'm not very familiar with zmq's internals this looks
>>> > promising.
>>> > Did you test if your implementation remains correct? ie. it doesn't
>>> > introduce deadlocks or other race conditions?
>>> >
>>> > Rg,
>>> >
>>> > Arnaud
>>> >
>>> > On 2015-05-31 19:29, Jens Auer wrote:
>>> >> Hi,
>>> >>
>>> >> I did some performance analysis of a program which receives data on
>>> >> a (SUB or
>>> >> PULL) socket, filters it for some criteria, extracts a value from the
>>> >> message
>>> >> and uses this as a subscription to forward the datato a PUB socket.
>>> >> As
>>> >> expected, most time is spent in memory allocations and memcpy
>>> >> operations, so I
>>> >> decided to check if there is an opportunity to minimize these
>>> >> operations in
>>> >> the critical path. From my analysis, the path is as follows:
>>> >> 1. stream_engine receives data from a socket into a static buffer of
>>> >> 8192
>>> >> bytes
>>> >> 2. decoder/v2_decoder implement a state machine which reads the flag
>>> >> and
>>> >> message size, create a new message and copy the data into the message
>>> >> data
>>> >> field
>>> >> 3. When sending, stream_engine copies the flags field, message and
>>> >> message
>>> >> data into a static buffer and sends this buffer completely to the
>>> >> socket
>>> >>
>>> >> Memory allocations are done in v2_decoder when a new message is
>>> >> created, and
>>> >> deallocations are done when sending the message. Memcpy operations
>>> >> are done in
>>> >> decoder to copy
>>> >> - the flags byte into a temporary buffer
>>> >> - the message size into a temporary buffer
>>> >> - the message data into the dynamically allocated storage
>>> >>
>>> >> Since the allocations and memcpy are the dominating operations, I
>>> >> implemented
>>> >> a scheme where these operations are minimized. The main idea is to
>>> >> allocate
>>> >> the receive buffer of 8192 byte dynamically and use this as the data
>>> >> storage
>>> >> for zero-copy messages created with msg_t::init_data. This replaces n
>>> >> = 8192 /
>>> >> (m_size + 10) memory allocations with one allocation, and it gets rid
>>> >> of the
>>> >> same number of memcpy operations for the message data. I implemented
>>> >> this in a
>>> >> fork (https://github.com/jens-auer/libzmq/tree/zero_copy_receive).
>>> >> For
>>> >> testing, I ran the throughput test (message size 100, 100000
>>> >> messages) locally
>>> >> and profiled for memory allocations and memcpy. The results are
>>> >> promising:
>>> >> - memory allocations reduced from 100,260 to 2,573
>>> >> - memcpy operations reduced from 301,227 to 202,449. This is expected
>>> >> because
>>> >> for every message, three memcpys are done, and the patch removes the
>>> >> data
>>> >> memcpy only.
>>> >> - throughput increased significantly by about 30-40% ( I only did a
>>> >> couple of
>>> >> runs to test it, no thorough benchmarking)
>>> >>
>>> >> For the implementation, I had to change two other things. After my
>>> >> first
>>> >> implementation, I realized that msg_t::init_data does a malloc to
>>> >> create the
>>> >> content_t member. Given that msg_t's size is now 64 bytes, I removed
>>> >> content_t
>>> >> completely by adding the members of content_t to the lmsg_t union.
>>> >> However,
>>> >> this is problem with the current code because one of the members is a
>>> >> atomic_counter_t which is a non-POD type and cannot be a union
>>> >> member. For my
>>> >> proof-of-concept implementation, I switched on C++11 mode because
>>> >> this relaxes
>>> >> the requirements for PODs.
>>> >>
>>> >> I hope this could be useful and maybe included in the main branch. My
>>> >> next
>>> >> step is to change the encoder/stream engine to use writev to skip the
>>> >> memcpy
>>> >> operations when sending messages.
>>> >>
>>> >> Best wishes,
>>> >> Jens Auer
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> zeromq-dev mailing list
>>> >> zeromq-dev at lists.zeromq.org
>>> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> >
>>> > _______________________________________________
>>> > zeromq-dev mailing list
>>> > zeromq-dev at lists.zeromq.org
>>> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>>
>>>
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>>
>>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20150602/81922f15/attachment.htm>
More information about the zeromq-dev
mailing list