[zeromq-dev] Notes from a hackathon

MinRK benjaminrk at gmail.com
Fri Feb 6 22:48:23 CET 2015


On Fri, Feb 6, 2015 at 1:08 PM, Thomas Rodgers <rodgert at twrodgers.com>
wrote:

> libzmq has had io_vec like send/receive for a while (zmq_sendiov,
> zmq_recviov), there is no documentation for them however.
>

Yes, though with it's current implementation, using sendiov is the same as
several calls to zmq_send(..., SNDMORE), making it no better or worse than
making those several calls yourself.

-MinRK


>
> On Fri, Feb 6, 2015 at 2:40 PM, Michel Pelletier <
> pelletier.michel at gmail.com> wrote:
>
>> Something else that occurred to me today, perhaps incorrectly, is that
>> there was mention that nanomsg doesn't have multipart messages, but it does
>> still preserve multipart messages in a way, but supporting scatter and
>> gather io vectors.  Perhaps emulating this well known API pattern is the
>> right approach to moving forward that still keep everyone happy and brings
>> us closer to the ability to have thread safety.
>>
>> -Michel
>>
>> On Fri, Feb 6, 2015 at 12:28 PM, MinRK <benjaminrk at gmail.com> wrote:
>>
>>> If I recall correctly, libzmq-3.0 (or was it the old 4.0 experimental
>>> branch?) preserved multi-hop while separating routing from content by
>>> making the routing id a sequence rather than a single value. That way, it
>>> seems that the same push/pop behavior that happens on a ROUTER-DEALER
>>> device today could still work on CLIENT-SERVER. The push/pop would be
>>> happening on the routing stack instead of message content.
>>>
>>> If a message were something like:
>>>
>>> {
>>>    frame_t frames[];
>>>    int nframes;
>>>    routing_id_t route[];
>>>    int nroutes;
>>> }
>>>
>>> it would seem that you could have multi-part content (selfishly, in the
>>> way that affects me - messages composed of non-contiguous memory),
>>> multi-hop routing, and send it all with a single `zmq_send_newfangled`
>>> call, allowing future attempts at making `zmq_send_newfangled` a threadsafe
>>> call.
>>>
>>> There may be reasons this would be super gross and horrible, but it's an
>>> idea, anyway.
>>>
>>> -MinRK
>>>
>>>
>>> On Fri, Feb 6, 2015 at 9:02 AM, Thomas Rodgers <rodgert at twrodgers.com>
>>> wrote:
>>>
>>>> Adding a mutex, even one that is never contended, to the socket will
>>>>> essentially triple this (one atomic CAS to acquire the mutex, one atomic
>>>>> CAS to put the message on the pipe, one atomic CAS to release the mutex).
>>>>
>>>>
>>>> This is a bit of a "blue sky" view of the cost of acquiring a mutex.
>>>> For the adventurous of spirit, chase down the call path of pthread_mutex
>>>> sometime in GLIBC. It is substantially more involved than a single pair of
>>>> 'lock; cmpxchg' instructions, but it tries really hard to make that the
>>>> rough cost of the happy path.
>>>>
>>>> On Fri, Feb 6, 2015 at 9:41 AM, Thomas Rodgers <rodgert at twrodgers.com>
>>>> wrote:
>>>>
>>>>> Having thought about this for a couple of more days, I want to at
>>>>> least take a stab at arguing against "threadsafe" sockets -
>>>>>
>>>>> libzmq's thread safety guarantees, to me anyway, are very clear,
>>>>> unsurprising and non-controversial - I cannot share a socket with another
>>>>> thread without a full fence.
>>>>>
>>>>> The kinds of systems I generally build have very strict requirements
>>>>> on overall latency, to the point that most of my networking IO is done
>>>>> through kernel-bypass libraries and NICs that support this, for raw TCP and
>>>>> UDP multicast. The latency sensitive code that does IO is in it's own
>>>>> thread, with exclusive access to the NICs which are accessed via kernel
>>>>> bypass. Coordination with other threads in the same process is done via
>>>>> inproc pair sockets. Pair sockets + very small messages (small enough that
>>>>> libzmq does not need to perform allocation) provide a very nice interface
>>>>> to a lock free queue with low overhead using a single atomic CAS operation.
>>>>> Atomic operations are cheap, but they are not free (~30 clocks on x86).
>>>>> Adding a mutex, even one that is never contended, to the socket will
>>>>> essentially triple this (one atomic CAS to acquire the mutex, one atomic
>>>>> CAS to put the message on the pipe, one atomic CAS to release the mutex). I
>>>>> would like to have the option to avoid this.
>>>>>
>>>>> If a wrapper wants thread safe sockets to enable certain use-cases
>>>>> that may be more idiomatic for the language in question, it can provide the
>>>>> full fence. AZMQ <https://github.com/zeromq/azmq> does exactly this
>>>>> by default, but you have the option to opt out of it. It does this because
>>>>> Boost Asio by default allows it's sockets to be used from multiple threads
>>>>> for async IO and I need to guard more than just exclusive access to the
>>>>> ZeroMQ socket a the fence in this case. Putting a mutex inside of the
>>>>> libzmq socket, essentially doubles the overhead for no gain in useful
>>>>> functionality and runs completely counter to one of C and C++'s overarching
>>>>> principles: "don't pay for what you don't use".
>>>>>
>>>>> If a class of apps really demands short lived exclusive access to a
>>>>> socket, provide a pool abstraction. The pool is thread safe, obtain a
>>>>> socket, perform I/O in a single thread, return the socket to the pool.
>>>>>
>>>>> On Wed, Feb 4, 2015 at 1:04 PM, Michel Pelletier <
>>>>> pelletier.michel at gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 4, 2015 at 10:51 AM, Pieter Hintjens <ph at imatix.com>
>>>>>> wrote:
>>>>>>
>>>>>>> The discussion about thread safety was quite short iirc, though that
>>>>>>> contributor did discuss other things... at length. I merged his
>>>>>>> "thread safe socket" change rapidly, then we reverted it after a few
>>>>>>> days, and he disappeared. It was rather brute force and I suspect did
>>>>>>> not work at all, it simply wrapped all accesses to the socket
>>>>>>> structure in mutexes. No discussion at the time of multipart data and
>>>>>>> atomic send/recv.
>>>>>>>
>>>>>>
>>>>>> My memory of the conversation at the time is pretty dim, I agree the
>>>>>> changes were ugly and untested and the contributor was difficult to reason
>>>>>> with and seemed to want to make the changes based on no real need at all.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> As for socket safety, I've no strong opinion. I see that many people
>>>>>>> expect that to work and hit errors when it doesn't. I see that
>>>>>>> nanomsg
>>>>>>> has threadsafe sockets and no multipart. I see that sharing sockets
>>>>>>> across threads would make some actor models simpler, which is nice.
>>>>>>>
>>>>>>
>>>>>> This is the classic problem with thread safe anything.  Threads are
>>>>>> hard, and there is a balance between the complexity of making a thread safe
>>>>>> construct and the skill required of a programmer to use "unsafe" construct
>>>>>> in a safe manner.  I still think if the concrete problem is very short
>>>>>> lived threads causing slow joiner problems, then the simple solution is
>>>>>> pools (troupes of actors?).
>>>>>>
>>>>>> -Michel
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Feb 4, 2015 at 7:35 PM, Michel Pelletier
>>>>>>> <pelletier.michel at gmail.com> wrote:
>>>>>>> > I think Brian has some good points here, there are numerous
>>>>>>> unrelated issues
>>>>>>> > being discussed in this thread.
>>>>>>> >
>>>>>>> > A few points that I have:
>>>>>>> >
>>>>>>> > Multi part messages have also bothered me.  However as a Python
>>>>>>> programmer i
>>>>>>> > see Min's points about the expense of buffer creation.  To my
>>>>>>> knowledge
>>>>>>> > zproto does not (yet) have Python generation support either, or
>>>>>>> maybe
>>>>>>> > something like generated cffi or ctypes wrappers around the zproto
>>>>>>> generated
>>>>>>> > C code.  That being said there are a variety of serialization
>>>>>>> libraries for
>>>>>>> > Python.  With some ctypes and mmap magic they can also be done
>>>>>>> "zero copy"
>>>>>>> > but it's not pretty:
>>>>>>> >
>>>>>>> > https://gist.github.com/michelp/7522179
>>>>>>> >
>>>>>>> > Multi part envelops are also how multi-hop routing is done.  I
>>>>>>> don't see how
>>>>>>> > the new ideas handle that.  I don't think we can just say "multi
>>>>>>> hop routing
>>>>>>> > is bad" and get rid of it.
>>>>>>> >
>>>>>>> > "Thread safe" sockets do not sound appealing to me.  We did that,
>>>>>>> had a long
>>>>>>> > and contentious discussion with the person championing them,
>>>>>>> merged it, then
>>>>>>> > reverted it and that person is now no longer in the community.
>>>>>>> Pieter was
>>>>>>> > the most vocal opponent to them then and now he wants them back.
>>>>>>> Of course,
>>>>>>> > anyone can change their mind, but the only current argument I hear
>>>>>>> now for
>>>>>>> > them though is improving the performance of short lived threads,
>>>>>>> but that
>>>>>>> > can be solved, more correctly in my opinion, with thread or
>>>>>>> connection
>>>>>>> > pools.  If you creating and tearing down threads that rapidly then
>>>>>>> you have
>>>>>>> > two problems.
>>>>>>> >
>>>>>>> > -Michel
>>>>>>> >
>>>>>>> > On Wed, Feb 4, 2015 at 3:37 AM, Brian Knox <bknox at digitalocean.com>
>>>>>>> wrote:
>>>>>>> >>
>>>>>>> >> After catching up on this thread, I feel like at least three
>>>>>>> problems are
>>>>>>> >> being conflated into one problem.  I'll state what I see being
>>>>>>> discussed
>>>>>>> >> from my perspective:
>>>>>>> >>
>>>>>>> >> 1. "Using multi part messages as a way to route to clients from a
>>>>>>> router
>>>>>>> >> socket is overly complicated and not how new users expect things
>>>>>>> to work"
>>>>>>> >>
>>>>>>> >> 2. "Using multi part messages for message serialization is
>>>>>>> costly, and
>>>>>>> >> potentially confusing to others."
>>>>>>> >>
>>>>>>> >> 3. "ZeroMQ sockets are not thread safe."
>>>>>>> >>
>>>>>>> >> While on an implementation level these three problems may be
>>>>>>> related, on a
>>>>>>> >> conceptual level I don't see them as related.  I may agree with
>>>>>>> some of
>>>>>>> >> these problem statements and not others.
>>>>>>> >>
>>>>>>> >> For me, my first priority is to always have the ability to get
>>>>>>> back a nice
>>>>>>> >> agnostic blob of bytes from ZeroMQ.   This makes it easy to make
>>>>>>> ZeroMQ
>>>>>>> >> socket use compatible with standard io interfaces in Go.
>>>>>>> Structure for what
>>>>>>> >> is contained in those bytes is a concern of a different layer.
>>>>>>> Sometimes I
>>>>>>> >> use zproto for this (which I like), and other times I don't.
>>>>>>> >>
>>>>>>> >> As a demonstration that the problems are different problems, I
>>>>>>> solved #1
>>>>>>> >> for myself in goczmq without addressing anything else.
>>>>>>> >>
>>>>>>> >> I would assert some of the confusion in this discussion is that
>>>>>>> we're
>>>>>>> >> talking about multiple problem statements at the same time.
>>>>>>> >>
>>>>>>> >> Cheers - and it was great meeting people this week!
>>>>>>> >>
>>>>>>> >> Brian
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Wed, Feb 4, 2015 at 12:50 AM, Pieter Hintjens <ph at imatix.com>
>>>>>>> wrote:
>>>>>>> >>>
>>>>>>> >>> Ironically, in my testing of high message rate), allowing
>>>>>>> multipart
>>>>>>> >>> creates significant costs. Multipart is just one way of getting
>>>>>>> >>> zero-copy, and even then only works on writing, not reading.
>>>>>>> >>>
>>>>>>> >>> For high performance brokers like Malamute I'd *really* like to
>>>>>>> be
>>>>>>> >>> moving blobs around instead of lists of blobs.
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> On Wed, Feb 4, 2015 at 12:41 AM, Gregg Irwin <
>>>>>>> gregg at pointillistic.com>
>>>>>>> >>> wrote:
>>>>>>> >>> > M> Perhaps it is because I spend my days in a higher level
>>>>>>> language
>>>>>>> >>> > M> like Python, but zproto is not an attractive option.
>>>>>>> >>> >
>>>>>>> >>> > Same here. I will read in detail about it shortly, but it may
>>>>>>> not make
>>>>>>> >>> > it into my toolbox as a multipart replacement. Multipart
>>>>>>> looked very
>>>>>>> >>> > cool when I found 0MQ, but I've ended up not using it much.
>>>>>>> I'm not
>>>>>>> >>> > doing high performance stuff though. Simplicity and ease of
>>>>>>> use are
>>>>>>> >>> > tops on my list.
>>>>>>> >>> >
>>>>>>> >>> > -- Gregg
>>>>>>> >>> >
>>>>>>> >>> > _______________________________________________
>>>>>>> >>> > zeromq-dev mailing list
>>>>>>> >>> > zeromq-dev at lists.zeromq.org
>>>>>>> >>> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>> >>> _______________________________________________
>>>>>>> >>> zeromq-dev mailing list
>>>>>>> >>> zeromq-dev at lists.zeromq.org
>>>>>>> >>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> _______________________________________________
>>>>>>> >> zeromq-dev mailing list
>>>>>>> >> zeromq-dev at lists.zeromq.org
>>>>>>> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>> >>
>>>>>>> >
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > zeromq-dev mailing list
>>>>>>> > zeromq-dev at lists.zeromq.org
>>>>>>> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>> >
>>>>>>> _______________________________________________
>>>>>>> zeromq-dev mailing list
>>>>>>> zeromq-dev at lists.zeromq.org
>>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> zeromq-dev mailing list
>>>>>> zeromq-dev at lists.zeromq.org
>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>
>>>>
>>>
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20150206/d1ad0829/attachment.htm>


More information about the zeromq-dev mailing list