[zeromq-dev] Inter thread communication for scalability
Lindley French
lindleyf at gmail.com
Tue Jan 14 23:40:27 CET 2014
Message passing can be an extremely elegant design, with intuitive behavior
and easy maintenance. Probably its biggest benefit is that it's trivial to
extend it from multiple threads to multiple processes, or even multiple
hosts, especially with zmq. However, I can't promise you it's more scalable
or performant in the 1-process case. You'll need to do some profiling to
see which is faster.
I agree with you on type safety. The zmqpp wrapper eases that pain
somewhat.
On Tue, Jan 14, 2014 at 4:24 PM, Kenneth Adam Miller <
kennethadammiller at gmail.com> wrote:
> I kind of think it would be the message passing, because obviously, if a
> thread tries to acquire a clean container from the set of shared, and the
> shared queue is empty, then it has to wait. I would much rather it receive
> a signal for it to wake up, but maybe boost::lockfree has an answer to this
> too...
>
>
> On Tue, Jan 14, 2014 at 3:19 PM, Kenneth Adam Miller <
> kennethadammiller at gmail.com> wrote:
>
>> Actually, which do you think would result in a better design decision?
>> Would using message passing result in a more scalable architecture? Where I
>> could just change a variable to increase throughput on better processors.
>>
>>
>> On Tue, Jan 14, 2014 at 3:09 PM, Kenneth Adam Miller <
>> kennethadammiller at gmail.com> wrote:
>>
>>> Well, I'm just a type safety dork, so I tend to think that even losing
>>> type information over one pointer, even if I know where that pointer is
>>> going to end up what type it represents on the other side, is a bad thing.
>>> Plus it's a performance thing that's just unnecessary, but I don't think
>>> it's a big deal. These aren't objects, they are indeed raw buffers, as you
>>> assumed.
>>>
>>> Also, awesome about the boost find! Appreciate you so much, you are a
>>> beast. But I'm actually still in a sprint, so there's no version or commit
>>> with which these hypothetical discussions directly coincide, you're helping
>>> me get it right the first time.
>>>
>>>
>>> On Tue, Jan 14, 2014 at 2:48 PM, Lindley French <lindleyf at gmail.com>wrote:
>>>
>>>> A visit to the Boost libraries reveals there's a brand-new
>>>> Boost.Lockfree library that must have arrived with one of the last few
>>>> versions. You should seriously consider simply replacing your std::lists
>>>> with boost::lockfree::queues using your existing logic, and see if that
>>>> gives you the performance you're looking for before you make any massive
>>>> changes.
>>>>
>>>>
>>>> On Tue, Jan 14, 2014 at 3:40 PM, Lindley French <lindleyf at gmail.com>wrote:
>>>>
>>>>> I'm going to caution you about passing pointers through inproc. It may
>>>>> be possible to do safely, but I haven't yet figured out how to manage
>>>>> ownership semantics in an environment where messages (pointers) can be
>>>>> silently dropped.
>>>>>
>>>>> I didn't imagine serialization would be a problem since you referred
>>>>> to "buffers"; I thought these would be raw byte buffers. If you actually
>>>>> mean lists of objects, then yes, you'll need to serialize to use inproc.
>>>>> There are a number of options for serialization in C++; there's
>>>>> Boost.Serialization, Google Protobufs, a few others. You can also do it
>>>>> manually if your objects are simple.
>>>>>
>>>>> Qt Signals & Slots is another solution for inter-thread communication
>>>>> similar to inproc which has the expected C++ object semantics and therefore
>>>>> doesn't require serialization. The downside is it's really only useful for
>>>>> one-to-one, one-to-many, or many-to-one semantics. This covers a lot, but I
>>>>> don't think it has a way to cover one-to-any, which is really what you want
>>>>> (and what the zmq push socket is designed for).
>>>>>
>>>>>
>>>>> On Tue, Jan 14, 2014 at 2:42 PM, Kenneth Adam Miller <
>>>>> kennethadammiller at gmail.com> wrote:
>>>>>
>>>>>> Yeah, it's in C/++.
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 14, 2014 at 1:39 PM, Charles Remes <lists at chuckremes.com>wrote:
>>>>>>
>>>>>>> If you are doing this from C and can access the raw memory, an
>>>>>>> inproc socket can pass pointers around. If you are using a managed language
>>>>>>> or one where accessing raw memory is difficult, you’ll want to figure out
>>>>>>> how to “fake” passing a pointer (or an object reference). In your case it
>>>>>>> seems like serializing/deserializing would be a big performance hit. That
>>>>>>> said, if that is the direction you must go then pick something fast like
>>>>>>> msgpack as your serializer.
>>>>>>>
>>>>>>>
>>>>>>> On Jan 14, 2014, at 1:29 PM, Kenneth Adam Miller <
>>>>>>> kennethadammiller at gmail.com> wrote:
>>>>>>>
>>>>>>> @AJ No, but I understand exactly why you suggested that. It's
>>>>>>> because I haven't explained that thread 1 is doing critical work and it
>>>>>>> needs to offload tasks to other threads as quickly as possible.
>>>>>>>
>>>>>>> @Lindley, Thanks so much for helping me see the truth! I was getting
>>>>>>> awful confused considering all the different bolony that could go on if I
>>>>>>> was stuck with semaphores, and I couldn't really re-envision it. Is there
>>>>>>> any kind of convenience function or core utility for de-serializing the
>>>>>>> data you receive over inproc messages?
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jan 14, 2014 at 12:49 PM, AJ Lewis <aj.lewis at quantum.com>wrote:
>>>>>>>
>>>>>>>> In the zeromq example, couldn't you just skip thread 1 entirely?
>>>>>>>> Then the
>>>>>>>> PULL socket from thread 2 takes uncompressed input from the source,
>>>>>>>> compresses it, and shoves it out the PUSH socket to thread 3 for
>>>>>>>> output.
>>>>>>>>
>>>>>>>> In this case, the PULL socket is the uncompressed pool and the PUSH
>>>>>>>> socket
>>>>>>>> is the compressed pool. Just make sure your uncompressed pool
>>>>>>>> doesn't fill
>>>>>>>> up faster than thread 2 can compress it, or you'll need to
>>>>>>>> implement some
>>>>>>>> logic to prevent it from using up all the memory.
>>>>>>>>
>>>>>>>> AJ
>>>>>>>>
>>>>>>>> On Tue, Jan 14, 2014 at 01:16:32PM -0500, Lindley French wrote:
>>>>>>>> > In this case your "buffers" are really just messages, aren't
>>>>>>>> they? A thread
>>>>>>>> > grabs one (receives a message), processes it, and writes the
>>>>>>>> result into
>>>>>>>> > another buffer (sends a message).
>>>>>>>> >
>>>>>>>> > The hard part is that ZeroMQ sockets don't like to be touched by
>>>>>>>> multiple
>>>>>>>> > threads, which complicates the many-to-many pattern you have
>>>>>>>> going here.
>>>>>>>> > I'm no expert, but I would suggest....
>>>>>>>> >
>>>>>>>> > Each "pool", A and B, becomes a single thread with two ZMQ inproc
>>>>>>>> sockets,
>>>>>>>> > one push and one pull. These are both bound to well-known
>>>>>>>> endpoints. All
>>>>>>>> > the thread does is continually shove messages from the pull
>>>>>>>> socket to the
>>>>>>>> > push socket.
>>>>>>>> >
>>>>>>>> > Each thread in "Thread set 1" has a push inproc socket connected
>>>>>>>> to pool
>>>>>>>> > A's pull socket.
>>>>>>>> >
>>>>>>>> > Each thread in "Thread set 2" has a pull inproc socket connected
>>>>>>>> to pool
>>>>>>>> > A's push socket and a push inproc socket connected to pool B's
>>>>>>>> pull socket.
>>>>>>>> > For each message it receives, it just processes it and spits it
>>>>>>>> out the
>>>>>>>> > other socket.
>>>>>>>> >
>>>>>>>> > The thread in "Thread set 3" has a pull inproc socket connected
>>>>>>>> to pool B's
>>>>>>>> > push socket. It just continually receives messages and outputs
>>>>>>>> them.
>>>>>>>> >
>>>>>>>> > This may seem complicated because concepts that were distinct
>>>>>>>> before
>>>>>>>> > (buffer pools and worker threads) are now the same thing: they're
>>>>>>>> both just
>>>>>>>> > threads with sockets. The critical difference is that the "buffer
>>>>>>>> pools"
>>>>>>>> > bind to well-known endpoints, so you can only have a few of them,
>>>>>>>> while the
>>>>>>>> > worker threads connect to those well-known endpoints, so you can
>>>>>>>> have as
>>>>>>>> > many as you like.
>>>>>>>> >
>>>>>>>> > Will this perform as well as your current code? I don't know.
>>>>>>>> Profile it
>>>>>>>> > and find out.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Tue, Jan 14, 2014 at 12:23 PM, Kenneth Adam Miller <
>>>>>>>> > kennethadammiller at gmail.com> wrote:
>>>>>>>> >
>>>>>>>> > > So, I have two pools of shared buffers; pool A, which is a set
>>>>>>>> of buffers
>>>>>>>> > > of uncompressed data, and pool B, for compressed data. I three
>>>>>>>> sets of
>>>>>>>> > > threads.
>>>>>>>> > >
>>>>>>>> > > Thread set 1 pulls from pool A, and fills buffers it receives
>>>>>>>> from pool A
>>>>>>>> > > up with uncompressed data.
>>>>>>>> > >
>>>>>>>> > > Thread set 2 is given a pool from A that has recently been
>>>>>>>> filled. It
>>>>>>>> > > pulls a buffer from pool B, compresses from A into B, and then
>>>>>>>> returns the
>>>>>>>> > > buffer it was given, cleared, back to pool A.
>>>>>>>> > >
>>>>>>>> > > Thread set 3 is a single thread, that is continually handed
>>>>>>>> compressed
>>>>>>>> > > data from thread set 2, which it outputs. When data is finished
>>>>>>>> output, it
>>>>>>>> > > returns the buffer to pool B, cleared.
>>>>>>>> > >
>>>>>>>> > > Can anybody describe a scheme to me that will allow thread sets
>>>>>>>> 1 & 2 to
>>>>>>>> > > scale?
>>>>>>>> > >
>>>>>>>> > > Also, suppose for pools A and B, I'm using shared queues that
>>>>>>>> are just C++
>>>>>>>> > > stl lists. When I pop from the front, I use a lock for removal
>>>>>>>> to make sure
>>>>>>>> > > that removal is deterministic. When I enqueue, I use a separate
>>>>>>>> lock to
>>>>>>>> > > ensure that the internals of the STL list is respected (don't
>>>>>>>> want two
>>>>>>>> > > threads receiving iterators to the same beginning node, that
>>>>>>>> would probably
>>>>>>>> > > corrupt the container or cause data loss, or both). Is this the
>>>>>>>> appropriate
>>>>>>>> > > way to go about it? Thread sets 1 & 2 will likely have more
>>>>>>>> than one
>>>>>>>> > > thread, but there's no guarantee that thread sets 1 & 2 will
>>>>>>>> have equal
>>>>>>>> > > threads.
>>>>>>>> > >
>>>>>>>> > > I was reading the ZeroMQ manual, and I read the part about
>>>>>>>> multi-threading
>>>>>>>> > > and message passing, and I was wondering what approaches should
>>>>>>>> be taken
>>>>>>>> > > with message passing when data is inherently shared between
>>>>>>>> threads.
>>>>>>>> > >
>>>>>>>> > > _______________________________________________
>>>>>>>> > > zeromq-dev mailing list
>>>>>>>> > > zeromq-dev at lists.zeromq.org
>>>>>>>> > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>>
>>>>>>>> > _______________________________________________
>>>>>>>> > zeromq-dev mailing list
>>>>>>>> > zeromq-dev at lists.zeromq.org
>>>>>>>> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> AJ Lewis
>>>>>>>> Software Engineer
>>>>>>>> Quantum Corporation
>>>>>>>>
>>>>>>>> Work: 651 688-4346
>>>>>>>> email: aj.lewis at quantum.com
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>> The information contained in this transmission may be confidential.
>>>>>>>> Any disclosure, copying, or further distribution of confidential
>>>>>>>> information is not permitted unless such privilege is explicitly granted in
>>>>>>>> writing by Quantum. Quantum reserves the right to have electronic
>>>>>>>> communications, including email and attachments, sent across its networks
>>>>>>>> filtered through anti virus and spam software programs and retain such
>>>>>>>> messages in order to comply with applicable data security and retention
>>>>>>>> requirements. Quantum is not responsible for the proper and complete
>>>>>>>> transmission of the substance of this communication or for any delay in its
>>>>>>>> receipt.
>>>>>>>> _______________________________________________
>>>>>>>> zeromq-dev mailing list
>>>>>>>> zeromq-dev at lists.zeromq.org
>>>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> zeromq-dev mailing list
>>>>>>> zeromq-dev at lists.zeromq.org
>>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> zeromq-dev mailing list
>>>>>>> zeromq-dev at lists.zeromq.org
>>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> zeromq-dev mailing list
>>>>>> zeromq-dev at lists.zeromq.org
>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>
>>>>
>>>
>>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20140114/4124a93e/attachment.htm>
More information about the zeromq-dev
mailing list