[zeromq-dev] Inter thread communication for scalability

Lindley French lindleyf at gmail.com
Tue Jan 14 21:40:17 CET 2014


I'm going to caution you about passing pointers through inproc. It may be
possible to do safely, but I haven't yet figured out how to manage
ownership semantics in an environment where messages (pointers) can be
silently dropped.

I didn't imagine serialization would be a problem since you referred to
"buffers"; I thought these would be raw byte buffers. If you actually mean
lists of objects, then yes, you'll need to serialize to use inproc. There
are a number of options for serialization in C++; there's
Boost.Serialization, Google Protobufs, a few others. You can also do it
manually if your objects are simple.

Qt Signals & Slots is another solution for inter-thread communication
similar to inproc which has the expected C++ object semantics and therefore
doesn't require serialization. The downside is it's really only useful for
one-to-one, one-to-many, or many-to-one semantics. This covers a lot, but I
don't think it has a way to cover one-to-any, which is really what you want
(and what the zmq push socket is designed for).


On Tue, Jan 14, 2014 at 2:42 PM, Kenneth Adam Miller <
kennethadammiller at gmail.com> wrote:

> Yeah, it's in C/++.
>
>
> On Tue, Jan 14, 2014 at 1:39 PM, Charles Remes <lists at chuckremes.com>wrote:
>
>> If you are doing this from C and can access the raw memory, an inproc
>> socket can pass pointers around. If you are using a managed language or one
>> where accessing raw memory is difficult, you’ll want to figure out how to
>> “fake” passing a pointer (or an object reference). In your case it seems
>> like serializing/deserializing would be a big performance hit. That said,
>> if that is the direction you must go then pick something fast like msgpack
>> as your serializer.
>>
>>
>> On Jan 14, 2014, at 1:29 PM, Kenneth Adam Miller <
>> kennethadammiller at gmail.com> wrote:
>>
>> @AJ No, but I understand exactly why you suggested that. It's because I
>> haven't explained that thread 1 is doing critical work and it needs to
>> offload tasks to other threads as quickly as possible.
>>
>> @Lindley, Thanks so much for helping me see the truth! I was getting
>> awful confused considering all the different bolony that could go on if I
>> was stuck with semaphores, and I couldn't really re-envision it. Is there
>> any kind of convenience function or core utility for de-serializing the
>> data you receive over inproc messages?
>>
>>
>> On Tue, Jan 14, 2014 at 12:49 PM, AJ Lewis <aj.lewis at quantum.com> wrote:
>>
>>> In the zeromq example, couldn't you just skip thread 1 entirely?  Then
>>> the
>>> PULL socket from thread 2 takes uncompressed input from the source,
>>> compresses it, and shoves it out the PUSH socket to thread 3 for output.
>>>
>>> In this case, the PULL socket is the uncompressed pool and the PUSH
>>> socket
>>> is the compressed pool.  Just make sure your uncompressed pool doesn't
>>> fill
>>> up faster than thread 2 can compress it, or you'll need to implement some
>>> logic to prevent it from using up all the memory.
>>>
>>> AJ
>>>
>>> On Tue, Jan 14, 2014 at 01:16:32PM -0500, Lindley French wrote:
>>> > In this case your "buffers" are really just messages, aren't they? A
>>> thread
>>> > grabs one (receives a message), processes it, and writes the result
>>> into
>>> > another buffer (sends a message).
>>> >
>>> > The hard part is that ZeroMQ sockets don't like to be touched by
>>> multiple
>>> > threads, which complicates the many-to-many pattern you have going
>>> here.
>>> > I'm no expert, but I would suggest....
>>> >
>>> > Each "pool", A and B, becomes a single thread with two ZMQ inproc
>>> sockets,
>>> > one push and one pull. These are both bound to well-known endpoints.
>>> All
>>> > the thread does is continually shove messages from the pull socket to
>>> the
>>> > push socket.
>>> >
>>> > Each thread in "Thread set 1" has a push inproc socket connected to
>>> pool
>>> > A's pull socket.
>>> >
>>> > Each thread in "Thread set 2" has a pull inproc socket connected to
>>> pool
>>> > A's push socket and a push inproc socket connected to pool B's pull
>>> socket.
>>> > For each message it receives, it just processes it and spits it out the
>>> > other socket.
>>> >
>>> > The thread in "Thread set 3" has a pull inproc socket connected to
>>> pool B's
>>> > push socket. It just continually receives messages and outputs them.
>>> >
>>> > This may seem complicated because concepts that were distinct before
>>> > (buffer pools and worker threads) are now the same thing: they're both
>>> just
>>> > threads with sockets. The critical difference is that the "buffer
>>> pools"
>>> > bind to well-known endpoints, so you can only have a few of them,
>>> while the
>>> > worker threads connect to those well-known endpoints, so you can have
>>> as
>>> > many as you like.
>>> >
>>> > Will this perform as well as your current code? I don't know. Profile
>>> it
>>> > and find out.
>>> >
>>> >
>>> > On Tue, Jan 14, 2014 at 12:23 PM, Kenneth Adam Miller <
>>> > kennethadammiller at gmail.com> wrote:
>>> >
>>> > > So, I have two pools of shared buffers; pool A, which is a set of
>>> buffers
>>> > > of uncompressed data, and pool B, for compressed data. I three sets
>>> of
>>> > > threads.
>>> > >
>>> > > Thread set 1 pulls from pool A, and fills buffers it receives from
>>> pool A
>>> > > up with uncompressed data.
>>> > >
>>> > > Thread set 2 is given a pool from A that has recently been filled. It
>>> > > pulls a buffer from pool B, compresses from A into B, and then
>>> returns the
>>> > > buffer it was given, cleared, back to pool A.
>>> > >
>>> > > Thread set 3 is a single thread, that is continually handed
>>> compressed
>>> > > data from thread set 2, which it outputs. When data is finished
>>> output, it
>>> > > returns the buffer to pool B, cleared.
>>> > >
>>> > > Can anybody describe a scheme to me that will allow thread sets 1 &
>>> 2 to
>>> > > scale?
>>> > >
>>> > > Also, suppose for pools A and B, I'm using shared queues that are
>>> just C++
>>> > > stl lists. When I pop from the front, I use a lock for removal to
>>> make sure
>>> > > that removal is deterministic. When I enqueue, I use a separate lock
>>> to
>>> > > ensure that the internals of the STL list is respected (don't want
>>> two
>>> > > threads receiving iterators to the same beginning node, that would
>>> probably
>>> > > corrupt the container or cause data loss, or both). Is this the
>>> appropriate
>>> > > way to go about it? Thread sets 1 & 2 will likely have more than one
>>> > > thread, but there's no guarantee that thread sets 1 & 2 will have
>>> equal
>>> > > threads.
>>> > >
>>> > > I was reading the ZeroMQ manual, and I read the part about
>>> multi-threading
>>> > > and message passing, and I was wondering what approaches should be
>>> taken
>>> > > with message passing when data is inherently shared between threads.
>>> > >
>>> > > _______________________________________________
>>> > > zeromq-dev mailing list
>>> > > zeromq-dev at lists.zeromq.org
>>> > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> > >
>>> > >
>>>
>>> > _______________________________________________
>>> > zeromq-dev mailing list
>>> > zeromq-dev at lists.zeromq.org
>>> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>>
>>> --
>>> AJ Lewis
>>> Software Engineer
>>> Quantum Corporation
>>>
>>> Work:    651 688-4346
>>> email:   aj.lewis at quantum.com
>>>
>>> ----------------------------------------------------------------------
>>> The information contained in this transmission may be confidential. Any
>>> disclosure, copying, or further distribution of confidential information is
>>> not permitted unless such privilege is explicitly granted in writing by
>>> Quantum. Quantum reserves the right to have electronic communications,
>>> including email and attachments, sent across its networks filtered through
>>> anti virus and spam software programs and retain such messages in order to
>>> comply with applicable data security and retention requirements. Quantum is
>>> not responsible for the proper and complete transmission of the substance
>>> of this communication or for any delay in its receipt.
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20140114/af3e02d4/attachment.htm>


More information about the zeromq-dev mailing list