[zeromq-dev] Inter thread communication for scalability
Goswin von Brederlow
goswin-v-b at web.de
Sat Jan 18 11:19:51 CET 2014
On Fri, Jan 17, 2014 at 05:06:33PM -0600, Kenneth Adam Miller wrote:
> On Fri, Jan 17, 2014 at 4:35 AM, Goswin von Brederlow <goswin-v-b at web.de>wrote:
>
> > On Thu, Jan 16, 2014 at 09:11:48PM -0600, Kenneth Adam Miller wrote:
> > > On Wed, Jan 15, 2014 at 8:36 AM, Goswin von Brederlow <goswin-v-b at web.de
> > >wrote:
> > > > All those threads get confusing. Lets draw a picture:
> > > >
> > > >
> > +--------------------------------------------------------------------+
> > > > |
> > |
> > > > v
> > |
> > > > Pool A Thread Set 1 Router X
> > |
> > > > PULL-PUSH --==> PULL-read-PUSH ==--> PULL-PUSH --==> PULL Thread Set 2
> > PUSH
> > > > \ /
> > > > +-compress-+
> > > > Pool B / \
> > > > PULL-PUSH --==> PULL
> > PUSH
> > > > ^
> > ||
> > > > |
> > ||
> > > > +----------- PUSH-write-PULL
> > <--==++
> > > > Thread 3
> > > >
> > > >
> > > So, I believe this is pretty close, but if I'm right, but you created a
> > > router to deal with a N-M situation, when in actuality a slightly
> > different
> > > configuration is needed in order to make that work. I could be
> > > understanding this wrong, but after I looked at the examples some more, I
> > > think things started to finally click. I think the multi-threaded example
> > > given in the manual that used a router and a dealer required that the
> > > sockets be of type REQ and REP in order to work. I don't think that PUSH
> > > and PULL is, at least not according to the reference guide on zmq_socket.
> > > Is that correct? I'm pretty sure that the only thing that you got wrong
> > is
> > > to illustrate that router takes requests in from thread set 1 and
> > shuttles
> > > them out to thread set 2 as replies.
> >
> > I just called the thread router because it routes messages from Thread
> > Set 1 to Thread Set 2. I didn't use a ROUTER/DEALER socket for it as
> > I believe the guide has for its router example.
> >
> > But that is because the other endpoints are PUSH and PULL so messages
> > go strictly one way and I only need PULL and PUSH to complement them
> > in the router thread. On the other hand I'm fairly new to zmq, too. So
> > I could be wrong.
> >
> > MfG
> > Goswin
> >
>
> Oh awesome, thanks!
> Ok, so last question-I think now that I'm actually implementing what you
> drew that request reply is actually the design that I want; because any
> time a thread group needs a resource, it should be able to request it that
> way the other can know to send it. Is this the right way to go about it?
> Because I was actually going to make a request pattern with PUSH and PULL
> and realized that if I did that, that I had no guarantee that the reply
> would get sent back to the sender. If I didn't have a pattern that would
> allow me to send a resource back to a requesting thread, then I would have
> to flush the thread groups' PULL sockets with handles, and I would have no
> way to know if even the number that I flushed in there was enough to
> satisfy their needs. It could result in performance degradation.
It depends on how many more blocks there are to compress than
compressors. If you don't have a high ratio of blocks to compressors
then the buffering of messages could lead to one compressor having a
bunch of buffers in its input queue while others are blocked waiting
for buffers.
On the other hand if you have a ton of blocks then the readers will
produce buffer after buffer and all compressors will get a full input
queue. I'm assuming reading a block takes significantly less time than
compressing so the uncompressed buffers will quickly pile up.
If you start with a fixed amount of buffers in Pool A then the readers
will fill them all and pass them on to the compressors and then they
will block waiting for the compressors to free up a buffer. The number
of buffers you use determines how much read ahead the readers can do
before blocking. This limits the amount of memory being used nicely.
If you don't want them limited that way then the readers could poll
their PULL socket or use non-blocking recv(). If no buffer is
available for pulling then they can allocate a new one and add it to
the circle.
As for knowing wether you have enough buffers or not. That is easy to
calculate. Every thread in thread set 2 should be able to get a
buffer. It should also have a few buffers (~recieve high water mark)
in its incoming queue, so multiply |thread set 2| * (1 + recv HWM).
That I would use at the minimum.
Further every thread in thread set 1 should have a buffer, be able to
fill its outgoing queue and have something in its incoming queue. So
add |thread set 1| * (1 + send HWM + recv HWM) more buffers.
That only leaves the Poll A and Router threads with space in their
incoming and outgoing queues. But since they just forward messages
without delay that is fine. In fact keep them empty as safety margin
because if you have too many buffers then you could end up with every
thread having its incoming and outgoing queue full and trying to send
one more. At that point you would have a deadlock.
If you let the reader threads allocate new buffers if they can't pull
any then you will have to handle that situation. Probably by having
the Pool A thread free buffers if it can't push them.
MfG
Goswin
More information about the zeromq-dev
mailing list