[zeromq-dev] [PATCH] Publisher side filtering

Gerard Toonstra gtoonstra at gmail.com
Mon Jan 31 16:06:05 CET 2011


On Mon, Jan 31, 2011 at 2:52 PM, Martin Sustrik <sustrik at 250bpm.com> wrote:

> Hi Gerard,
>
>
>  So what I see happen once in a while:
>>
>> - SUB is started
>> - broker is started
>> - PUB is started
>> - things work ok
>> - restart SUB couple of times after it sent the sub request to the broker
>> - no messages yet appear in the publisher (the printf debug statement
>> you put there about receiving unsub/sub requests).
>> - restart broker
>> - 1 second later, the PUB displays "couple of times" sub requests + 1.
>>
>
> So the subscriptions get to the publisher finally. It's only that they are
> delayed by 1 second, right?
>
> Can it be caused by invoking the PUB socket (zmq_send) once a second? The
> code currently processes the incoming subscriptions on a call to zmq_send().


Yes, the publisher has a 1-second send interval (test program).


>  Because there are as many messages as restarts and both the broker and
>> subs were restarted, this looks like the PUB is
>> piling up messages for some reason.
>>
>
> Yes. At the moment the subscriptions are simply forwarded upstream. Later
> on we can filter the subscriptions and *not* sent those that were already
> sent upstream.
>
> Example:
>
> 1 publisher, 1 forwarder, 2 subscribers.
>
> subscriber 1 subscribes to topic "A"
> the subscription is forwarded to the broker
> the broker forwards it to the publisher
> subsceiber 2 subscribes to topic "A"
> the subscription is forwarded to the broker
> the broker realises that the subscription was already forwarded to the
> publisher and does nothing


Some refcounting should indeed take place there. The complication is the
"unsub", but this is for later.

The "piling up" shouldn't actually happen, because every second the
publisher is going through the "send" cycle.
So by piling up, I mean that the PUB is apparently receiving the messages,
but only processes them when the broker
is killed. So it sounds it has something to do with pipe management in fq_t.
I was thinking this may be caused by not
calling "has_in" prior to "recv" (since has_in also manipulates the active
count), but the active count
is also anaged in "recv" anyway, so that shouldn't be the issue.

I took a look at "netstat" and saw the following:

tcp        0      0 0.0.0.0:5559            0.0.0.0:*
LISTEN
tcp        0      0 127.0.0.1:5559          127.0.0.1:39997
TIME_WAIT
tcp      180      0 127.0.0.1:40055         127.0.0.1:5559
ESTABLISHED
tcp        0      0 127.0.0.1:5559          127.0.0.1:39919
TIME_WAIT
tcp        0      0 127.0.0.1:5559          127.0.0.1:40055
ESTABLISHED

5559 is the port of the broker, listening for PUB connections.

Whenever a SUB is killed and restarted, the buffer, now saying 180,
increases by 18 bytes, which is
a subscription for topic "App.Global.TEST".  Doing this many times with
different subs doesn't clear this buffer,
it just stays there doing nothing, buffer increasing by 18 bytes each and
every time.

So apparently, the bytes stay in the "recvQ" of the publisher socket until
the broker is killed (killed, not restarted).
As soon as the broker is killed, the PUB suddenly picks up all pending
messages and prints the sub requests (debug).

This condition is not consistent. Sometimes the broker is restarted and it
just works. Only sometimes after frequent
restarts does the PUB get into this condition. Sounds like an issue with
states of pipes at the PUB side,
possibly where the broker connects and immediately sends the sub changes
afterwards, where the PUB may just
have missed it or assumes there is nothing to read yet?

Rgds,

-- 
Gerard Toonstra
-----------------------
http://www.radialmind.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20110131/b8af63ed/attachment.htm>


More information about the zeromq-dev mailing list