[zeromq-dev] [PATCH] Publisher side filtering
Martin Sustrik
sustrik at 250bpm.com
Mon Jan 31 14:52:10 CET 2011
Hi Gerard,
> So what I see happen once in a while:
>
> - SUB is started
> - broker is started
> - PUB is started
> - things work ok
> - restart SUB couple of times after it sent the sub request to the broker
> - no messages yet appear in the publisher (the printf debug statement
> you put there about receiving unsub/sub requests).
> - restart broker
> - 1 second later, the PUB displays "couple of times" sub requests + 1.
So the subscriptions get to the publisher finally. It's only that they
are delayed by 1 second, right?
Can it be caused by invoking the PUB socket (zmq_send) once a second?
The code currently processes the incoming subscriptions on a call to
zmq_send().
> Because there are as many messages as restarts and both the broker and
> subs were restarted, this looks like the PUB is
> piling up messages for some reason.
Yes. At the moment the subscriptions are simply forwarded upstream.
Later on we can filter the subscriptions and *not* sent those that were
already sent upstream.
Example:
1 publisher, 1 forwarder, 2 subscribers.
subscriber 1 subscribes to topic "A"
the subscription is forwarded to the broker
the broker forwards it to the publisher
subsceiber 2 subscribes to topic "A"
the subscription is forwarded to the broker
the broker realises that the subscription was already forwarded to the
publisher and does nothing
> I noticed that the "pub_t::xsend"
> method doesn't call "has_in" prior to executing
> the recv in the while loop and was wondering whether something may be
> failing over there?
It's OK. It's non-blocking recv so it returns EAGAIN if there are no
subscriptions to process.
> When this occurs at some point, it's usually after a broker restart. The
> messages from the sub fail to get to the pub.
> When the broker is restarted again, they all show up together in one go.
> I think this is related to the "has_in" method
> probably? (as if there's an invalid pipe ahead of the new valid pipe
> that didn't get removed, or something like that).
So you are able to get the system stuck, right? Do you have the test
programs? How can I reproduce it?
Martin
More information about the zeromq-dev
mailing list