[zeromq-dev] i can't see what i am doing wrong

Brian Granger ellisonbg at gmail.com
Tue Jul 6 01:06:51 CEST 2010


I have run into this many times.  The issue is that the SUB socket
doesn't get the messages until it actually connects:

If the PUB starts before the SUB, the PUB will start broadcasting
before the SUB starts and the SUB won't get those messages that were
sent before the SUB connects.  A PUB socket is like a radio broadcast.
 If you aren't listening, you don't get the messages.

BUT (this is more subtle).  If the SUB starts before the PUB, you will
still miss messages.  This is because it takes a little bit of time (I
think 0.1 sec) for the SUB socket to realize the PUB socket has
started.  In that short time interval, the PUB socket has already
started sending and you miss a few.

So, if you want to make sure that you get all the messages, I would
use a separate REQ/REP channel between the two to synchronize
everything before the PUB/SUB starts.

Cheers,

Brian

On Mon, Jul 5, 2010 at 3:31 PM, Andrew Hume <andrew at research.att.com> wrote:
> folks,
> i am doing a simple case and can't see my error:
> in process a:
>         ctxt = zmq_init(1, 5, 0);
>         q = zmq_socket(ctxt, ZMQ_PUB);
>         sprintf(buf, "tcp://%s:%s", machine, port);
>         n = zmq_connect(q, buf);
>         for(n = 0; n < 2050; n++){
>                 get_goo(loc, &data, &len);
>                 zmq_msg_init_size(&msg, len);
>                 memcpy(zmq_msg_data(&msg), data, len);
>                 m = zmq_send(q, &msg, 0);
>                 assert(m == 0);
>         }
>         zmq_term(ctxt);
>         exit(0);
> in process b:
>         ctxt = zmq_init(1, 10, 0);
>         q = zmq_socket(ctxt, ZMQ_SUB);
>         sprintf(buf, "tcp://*:%s", port);
>         n = zmq_bind(q, buf);
> assert(n == 0);
>         n = zmq_setsockopt(q, ZMQ_SUBSCRIBE, 0, 0);
> assert(n == 0);
>         for(n = 0; ; n++){
>                 zmq_msg_init(&msg);
>                 m = zmq_recv(q, &msg, 0);
> assert(m == 0);
>                 zmq_msg_close(&msg);
>                 if((n%100) == 99){
>                         printf("got %d packets\n", n+1);
>                         sleep(1);
>                 }
>         }
> the problem:
> process b doesn't always see all 2050 messages from process a.
> maybe 5% it does. sometimes, 1300 get thru, other times, 2000.
> nothing i've checked returns an error. i'm running 2.0.6 on redhat 5.4.
> is my code in error? or am i misunderstanding something?
> thanks
>
> ------------------
> Andrew Hume  (best -> Telework) +1 732-886-1886
> andrew at research.att.com  (Work) +1 973-360-8651
> AT&T Labs - Research; member of USENIX and LOPSA
>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>



-- 
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu
ellisonbg at gmail.com



More information about the zeromq-dev mailing list