[zeromq-dev] Feedback on new PATCH socket
Fabien Ninoles
fabien.ninoles at ubisoft.com
Sun May 8 06:56:07 CEST 2011
Sorry, I was talking about mongrel2 handlers, which consist of a pull socket that is read, and a pub socket where you send the request back.
Fabien
----- Message d'origine -----
> Something funny: as it is right now, the PATCH socket can be used to
> replace both handlers sockets with a single one. Can't work with the
> collecting mechanic I talk below however (neither with a timeout). May
> be we should put this policy in a higher level socket ?
>
> Fabien
>
> ----- Message d'origine -----
> >
> > ----- Message d'origine -----
> > > On 05/06/2011 10:15 PM, Fabien Ninoles wrote:
> > > > Census is just one example. It's more a "chain of command"
> pattern
> > > > where one or multiple GOVERNOR can send a command, to one or
> > multiple
> > > > WORKERS that obeys and send the results back. In fact, my
> primary
> > > > example were a parallel pipeline where everybody work on the same
> > > > task.
> > >
> > > Is that meant to implement the hot/hot failover?
> >
> > Sorry, I think I mix things up a little. We are currently in the
> > process of replacing a cluster P2P network stack over UDP with zmq.
> The
> > old stack has peer discovering, data replication, broadcasting and
> > direct communication between peers has well has fault detection and
> > master migration. It does a good job so far but fail on two levels,
> > scalability and network split, and required a separate protocol for
> > inter-cluster communication.
> >
> > I'm redesigning currently the whole thing but have to made some
> > concession in my design just to allow our deployment tools to work in
> > both case. So, for example, all nodes need to be able to send a
> command
> > and receive a reply from each others, even if the node doesn't know at
> > start how many other nodes are up and running.
> >
> > So, may be my specific need is not the best example to generalize, but
> > I'm still thinking that a dispatch and collect pattern can be quite
> > useful in a grid-oriented network.
> >
> >
> > > This is part of the XREP vs. ROUTER confusion. XREP cannot add
> > > delimiter
> > > because it's meant to reside in the middle of the topology,
> > forwarding
> > > request and replies to the next hop.
> >
> > I understand that. My usage of XREP sockets is more a pratical
> issue.
> > Semantically, it is nearer from the REQ socket. I just cannot remove
> > the strict send/recv policy of REQ to allow it to recv multiple
> replies
> > and that's why I use a XREP socket to implement it, but I would
> clearly
> > prefer a dedicate socket with a clear endpoint semantic for both.
> >
> > > The obvious problem with any one request many replies model is that
> > the
> > > requester has no idea whether it have got all the answers or not
> yet.
> > > Specifically, think of large distributed topologies where at least
> a
> > > part of topology is likely to be offline at any given moment.
> > >
> > > The only solution seems to be to set a deadline for the replies.
> The
> > > user code could then look something like this:
> > >
> > > s = socket (SURVEYOR);
> > > zmq_setsockopt (s, ZMQ_DEADLINE, 10);
> > > // create a request...
> > > zmq_send (s, request, 0);
> > > while (true) {
> > > zmq_msg_t reply;
> > > rc = zmq_recv (s, &reply, 0);
> > > if (rc < 0 && errno = EDEADLINE)
> > > break;
> > > // process reply here...
> > > }
> >
> > Personnaly, and it is really only a matter of taste, I don't like the
> > idea of the socket handling the deadline itself. If I would like to
> > have a lock-steps approach of the problem, I would said that all PATCH
> > socket required all connected sockets to send a reply before
> processing
> > a new request. Since it doesn't know how many sub connections are
> below
> > each socket, the protocol would required to send back a signal telling
> > so.
> >
> > So, the pseudo-code for the patch socket would be something like:
> >
> > on_send(request):
> > for each socket in out_:
> > send(socket, request);
> > push(wait_queue, socket);
> > end.
> > while not empty?(wait_queue):
> > socket := poll(wait_queue, POLLIN);
> > reply := recv(socket);
> > if getsockopt(socket, RCVEND):
> > pop(wait_queue, socket);
> > end.
> > flags := 0;
> > if getsockopt(socket, RCVMORE):
> > flags := SNDMORE;
> > end.
> > if empty?(wait_queue):
> > flags := flags | SNDEND;
> > end.
> > send(_in, reply, flags);
> > end.
> > end.
> >
> > The policy for a P-REP socket would be to always a SNDEND flag on the
> > last frame. For the P-REQ, it would be to read all messages until
> the
> > SNDEND flag is received.
> >
> > The first problems I see with this approach:
> > 1- It doesn't handle the "no out_ socket". That can be fix by
> sending a
> > special "END_REPLY" that would be dropped by the next PATCH or P-REQ
> > socket upstream (which also mean that it must contains the full
> address
> > stack of the reply and be distinguishable from any other replies).
> >
> > 2- It lock the patch socket until all reply came back. May be it's
> > better this way, given that a single request can generate 1000 replies
> > but it can also completly lockdown the full tree if one socket
> > downstream fail to answer. In this case, setting a maximum timeout
> (in
> > the poll call above)is the only viable solution (or could we handle it
> > with a HWM only, dropping the current wait_queue before dropping any
> > message).
> >
> > Hope this help a little bit,
> >
> > Fabien
> > _______________________________________________
> > zeromq-dev mailing list
> > zeromq-dev at lists.zeromq.org
> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> >
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
More information about the zeromq-dev
mailing list