[zeromq-dev] Important: backward incompatible changes for 0MQ/3.0!
Martin Sustrik
sustrik at 250bpm.com
Sun Apr 3 10:58:12 CEST 2011
Hi Paul,
>> I would say the question is how can we improve reliability of 0mq (NB:
>> not make it perfect, just improve it) without dragging all this madness in.
>
> That was exacly my intention. May be I've not clear about that. I'm thinking
> about API similar to posix shutdown. First we call:
>
> zmq_shutdown(sock, SHUT_RD)
Ok. Couple of points:
1. Your proposal doesn't map to POSIX shutdown semantics. POSIX shutdown
is a non-blocking operation, ie. it initiates a half-close and returns
immediately. No more messages can be read/written.
2. The handshake with all the peers during the shutdown can take
arbitrary long time and even cause a deadlock. This kind of thing is
extremely vulnerable to DoS attacks.
3. Note that the intention here is to improve the reliability rather
than make 0MQ "reliable". See previous email for the detailed
discussion. Given the case is that we are trying to provide some
heuristic semi-reliable behaviour, it should not change the API in any way.
> Probably when we add a sentinel messages we can do PUB/SUB more
> reliable. When connection from publisher is closed unexpectedly we
> can send application EIO error (or whatever we choose). For tcp we know
> when connection is broken, for ipc it is broken only on application crash
> and we also know it, for pgm we have retry timeout. Also we have to
> inject this kind of message when queue is full and we loose some
> message. This way you don't need to count messages to know when
> to die if messages stream is broken (and don't need to duplicate complex
> bookkeeping when there are several publishers). For devices it's up
> to the application on whats to do with error. It have to forward it as some
> application specific message if it needs to.
The problem here is that PUB/SUB allows for multiple publishers. Thus
numbering the messages wouldn't help. The real solution AFAICS is
breaking the pub/sub pattern into two distinct patterns: true pub/sub
with a single stream of messages (numbering makes sense here) and
"aggregation" where streams from multiple publishers are aggregated as
they are forwarded towards the consumer (no point in numbering).
> BTW, it's much more important than repeating requests in REQ socket,
> since latter can be easily done in user code. Well, actually it forces
> me to always use XREQ/XREP sockets which points that REQ ones are
> probably useless for any realistic applications. So probably for blocking
> use case we need some option like ZMQ_RESET, to allow to request
> again.
Yes. The reset seems to be a good idea. Use cases:
1. On REQ socket: I am not interested in this reply any more. Cancel the
request and start a new one.
2. On REP socket: The request I've got is malformed and possibly
malevolent. Drop the request without even responding to the requester.
Martin
More information about the zeromq-dev
mailing list