[zeromq-dev] Edge-triggered polling vs Level-triggered. Which one ZMQ is using? Why?

Goswin von Brederlow goswin-v-b at web.de
Mon Jul 21 20:23:43 CEST 2014

On Mon, Jul 14, 2014 at 08:42:14AM -0700, Michel Pelletier wrote:
> There are some discussions here:
> http://lwn.net/Articles/25137/
> http://www.kegel.com/c10k.html
> A quick scan of nanomsg source indicates it can use epoll, kqueue, or poll.
>  The first two are edge-triggered and the third is level-triggered.

Actually epoll() supports both level and edge trigger.

>  nanomsg appears in the first two cases to keep an internal array of events
> (self->events) to emulate level triggering (I could be wrong about that as
> I am not a nanomsg dev).
> As for one vs the other, I think it's a non-issue.  Neither one is right or
> wrong, correct code can be written for either case.  If you are doing
> event-driven programming you are either already aware of the issues of the
> underlying api (epoll/kqueue/poll) or will have to come to understand them
> eventually.
> -Michel

I think it is a big issue.

On read a level trigger is better because when you didn't consume all
input then the next zmq_poll() should not block. It's too easy to
accidentally run into this. Further you actually do want to only
consume some input from each socket and then poll again. The reason
for that is so that you can round-robin all sockets fairly, e.g.
consume one message from each socket and then poll again. If you
instead consume as much as possible from every socket then one socket
can be flooded with messages and starve other sockets. So level
trigger on read is a must for me.

Now on write the opposite holds. Normaly you write to the socket and
the data goes directly into the buffer/queue. But once in a while you
write too much and the socket returns EAGAIN. Then you keep the
message in some app internal buffers and wait for POLLOUT. When you
get a POLLOUT you write the backlog. With level trigger you have to
add waiting for POLLOUT every time your buffer/queue runs full and
remove again once the backlog is cleared. Otherwise POLLOUT would get
triggered all the time when there is no backlog. With edge trigger on
the other hand you simply add POLLOUT once and you make sure to write
as much backlog as possible whenever it gets triggered. In rare cases
it might get triggered with no backlog but only at most once every
time the backlog is cleared [this happens when the backlog exactly
fills the buffer].

So in summary the best behaviour would be (if you can't select it

read - level triggered
write - edge triggered

> On Mon, Jul 14, 2014 at 4:52 AM, artemv zmq <artemv.zmq at gmail.com> wrote:
> > Hi Pieter
> >
> > Not sure if it is related or not, but I experience an issue with using
> > ZMQ.Poller and java socket (SelectableChannel). I.e. poller api itself
> > allows me to register SelectableChannel on poller (for POLLIN),
> > but combination isn't workable -- poller does render presence of incoming
> > traffic only once.. ((
> >
> >
> > 2014-07-14 14:31 GMT+03:00 Pieter Hintjens <ph at imatix.com>:
> >
> > This was a design decision from very long ago, 2009 or so, and there
> >> was no real discussion of it. I've always assumed if it was really a
> >> problem in libzmq, someone would have changed that by now.
> >>
> >> On Mon, Jul 14, 2014 at 11:50 AM, artemv zmq <artemv.zmq at gmail.com>
> >> wrote:
> >> > Hi community
> >> >
> >> > Did read nanomsg' docs, the part where they have been explaining
> >> differences
> >> > against zmq. They mentioned that they use level-triggered polling
> >> instead of
> >> > edge-triggered one as it's in zmq.
> >> > Since I'm not expert in this low-level stuff, but I'm still very
> >> curious why
> >> > zeromq decided to go edge-triggered way instead of level-triggered one.
> >> >
> >> > Thanks in advance.

I'm pretty sure zmq uses level trigger or my code would surely
deadlock all the time. I'm also pretty certain I saw that specified in
the docs but now all I see is mention that zmq_poll() behaves like the
systems poll(), which is level triggered.


More information about the zeromq-dev mailing list