[zeromq-dev] control messages appearing in application message reads

Marcin Romaszewicz marcin at brkt.com
Thu Jul 9 18:10:59 CEST 2015


Huh, that's odd. I haven't looked at your code too closely until now, and
yeah, given that you added the filtering in session_base, it would appear I
should see these everywhere! I'm currently fixing a giant mess of many
versions of ZMQ in the cloud, and it's hard to keep track of it all.
Looking at your code, I doubt I could produce a C test case which does what
I described, I made a mistake, and the fact I'm seeing the heartbeats on
NetBSD is a coincidence. These also connect via a different path than most
hosts.

It's too bad that the timers are started out of the
stream_engine_t::produce_ping_message. Any received packet already cancels
them, so it would be nice just to be able to turn on the timer from the app
level, and have the timeout actually decoupled from the heartbeats. I might
make that change locally, and if I have the timeout timer without the
pings, it would solve my problem talking to old versions of ZMQ.

-- Marcin



On Wed, Jul 8, 2015 at 8:00 PM, Jonathan Reams <jbreams at gmail.com> wrote:

> I'm actually very surprised that you aren't having this issue on all
> platforms - when I wrote the heartbeats code I had to fix this exact
> problem in master, so I think this behavior is expected. The heartbeating
> code filters command messages, and so far as I can tell there isn't
> anything in earlier versions of zeromq to distinguish command messages or
> filter them out. Any chance you could come up with a C test case where the
> PING messages properly get filtered out on older versions of zeromq?
>
>
> On Wed, Jul 8, 2015 at 7:15 PM, Marcin Romaszewicz <marcin at brkt.com>
> wrote:
>
>> Hi All,
>>
>> I've been working for a while on avoiding file descriptor leaks, and
>> various tests were going well using jbreams heartbeat protocol, but I've
>> hit a snag.
>>
>> I've got router sockets in servers running in AWS on zmq 4.2.0 (fresh
>> from your git repo), which are accepting connections from a whole lot of
>> hosts in AWS, from various operating systems and ZMQ versions. Heartbeats
>> are enabled and the heartbeat protocol sends messages with the msg::command
>> flag set with a payload of \4PING. On most of my hosts, the ZMQ sockets
>> consume these command messages and don't pass them up to the application
>> code.
>>
>> I'm running the Router sockets on Ubuntu 14 and a libzmq built off the
>> top of git as of a couple of days ago (4.2.0), this socket emits heartbeats
>> to other sockets on a whole bunch operating systems and ZMQ versions. These
>> are potentially long running workloads used by others, so I have to support
>> a lot of versions in flight.
>>
>> I've got a bunch of hosts running in AWS on NetBSD 6.1 using
>> py27-zmq-14.4.1, which bundles zmq 4.0.5. On these particular hosts, the
>> heartbeats are being delivered to application code via
>> zmq.socket.recv_multipart() (which calls zmq_msg_recv). All of our messages
>> delivered through zmq are wrapped in an outer protocol buffer which tells
>> us what the contents are, so we just read the message and deserialize is a
>> proto, but when we get a control message, this fails.
>>
>> On all of our ubuntu hosts, zmq handles heartbeats just fine, and we have
>> 3.2.2, 3.2.4, 4.1.2, and 4.0.4 out in the wild, none of these pass control
>> message up to application code, and some are running in python, while
>> others run in C/C++, all work.
>>
>> However, on NetBSD 6.1, i've got three versions in the wild; 4.0.3,
>> 4.0.4, and 4.0.5 (all via py-zmq). All of these pass command messages up to
>> application code, breaking it.
>>
>> Any ideas on what could be going wrong? Could there be a netbsd bug in
>> filtering out command messages?
>>
>> Thanks,
>> -- Marcin
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20150709/5dd9a11f/attachment.htm>


More information about the zeromq-dev mailing list