[zeromq-dev] weirdness

Andrew Hume andrew at research.att.com
Wed Feb 2 17:08:30 CET 2011


this has been resolved, and zeromq is blameless!
i added much more logging to help investigate and found the problem.
namely, the server in question had its date set wrong!
(it was 5 mins behind the other servers.)
this meant that the heartbeats were being processed and immediately discarded
as evidence that the process had died (or rather, timed out).
if the date delta had been smaller, i wouldn't have timed them out,
and if it were larger, i would have noticed just mucking around on the system.

thanks to those who responded, in any case. i will work on getting tcpdump
stuff working for the next time.

one of my steps in investigating was running stuff under valgrind, and of course,
zeromq internals generate quite a few (hopefully harmless) valgrind errors.
is there a plan to eliminate these errors?

------------------
Andrew Hume  (best -> Telework) +1 623-551-2845
andrew at research.att.com  (Work) +1 none currently
AT&T Labs - Research; member of USENIX and LOPSA




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20110202/e4b14dc3/attachment.htm>


More information about the zeromq-dev mailing list