[zeromq-dev] any hints?

Gerard Toonstra gtoonstra at gmail.com
Tue Oct 26 16:31:29 CEST 2010


Hi Andrew,

Given the fact that software on all nodes is identical and this fails on one
node only,
I'd consider the option that this could be a intermittent hardware problem.
Have you verified
memory with a mem checker and verified the correct function of the NIC by
doing a stress-test on it?

Rgds,

G>

On Tue, Oct 26, 2010 at 2:03 PM, Andrew Hume <andrew at research.att.com>wrote:

> i realise its crass to ask for help in a complicated program, but i'd
> appreciate any hints on what to look at.
> i am running version 2.0.9 on a red hat enterprise 5.4 system
> (actually several nodes identically setup).
>
> the program (digest) has a main thread which opens several sockets
> successfully.
> it passes the zmq context to a few other threads, who then open one
> or two zmq sockets to work on.
>
> the problem is this: digest fails to work on exactly one system (nisus01)
> in the following way:
> any call to zmq_socket in a thread other than the main thread fails with no
> useful errno.
> (the types include ZMQ_SUB and ZMQ_PUSH.) the same binary and same
> libraries
> work on other nodes just fine. similiar programs using zmq run just fine on
> nisus01.
>
> i've asked my sysadmin to reboot nisus01 to see if that changes anything.
> is there anything else to look at? i've run everything with valgrind to
> make sure
> i'm not stomping on memory. is there some non-thread safe weakness with
> zmq_socket?
>
> andrew
>
> ------------------
> Andrew Hume  (best -> Telework) +1 623-551-2845
> andrew at research.att.com  (Work) +1 none currently
> AT&T Labs - Research; member of USENIX and LOPSA
>


-- 
Gerard Toonstra
-----------------------
http://www.radialmind.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20101026/5c0853e0/attachment.htm>


More information about the zeromq-dev mailing list