[zeromq-dev] assert fail 3.1.1 (master)

john skaller skaller at users.sourceforge.net
Mon Jan 30 02:30:52 CET 2012


[tasksink3] Process Monitoring pthread for test/zmq/tasksink2 start at 1327882691
[tasksink3] Process 1644 created for program test/zmq/tasksink2
Total expected cost: 0 msec
.:...................:...................:...................:....................:..................:...................:...................:...................:...................:..................Total elapsed time: 1000 msec
Assertion failed: size > 0 && (*data == 0 || *data == 1) (xpub.cpp:66)
[tasksink3] Exit 1644 status 6

The Felix code:
////////////////////////
//  Socket to receive messages on
var receiver = context.mk_socket ZMQ_PULL;
receiver.bind "tcp://*:5558";

//  Socket for worker control
var controller = context.mk_socket ZMQ_PUB;
controller.bind "tcp://*:5559";

//  Wait for start of batch
C_hack::ignore$ receiver.recv_string;

//  Start our clock now
start_time := #Time::time;

//  Process 100 confirmations
for var task_nbr in 0 upto 99 do
  C_hack::ignore$ receiver.recv_string;
  print if (task_nbr / 10) * 10 == task_nbr then ":" else "." endif;
  fflush (stdout);
done
println$ f"Total elapsed time: %d msec"$ 
  (#Time::time - start_time).int * 1000;

//  Send kill signal to workers
controller.send_string "KILL";

//  Finished
Faio::sleep (sys_clock, 0.001); //  Give 0MQ time to deliver

receiver.close;
controller.close;
context.term;
///////////////////

suggests the problem is in the send.string "KILL" bit. Here is send_string:

  proc send_string (s:zmq_socket) (m:string) => 
     int_to_proc$ zmq_send(s,m.cstr.address,m.len,ZMQ_XMIT_OPTIONS_NONE);

where the Felix binding to C is given by:

  gen zmq_send : zmq_socket * address * size * zmq_xmit_options_t -> int;

i.e. this is a direct call to zmq_send without any glue. m.cstr is C++ m->c_str()
which returns a char array of the string. So note: no messages here, just sending
raw bytes.

Now, I have no idea if this is related but I have a segfault in another program:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0xfffffffffffffff8
0x00007fff84b1d0aa in std::string::_Rep::_M_grab ()

(gdb) bt
#0  0x00007fff84b1d0aa in std::string::_Rep::_M_grab ()
#1  0x00007fff84b1d171 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string ()
#2  0x00000001000021ed in flxusr::launch::_i18245_f18245_iterator__apos_2::apply ()
#3  0x00000001000055e7 in flxusr::launch::launch_managed_processes::resume ()
#4  0x000000010001bea5 in flx::rtl::fthread_t::run ()
#5  0x000000010001c03e in flx::run::sync_state_t::frun ()
#6  0x00000001000143ed in doflx ()
#7  0x0000000100014b19 in run_felix_pthread ()
#8  0x0000000100014db3 in run_felix ()
#9  0x0000000100015534 in main ()

which suggests a bug in in OSX GNU C++ standard library. however this one is sending
strings across thread boundaries via a monitor (should be safe, uses locks).

On my box  C++ strings are a single machine word .. I have no idea how this works.

--
john skaller
skaller at users.sourceforge.net







More information about the zeromq-dev mailing list