[zeromq-dev] Hang in a receiving application

Ville Aine ville.aine at blackshear.fi
Fri Jun 26 17:17:01 CEST 2009


Hi,

Sometimes when we are running our application, the receiving side
appears to hang. Unfortunately, I cannot reproduce this behavior
intentionally.

In our setup, we have one process running on a Windows box sending
messages, and a single client on a Linux box receiving them. When the
client hangs, the server is pushing several thousand messages per
second, at a rate far higher than the client can process them. Usually
things work OK for several minutes, and then client freezes.

After the hang, the situation looks like this:

  - Attaching to the frozen client with GDB shows two threads.
    The back trace for the "main" thread looks like this:

        #0  0x00007f10e0c96b04 in __lll_lock_wait () from /lib/libpthread.so.0
        #1  0x00007f10e0c921a0 in _L_lock_102 () from /lib/libpthread.so.0
        #2  0x00007f10e0c91aae in pthread_mutex_lock () from /lib/libpthread.so.0
        #3  0x00007f10e0eba2fa in zmq::api_thread_t::receive () from /usr/lib/libzmq.so.0
        #4  0x00007f10e194cf31 in zmq_receive () from /usr/lib/libczmq.so.0

    Where the pthread_mutex_lock() call appears to be made from
    ysemaphore_t::wait(), which in turn is called from
    ypollset_t::poll().

    The other thread appears to be blocked on epoll_wait() called from
    epoll_t::process_events().
  
  - The TCP connection between the client and the server is still
    alive according to netstat, but tcpdump does not show
    any traffic.

  - According to our logs, the server keeps sending messages at normal
    pace

  - If a new client application is started, it will receive data
    normally, while the original application is still frozen.

We are using zmq 0.6.1 with dummy_locator_t from the dj branch, and
slightly modified libczmq as described in my previous email to
zeromq-dev[1]. The sending side is running on 32-bit Windows Server 2003
R2, and the receiving side is running on 64-bit Ubuntu Intrepid with
2.6.27-22-xen kernel. Both operating systems are running under Xen.

Thanks,
Ville

1. http://lists.zeromq.org/pipermail/zeromq-dev/2009-June/000857.html



More information about the zeromq-dev mailing list