[zeromq-dev] Erlang Ports and zmq_poll

Martin Sustrik sustrik at moloch.sk
Mon Jul 26 14:53:31 CEST 2010


> BTW, if you want to flesh out some details of that migration with the
> list I'm sure many of us will be able to provide some useful feedback.

The migration work is already done. Here's some background:

Originally, OS threads used to communicate using a socketpair 
(signaler_t class). There was a socketpair between each pair of 
intercommunicating threads. However, single application thread can own 
multiple sockets. Thus, communication with all the sockets belonging to 
the same application thread was done using a single socketpair.

When socket was migrated to a new application thread, I/O threads had no 
idea about the fact and still used the original socketpair to 
communicate with it. Obviously, the communication went to the wrong 
application thread...

What was done in sustrik/zeromq2 branch is that there's a separate 
socketpair for each _socket_  rather than for each thread. Thus, even 
when socket is migrated to a different thread, the communication channel 
(socketpair) can be migrated with it. The only requirement is that user 
does full memory barrier after migrating the socket, so that CPU caches 
are synchronised in case the socket ownership was moved to a different 
CPU core.

It seems easy, but there is a problem. When socket is closed, final 
handshaking has to be done to deallocate the resources (queues) shared 
with other threads. zmq_close cannot block while the handshaking is 
finished because the peer may be another 0MQ socket (via inproc 
transport). In such case handshaking is done between two application 
threads. If the peer application thread doesn't call a 0MQ function for 
an hour, the handshake cannot be accomplished for an hour and zmq_close 
would block for an hour.

Thus, zmq_close has to return immediately and leave the socket in 
"zombie" state. Note that the change made in sustrik/zeromq2 
disassociates 0MQ socket from an application thread. Given that, which 
thread is going to finish the handshaking after socket becomes a zombie? 
Zombie is an orphan and there's nobody responsible for it.

Current solution is that zmq_socket function call (which is not bound to 
any particular thread) tries to dezombify the environment before doing 
anything else. Dezombification is placed in a critical section so that 
parallel calls to zmq_socket won't trash the memory.

Now zmq_term enters the picture. When 0MQ is terminated we need all 
zombies to be deallocated. The whole algorithm becomes pretty complex 
and extremely fragile.

This is the status quo of sustrik/zeromq2.

The shutdown system has to be thoroughly examined and fixed.

Doing so, there's another plan mingled with the above one. You may 
recall that people often complain about "sent messages are lost when I 
terminate the application" and "Is there a way to ensure my messages are 
passed to the wire before my app exits?"

The problem is caused by the fact that so far zmq_close used to tear 
down whole infrastructure associated with the socket -- including any 
pending outgoing mesages. Now, that we have "zombie" sockets there a 
possiblity of delaying actual socket deallocation till all outgoing 
messages are written to the wire.

The two features above (migration and flushing pending messages) are 
pretty distinct, however, they have to be done in a single step. The 
reason is that shutdown subsystem is so complex and costly to get right 
that refactoring it twice would be just too expensive.

In short: migration is done, there are some bugs in shutdown mechanism, 
message flushing is still to be done.

Hope this helps.

More information about the zeromq-dev mailing list