[zeromq-dev] [PATCH] Scalability improvements for large amounts of connections

Matt Weinstein matt_weinstein at yahoo.com
Thu Oct 14 12:35:10 CEST 2010

I've given this a bit of thought.

IMO the problem lies more with "unintended" migration.

Two cases come to mind:

1 - accidental_tourist - user doesn't understand the single thread  
limitation, and blindly plows ahead.
This is very common in 2.0.X usage, and causes an immediate failure.  
However, in 2.1.X versions this will cause difficult to diagnose ypipe  
related errors, since there's no sense of thread "ownership".

2- broken_barrier - user thinks they understand migration, but don't,  
or they slightly misimplement it.
This one is a new case, which will again result in ypipe-related  
issues in heavily multithreaded scenarios which work fine with e.g.  

Both of these will be REALLY hard to diagnose.  When you start seeing  
the bug reports from these, esp. as your uptake rate increases, you  
may want to have the ability to say one of two things:

1- You _must_ use ZMQ_MIGRATE, in which case we will ensure barriers  
and threads are correct; or
2- You _may_ use ZMQ_MIGRATE, in which case we will ensure barriers  
and threads are correct.

Case 2 is more interesting, since it gives you the option of  
recommending an optional solution, i.e. if you use ZMQ_MIGRATE,  
getting some additional "correctness" services.  And they may not  
realize they're migrating in the first place, etc., so when you start  
inserting these before zmq calls until the problem is obvious...

If you _don't_ have the thread checking capability, someone will have  
to implement it IMO.

Also, unless I'm mistaken, it would merely be a matter of setting a  
pthread_t (or other system specific identifier) to a non-zero value as  
part of a ZMQ_MIGRATE, and only doing the related check if that were  

This would also allow the client to drop the ZMQ_MIGRATE call in  
exchange for a barrier once they were sure the code worked, e.g.

#ifdef	FASTER
	inline void migrate() { do_barrier(); }
	inline void migrate() { do_setsockopt(); }

And you'd unroll the code with unlikely(thread_id), etc.

Please let me know what you think...



On Oct 12, 2010, at 7:06 AM, Pieter Hintjens wrote:

> Thanks, Martin, this makes it clearer.
> On Tue, Oct 12, 2010 at 10:37 AM, Martin Sustrik  
> <sustrik at 250bpm.com> wrote:
>> On 10/12/2010 10:29 AM, Pieter Hintjens wrote:
>>> On Mon, Oct 11, 2010 at 10:51 PM, Martin Sustrik<sustrik at 250bpm.com>
>>>  wrote:
>>>> If you are using socket from multiple threads you are abusing the
>>>> library and you fully deserve your application crashing. Muahaha!
>>> I'm not sure "0MQ supports socket migration but if you use it your
>>> apps will crash, Muahaha" is a compelling proposition.
>>> Even if the functionality only exists for specific cases such as the
>>> Erlang binding, it should be clearly documented and explicit.
>>> Well, you can try the "muahaha" approach but IMO it'll just result  
>>> in
>>> confusion and complaints.
>> You should *not* use socket from multiple threads in parallel. That's
>> clearly documented and if you do so, you'll crash. To be serious,  
>> you can't
>> avoid that unless you remove the lockfree algorithms. It worked  
>> that way
>> till now and it still works that way.
>> What I am saying is there's no need for explicit socket migration,  
>> because
>> basically any scenario of migration you can imagine already ensures  
>> the
>> memory barrier.
>> If it does not, you are doing something very strange (probably  
>> using some
>> kind of lock-free signaling mechanism) and in that case you are an  
>> expert
>> and you understand well what "you have to ensure full memory  
>> barrier" means.
>> Martin
> -- 
> -
> Pieter Hintjens
> iMatix - www.imatix.com
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev

More information about the zeromq-dev mailing list