[zeromq-dev] [PATCH] Scalability improvements for large amounts of connections

Martin Sustrik sustrik at moloch.sk
Thu Oct 14 13:08:39 CEST 2010


Matt,

I think it's too much work with almost no benefit.

ZMQ_MIGRATE won't solve your "accidental tourist" scenario. It would 
fail and it would be hard to diagnose anyway.

As for the second scenario, it would require user to do *really* strange 
things to get there. Such as using x86 cache coherence mechanisms to 
notify the other thread about the migration while the latter is 
busy-looping, checking the particular memory position. If the user does 
that kind of thing it's reasonable to expect that he understands what 
full memory barrier means.

Anyway, if you feel you absolutely need the functionality feel free to 
implement it. It's going to be a non-trivial endeavour though.

Martin

> I've given this a bit of thought.
>
> IMO the problem lies more with "unintended" migration.
>
> Two cases come to mind:
>
> 1 - accidental_tourist - user doesn't understand the single thread 
> limitation, and blindly plows ahead.
> This is very common in 2.0.X usage, and causes an immediate failure. 
> However, in 2.1.X versions this will cause difficult to diagnose ypipe 
> related errors, since there's no sense of thread "ownership".
>
> 2- broken_barrier - user thinks they understand migration, but don't, 
> or they slightly misimplement it.
> This one is a new case, which will again result in ypipe-related 
> issues in heavily multithreaded scenarios which work fine with e.g. 
> printfs.
>
> Both of these will be REALLY hard to diagnose.  When you start seeing 
> the bug reports from these, esp. as your uptake rate increases, you 
> may want to have the ability to say one of two things:
>
> 1- You _must_ use ZMQ_MIGRATE, in which case we will ensure barriers 
> and threads are correct; or
> 2- You _may_ use ZMQ_MIGRATE, in which case we will ensure barriers 
> and threads are correct.
>
> Case 2 is more interesting, since it gives you the option of 
> recommending an optional solution, i.e. if you use ZMQ_MIGRATE, 
> getting some additional "correctness" services.  And they may not 
> realize they're migrating in the first place, etc., so when you start 
> inserting these before zmq calls until the problem is obvious...
>
> If you _don't_ have the thread checking capability, someone will have 
> to implement it IMO.
>
> Also, unless I'm mistaken, it would merely be a matter of setting a 
> pthread_t (or other system specific identifier) to a non-zero value as 
> part of a ZMQ_MIGRATE, and only doing the related check if that were 
> non-zero.
>
> This would also allow the client to drop the ZMQ_MIGRATE call in 
> exchange for a barrier once they were sure the code worked, e.g.
>
> #ifdef    FASTER
>     inline void migrate() { do_barrier(); }
> #else
>     inline void migrate() { do_setsockopt(); }
> #endif
>
> And you'd unroll the code with unlikely(thread_id), etc.
>
> Please let me know what you think...
>
> Best,
>
> Matt
>
> On Oct 12, 2010, at 7:06 AM, Pieter Hintjens wrote:
>
>> Thanks, Martin, this makes it clearer.
>>
>> On Tue, Oct 12, 2010 at 10:37 AM, Martin Sustrik <sustrik at 250bpm.com> 
>> wrote:
>>> On 10/12/2010 10:29 AM, Pieter Hintjens wrote:
>>>>
>>>> On Mon, Oct 11, 2010 at 10:51 PM, Martin Sustrik<sustrik at 250bpm.com>
>>>>  wrote:
>>>>
>>>>> If you are using socket from multiple threads you are abusing the
>>>>> library and you fully deserve your application crashing. Muahaha!
>>>>
>>>> I'm not sure "0MQ supports socket migration but if you use it your
>>>> apps will crash, Muahaha" is a compelling proposition.
>>>>
>>>> Even if the functionality only exists for specific cases such as the
>>>> Erlang binding, it should be clearly documented and explicit.
>>>>
>>>> Well, you can try the "muahaha" approach but IMO it'll just result in
>>>> confusion and complaints.
>>>
>>> You should *not* use socket from multiple threads in parallel. That's
>>> clearly documented and if you do so, you'll crash. To be serious, 
>>> you can't
>>> avoid that unless you remove the lockfree algorithms. It worked that 
>>> way
>>> till now and it still works that way.
>>>
>>> What I am saying is there's no need for explicit socket migration, 
>>> because
>>> basically any scenario of migration you can imagine already ensures the
>>> memory barrier.
>>>
>>> If it does not, you are doing something very strange (probably using 
>>> some
>>> kind of lock-free signaling mechanism) and in that case you are an 
>>> expert
>>> and you understand well what "you have to ensure full memory 
>>> barrier" means.
>>>
>>> Martin
>>>
>>>
>>
>>
>>
>> -- 
>> -
>> Pieter Hintjens
>> iMatix - www.imatix.com
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>




More information about the zeromq-dev mailing list