[zeromq-dev] SUB socket hangs after some time

Oliver Senn oliver.senn at smart.mit.edu
Thu Nov 11 05:12:42 CET 2010


I tried using * instead of 127.0.0.1 and different ports but the problem 
was still around.

Then I also tested on different machines. I was able to reproduce the 
problem on an Ubuntu 10.04 box with Kernel 2.6.32-25. For this machine I 
was also able to reproduce the problem even when the publisher was 
running on a different machine and just the client was running on the 
Ubuntu box.

I then tried on a Ubuntu 10.10 server box with kernel 2.6.35-22 and 
there I *wasn't* able to reproduce the problem. It's running fine.

The only difference* between my Mac/Ubuntu 10.04 and the other box that 
I was able to find is that I upgraded ZeroMQ from 2.0.9 on the 
Mac/Ubuntu 10.04 whereas on the other box, I installed directly 2.0.10.
Do you think this could make a difference? Could there be something left 
from 2.0.9. that is causing the problems?

* Apart from the differences in OS and kernel versions

Oliver

On 11/11/10 10:47 AM, Joshua Foster wrote:
> I just tried it with 2.0.9, 2.0.10, and the latest from maint branch. I am not seeing the symptoms. The client continues to pull messages off. I suspect that it may be something else to your environment. Does it still exhibit the problem if you bind to a different port? Or when you bind to '*' vs '127.0.0.1'?
>
> Joshua
>
> On Nov 10, 2010, at 6:48 PM, Oliver Senn wrote:
>
>> Hi Joshua,
>>
>> Thanks for the help. I will try again after making the variable
>> volatile. But the program hangs in the receiving method of the socket,
>> so I don't think it's because of that.
>>
>> I only tried the test program on one machine but the whole program was
>> used on multiple machines and was showing the same behavior.
>>
>> Oliver
>>
>> On 11/11/10 6:42 AM, Joshua Foster wrote:
>>> I'm running it on OS X with 2.0.9 and I haven't seen the issue yet. I'll compile 2.0.10 and see if it happens later tonight. Not sure if this affects it, but numMessages should be volatile since you have multiple threads accessing it. If you want the ability to restart the subscriber without losing messages, be sure to set the identity. Also, are you running both pub and client on the same machine with tcp://127.0.0.1?
>>>
>>> Joshua
>>>
>>> On Nov 10, 2010, at 6:32 AM, Oliver Senn wrote:
>>>
>>>> Hi list,
>>>>
>>>> In our code we use a simple PUB/SUB scheme: A publisher is sending data over a PUB socket and a subscriber is getting that data using a SUB socket.
>>>> Today I tested the code and (especially with a lot of messages) after some time the subscriber hangs in socket.recv(). The publisher happily goes on at sending messages and does not return an error. The subscriber does not get any of those messages though and also does not report an error.
>>>>
>>>> I simplified the code we are using and a attached it to this email. Sometimes the problem appears after 20 seconds and sometimes after 400 but eventually it happens.
>>>>
>>>> I am using ZeroMQ 2.0.10 and the Java bindings on Mac OS X 10.6.4 with .
>>>>
>>>> Best,
>>>>
>>>> Oliver
>>>> <Client.java><Publisher.java>_______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>



More information about the zeromq-dev mailing list