[zeromq-dev] ZMQ hanged :(

Ivan Pechorin ivan.pechorin at gmail.com
Thu Apr 11 14:19:21 CEST 2013

Hi Gonzalo,

I have a small app that accepts web service requests (Apache CXF inside
Jetty servlet container) and uses JeroMQ to pass requests to a backend

There is a single dedicated thread that runs ZLoop with the following
zeromq sockets:
 1) inproc PULL socket to receive requests from multiple webservice threads
that process requests;
 2) tcp DEALER socket to communicate with backend server completely
 3) inproc PULL socket to receive "stop" signal (sent from some other
thread when it's time to stop).

There is one more socket for use by the multiple webservice threads:
 4) inproc PUSH socket (connected to socket #1 above).

Every webservice worker thread processes each request as follows:
 1) get an unique Id for the request;
 2) register the Id in a map of pending requests (basically, push an
instance of dummy "Waiter" class into the map, using the Id as a key);
 3) push the request to the loop using the inproc PUSH socket #4 - access
to the socket is protected by "synchronized" block, of course;
 4) put a "waiter" object into the
 4) wait for a reply on the waiter object - standard Java wait() is used.

The loop works as follows:
 1) forwards every request from the inproc PULL socket (#1) through the TCP
DEALER socket (#2) to the backend server (including the request Id);
 2) every reply received from the backend server, has request Id inside;
this Id is used to find the "waiter", pass the reply to the waiter and
notify() it.
 3) plus some heartbeating with the backend server.

So, there are just 4 zeromq sockets used to process multiple concurrent
webservice requests.

P.S. I know that using a socket from multiple threads is contrary to the
ZeroMQ way, but the solution described above works fine for me and I don't
think that using one or few hundreds of inproc sockets (one inproc socket
for each worker thread in the servlet container) can be more efficient.

2013/4/11 Gonzalo Vasquez <gvasquez at altiuz.cl>

> Thanks Min,
> Ok, understood. But as this code is part of a webservice (i.e. each
> request works in separate threads, and concurrently), opening a single
> socket is not an option, so I'm guessing a socket pool would help. Any
> hints/patterns/iinks on achieving such on java?
> Regards,
>   Gonzalo Vásquez Sáez
> Gerente Investigación y Desarrollo (R&D)
> Altiuz Soluciones Tecnológicas de Negocios Ltda.
> Av. Nueva Tajamar 555 Of. 802, Las Condes - CP 7550099
> +56 2 335 2461
>   gvasquez at altiuz.cl
> http://www.altiuz.cl
> http://www.altiuzreports.com
>   <https://www.facebook.com/altiuz>  <http://twitter.com/altiuz> <http://www.linkedin.com/company/altiuz>
> El 11-04-2013, a las 2:44, Yu Dongmin <miniway at gmail.com> escribió:
> Hi,
> I was able to reproduce your case.
> Basically, creating and closing ZMQ socket per every message might not be
> a good practice.
> I would recommend to connect once at a program start (ex, the next line of
> ZMQ.context)
> Even though, jeromq seems to have a bug at handling frequent socket
> creation.
> jzmq also showed a 4~5 secs blocking at a stress but it resumed again but
> jeromq didn't resume after blocking.
> I'll looking into this issue further.
> Thanks
> Min
> On Apr 11, 2013, at 5:14 AM, Gonzalo Vasquez <gvasquez at altiuz.cl> wrote:
> I'm using JeroMQ. I've updated the code to a much tidier one:
> private byte[] getByte(final String table, final String name,
> final int doc_off, final int doc_len, final int comp_off,
> final int comp_len, final char compressionType) throws Exception {
> File file = new File(cacheRoot, table.substring(0, 3) + "/DOC/" + name);
> //$NON-NLS-1$
> // Context ctx = ZMQ.context(1);
> Socket req = null;
> byte[] data = null;
> try {
> req = ctx.socket(ZMQ.REQ);
> req.connect(ENDPOINT);
> // TODO Crear POJO en vez de Map
> Map<String, String> params = new HashMap<String, String>();
> params.put("path", file.getAbsolutePath());
> params.put("dOff", String.valueOf(doc_off));
> params.put("dLen", String.valueOf(doc_len));
> params.put("cOff", String.valueOf(comp_off));
> params.put("clen", String.valueOf(comp_len));
> params.put("cType", String.valueOf(compressionType));
> ByteArrayOutputStream baos = null;
> ObjectOutputStream oos = null;
> try {
> baos = new ByteArrayOutputStream();
> oos = new ObjectOutputStream(baos);
> oos.writeObject(params);
> } finally {
> params.clear();
> if (oos != null) {
> oos.close();
> }
> if (baos != null) {
> baos.close();
> }
> }
> LOG.info("Sending Request");
> req.send(baos.toByteArray(), NO_FLAGS);
> LOG.info("Request sent");
> data = req.recv();
> LOG.info("Response received");
> } finally {
> if (req != null) {
> req.disconnect(ENDPOINT);
> req.close();
> }
> }
> // ctx.term();
> return data;
> }
> But same problem arises :(
>   Gonzalo Vásquez Sáez
> Gerente Investigación y Desarrollo (R&D)
> Altiuz Soluciones Tecnológicas de Negocios Ltda.
> Av. Nueva Tajamar 555 Of. 802, Las Condes - CP 7550099
> +56 2 335 2461
>   gvasquez at altiuz.cl
> http://www.altiuz.cl
> http://www.altiuzreports.com
>   <https://www.facebook.com/altiuz>  <http://twitter.com/altiuz> <http://www.linkedin.com/company/altiuz>
> El 10-04-2013, a las 17:11, Eric Hill <eric at ijack.net> escribió:
> Sorry I missed the code in the first email.  Are you using zmq with the
> jni binding, or jzmq?
> Your code looks fine to me.  Can you strip that section of code out into a
> separate jar for testing outside of WAS?
> On Wed, Apr 10, 2013 at 2:37 PM, Gonzalo Vasquez <gvasquez at altiuz.cl>wrote:
>> Dear Eric,
>> 1.- Process 2472 actually is IBM WAS, where my client code runs.
>> 2.- I attached the code on the first email, nevertheless here is the
>> client side:
>> private byte[] getByte(final String table, final String name,
>>  final int doc_off, final int doc_len, final int comp_off,
>>  final int comp_len, final char compressionType) throws Exception {
>>  File file = new File(cacheRoot, table.substring(0, 3) + "/DOC/" +
>> name); //$NON-NLS-1$
>>  // Context ctx = ZMQ.context(1);
>>  Socket req = ctx.socket(ZMQ.REQ);
>>  req.connect(ENDPOINT);
>>  // TODO Crear POJO en vez de Map
>>  Map<String, String> params = new HashMap<String, String>();
>>  params.put("path", file.getAbsolutePath());
>> params.put("dOff", String.valueOf(doc_off));
>>  params.put("dLen", String.valueOf(doc_len));
>>  params.put("cOff", String.valueOf(comp_off));
>> params.put("clen", String.valueOf(comp_len));
>>  params.put("cType", String.valueOf(compressionType));
>> ByteArrayOutputStream baos = new ByteArrayOutputStream();
>>  ObjectOutputStream oos = new ObjectOutputStream(baos);
>> oos.writeObject(params);
>>  oos.close();
>> params.clear();
>>  baos.close();
>>  LOG.info("Sending Request");
>>  req.send(baos.toByteArray(), NO_FLAGS);
>>  LOG.info("Request sent");
>>  byte[] data = req.recv();
>>  LOG.info("Response received");
>>  req.close();
>> // ctx.term();
>>  return data;
>> }
>> I'm now moving the close invocation into a finally block, just in case
>> something goes wrong in between.
>> 3.- Yes, I'm creating a new socket from the context on each request, but
>> closing (using close() method) it upon completion, do I have to use the
>> disconnect() method too?
>>   Gonzalo Vásquez Sáez
>> Gerente Investigación y Desarrollo (R&D)
>> Altiuz Soluciones Tecnológicas de Negocios Ltda.
>> Av. Nueva Tajamar 555 Of. 802, Las Condes - CP 7550099
>> +56 2 335 2461
>>   gvasquez at altiuz.cl
>> http://www.altiuz.cl
>> http://www.altiuzreports.com
>>   <https://www.facebook.com/altiuz>  <http://twitter.com/altiuz> <http://www.linkedin.com/company/altiuz>
>> El 10-04-2013, a las 16:14, Eric Hill <eric at ijack.net> escribió:
>> Process 2472 at the time of this running looks like it has a large number
>> of open sockets.  Since I don't have your code, I can only guess that
>> you're connecting new sockets for every request?  I've got a fairly large
>> system going that has at most a few dozen sockets open at any given time.
>>  The system is most likely being slow because it's running out of IP ports.
>>  Realize that there's only 65000 local ports for making outgoing
>> connections...
>> Eric
>> On Wed, Apr 10, 2013 at 1:30 PM, Gonzalo Vasquez <gvasquez at altiuz.cl>wrote:
>>> Please see attached file for "netstat -ano" output
>>>   Gonzalo Vásquez Sáez
>>> Gerente Investigación y Desarrollo (R&D)
>>> Altiuz Soluciones Tecnológicas de Negocios Ltda.
>>> Av. Nueva Tajamar 555 Of. 802, Las Condes - CP 7550099
>>> +56 2 335 2461
>>>   gvasquez at altiuz.cl
>>> http://www.altiuz.cl
>>> http://www.altiuzreports.com
>>>   <https://www.facebook.com/altiuz>  <http://twitter.com/altiuz> <http://www.linkedin.com/company/altiuz>
>>> El 10-04-2013, a las 15:03, Eric Hill <eric at ijack.net> escribió:
>>> "netstat -ano" would be an interesting metric to look at.
>>> On Wed, Apr 10, 2013 at 1:02 PM, Gonzalo Vasquez <gvasquez at altiuz.cl>wrote:
>>>> No Antivirus is installed on the server. I can think of a socket
>>>> exhausted related issue (kinda ulimit in unix/linux), as I even get
>>>> disconnected from the Remote Desktop in this scenario, but I'm able to
>>>> relogin immediately.
>>>>   Gonzalo Vásquez Sáez
>>>> Gerente Investigación y Desarrollo (R&D)
>>>> Altiuz Soluciones Tecnológicas de Negocios Ltda.
>>>> Av. Nueva Tajamar 555 Of. 802, Las Condes - CP 7550099
>>>> +56 2 335 2461
>>>>   gvasquez at altiuz.cl
>>>> http://www.altiuz.cl
>>>> http://www.altiuzreports.com
>>>>   <https://www.facebook.com/altiuz>  <http://twitter.com/altiuz> <http://www.linkedin.com/company/altiuz>
>>>> El 10-04-2013, a las 14:50, Eric Hill <eric at ijack.net> escribió:
>>>> 100% unresponsive and extremely slow are mutually exclusive.  I've seen
>>>> problems with antivirus programs attempting to scan inbound and outbound
>>>> network connections for possible threats.  Are you running any form of
>>>> antivirus on the server?
>>>> On Wed, Apr 10, 2013 at 12:14 PM, Gonzalo Vasquez <gvasquez at altiuz.cl>wrote:
>>>>> Nope, no cpu nor high memory usage detected :(
>>>>>   Gonzalo Vásquez Sáez
>>>>> Gerente Investigación y Desarrollo (R&D)
>>>>> Altiuz Soluciones Tecnológicas de Negocios Ltda.
>>>>> Av. Nueva Tajamar 555 Of. 802, Las Condes - CP 7550099
>>>>> +56 2 335 2461
>>>>>   gvasquez at altiuz.cl
>>>>> http://www.altiuz.cl
>>>>> http://www.altiuzreports.com
>>>>>   <https://www.facebook.com/altiuz>  <http://twitter.com/altiuz> <http://www.linkedin.com/company/altiuz>
>>>>> El 10-04-2013, a las 13:36, Wolfgang Richter <wolf at cs.cmu.edu>
>>>>> escribió:
>>>>> So you've noticed 100% CPU usage on this server or high memory usage
>>>>> when it's running (thus, the unresponsiveness?)?
>>>>> --
>>>>> Wolf
>>>>> On Wed, Apr 10, 2013 at 11:38 AM, Gonzalo Vasquez <gvasquez at altiuz.cl>wrote:
>>>>>> Wolf,
>>>>>> Yes, almost 100% unresponsive, even closing windows is extremely
>>>>>> slow.
>>>>>> The server component is terminated by a single CTRL-C, i.e. it's
>>>>>> interrupted....as the main is invoked in a black cmd window.
>>>>>> I've also realized that I also had to the terminate the client side
>>>>>> to recover 100% responsiveness, this part of the code is running as a
>>>>>> webapp in IBM WAS Server
>>>>>> It's virutalized in an ESXi server.-
>>>>>> Thanks.
>>>>>>   Gonzalo Vásquez Sáez
>>>>>> Gerente Investigación y Desarrollo (R&D)
>>>>>> Altiuz Soluciones Tecnológicas de Negocios Ltda.
>>>>>> Av. Nueva Tajamar 555 Of. 802, Las Condes - CP 7550099
>>>>>> +56 2 335 2461
>>>>>>   gvasquez at altiuz.cl
>>>>>> http://www.altiuz.cl
>>>>>> http://www.altiuzreports.com
>>>>>>   <https://www.facebook.com/altiuz>  <http://twitter.com/altiuz> <http://www.linkedin.com/company/altiuz>
>>>>>> El 10-04-2013, a las 12:34, Wolfgang Richter <wolf at cs.cmu.edu>
>>>>>> escribió:
>>>>>> What do you mean by:
>>>>>> the server get's really "stuck" until I terminate the server
>>>>>>> component.
>>>>>> Do you mean your Windows Server becomes almost unresponsive?
>>>>>> Other processes can't work properly?
>>>>>> How do you terminate the server component?
>>>>>> Also, is this in a virtualized/cloud environment, or bare metal
>>>>>> Windows Server?
>>>>>> --
>>>>>> Wolf
>>>>>> _______________________________________________
>>>>>> zeromq-dev mailing list
>>>>>> zeromq-dev at lists.zeromq.org
>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>> _______________________________________________
>>>>>> zeromq-dev mailing list
>>>>>> zeromq-dev at lists.zeromq.org
>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>> _______________________________________________
>>>>> zeromq-dev mailing list
>>>>> zeromq-dev at lists.zeromq.org
>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>> _______________________________________________
>>>>> zeromq-dev mailing list
>>>>> zeromq-dev at lists.zeromq.org
>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20130411/bdcd1c02/attachment.htm>

More information about the zeromq-dev mailing list