[zeromq-dev] REQ/REP with multiple REP workers question
Alexander Hill
alexander.d.hill.89 at gmail.com
Sat Apr 11 01:21:33 CEST 2020
It is expected, because DEALER doesn't have any concept of when a
worker in this pattern is "ready": it will round-robin outgoing
messages among all connected peers.
Imagine that the state of the world is this: clients 1 and 2 have no
outstanding requests, and the dealer socket is primed to forward its
next message to worker 2, which will take two seconds to produce a
reply. (Worker 1, of course, will only take one second to reply.)
Now, both clients send a request at about the same instant. Client 2's
request wins the race inside the router socket, and it gets forwarded
to worker 2. Immediately after, client 1's request is forwarded to
worker 1. Importantly, the dealer socket will send its next message to
worker 2.
One second passes. Worker 1 sends a reply, causing client 1 to send
its next request. This is routed to worker 2, which will still be
chewing on client 2's request for another full second! Client 1 needs
to wait one second for client 2's request to complete, then two more
for worker 2 to process its own request, for the total of three
seconds that you observed.
----
As this exercise demonstrates, the simple round-robin behavior baked
into REQ, DEALER, and PUSH sockets can create suboptimal schedules
when workloads aren't homogeneous. A lot of the time, that's
acceptable. If it's not, you can build more sophisticated load
balancing algorithms yourself on top of a ROUTER socket, but the
exercise does involve a bit of protocol design.
On Fri, Apr 10, 2020 at 1:31 PM Jasper Jaspers <jaspers01995 at gmail.com> wrote:
>
> I'm testing the REQ->REP pattern with multiple reply workers to test concurrent behavior.
>
> Have 3 applications, 2 clients and 1 server, running on same node.
>
> Each, which is essentially the client from the zmq guide, has a REQ socket that connects to the server (tcp://127.0.0.1:nnnn). They simply loop sending messages and waiting for the reply and timing how long it takes to get the reply.
>
> The server, which is essentially the mtserver from the zmq guide, has an external ROUTER that binds to (tcp://127.0.0.1:*). Internally it has a DEALER with n workers, where each worker is on its own thread and has a REQ socket that connects locally to DEALER. The ROUTER and DEALER use the zmq_proxy to map external messages to internal messages. Each REP worker receives a message and sleeps for some number of seconds to simulate work time and then sends a reply.
>
> In my test I have two REP workers configured on the server. I figured one for each client to get concurrent behavior. Worker1 sleeps for 1 sec and Worker2 sleeps for 2 seconds.
>
> Based on this I would expect concurrent behavior and the clients to show that messages take either 1 or 2 seconds to complete. When I start the first client I see messages take either 1 or 2 seconds based on which worker processed the message, which I expect. Then when I start the 2nd client I see, on both clients, that some messages take 3 seconds to complete. Looking further, all of the messages that take 3 sec to complete come from Worker2. Looks like only the first message processed by Worker2 after the second client starts completes in 2 seconds. My logs show that when Worker2 takes 3 seconds to complete it's receiving the client's message 1 sec after the client sent it. This accounts for the additional time but I'm not understand why this is happening. Is this the correct behavior?
>
> I also re-ran the test where each Worker sleeps for 2 secs. In this case the clients showed that all work, from each Worker, completed in 2 secs which I expected.
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
More information about the zeromq-dev
mailing list