[zeromq-dev] PyZMQ / High CPU usage on receive

Francesco francesco.montorsi at gmail.com
Fri Jul 12 00:05:41 CEST 2024


Hi Adam,
Try to se an RX timeout on the SUB socket so that your tight RX loop can
release the CPU back to the OS while waiting for a message to be received..

HTH,
Francesco

Il mer 10 lug 2024, 23:21 Adam Cécile <acecile at le-vert.net> ha scritto:

> Hello,
>
>
> I'm trying to create an application with one central server gathering
> frames from multiple other process following H264 streams and yielding
> frames to the central server.
>
> However, I'm struggling with high CPU usage on the receiving Zmq part,
> with only a dozen of 25 fps streams, see top below:
>
>      PID USER      PR  NI    VIRT    RES SHR S  %CPU  %MEM     TIME+
> COMMAND
>   210591 usernam   20   0 5605252  63780  21056 S  42.9   0.1 0:06.37
> streams-manager
>   210632 usernam   20   0 7508320 297676 104584 S  17.5   0.5 0:02.57
> stream-cam13039
>   210637 usernam   20   0 7508192 297072 105496 S  16.5   0.5 0:02.65
> stream-cam4004
>   210635 usernam   20   0 7508192 298248 105416 S  16.2   0.5 0:02.56
> stream-cam13041
>   210628 usernam   20   0 7582068 295124 104408 S  15.8   0.4 0:02.60
> stream-cam13035
>   210653 usernam   20   0 7508320 294120 105640 S  15.8   0.4 0:02.55
> stream-cam13072
>   210640 usernam   20   0 7505236 236364 105708 S  10.6   0.4 0:01.91
> stream-cam200
>   210629 usernam   20   0 7505188 236892 105000 S   9.6   0.4 0:01.98
> stream-cam147
>   210642 usernam   20   0 7505112 234796 105128 S   8.9   0.4 0:01.75
> stream-cam204
>   210650 usernam   20   0 7505236 234980 105084 S   8.9   0.4 0:01.59
> stream-cam231
>   210644 usernam   20   0 7578860 234228 104480 S   8.6   0.4 0:01.80
> stream-cam214
>   210646 usernam   20   0 7505112 235932 105384 S   8.6   0.4 0:01.53
> stream-cam215
>   210652 usernam   20   0 7505112 234988 104972 S   8.6   0.4 0:01.62
> stream-cam233
>   161809 nm-open+  20   0   63492  14264  12000 R   7.9   0.0 5:24.89
> openconnect
>   210648 usernam   20   0 7505112 234688 104692 S   6.3   0.4 0:01.68
> stream-cam218
>   210638 usernam   20   0 7503484 214092 105244 S   4.6   0.3 0:01.38
> stream-cam167
>
>
> Sending part is doing fine, and the code actually publishing the frames
> is the following:
>
> In init:
>
> self._zmq_socket = cast(zmq.Socket, self._zmq_context.socket(zmq.PUB))
> self._zmq_socket.setsockopt(zmq.LINGER, 5000)  # ALlow up to 5 seconds
> to flush message before closing
> self._zmq_socket.setsockopt(zmq.HEARTBEAT_IVL, 1000)
> self._zmq_socket.setsockopt(zmq.HEARTBEAT_TIMEOUT, 5000)
> self._zmq_socket.setsockopt(zmq.HEARTBEAT_TTL, 5000)
> self._zmq_socket.setsockopt(zmq.RECONNECT_IVL, 10000)
> self._zmq_socket.connect(self._zmq_url)
>
> For each frame:
>
> height, width, channels = frame.shape
> payload = [int(110).to_bytes(2), height.to_bytes(2), width.to_bytes(2),
> channels.to_bytes(2), bytes() if frame is None else frame.tobytes()]
>
> self._zmq_socket.send_multipart(payload, flags=zmq.NOBLOCK, copy=False,
> track=False)
>
>
> Receiving part, which is causing the issue is the following:
>
> zmq_context = zmq.Context()
> zmq_socket = zmq_context.socket(zmq.SUB)
> zmq_socket.setsockopt_string(zmq.SUBSCRIBE, "")
> zmq_socket.bind(self.socket_url)
>
> while self.running:
>      parts = cast(Tuple[zmq.Frame, zmq.Frame, zmq.Frame, zmq.Frame,
> zmq.Frame], zmq_socket.recv_multipart(copy=False, track=False))
>
>
> Both are communicating using Unix socket (ipc://).
>
> What I already tried:
>
> - Use tcp socket instead of ipc
>
> - Send one single bytes message instead of multipart
>
> - Switch to push/pull instead of pub/sub
>
> Sadly, nothing is really making any change. The only thing that reduce
> "streams-manager" CPU usage close to zero, is to reduce the size of
> message being sent on stream consumer processes:
>
> E.g: Changing: payload = [int(110).to_bytes(2), height.to_bytes(2),
> width.to_bytes(2), channels.to_bytes(2), bytes() if frame is None else
> frame.tobytes()]
>
> To: payload = [int(110).to_bytes(2), height.to_bytes(2),
> width.to_bytes(2), channels.to_bytes(2), bytes() if frame is None else
> frame.tobytes()[:10]]
>
> To keep only 10 bytes of the actual video frame instead of the full one.
>
>
> Am I trying to do something stupid or did I missed something obvious ?
> As libzmq is written in C and this is basically only I/O, I assumed
> receiving raw bytes on one central process would not be a bottleneck...
>
>
> Thanks a lot in advance,
>
> Best regards, Adam.
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20240712/b7157e0b/attachment.htm>


More information about the zeromq-dev mailing list