[zeromq-dev] [External] Re: A PGM/EPGM question

Steven McCoy steven.mccoy at miru.hk
Fri Mar 23 21:13:47 CET 2018


Maybe have to use zmq_poll with both in and out events?  Ultimately in_event()
needs to fire on the pgm_sender which calls process_upstream() that
processes a NAK.

On 23 March 2018 at 13:43, Montero, Antonio UTC CCS <
Antonio.Montero at fs.utc.com> wrote:

> Understood however that is not the behavior I am seeing. Although that is
> likely to be the case for EPGM since those are UDP packets although from my
> understanding regardless whether incoming data is multicast or unicast, PGM
> is binding to any address and specific port. The kernel will pass all data
> received on an interface to any listening socket as long as the destination
> port patches that of the socket binding.
>
>
>
> Now let’s put aside UDP for a sec, what about when using pgm transport?
> These are raw sockets and any unicast NAK are actually sent from remote SUB
> to the PUB unicast address and source port (which is randomly selected at
> the time of creating the raw PGM PUB socket). At that point the PUB socket
> should be the only one listening on its own unicast address and source
> port. Correct?
>
>
>
> *This is a snapshot of what my netstat –ln looks like at the moment. This
> is with both ( PUB and SUB created and running on the same host ).*
>
>
>
> Proto Recv-Q Send-Q Local Address                                                                Foreign
> Address      State
>
> *Sockets associated with PUB:*
>
> raw   164672      0     2001:db8::2b0:19ff:fe73:d890%2147479552:113
> ::%622984:*            113
>
> raw   164672      0    2001:db8::2b0:19ff:fe73:d890%2147479552:113
> ::%623304:*             113
>
> raw        0            0    ::%2147479552:113
>                                              ::%622984:*
>  113
>
> *Sockets associated with SUB:*
>
> raw   164672      0    2001:db8::2b0:19ff:fe73:d890%2147479552:113
> ::%622984:*             113
>
> raw   164672      0    2001:db8::2b0:19ff:fe73:d890%2147479552:113
> ::%623304:*             113
>
> raw        0            0    ::%2147479552:113
>                                                ::%622984:*
> 113
>
>
>
> You would notice how the Recv-Q is full on both PUB and SUB related
> send/router alert send sockets.
>
> These are my thoughts as to why they are full and not because of the same
> reason:
>
>
>
> For the case of the SUB associated sockets the 2001 address ones basically
> used to send NAKs to the remote PUB:
>
> These get full as soon as a remote PUB starts sending multicast data. I
> think the SUB send socket is connecting with the destination port used to
> send multicast traffic. I can see whenever a SUB sends NAK to the PUB that
> the source port on unicast packet matches that of the destination port of
> the multicast group. However this is not really an issue since the SUB
> socket is configured in PGM as receive only therefore any ODATA/SPM data
> received on its send socket is not processed. The SUB socket however is
> also getting the multicast data via the local binding: ::%2147479552:113
> which as seen is emptying out its queue fine and I could verify the node is
> receiving data at the application level.
>
>
>
> For the case of the PUB associated sockets the 2001 address ones basically
> used to send ODATA/SPM/RDATA/NCF to remote SUB:
>
> Even though its local binding: ::%2147479552:113 is also receiving the
> multicast data sent by remote PUB it is thrown out since the PUB socket is
> configured as send only at PGM level and so ODATA/SPM data received is
> thrown out.
>
> However, its send associated sockets do receive unicast NAKs from remote
> SUB and as seen above they are being put on the socket’s Recv-Q however the
> queue is full because NAKs are not being processed by the PUB socket.
>
>
>
> Note: The exact same behavior is seen with EPGM the only difference is
> that none of the sockets Recv-Q get full because they are being emptied out
> at the UDP layer upon arrival however I suspect that once forwarded to the
> PGM layer the PGM socket buffers would show the same thing as netstat –ln
> above.
>
>
>
> Even though I think is redundant and probably not a good idea to run the
> same code when creating either a ZMQ PUB and/or SUB socket since
> essentially those socket types are restricted to do specific things like
> send/receive only, that does not appear to be the cause of the issue here.
> I have read in some of the openpgm doc that it is necessary for the
> application to frequently call pgm_recv as that somehow moves the pgm state
> machine to do things, *however my issue here is how to accomplish that
> from the ZMQ API layer*, that is the whole point of using ZMQ in my case
> in the first place.
>
>
>
> Any thoughts? And thanks of the comments.
>
>
>
> *From:* zeromq-dev [mailto:zeromq-dev-bounces at lists.zeromq.org] *On
> Behalf Of *Steven McCoy
> *Sent:* Friday, March 23, 2018 12:53 PM
> *To:* ZeroMQ development list
> *Subject:* Re: [zeromq-dev] [External] Re: A PGM/EPGM question
>
>
>
> The problem is that the kernel will not multicast UDP unicast packets to
> each socket listening so it is probable the wrong socket is hearing the NAK.
>
>
>
> On Fri, Mar 23, 2018 at 12:07 Montero, Antonio UTC CCS <
> Antonio.Montero at fs.utc.com> wrote:
>
> ZMQ’s implementation of PUB socket type does not allow for receive calls
> to be made (zmq_recv is disabled), hence why I am trying to figure out how
> does one trigger ZMQ to call “pgm_recv” on the PUB socket in order to get
> the PUB socket to processes received NAKs from a remote SUB socket?
>
> I have tried querying the PUB socket state via ZMQ_EVENTS to triggering
> the processing of any commands available for the socket however that does
> not seem to move the PGM state machine in terms of processing NAKs.
>
>
>
> I am running both a PUB and SUB on the same application on the same host
> and although I see the same set of sockets being created at the PGM level
> for both PUB and SUB ZMQ sockets which includes multiple sockets binding to
> the same port, this does not appear to cause any issues in terms of my SUB
> socket able to receive multicast messages from a remote PUB and respond
> with unicast NAKs when data loss is detected.
>
>
>
> Any ideas as to how a user should get ZMQ lib to trigger NAKs processing
> for a PUB socket using either pgm/epgm transports?
>
>
>
> Thanks,
>
> Antonio Montero.
>
> *From:* zeromq-dev [mailto:zeromq-dev-bounces at lists.zeromq.org] *On
> Behalf Of *Steven McCoy
> *Sent:* Friday, March 23, 2018 9:55 AM
> *To:* ZeroMQ development list
> *Subject:* [External] Re: [zeromq-dev] A PGM/EPGM question
>
>
>
> You should check the PUB socket has a loop that is processing the incoming
> NAK requests, this is usually recv call based.  The symptoms indicate that
> the protocol is operating TX-only.
>
>
>
>>
> Steve-o
>
>
>
> On Wed, Mar 21, 2018 at 19:50 Montero, Antonio UTC CCS <
> Antonio.Montero at fs.utc.com> wrote:
>
> Hello,
>
> I am having a bit of a hard time getting a ZMQ PUB socket reacting to PGM
> NAKs which means at this point I am not able to recover lost packets
>
> I have tried with both protocols: (pgm and epgm). Still getting the same
> result.
>
>
>
> I have a setup where I create both a PUB and SUB sockets in that order in
> the same ZMQ context running on the same host and connected to the same
> IPv6 multicast address and port.
>
> I have N nodes and each node has a PUB and SUB. All N nodes send messages
> asynchronously and all N nodes receive all messages. My multicast network
> is working fine whether I use pgm or epgm and all N nodes communicate with
> each other over IPv6 multicast.
>
> The issue I am having is when a packet loss occurs, a remote SUB sends a
> unicast NAK back to the source PUB however I am not seeing any NCF or RDATA
> being sent by the source PUB. I have verified that the packets in question
> are in fact still in the Tx Window as reported by the SPMs being sent by
> the source PUB. I have ongoing traffic on a periodic basis which triggers a
> send and receive respectably on the PUB and SUB sockets and I am clearing
> out the ZMQ_EVENTS after every send and/or receive. I also have a polling
> thread running every 150ms to check for ZMQ_EVENTS on both PUB and SUB.
>
>
>
> Nothing seems to work in terms of triggering the PUB to react and process
> the NAKs received from remote SUB. Looking at the code a bit I see this
> function zmq::pgm_socket_t::process_upstream but
>
> can’t tell if and how it is being triggered. It does not appear to be from
> my perspective.
>
>
>
> Any help or direction would be appreciated. Thanks.
>
>
>
> --
>
> Antonio
>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.zeromq.org_mailman_listinfo_zeromq-2Ddev&d=DwMFaQ&c=ilBQI1lupc9Y65XwNblLtw&r=KE-_zI6ApNLT6qvQ1tC8zssu327OLN9lWlhIigWhJA4&m=AYMnN2d160L4oOUMYUzTsb0enU6l7vTnRPY_52rLMy0&s=LcJtAkEY4h2bzvGKaxr7OMpdGRbSSgLTF12pJkc7N70&e=>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.zeromq.org_mailman_listinfo_zeromq-2Ddev&d=DwMFaQ&c=ilBQI1lupc9Y65XwNblLtw&r=KE-_zI6ApNLT6qvQ1tC8zssu327OLN9lWlhIigWhJA4&m=xnJYudr-VZyLQc2fVwcyswMASLxV90FdVTH1C3EKuwk&s=mzDn57-bG6GCUrlAQwPK0okr2zFlO_BdVZbzm-_kSZ0&e=>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20180323/eb79db06/attachment.htm>


More information about the zeromq-dev mailing list