[zeromq-dev] [External] Re: A PGM/EPGM question

Luca Boccassi luca.boccassi at gmail.com
Fri Mar 23 21:17:09 CET 2018


ZMQ_XPUB has reads enabled as well

On Fri, 2018-03-23 at 16:13 -0400, Steven McCoy wrote:
> Maybe have to use zmq_poll with both in and out events?  Ultimately
> in_event()
> needs to fire on the pgm_sender which calls process_upstream() that
> processes a NAK.
> 
> On 23 March 2018 at 13:43, Montero, Antonio UTC CCS <
> Antonio.Montero at fs.utc.com> wrote:
> 
> > Understood however that is not the behavior I am seeing. Although
> > that is
> > likely to be the case for EPGM since those are UDP packets although
> > from my
> > understanding regardless whether incoming data is multicast or
> > unicast, PGM
> > is binding to any address and specific port. The kernel will pass
> > all data
> > received on an interface to any listening socket as long as the
> > destination
> > port patches that of the socket binding.
> > 
> > 
> > 
> > Now let’s put aside UDP for a sec, what about when using pgm
> > transport?
> > These are raw sockets and any unicast NAK are actually sent from
> > remote SUB
> > to the PUB unicast address and source port (which is randomly
> > selected at
> > the time of creating the raw PGM PUB socket). At that point the PUB
> > socket
> > should be the only one listening on its own unicast address and
> > source
> > port. Correct?
> > 
> > 
> > 
> > *This is a snapshot of what my netstat –ln looks like at the
> > moment. This
> > is with both ( PUB and SUB created and running on the same host ).*
> > 
> > 
> > 
> > Proto Recv-Q Send-Q Local
> > Address                                                            
> >     Foreign
> > Address      State
> > 
> > *Sockets associated with PUB:*
> > 
> > raw   164672      0     2001:db8::2b0:19ff:fe73:d890%2147479552:113
> > ::%622984:*            113
> > 
> > raw   164672      0    2001:db8::2b0:19ff:fe73:d890%2147479552:113
> > ::%623304:*             113
> > 
> > raw        0            0    ::%2147479552:113
> >                                              ::%622984:*
> >  113
> > 
> > *Sockets associated with SUB:*
> > 
> > raw   164672      0    2001:db8::2b0:19ff:fe73:d890%2147479552:113
> > ::%622984:*             113
> > 
> > raw   164672      0    2001:db8::2b0:19ff:fe73:d890%2147479552:113
> > ::%623304:*             113
> > 
> > raw        0            0    ::%2147479552:113
> >                                                ::%622984:*
> > 113
> > 
> > 
> > 
> > You would notice how the Recv-Q is full on both PUB and SUB related
> > send/router alert send sockets.
> > 
> > These are my thoughts as to why they are full and not because of
> > the same
> > reason:
> > 
> > 
> > 
> > For the case of the SUB associated sockets the 2001 address ones
> > basically
> > used to send NAKs to the remote PUB:
> > 
> > These get full as soon as a remote PUB starts sending multicast
> > data. I
> > think the SUB send socket is connecting with the destination port
> > used to
> > send multicast traffic. I can see whenever a SUB sends NAK to the
> > PUB that
> > the source port on unicast packet matches that of the destination
> > port of
> > the multicast group. However this is not really an issue since the
> > SUB
> > socket is configured in PGM as receive only therefore any ODATA/SPM
> > data
> > received on its send socket is not processed. The SUB socket
> > however is
> > also getting the multicast data via the local binding:
> > ::%2147479552:113
> > which as seen is emptying out its queue fine and I could verify the
> > node is
> > receiving data at the application level.
> > 
> > 
> > 
> > For the case of the PUB associated sockets the 2001 address ones
> > basically
> > used to send ODATA/SPM/RDATA/NCF to remote SUB:
> > 
> > Even though its local binding: ::%2147479552:113 is also receiving
> > the
> > multicast data sent by remote PUB it is thrown out since the PUB
> > socket is
> > configured as send only at PGM level and so ODATA/SPM data received
> > is
> > thrown out.
> > 
> > However, its send associated sockets do receive unicast NAKs from
> > remote
> > SUB and as seen above they are being put on the socket’s Recv-Q
> > however the
> > queue is full because NAKs are not being processed by the PUB
> > socket.
> > 
> > 
> > 
> > Note: The exact same behavior is seen with EPGM the only difference
> > is
> > that none of the sockets Recv-Q get full because they are being
> > emptied out
> > at the UDP layer upon arrival however I suspect that once forwarded
> > to the
> > PGM layer the PGM socket buffers would show the same thing as
> > netstat –ln
> > above.
> > 
> > 
> > 
> > Even though I think is redundant and probably not a good idea to
> > run the
> > same code when creating either a ZMQ PUB and/or SUB socket since
> > essentially those socket types are restricted to do specific things
> > like
> > send/receive only, that does not appear to be the cause of the
> > issue here.
> > I have read in some of the openpgm doc that it is necessary for the
> > application to frequently call pgm_recv as that somehow moves the
> > pgm state
> > machine to do things, *however my issue here is how to accomplish
> > that
> > from the ZMQ API layer*, that is the whole point of using ZMQ in my
> > case
> > in the first place.
> > 
> > 
> > 
> > Any thoughts? And thanks of the comments.
> > 
> > 
> > 
> > *From:* zeromq-dev [mailto:zeromq-dev-bounces at lists.zeromq.org] *On
> > Behalf Of *Steven McCoy
> > *Sent:* Friday, March 23, 2018 12:53 PM
> > *To:* ZeroMQ development list
> > *Subject:* Re: [zeromq-dev] [External] Re: A PGM/EPGM question
> > 
> > 
> > 
> > The problem is that the kernel will not multicast UDP unicast
> > packets to
> > each socket listening so it is probable the wrong socket is hearing
> > the NAK.
> > 
> > 
> > 
> > On Fri, Mar 23, 2018 at 12:07 Montero, Antonio UTC CCS <
> > Antonio.Montero at fs.utc.com> wrote:
> > 
> > ZMQ’s implementation of PUB socket type does not allow for receive
> > calls
> > to be made (zmq_recv is disabled), hence why I am trying to figure
> > out how
> > does one trigger ZMQ to call “pgm_recv” on the PUB socket in order
> > to get
> > the PUB socket to processes received NAKs from a remote SUB socket?
> > 
> > I have tried querying the PUB socket state via ZMQ_EVENTS to
> > triggering
> > the processing of any commands available for the socket however
> > that does
> > not seem to move the PGM state machine in terms of processing NAKs.
> > 
> > 
> > 
> > I am running both a PUB and SUB on the same application on the same
> > host
> > and although I see the same set of sockets being created at the PGM
> > level
> > for both PUB and SUB ZMQ sockets which includes multiple sockets
> > binding to
> > the same port, this does not appear to cause any issues in terms of
> > my SUB
> > socket able to receive multicast messages from a remote PUB and
> > respond
> > with unicast NAKs when data loss is detected.
> > 
> > 
> > 
> > Any ideas as to how a user should get ZMQ lib to trigger NAKs
> > processing
> > for a PUB socket using either pgm/epgm transports?
> > 
> > 
> > 
> > Thanks,
> > 
> > Antonio Montero.
> > 
> > *From:* zeromq-dev [mailto:zeromq-dev-bounces at lists.zeromq.org] *On
> > Behalf Of *Steven McCoy
> > *Sent:* Friday, March 23, 2018 9:55 AM
> > *To:* ZeroMQ development list
> > *Subject:* [External] Re: [zeromq-dev] A PGM/EPGM question
> > 
> > 
> > 
> > You should check the PUB socket has a loop that is processing the
> > incoming
> > NAK requests, this is usually recv call based.  The symptoms
> > indicate that
> > the protocol is operating TX-only.
> > 
> > 
> > 
> > —
> > 
> > Steve-o
> > 
> > 
> > 
> > On Wed, Mar 21, 2018 at 19:50 Montero, Antonio UTC CCS <
> > Antonio.Montero at fs.utc.com> wrote:
> > 
> > Hello,
> > 
> > I am having a bit of a hard time getting a ZMQ PUB socket reacting
> > to PGM
> > NAKs which means at this point I am not able to recover lost
> > packets
> > 
> > I have tried with both protocols: (pgm and epgm). Still getting the
> > same
> > result.
> > 
> > 
> > 
> > I have a setup where I create both a PUB and SUB sockets in that
> > order in
> > the same ZMQ context running on the same host and connected to the
> > same
> > IPv6 multicast address and port.
> > 
> > I have N nodes and each node has a PUB and SUB. All N nodes send
> > messages
> > asynchronously and all N nodes receive all messages. My multicast
> > network
> > is working fine whether I use pgm or epgm and all N nodes
> > communicate with
> > each other over IPv6 multicast.
> > 
> > The issue I am having is when a packet loss occurs, a remote SUB
> > sends a
> > unicast NAK back to the source PUB however I am not seeing any NCF
> > or RDATA
> > being sent by the source PUB. I have verified that the packets in
> > question
> > are in fact still in the Tx Window as reported by the SPMs being
> > sent by
> > the source PUB. I have ongoing traffic on a periodic basis which
> > triggers a
> > send and receive respectably on the PUB and SUB sockets and I am
> > clearing
> > out the ZMQ_EVENTS after every send and/or receive. I also have a
> > polling
> > thread running every 150ms to check for ZMQ_EVENTS on both PUB and
> > SUB.
> > 
> > 
> > 
> > Nothing seems to work in terms of triggering the PUB to react and
> > process
> > the NAKs received from remote SUB. Looking at the code a bit I see
> > this
> > function zmq::pgm_socket_t::process_upstream but
> > 
> > can’t tell if and how it is being triggered. It does not appear to
> > be from
> > my perspective.
> > 
> > 
> > 
> > Any help or direction would be appreciated. Thanks.
> > 
> > 
> > 
> > --
> > 
> > Antonio
> > 
> > 
> > 
> > _______________________________________________
> > zeromq-dev mailing list
> > zeromq-dev at lists.zeromq.org
> > https://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.zeromq.
> > org_mailman_listinfo_zeromq-
> > 2Ddev&d=DwMFaQ&c=ilBQI1lupc9Y65XwNblLtw&r=KE-
> > _zI6ApNLT6qvQ1tC8zssu327OLN9lWlhIigWhJA4&m=AYMnN2d160L4oOUMYUzTsb0e
> > nU6l7vTnRPY_52rLMy0&s=LcJtAkEY4h2bzvGKaxr7OMpdGRbSSgLTF12pJkc7N70&e
> > =>
> > 
> > _______________________________________________
> > zeromq-dev mailing list
> > zeromq-dev at lists.zeromq.org
> > https://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.zeromq.
> > org_mailman_listinfo_zeromq-
> > 2Ddev&d=DwMFaQ&c=ilBQI1lupc9Y65XwNblLtw&r=KE-
> > _zI6ApNLT6qvQ1tC8zssu327OLN9lWlhIigWhJA4&m=xnJYudr-
> > VZyLQc2fVwcyswMASLxV90FdVTH1C3EKuwk&s=mzDn57-
> > bG6GCUrlAQwPK0okr2zFlO_BdVZbzm-_kSZ0&e=>
> > 
> > 
> > _______________________________________________
> > zeromq-dev mailing list
> > zeromq-dev at lists.zeromq.org
> > https://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > 
> > 
> 
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev

-- 
Kind regards,
Luca Boccassi
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: This is a digitally signed message part
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20180323/58e2e264/attachment.sig>


More information about the zeromq-dev mailing list