[vpp-dev] vpp router plugin threads? (vpp + router + netlink + FRRouting)


Jaeb
 

Hi Brian,


I've successfully built the router plugin on 18.10 and "19.01"


What errors are you encountering when you attempt to build it?


Kind regards,


John Biscevic
Systems Architect,  Bahnhof AB
Mobile:  +46 76 111 01 24
E-mail:  john.biscevic@...

From: vpp-dev@... <vpp-dev@...> on behalf of Brian Dickson <brian.peter.dickson@...>
Sent: Wednesday, December 12, 2018 4:17 AM
To: vpp-dev@...
Cc: vppsb-dev@...
Subject: [vpp-dev] vpp router plugin threads? (vpp + router + netlink + FRRouting)
 
Greetings, VPP folks.

I am continuing to work on my vpp + router-plugin (+FRRouting) set-up.

I have things mostly working with very large routing tables (source from multiple BGP peers), but am having some challenges when trying to use multi-threaded (additional worker threads) for increasing overall VPP forwarding performance.

When using just a single thread, the BGP peers take a long time to sync up but it is relatively stable. Forwarding performance on a 10G NIC (i40e driver and vfio-pci selected), is pretty decent, but I am interested in finding ways to improve performance (and getting things to the point where I can use a 40G card also in the system). The limit seems to be packets per second, and maxes out at about 11Mpps.

The problem is, when I try to use worker threads, I start running into issues with rtnetlink buffers, and BGP, ICMP, ARP, etc, all become "flaky".

My suspicion is that it has something to do with which thread(s) handle the netlink traffic, and which thread(s) handle the TCP port 179 (BGP) traffic, which needs to go via the tap-inject path to the kernel, and then to the BGP speaking application (FRR sub-unit "bgpd").

Is there anyone who can provide information or advice on this issue?

NB: the flakiness is in a COMPLETELY unloaded environment - no other traffic is being handled, nothing else is consuming CPU cycles. It is just the BGP traffic itself plus related stuff (ARP) and any diagnostic traffic I use (ping).

Is this a case where I need to adjust the RSS to direct incoming packets to the right subset of cores, and do I also need to direct particular traffic (TCP 179) to the main core? Do I need to ensure anything else, like using a separate core (and set the core afinity with taskset -c ) for my BGP speaker?

Any suggestions or advice would be greatly appreciated.

Also, any updates on bringing netlink and router plugins into the main vppsb tree? Building them on anything other than 18.07 just doesn't work for me, and even on 18.07 is rather brittle, and I'm not 100% sure about the build steps, which actually involve passing CFLAGS in to make, which suggests something isn't quite right...

Thanks in advance,
Brian


Brian Dickson <brian.peter.dickson@...>
 

FYI:

I have tried to build these in the last couple of days.

The current "version" of vppsb/netlink and vppsb/router seem to not compile with any version.

I'm not sure who is maintaining the code or updating the repo, but it doesn't appear to be using git branches at all, and they are breaking stuff for everyone. 
Please don't do that.
If you are updating stuff that other people use, please be polite, and place your edits in a branch until it can compile, be reviewed, and merged (regardless of who might review it.)

(The git HEAD entries are 3a3b77f27b6d1469c5e1628cb508e193df20d6a0 and 9791ab9fa07347fd063a55dc44cc1b0b67ee2292 for the bad and good versions respectively.)

The version of dpdk included currently also seems to break/explode/implode, unless the "make" target for the dpdk-install-dep (sp?) thing is done separately.

The older version of vppsb/router has a single bug, which is easily fixed, but requires fixing for the router plugin to build.
(This error points to the fix: router/tap_inject_netlink.c:163:43: error: too many arguments to function 'vnet_unset_ip6_ethernet_neighbor')

If the older version of vppsb is used, AND the dpdk dep thing is installed first, AND the minor bug is fixed, 18.07 and 18.10 do compile.

I have found it necessary to pass some CFLAGS and LDFLAGS when building netlink-install and router-install:

make CFLAGS="-fPIC -std=gnu99 -I. -I/usr/local/src/vpp/netlink -I/usr/local/src/vpp/router -I/usr/local/src/vpp/router/router" LDFLAGS="-L/usr/local/src/vpp/build-root/install-vpp-native/netlink/lib64" V=0 PLATFORM=vpp TAG=vpp netlink-install router-install



It'd be nice if these relatively minor things could be cleaned up, independent of any actual development going on relative to these two plugins

Thanks,
Brian


On Wed, Dec 12, 2018 at 4:12 AM John Biscevic <John.Biscevic@...> wrote:

Hi Brian,


I've successfully built the router plugin on 18.10 and "19.01"


What errors are you encountering when you attempt to build it?


Kind regards,


John Biscevic
Systems Architect,  Bahnhof AB
Mobile:  +46 76 111 01 24

From: vpp-dev@... <vpp-dev@...> on behalf of Brian Dickson <brian.peter.dickson@...>
Sent: Wednesday, December 12, 2018 4:17 AM
To: vpp-dev@...
Cc: vppsb-dev@...
Subject: [vpp-dev] vpp router plugin threads? (vpp + router + netlink + FRRouting)
 
Greetings, VPP folks.

I am continuing to work on my vpp + router-plugin (+FRRouting) set-up.

I have things mostly working with very large routing tables (source from multiple BGP peers), but am having some challenges when trying to use multi-threaded (additional worker threads) for increasing overall VPP forwarding performance.

When using just a single thread, the BGP peers take a long time to sync up but it is relatively stable. Forwarding performance on a 10G NIC (i40e driver and vfio-pci selected), is pretty decent, but I am interested in finding ways to improve performance (and getting things to the point where I can use a 40G card also in the system). The limit seems to be packets per second, and maxes out at about 11Mpps.

The problem is, when I try to use worker threads, I start running into issues with rtnetlink buffers, and BGP, ICMP, ARP, etc, all become "flaky".

My suspicion is that it has something to do with which thread(s) handle the netlink traffic, and which thread(s) handle the TCP port 179 (BGP) traffic, which needs to go via the tap-inject path to the kernel, and then to the BGP speaking application (FRR sub-unit "bgpd").

Is there anyone who can provide information or advice on this issue?

NB: the flakiness is in a COMPLETELY unloaded environment - no other traffic is being handled, nothing else is consuming CPU cycles. It is just the BGP traffic itself plus related stuff (ARP) and any diagnostic traffic I use (ping).

Is this a case where I need to adjust the RSS to direct incoming packets to the right subset of cores, and do I also need to direct particular traffic (TCP 179) to the main core? Do I need to ensure anything else, like using a separate core (and set the core afinity with taskset -c ) for my BGP speaker?

Any suggestions or advice would be greatly appreciated.

Also, any updates on bringing netlink and router plugins into the main vppsb tree? Building them on anything other than 18.07 just doesn't work for me, and even on 18.07 is rather brittle, and I'm not 100% sure about the build steps, which actually involve passing CFLAGS in to make, which suggests something isn't quite right...

Thanks in advance,
Brian


Brian Dickson <brian.peter.dickson@...>
 



On Wed, Dec 12, 2018 at 4:12 AM John Biscevic <John.Biscevic@...> wrote:

Hi Brian,


I've successfully built the router plugin on 18.10 and "19.01"


What errors are you encountering when you attempt to build it?

When following the instructions (doing the steps for running make, found in router/README.md, right after all the ln -sf steps):

@@@@ Building netlink in /usr/local/src/vpp/build-root/build-vpp_debug-native/netlink @@@@

make[1]: Entering directory `/usr/local/src/vpp/build-root/build-vpp_debug-native/netlink'

  CC       librtnl/netns.lo

  CC       librtnl/rtnl.lo

  CC       librtnl/mapper.lo

  CC       test/test.lo

/usr/local/src/vpp/build-data/../netlink/librtnl/rtnl.c: In function 'rtnl_socket_open':

/usr/local/src/vpp/build-data/../netlink/librtnl/rtnl.c:269:39: error: 'RTNLGRP_MPLS_ROUTE' undeclared (first use in this function)

     grpmask(RTNLGRP_NOTIFY) | grpmask(RTNLGRP_MPLS_ROUTE),

                                       ^

/usr/local/src/vpp/build-data/../netlink/librtnl/rtnl.c:269:39: note: each undeclared identifier is reported only once for each function it appears in

/usr/local/src/vpp/build-data/../netlink/librtnl/netns.c:69:5: error: 'RTA_VIA' undeclared here (not in a function)

   _(RTA_VIA, via, 1)                            \

     ^

/usr/local/src/vpp/build-data/../netlink/librtnl/netns.c:82:13: note: in definition of macro '_'

     .type = t, .unique = u,                     \

             ^

/usr/local/src/vpp/build-data/../netlink/librtnl/netns.c:86:3: note: in expansion of macro 'ns_foreach_rta'

   ns_foreach_rta

   ^

make[1]: *** [librtnl/rtnl.lo] Error 1

make[1]: *** Waiting for unfinished jobs....

make[1]: *** [librtnl/netns.lo] Error 1

make[1]: Leaving directory `/usr/local/src/vpp/build-root/build-vpp_debug-native/netlink'

make: *** [netlink-build] Error 2



Is this something you encountered? How did you resolve it?

Brian
 


Kind regards,


John Biscevic
Systems Architect,  Bahnhof AB
Mobile:  +46 76 111 01 24

From: vpp-dev@... <vpp-dev@...> on behalf of Brian Dickson <brian.peter.dickson@...>
Sent: Wednesday, December 12, 2018 4:17 AM
To: vpp-dev@...
Cc: vppsb-dev@...
Subject: [vpp-dev] vpp router plugin threads? (vpp + router + netlink + FRRouting)
 
Greetings, VPP folks.

I am continuing to work on my vpp + router-plugin (+FRRouting) set-up.

I have things mostly working with very large routing tables (source from multiple BGP peers), but am having some challenges when trying to use multi-threaded (additional worker threads) for increasing overall VPP forwarding performance.

When using just a single thread, the BGP peers take a long time to sync up but it is relatively stable. Forwarding performance on a 10G NIC (i40e driver and vfio-pci selected), is pretty decent, but I am interested in finding ways to improve performance (and getting things to the point where I can use a 40G card also in the system). The limit seems to be packets per second, and maxes out at about 11Mpps.

The problem is, when I try to use worker threads, I start running into issues with rtnetlink buffers, and BGP, ICMP, ARP, etc, all become "flaky".

My suspicion is that it has something to do with which thread(s) handle the netlink traffic, and which thread(s) handle the TCP port 179 (BGP) traffic, which needs to go via the tap-inject path to the kernel, and then to the BGP speaking application (FRR sub-unit "bgpd").

Is there anyone who can provide information or advice on this issue?

NB: the flakiness is in a COMPLETELY unloaded environment - no other traffic is being handled, nothing else is consuming CPU cycles. It is just the BGP traffic itself plus related stuff (ARP) and any diagnostic traffic I use (ping).

Is this a case where I need to adjust the RSS to direct incoming packets to the right subset of cores, and do I also need to direct particular traffic (TCP 179) to the main core? Do I need to ensure anything else, like using a separate core (and set the core afinity with taskset -c ) for my BGP speaker?

Any suggestions or advice would be greatly appreciated.

Also, any updates on bringing netlink and router plugins into the main vppsb tree? Building them on anything other than 18.07 just doesn't work for me, and even on 18.07 is rather brittle, and I'm not 100% sure about the build steps, which actually involve passing CFLAGS in to make, which suggests something isn't quite right...

Thanks in advance,
Brian


Ni, Hongjun
 

Hi Burt and Brian,

 

VPP has leveraged cmake to compile code after release 18.10, but vppsb still use make to compile code.

We tried to port router plugin to VPP, and it can work, but it is not accepted by VPP community.

https://gerrit.fd.io/r/#/c/15062/

 

You need to rework VPPSB’s building system to match VPP’s cmake, to make router plugin work.

 

Thanks,

Hongjun

 

From: vpp-dev@... [mailto:vpp-dev@...] On Behalf Of Burt Silverman
Sent: Friday, December 14, 2018 6:13 AM
To: brian.peter.dickson@...
Cc: John.Biscevic@...; vpp-dev <vpp-dev@...>; vppsb-dev@...
Subject: Re: [vpp-dev] vpp router plugin threads? (vpp + router + netlink + FRRouting)

 

I just tried building on master of both vpp and vppsb. I just had to add #include <sys/uio.h> to tap_inject_node.c. It seems like that bug has been around a long time. I used the directions in vppsb/router/README.md.

 

Burt


Burt Silverman <burtms@...>
 

I just tried building on master of both vpp and vppsb. I just had to add #include <sys/uio.h> to tap_inject_node.c. It seems like that bug has been around a long time. I used the directions in vppsb/router/README.md.

Burt