CSIT failing perf tests for week 41 (09/26 – 10/02)


Viliam Luc -X (vluc - PANTHEON TECH SRO at Cisco) <vluc@...>
 

=====SUMMARY=====

New issues - 0

Unfixed issues - 15

Fixed issues - 3

 

===NEW ISSUES===

 

===OUTSTANDING UNFIXED===

 

1) error: ALL ARM testbeds are failing with VPP failed to start

   rca:

   test: all

   frequency: all

   testbed: 2n-tx2, 3n-alt, 3n-tsh

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-tx2/404/log.html.gz

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-alt/99/log.html.gz

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-tsh/693/log.html.gz

                                                           

TICKET: https://jira.fd.io/browse/CSIT-1857

 

2) error: all testbeds DPDK l3fwd tests have wrong configuration

   rca:

   test: DPDK l3fwd

   frequency: all

   testbed: all

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-dpdk-perf-mrr-weekly-master-3n-alt/27/log.html.gz#s1-s1-s1-s2-t1

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-dpdk-perf-verify-master-3n-icx/3/log.html.gz#s1-s1-s1-s1-t1-k2-k5-k14

 

TICKET: https://jira.fd.io/browse/CSIT-1858

 

NOTE: tests are passing but the traffic returns back to TG instead of being forwarded to DUT2.

 

3) error: 3n-icx: all NFV density-DCR memif-Chain ipsec tests failing with no traffic forwarded

   rca:

   test: all chain ipsec

   frequency: all

   testbed: 3n-icx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-weekly-master-3n-icx/35/log.html.gz#s1-s1-s1-s1-s1-s1-s1-t1

 

NOTE: This week 3n-icx didn't run due to low cadency of runs because of iterative tests for RC1.

TICKET: https://jira.fd.io/browse/CSIT-1865

 

4) error: 2n-clx: half of the packets lost on PDR tests (re-opened)

   rca:

   test: e810Cq ip4base, ip6base

   frequency: sporadic

   testbed: 2n-clx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/156/log.html.gz#s1-s1-s1-s2-s8-t1

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/158/log.html.gz#s1-s1-s1-s4-s4-t1

                                                            https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/159/log.html.gz#s1-s1-s1-s2-s8-t1

                                                            https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/160/log.html.gz#s1-s1-s1-s2-s8-t1

 

NOTE: week37: issue appeared

NOTE: week39: happened again but only on ip6base e810Cq - build #158.

NOTE: week40: happened only on ip4base e810Cq, but 99,9% of packets are lost now. Build #159

NOTE: week41: This week issue happened only on ip4base e810Cq

 

TICKET: https://jira.fd.io/browse/CSIT-1864

 

5) error: 3n-alt: testpmd tests fail with no traffic

   rca:

   test: testpmd

   frequency: all

   testbed: 3n-alt

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-dpdk-perf-mrr-weekly-master-3n-alt/26/log.html.gz#s1-s1-s1-s1

 

TICKET: https://jira.fd.io/browse/CSIT-1848

 

6) error: 3n-icx: IP4 tunnels GTPU and WIREGUARD tests failing with ~1700 packets lost

   rca:

   test: IP4 tunnels with E810Xxv nic

   frequency: sporadic

   testbed: 3n-icx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-3n-icx/39/log.html.gz#s1-s1-s1-s1-s1-t1

 

NOTE: TRex doesn't support E810Xxv (Columbiaville).

NOTE: Job failed with another issue so this wasn't observed this week.

TICKET: https://jira.fd.io/browse/CSIT-1862

 

7) error: 2n-tx2, 3n-tsh: Failed to create container DUT1_CNF1

   rca:

   test: 2n-tx2: all Container Memif

         3n-tsh: SRv6_PROXY

   frequency: all

   testbed: 2n-tx2, 3n-tsh

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-tx2/386/log.html.gz#s1-s1-s1-s1-s1-t1

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-tsh/679/log.html.gz#s1-s1-s1-s6-s4-t1

  

NOTE: 2n-tx2 started with build #383 on 31st of August. Build #382 on 30th of August passed.

NOTE: 3n-tsh started with build #674 on 31st of August. Build #673 on 30th of August passed.

NOTE: ARM testbeds still failing so this wasn't observed.

TICKET: https://jira.fd.io/browse/CSIT-1860

 

8) error: 3n-icx: All 1000Tnlsw Fixtnlip non AVF tests failing. 1518B with no traffic forwarded, IMIX with excessive packet loss

   rca:

   test: 1518B crypto

   frequency: sporadic

   testbed: 3n-icx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/135/log.html.gz#s1-s1-s1-s1-s1-t1

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-3n-icx/39/log.html.gz#s1-s1-s1-s1-s1-t1

 

TICKET: https://jira.fd.io/browse/CSIT-1844

 

9) error: 2n-dnv: sporadic 1518B tput tests failing to establish required sessions

   rca:

   test: 1518B tput

   frequency: sporadic

   testbeds: 2n-dnv

   examples: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1247/log.html.gz#s1-s1-s1-s1-s9-t6

 

TICKET: https://jira.fd.io/browse/CSIT-1850

NOTE: #1240 all tput test passed

 

10) error: 3n-icx, 3n-skx: all 1518B AVF crypto tests failed with no traffic, all IMIX AVF crypto with excessive packet loss

   rca:

   test: all AVF crypto

   frequency: sporadic

   testbed: 3n-skx, 3n-icx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/133/log.html.gz#s1-s1-s1-s1-s4-t1

                                          

NOTE: This is wasn't observed this week.

TICKET: https://jira.fd.io/browse/CSIT-1827

 

11) error: NDR sporadic packet lost

   rca:

   test: af-xdp multicore tests

   frequency: low

   testbed: 2n-skx, 2n-clx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-skx/202/log.html.gz#s1-s1-s1-s2-s4-t3

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/152/log.html.gz#s1-s1-s1-s5-s12-t3

 

TICKET: https://jira.fd.io/browse/CSIT-1802

 

12) error: 3n-tsh, 3n-alt, 2n-clx testbed (Taishan, Altra, Cascade-lake): NDR tests failing from time to time.

   rca:

   tests: Crypto, Ip4, L2, Srv6, Vm Vhost (all packet sizes, all core configurations affected)

   frequency: medium

   testbed: 3n-tsh, 3n-alt, 2n-clx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/151/log.html.gz#s1-s1-s1-s2-s56-t3

                                          

TICKET: https://jira.fd.io/browse/CSIT-1804

 

13) error: T-Rex STL runtime error

   rca: VPP code - X557 speed_capability set 1GE instead of 10GE

   test: sporadic

   frequency: all

   testbed: 2n-dnv and 3n-dnv

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1247/log.html.gz#s1-s1-s1-s3-s1-t1

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-dnv/1257/log.html.gz#s1-s1-s1-s1-s3-t2

 

TODO: VPP to fix speed_capability.

TICKET: https://jira.fd.io/browse/VPP-2010

 

14) error: failed creating AVF interface

   rca: issue in Intel FVL driver

   test: multicore AVF

   frequency: sporadic

   testbed: all testbeds

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-clx/1194/log.html.gz#s1-s1-s1-s4-s17-t2

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-zn2/622/log.html.gz#s1-s1-s1-s2-s12-t3

 

NOTE: A long standing issue without a final permanent fix.

TICKET: multicore AVF tests are failing when trying to create interface, https://jira.fd.io/browse/CSIT-1782

 

15) error: Not all DET44 sessions have been established: 4128767 != 4128768

   rca: unknown

   test: nat44det udp 4m and 16m (64k and 1m are ok)

   frequency: very sporadic. It failed in 1 out of 8 runs.

   testbed: 2n-zn2, 2n-skx, 2n-icx, 2n-clx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-icx/112/log.html.gz#s1-s1-s1-s2-s35-t2

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-clx/1145/log.html.gz#s1-s1-s1-s2-s48-t2

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-zn2/620/log.html.gz#s1-s1-s1-s2-s21-t2

 

TICKET: https://jira.fd.io/browse/CSIT-1795

 

===OUTSTANDING FIXED===

#) error: 2n-aws all tests failing with InvalidPlacementGroup

   rca:

   test: all

   frequency: all

   testbed: 2n-aws

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-weekly-master-2n-aws/64/console.log.gz

  

   message:

   Error: error creating EC2 Placement Group (csit-2n-aws-c5n-csit-vpp-perf-mrr-weekly-master-2n-aws-pg): InvalidPlacementGroup.Duplicate: The placement group 'csit-2n-aws-c5n-csit-vpp-perf-mrr-weekly-master-2n-aws-pg' already exists.

              status code: 400, request id: e9120991-2048-4f8b-89d2-4cd0868e5e0e

 

TICKET: https://jira.fd.io/browse/CSIT-1866

 

===FIXED ISSUES===

#) error: 2n-icx: all tests failed with parent suite setup time-out

   rca: SSHTimeout: Timeout exception during execution of command: fgrep docker /proc/1/cgroup

   test: all

   frequency: all

   testbed: 2n-icx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-icx/39/console.log.gz

 

TICKET: https://jira.fd.io/browse/CSIT-1863

 

#) error: 2n-clx, 2n-zn2, 2n-icx: QEMU NF failed to run on VM VHOST tests

   rca: svm_region_map(mmap open): No such file or directory

   test: MRR daily : 2n-clx: VM VHOST vppl2xc

                             MRR weekly: 2n-clx, 2n-icx: VM VHOST chain vppip4

         NDRPDR: VM VHOST vppip4

   frequency: all

   testbed: 2n-clx, 2n-zn2, 2n-icx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-clx/1194/log.html.gz#s1-s1-s1-s6-s2-t1

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-icx/143/log.html.gz#s1-s1-s1-s6-s2-t1

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-zn2/622/log.html.gz#s1-s1-s1-s6-s2-t1

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-weekly-master-2n-clx/161/log.html.gz#s1-s1-s1-s1-s2-s1-s1-t1

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-weekly-master-2n-icx/40/log.html.gz#s1-s1-s1-s1-s2-s1-s1-t1

                                                            https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/160/log.html.gz#s1-s1-s1-s6-s2

                                                            https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-icx/41/log.html.gz#s1-s1-s1-s6-s2-t1

 

NOTE: 2n-zn2 started with build #603 on 31st of August. Build #602 on 30th of August passed.

NOTE: fix: 37254: fix(qemu): VM tests | https://gerrit.fd.io/r/c/csit/+/37254 (applied to oper branch on 10/03)

 

TICKET: https://jira.fd.io/browse/CSIT-1859

 

#) error: 2n-clx: X710 NICs interfere with TRex

   rca: i40e interface 0000:18:00.0 is under Linux and will interfere with TRex interface 0000:18:00.2

   test: X710 (ip4base, ip6base, l2bd)

   frequency: all

   testbed: 2n-clx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-clx/1192/log.html.gz#s1-s1-s1-s2-s18

                  https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/159/log.html.gz#s1-s1-s1-s2-s18

 

TICKET: https://jira.fd.io/browse/CSIT-1861

 

Best regards,

Viliam Luc

Join csit-report@lists.fd.io to automatically receive all group messages.