Re: CSIT failing perf tests for week 31 (07/25 – 07/31)


Viliam Luc -X (vluc - PANTHEON TECH SRO at Cisco)
 

Update:

 

1) PM to apply workaround until Fan & Intel team fix the issue. Issue happened after change in compatibility matrix in CSIT 22.06 env space.

2) Frequency is now higher. VP to fix.

3) No progress since last week. VP to fix.

4, 5, 6, 7, 8 – long term issues with no permanent fix.

 

Best regards,

Viliam Luc

 

From: Viliam Luc -X (vluc - PANTHEON TECH SRO at Cisco)
Sent: 01 August 2022 11:13
To: csit-report@...
Subject: CSIT failing perf tests for week 31 (07/25 – 07/31)

 

=====SUMMARY=====

 

===NEW ISSUES===

 

===OUTSTANDING UNFIXED===

 

1) error: 2n-clx, 3n-skx: First port is down when running traffic profile

   rca:

   test: 10Ge2P1X710

   frequency: all

   testbed: 2n-clx, 3n-skx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-3n-skx/204/log.html.gz#s1-s1-s1-s2-s1

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-skx/1602/log.html.gz#s1-s1-s1-s2-s1-t1

 

NOTE: Issue had been observed on 2n-clx and fixed with workaround. Now (7/3) it appeared on 3n-skx. The same x710 NIC.

TICKET: https://jira.fd.io/browse/CSIT-1831

 

2) error: 2n-dnv: sporadic 1518B tput tests failing to establish required sessions

   rca:

   test: 1518B tput

   frequency: all

   testbeds: 2n-dnv

   examples: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1193/log.html.gz#s1-s1-s1-s1-s8-t6

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1195/log.html.gz#s1-s1-s1-s1-s7-t4

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1205/log.html.gz#s1-s1-s1-s1-s9-t6

 

TICKET: https://jira.fd.io/browse/CSIT-1850

 

3) error: 3n-icx, 3n-skx: all 1518B AVF crypto tests failed with no traffic, all IMIX AVF crypto with excessive packet loss

   rca:

   test: all AVF crypto

   frequency: sporadic

   testbed: NDRPDR: 3n-skx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-skx/1592/log.html.gz#s1-s1-s1-s2-s1-t1

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-3n-skx/206/log.html.gz#s1-s1-s1-s1-s1-t1

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/95/log.html.gz#s1-s1-s1-s1-s1-t1

                                          

TICKET: https://jira.fd.io/browse/CSIT-1827

NOTE: After a while failed on MRR daily tests. Only 1518B tests failed, IMIX passed.

NOTE: MRR 3n-icx failing always (not sporadic)

 

4) error: 3n-tsh, 3n-alt, 2n-clx testbed (Taishan, Altra): NDR tests failing from time to time.

   rca:

   tests: Crypto, Ip4, L2, Srv6, Vm Vhost (all packet sizes, all core configurations affected)

   frequency: medium

   testbed: 3n-tsh, 3n-alt, 2n-clx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/151/log.html.gz#s1-s1-s1-s2-s56-t3

 

TICKET: https://jira.fd.io/browse/CSIT-1804

 

5) error: NDR sporadic packet lost

   rca:

   test: af-xdp multicore tests

   frequency: low

   testbed: 2n-skx, 2n-clx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/140/log.html.gz#s1-s1-s1-s2-s10-t2

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-skx/202/log.html.gz#s1-s1-s1-s2-s4-t3

 

TICKET: https://jira.fd.io/browse/CSIT-1802

NOTE: Not observed this week.

 

6) error: T-Rex STL runtime error

   rca: VPP code - X557 speed_capability set 1GE instead of 10GE

   test: sporadic

   frequency: all

   testbed: 2n-dnv and 3n-dnv

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-dnv/1215/log.html.gz#s1-s1-s1-s1-s1-t3

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1205/log.html.gz#s1-s1-s1-s2-s2-t3

 

TODO: VPP to fix speed_capability.

TICKET: https://jira.fd.io/browse/VPP-2010

 

7) error: failed creating AVF interface

   rca: issue in Intel FVL driver

   test: multicore AVF

   frequency: sporadic

   testbed: all testbeds

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/95/log.html.gz#s1-s1-s1-s5-s3-t2

 

NOTE: A long standing issue without a final permanent fix.

TICKET: multicore AVF tests are failing when trying to create interface, https://jira.fd.io/browse/CSIT-1782

 

8) error: Not all DET44 sessions have been established: 4128767 != 4128768

   rca: unknown

   test: nat44det udp 4m and 16m (64k and 1m are ok)

   frequency: very sporadic. It failed in 1 out of 8 runs.

   testbed: 2n-zn2, 2n-skx, 2n-icx, 2n-dnv

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-icx/84/log.html.gz#s1-s1-s1-s2-s29-t2

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1190/log.html.gz#s1-s1-s1-s1-s9-t4

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-clx/1145/log.html.gz#s1-s1-s1-s2-s48-t2

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-zn2/580/log.html.gz#s1-s1-s1-s2-s22-t3

 

TICKET: https://jira.fd.io/browse/CSIT-1795

NOTE: During week28 1st time happened on 2n-icx

NOTE: During week29 1st time observed on 2n-dnv

 

===OUTSTANDING FIXED===

9) 3n-alt: failed to set port1 up

   rca:

   test: 1518B-2c-ethip4ipsec10000tnlsw-ip4base-int-aes128gcm-mrr

   frequency: all

   testbed: 3n-alt

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-alt/55/log.html.gz#s1-s1-s1-s1-s2-t2

 

NOTE: only 1 test failing, but it's failing in each run since Wednesday (07/19)

 

10) error: 2n-skx, 2n-clx, 2n-icx: ALL 1518B TCP tput tests failing with big packet loss

   rca:

   test: 1518B TCP tput

   frequency: all

   testbed: 2n-skx, 2n-clx, 2n-icx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-skx/204/log.html.gz#s1-s1-s1-s2-s17-t1

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/149/log.html.gz#s1-s1-s1-s2-s34-t1

 

TICKET: https://jira.fd.io/browse/CSIT-1846

 

11) 2n-skx: UDP 16m tput tests fail to create all sessions

   rca:

   test: UDP 16m TPUT

   frequency: all

   testbed: 2n-skx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2206-2n-skx/8/log.html.gz#s1-s1-s1-s1-s12-t1

                                          

 

TICKET: https://jira.fd.io/browse/CSIT-1849

NOTE: not the same issue as https://jira.fd.io/browse/CSIT-1795 where only 1 session not created.

NOTE: This is not observed on daily trending.

 

===FIXED ISSUES===

 

Best regards,

Viliam Luc

Join csit-report@lists.fd.io to automatically receive all group messages.