CSIT failing perf tests for week 29 (07/11 – 07/17)


Viliam Luc -X (vluc - PANTHEON TECH SRO at Cisco)
 

=====SUMMARY=====

 

===NEW ISSUES===

 

===OUTSTANDING UNFIXED===

1) 2n-skx: UDP 16m tput tests fail to create all sessions

   rca:

   test: UDP 16m TPUT

   frequency: all

   testbed: 2n-skx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2206-2n-skx/8/log.html.gz#s1-s1-s1-s1-s12-t1

                                          

 

TICKET: https://jira.fd.io/browse/CSIT-1849

NOTE: not the same issue as https://jira.fd.io/browse/CSIT-1795 where only 1 session not created.

 

2) error: 2n-clx, 3n-skx: First port is down when running traffic profile

   rca:

   test: 10Ge2P1X710

   frequency: all

   testbed: 2n-clx, 3n-skx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-skx/1590/log.html.gz#s1-s1-s1-s2-s1-t1

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-3n-skx/204/log.html.gz#s1-s1-s1-s2-s1

 

NOTE: Issue had been observed on 2n-clx and fixed with workaround. Now (7/3) it appeared on 3n-skx. The same x710 NIC.

TICKET: https://jira.fd.io/browse/CSIT-1831

 

3) error: 2n-dnv: sporadic 1518B tput tests failing to establish required sessions

   rca:

   test: 1518B tput

   frequency: all

   testbeds: 2n-dnv

   examples: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1193/log.html.gz#s1-s1-s1-s1-s8-t6

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1195/log.html.gz#s1-s1-s1-s1-s7-t4

 

TICKET: https://jira.fd.io/browse/CSIT-1850

 

4) error: 2n-skx, 2n-clx, 2n-icx: ALL 1518B TCP tput tests failing with big packet loss

   rca:

   test: 1518B TCP tput

   frequency: all

   testbed: 2n-skx, 2n-clx, 2n-icx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-skx/202/log.html.gz#s1-s1-s1-s2-s17-t1

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/149/log.html.gz#s1-s1-s1-s2-s34-t1

 

TICKET: https://jira.fd.io/browse/CSIT-1846

 

5) error: all tcp tput tests failing

   rca:

   test: 100b tcp tput

   frequency: all

   testbed: 2n-clx, 2n-icx, 2n-skx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-skx/1743/log.html.gz#s1-s1-s1-s2-s17-t1

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-skx/201/log.html.gz#s1-s1-s1-s2-s17-t1

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-icx/85/log.html.gz#s1-s1-s1-s2-s15-t1

 

NOTE: 'reset' TRex is not valid while disconnected.

NOTE: on weekly-2n-skx/201 issue is different. Traffic goes through but losing packets.

TICKET: https://jira.fd.io/browse/CSIT-1830

 

6) error: 3n-icx, 3n-skx: all 1518B AVF crypto tests failed with no traffic, all IMIX AVF crypto with excessive packet loss

   rca:

   test: all AVF crypto

   frequency: sporadic

   testbed: NDRPDR: 3n-skx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-skx/1592/log.html.gz#s1-s1-s1-s2-s1-t1

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-3n-skx/205/log.html.gz#s1-s1-s1-s1-s1-t1

                                          

TICKET: https://jira.fd.io/browse/CSIT-1827

NOTE: After a while failed on MRR daily tests. Only 1518B tests failed, IMIX passed.

NOTE: Interestingly MRR failing with higher frequency now.

 

7) error: NDR sporadic packet lost

   rca:

   test: af-xdp multicore tests

   frequency: low

   testbed: 2n-skx, 2n-clx

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/140/log.html.gz#s1-s1-s1-s2-s10-t2

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-skx/202/log.html.gz#s1-s1-s1-s2-s4-t3

 

TICKET: https://jira.fd.io/browse/CSIT-1802

 

8) error: T-Rex STL runtime error

   rca: VPP code - X557 speed_capability set 1GE instead of 10GE

   test: sporadic

   frequency: all

   testbed: 2n-dnv and 3n-dnv

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-dnv/1205/log.html.gz#s1-s1-s1-s1-s1-t1

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1195/log.html.gz#s1-s1-s1-s1-s2-t1

 

TODO: VPP to fix speed_capability.

TICKET: https://jira.fd.io/browse/VPP-2010

 

9) error: failed creating AVF interface

   rca: issue in Intel FVL driver

   test: multicore AVF

   frequency: sporadic

   testbed: all testbeds

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-zn2/570/log.html.gz#s1-s1-s1-s4-s6-t3

 

NOTE: A long standing issue without a final permanent fix.

TICKET: multicore AVF tests are failing when trying to create interface, https://jira.fd.io/browse/CSIT-1782

 

10) error: Not all DET44 sessions have been established: 4128767 != 4128768

   rca: unknown

   test: nat44det udp 4m and 16m (64k and 1m are ok)

   frequency: very sporadic. It failed in 1 out of 8 runs.

   testbed: 2n-zn2, 2n-skx, 2n-icx, 2n-dnv

   example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-icx/84/log.html.gz#s1-s1-s1-s2-s29-t2

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1190/log.html.gz#s1-s1-s1-s1-s9-t4

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-clx/1145/log.html.gz#s1-s1-s1-s2-s48-t2

                                           https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-zn2/570/log.html.gz#s1-s1-s1-s2-s21-t2

 

TICKET: https://jira.fd.io/browse/CSIT-1795

NOTE: 2 week ago 1st time happened on 2n-icx

NOTE: 1 week ago 1st time observed on 2n-dnv

 

===OUTSTANDING FIXED===

 

===FIXED ISSUES===

 

Best regards,

Viliam Luc