Re: CSIT failing perf tests for week 42 (10/03 – 10/09)
Viliam Luc -X (vluc - PANTHEON TECH SRO at Cisco) <vluc@...>
Update. I’ve added new row with priority. We can discuss and make some changes to priority during public call today.
CSIT failing perf tests for week 42 (10/03 – 10/09)
=====SUMMARY=====
===NEW ISSUES===
1) error: 3n-snr: 25Ge Interface goes down randomly rca: priority: medium test: all frequency: sporadic testbed: 3n-snr
TICKET: https://jira.fd.io/browse/CSIT-1871 NOTE: sometimes 'TwentyFiveGigabitEthernetec/0/0' goes down and all subsequent tests fail.
===OUTSTANDING UNFIXED===
2) error: ALL ARM testbeds are failing with VPP failed to start rca: test: all priority: high frequency: all testbed: 2n-tx2, 3n-alt, 3n-tsh example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-tx2/406/log.html.gz https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-alt/102/log.html.gz https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-tsh/695/log.html.gz
TICKET: https://jira.fd.io/browse/CSIT-1857
3) error: all testbeds DPDK l3fwd tests have wrong configuration rca: test: DPDK l3fwd priority: medium frequency: all testbed: all
TICKET: https://jira.fd.io/browse/CSIT-1858 NOTE: tests are passing but the traffic returns back to TG instead of being forwarded to DUT2. FIX: https://gerrit.fd.io/r/c/csit/+/37240
4) error: 2n-clx: half of the packets lost on PDR tests (re-opened) rca: test: e810Cq ip4base, ip6base priority: high frequency: sporadic testbed: 2n-clx
NOTE: week37: issue appeared NOTE: week39: happened again but only on ip6base e810Cq - build #158. NOTE: week40: happened only on ip4base e810Cq, but 99,9% of packets are lost now. Build #159 NOTE: week41: This week issue happened only on ip4base e810Cq NOTE: week42: Issue happened on daily MRR tests. Only Ip6Base (E810Cq) with no traffic.
TICKET: https://jira.fd.io/browse/CSIT-1864
5) error: 3n-alt: testpmd tests fail with no traffic rca: test: testpmd priority: medium frequency: all testbed: 3n-alt
TICKET: https://jira.fd.io/browse/CSIT-1848
6) error: 3n-icx: IP4 tunnels GTPU and WIREGUARD tests failing with ~1700 packets lost rca: test: IP4 tunnels with E810Xxv nic priority: medium frequency: sporadic testbed: 3n-icx
NOTE: TRex doesn't support E810Xxv (Columbiaville). TICKET: https://jira.fd.io/browse/CSIT-1862
7) error: 2n-tx2, 3n-tsh: Failed to create container DUT1_CNF1 rca: test: 2n-tx2: all Container Memif 3n-tsh: SRv6_PROXY priority: medium - low frequency: all testbed: 2n-tx2, 3n-tsh
NOTE: 2n-tx2 started with build #383 on 31st of August. Build #382 on 30th of August passed. NOTE: 3n-tsh started with build #674 on 31st of August. Build #673 on 30th of August passed. NOTE: ARM testbeds still failing so this wasn't observed. TICKET: https://jira.fd.io/browse/CSIT-1860
8) error: 3n-icx: All 1000Tnlsw Fixtnlip non AVF tests failing. 1518B with no traffic forwarded, IMIX with excessive packet loss rca: test: 1518B crypto priority: medium frequency: sporadic testbed: 3n-icx
TICKET: https://jira.fd.io/browse/CSIT-1844
9) error: 2n-dnv: sporadic 1518B tput tests failing to establish required sessions rca: test: 1518B tput priority: low frequency: sporadic testbeds: 2n-dnv
TICKET: https://jira.fd.io/browse/CSIT-1850 NOTE: #1240 all tput test passed
10) error: 3n-icx, 3n-skx, 3n-snr: all 1518B AVF crypto tests failed with no traffic, all IMIX AVF crypto with excessive packet loss rca: test: all AVF crypto frequency: sporadic priority: high testbed: 3n-skx, 3n-icx, 3n-snr
TICKET: https://jira.fd.io/browse/CSIT-1827
11) error: all testbeds: AF-XDP - NDR tests failing from time to time rca: test: af-xdp multicore tests priority: low frequency: low testbed: 2n-clx, 2n-skx, 2n-tx2, 2n-icx
TICKET: https://jira.fd.io/browse/CSIT-1802
12) error: 3n-tsh, 3n-alt, 2n-clx testbed (Taishan, Altra, Cascade-lake): NDR tests failing from time to time. rca: tests: Crypto, Ip4, L2, Srv6, Vm Vhost (all packet sizes, all core configurations affected) priority: medium frequency: medium testbed: 3n-tsh, 3n-alt, 2n-clx
TICKET: https://jira.fd.io/browse/CSIT-1804
13) error: T-Rex STL runtime error rca: VPP code - X557 speed_capability set 1GE instead of 10GE test: sporadic priority: low frequency: all testbed: 2n-dnv and 3n-dnv
TODO: VPP to fix speed_capability. TICKET: https://jira.fd.io/browse/VPP-2010
14) error: failed creating AVF interface rca: issue in Intel FVL driver test: multicore AVF priority: low frequency: sporadic testbed: all testbeds
NOTE: A long standing issue without a final permanent fix. TICKET: multicore AVF tests are failing when trying to create interface, https://jira.fd.io/browse/CSIT-1782
15) error: Not all DET44 sessions have been established: 4128767 != 4128768 rca: unknown test: nat44det udp 4m and 16m (64k and 1m are ok) priority: low frequency: very sporadic. It failed in 1 out of 8 runs. testbed: 2n-zn2, 2n-skx, 2n-icx, 2n-clx
TICKET: https://jira.fd.io/browse/CSIT-1795
===OUTSTANDING FIXED===
===FIXED ISSUES=== #) error: 3n-tsh: failed to bind/unbind interface rca: test: all frequency: all testbed: 3n-tsh
TICKET: https://jira.fd.io/browse/CSIT-1874 TODO: JL to fix. FIX: echo 1 | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
#) error: 3n-icx: all NFV density-DCR memif-Chain ipsec tests failing with no traffic forwarded rca: test: all chain ipsec frequency: all testbed: 3n-icx
TICKET: https://jira.fd.io/browse/CSIT-1865
Best regards, Viliam Luc
From: Viliam Luc -X (vluc - PANTHEON TECH SRO at Cisco)
Sent: 10 October 2022 12:09 To: csit-report@... Subject: CSIT failing perf tests for week 42 (10/03 – 10/09)
=====SUMMARY===== New issues - 1 Unfixed issues - 15 Fixed issues - 2
===NEW ISSUES===
1) error: 3n-snr: 25Ge Interface goes down randomly rca: priority: medium test: all frequency: sporadic testbed: 3n-snr
TICKET: https://jira.fd.io/browse/CSIT-1871 NOTE: sometimes 'TwentyFiveGigabitEthernetec/0/0' goes down and all subsequent tests fail.
===OUTSTANDING UNFIXED===
2) error: ALL ARM testbeds are failing with VPP failed to start rca: test: all frequency: all testbed: 2n-tx2, 3n-alt, 3n-tsh example: https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-tx2/406/log.html.gz https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-alt/102/log.html.gz https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-tsh/695/log.html.gz
TICKET: https://jira.fd.io/browse/CSIT-1857
3) error: all testbeds DPDK l3fwd tests have wrong configuration rca: test: DPDK l3fwd frequency: all testbed: all
TICKET: https://jira.fd.io/browse/CSIT-1858
NOTE: tests are passing but the traffic returns back to TG instead of being forwarded to DUT2.
4) error: 2n-clx: half of the packets lost on PDR tests (re-opened) rca: test: e810Cq ip4base, ip6base frequency: sporadic testbed: 2n-clx
NOTE: week37: issue appeared NOTE: week39: happened again but only on ip6base e810Cq - build #158. NOTE: week40: happened only on ip4base e810Cq, but 99,9% of packets are lost now. Build #159 NOTE: week41: This week issue happened only on ip4base e810Cq NOTE: week42: Issue happened on daily MRR tests. Only Ip6Base (E810Cq) with no traffic.
TICKET: https://jira.fd.io/browse/CSIT-1864
5) error: 3n-alt: testpmd tests fail with no traffic rca: test: testpmd frequency: all testbed: 3n-alt
TICKET: https://jira.fd.io/browse/CSIT-1848
6) error: 3n-icx: IP4 tunnels GTPU and WIREGUARD tests failing with ~1700 packets lost rca: test: IP4 tunnels with E810Xxv nic frequency: sporadic testbed: 3n-icx
NOTE: TRex doesn't support E810Xxv (Columbiaville). TICKET: https://jira.fd.io/browse/CSIT-1862
7) error: 2n-tx2, 3n-tsh: Failed to create container DUT1_CNF1 rca: test: 2n-tx2: all Container Memif 3n-tsh: SRv6_PROXY frequency: all testbed: 2n-tx2, 3n-tsh
NOTE: 2n-tx2 started with build #383 on 31st of August. Build #382 on 30th of August passed. NOTE: 3n-tsh started with build #674 on 31st of August. Build #673 on 30th of August passed. NOTE: ARM testbeds still failing so this wasn't observed. TICKET: https://jira.fd.io/browse/CSIT-1860
8) error: 3n-icx: All 1000Tnlsw Fixtnlip non AVF tests failing. 1518B with no traffic forwarded, IMIX with excessive packet loss rca: test: 1518B crypto frequency: sporadic testbed: 3n-icx
TICKET: https://jira.fd.io/browse/CSIT-1844
9) error: 2n-dnv: sporadic 1518B tput tests failing to establish required sessions rca: test: 1518B tput frequency: sporadic testbeds: 2n-dnv
TICKET: https://jira.fd.io/browse/CSIT-1850 NOTE: #1240 all tput test passed
10) error: 3n-icx, 3n-skx, 3n-snr: all 1518B AVF crypto tests failed with no traffic, all IMIX AVF crypto with excessive packet loss rca: test: all AVF crypto frequency: sporadic testbed: 3n-skx, 3n-icx, 3n-snr
TICKET: https://jira.fd.io/browse/CSIT-1827
11) error: all testbeds: AF-XDP - NDR tests failing from time to time rca: test: af-xdp multicore tests frequency: low testbed: 2n-clx, 2n-skx, 2n-tx2, 2n-icx
TICKET: https://jira.fd.io/browse/CSIT-1802
12) error: 3n-tsh, 3n-alt, 2n-clx testbed (Taishan, Altra, Cascade-lake): NDR tests failing from time to time. rca: tests: Crypto, Ip4, L2, Srv6, Vm Vhost (all packet sizes, all core configurations affected) frequency: medium testbed: 3n-tsh, 3n-alt, 2n-clx
TICKET: https://jira.fd.io/browse/CSIT-1804
13) error: T-Rex STL runtime error rca: VPP code - X557 speed_capability set 1GE instead of 10GE test: sporadic frequency: all testbed: 2n-dnv and 3n-dnv
TODO: VPP to fix speed_capability. TICKET: https://jira.fd.io/browse/VPP-2010
14) error: failed creating AVF interface rca: issue in Intel FVL driver test: multicore AVF frequency: sporadic testbed: all testbeds
NOTE: A long standing issue without a final permanent fix. TICKET: multicore AVF tests are failing when trying to create interface, https://jira.fd.io/browse/CSIT-1782
15) error: Not all DET44 sessions have been established: 4128767 != 4128768 rca: unknown test: nat44det udp 4m and 16m (64k and 1m are ok) frequency: very sporadic. It failed in 1 out of 8 runs. testbed: 2n-zn2, 2n-skx, 2n-icx, 2n-clx
TICKET: https://jira.fd.io/browse/CSIT-1795
===OUTSTANDING FIXED===
===FIXED ISSUES=== #) error: 3n-tsh: failed to bind/unbind interface rca: priority: low test: all frequency: all testbed: 3n-tsh
TICKET: https://jira.fd.io/browse/CSIT-1874 TODO: JL to fix. FIX: echo 1 | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
#) error: 3n-icx: all NFV density-DCR memif-Chain ipsec tests failing with no traffic forwarded rca: test: all chain ipsec frequency: all testbed: 3n-icx
NOTE: This week 3n-icx didn't run due to low cadency of runs because of iterative tests for RC1. TICKET: https://jira.fd.io/browse/CSIT-1865
Best regards, Viliam Luc |
|