Freebsd-10.0-CURRENT problem in the bottom half
Venkata Duvvuru
VenkatKumar.Duvvuru at Emulex.Com
Thu Oct 31 14:00:09 UTC 2013
Hi,
In Freebsd 10.0-current with Emulex's OCE driver, I observe that the bottom half is hogging all the CPU which is leading to system sluggishness. I used the same hardware to check the behavior on 9.1-RELEASE, everything is fine, bottom half is not taking more than 10% of the CPU even at the line rate speed. But with half the throughput of line rate in Freebsd-10.0-current all the CPUs peak and "top -aSCHIP" shows that it's all bottom half who is eating the CPU. I also tried with Intel's 10g NIC to see if it has got the same problem and interestingly it is also having the same problem, where bottom half is hogging all the CPU.
We did some profiling to check where the problem is. I'm pasting the profiling results for reference here. Observation from the results is that uma_zalloc_arg==> __mtx_lock_sleep is taking quite a bit of the CPU. We sleep at a few places in the bottom half and I think the bottom half scheduling is what is causing the problem. Please let me know if you see something very obvious from the profiling data.
I'm seeing similar problem with ixgbe driver as well uma_zalloc_arg==> __mtx_lock_sleep is taking quite a bit of the CPU.
Top command results are also pasted below for both 10.0-current (where the problem is seen) and 9.1-RELEASE (where the problem is not seen).
42.99% [372357] __mtx_lock_sleep @ /boot/kernel/kernel
100.0% [372357] __mtx_lock_flags
80.37% [299245] sosend_generic
100.0% [299245] soo_write
17.27% [64322] uma_zalloc_arg
50.43% [32438] m_copym
45.85% [29490] m_getm2
03.72% [2393] tcp_output
00.00% [1] oce_alloc_rx_bufs @ /boot/kernel/oce.ko
02.35% [8762] _sleep @ /boot/kernel/kernel
100.0% [8762] sbwait
00.00% [11] in_matroute
100.0% [11] rtalloc1_fib
00.00% [10] uma_zfree_arg
60.00% [6] m_freem
40.00% [4] sbdrop_internal
00.00% [3] rtalloc1_fib
100.0% [3] rtalloc_ign_fib
00.00% [3] tcp_output
100.0% [3] tcp_usr_send
00.00% [1] _cv_wait
100.0% [1] usb_process
34.98% [303027] trash_ctor @ /boot/kernel/kernel
87.44% [264956] mb_ctor_clust
100.0% [264956] uma_zalloc_arg
100.0% [264956] m_getm2
12.51% [37921] mb_ctor_mbuf
100.0% [37921] uma_zalloc_arg
48.28% [18308] m_getm2
48.18% [18272] m_copym
03.54% [1341] tcp_output
00.05% [150] mb_ctor_pack
100.0% [150] uma_zalloc_arg
96.67% [145] m_getm2
03.33% [5] oce_alloc_rx_bufs @ /boot/kernel/oce.ko
04.36% [37786] __rw_wlock_hard @ /boot/kernel/kernel
100.0% [37786] _rw_wlock_cookie
99.88% [37741] tcp_usr_send
100.0% [37741] sosend_generic
00.11% [43] in_pcblookup_hash
100.0% [43] tcp_input
Top -aSCHIP with Freebsd-10.0-CURRENT (throughput 4.5 Gbps)
PID USERNAME PRI NICE SIZE RES STATE C TIME CPU COMMAND
0 root -92 0 0K 960K - 11 4:20 100.00% [kernel{oce1 taskq}]
2325 root 103 0 70524K 6240K CPU5 5 0:39 100.00% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
0 root -92 0 0K 960K CPU8 8 5:17 99.37% [kernel{oce1 taskq}]
0 root -92 0 0K 960K CPU10 10 4:27 99.37% [kernel{oce1 taskq}]
0 root -92 0 0K 960K CPU9 9 3:35 98.88% [kernel{oce1 taskq}]
0 root -92 0 0K 960K CPU7 7 7:25 98.29% [kernel{oce1 taskq}]
11 root 155 ki31 0K 256K RUN 12 14:55 93.90% [idle{idle: cpu12}]
11 root 155 ki31 0K 256K CPU13 13 14:48 91.70% [idle{idle: cpu13}]
11 root 155 ki31 0K 256K CPU14 14 18:35 89.79% [idle{idle: cpu14}]
11 root 155 ki31 0K 256K RUN 15 20:55 87.99% [idle{idle: cpu15}]
11 root 155 ki31 0K 256K RUN 4 19:35 73.68% [idle{idle: cpu4}]
11 root 155 ki31 0K 256K RUN 0 19:47 66.89% [idle{idle: cpu0}]
11 root 155 ki31 0K 256K CPU2 2 18:53 64.89% [idle{idle: cpu2}]
11 root 155 ki31 0K 256K CPU6 6 19:55 61.18% [idle{idle: cpu6}]
11 root 155 ki31 0K 256K CPU1 1 18:46 55.66% [idle{idle: cpu1}]
11 root 155 ki31 0K 256K RUN 5 19:41 42.48% [idle{idle: cpu5}]
2325 root 52 0 70524K 6240K sbwait 6 0:15 38.77% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
11 root 155 ki31 0K 256K CPU3 3 19:06 36.38% [idle{idle: cpu3}]
2325 root 49 0 70524K 6240K CPU1 1 0:14 35.06% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
2325 root 44 0 70524K 6240K sbwait 0 0:13 33.59% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
2325 root 42 0 70524K 6240K sbwait 2 0:13 33.15% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
2325 root 41 0 70524K 6240K sbwait 6 0:12 29.98% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
2325 root 36 0 70524K 6240K sbwait 15 0:09 25.10% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
2325 root 34 0 70524K 6240K CPU13 13 0:09 24.46% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
Top -aSCHIP with Freebsd-9.1-RELEASE (throughput 9.6 Gbps)
last pid: 16802; load averages: 0.68, 0.15, 0.05 up 3+06:17:33 07:35:29
187 processes: 15 running, 128 sleeping, 44 waiting
CPU 0: 0.0% user, 0.0% nice, 13.8% system, 0.4% interrupt, 85.8% idle
CPU 1: 0.0% user, 0.0% nice, 9.4% system, 0.8% interrupt, 89.8% idle
CPU 2: 0.0% user, 0.0% nice, 10.2% system, 0.0% interrupt, 89.8% idle
CPU 3: 0.0% user, 0.0% nice, 12.6% system, 0.0% interrupt, 87.4% idle
CPU 4: 0.0% user, 0.0% nice, 13.4% system, 0.0% interrupt, 86.6% idle
CPU 5: 0.4% user, 0.0% nice, 11.0% system, 0.0% interrupt, 88.6% idle
CPU 6: 0.0% user, 0.0% nice, 16.9% system, 0.0% interrupt, 83.1% idle
CPU 7: 0.0% user, 0.0% nice, 32.3% system, 0.0% interrupt, 67.7% idle
CPU 8: 0.0% user, 0.0% nice, 21.7% system, 0.0% interrupt, 78.3% idle
CPU 9: 0.0% user, 0.0% nice, 24.4% system, 0.0% interrupt, 75.6% idle
CPU 10: 0.4% user, 0.0% nice, 6.3% system, 0.0% interrupt, 93.3% idle
CPU 11: 0.0% user, 0.0% nice, 16.1% system, 0.0% interrupt, 83.9% idle
Mem: 21M Active, 2201M Inact, 1094M Wired, 1236M Buf, 8552M Free
Swap: 15G Total, 15G Free
PID USERNAME PRI NICE SIZE RES STATE C TIME CPU COMMAND
11 root 155 ki31 0K 192K CPU1 1 78.1H 95.75% [idle{idle: cpu1}]
11 root 155 ki31 0K 192K RUN 10 78.1H 95.07% [idle{idle: cpu10}]
11 root 155 ki31 0K 192K CPU4 4 78.2H 94.29% [idle{idle: cpu4}]
11 root 155 ki31 0K 192K CPU2 2 78.2H 94.19% [idle{idle: cpu2}]
11 root 155 ki31 0K 192K CPU5 5 78.2H 94.09% [idle{idle: cpu5}]
11 root 155 ki31 0K 192K CPU3 3 78.2H 94.09% [idle{idle: cpu3}]
11 root 155 ki31 0K 192K CPU6 6 78.1H 86.77% [idle{idle: cpu6}]
11 root 155 ki31 0K 192K CPU11 11 78.1H 86.57% [idle{idle: cpu11}]
11 root 155 ki31 0K 192K CPU0 0 77.5H 86.28% [idle{idle: cpu0}]
11 root 155 ki31 0K 192K RUN 8 78.1H 81.79% [idle{idle: cpu8}]
11 root 155 ki31 0K 192K CPU9 9 78.1H 78.47% [idle{idle: cpu9}]
11 root 155 ki31 0K 192K RUN 7 78.1H 76.86% [idle{idle: cpu7}]
0 root -92 0 0K 784K - 7 0:08 26.37% [kernel{oce2 taskq}]
0 root -92 0 0K 784K CPU4 9 0:06 22.36% [kernel{oce2 taskq}]
0 root -92 0 0K 784K - 8 0:07 20.65% [kernel{oce2 taskq}]
0 root -92 0 0K 784K CPU3 6 0:06 15.48% [kernel{oce2 taskq}]
0 root -92 0 0K 784K - 0 0:04 10.99% [kernel{oce2 taskq}]
0 root -92 0 0K 784K - 11 0:04 10.89% [kernel{oce2 taskq}]
16801 root 26 0 65196K 5292K sbwait 4 0:03 10.50% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root 23 0 65196K 5292K sbwait 0 0:02 6.79% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root 23 0 65196K 5292K sbwait 10 0:01 6.05% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root 24 0 65196K 5292K sbwait 5 0:02 5.96% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root 24 0 65196K 5292K sbwait 2 0:01 5.66% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root 22 0 65196K 5292K sbwait 10 0:01 4.79% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root 23 0 65196K 5292K sbwait 11 0:01 4.59% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root 23 0 65196K 5292K sbwait 10 0:01 3.96% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root 22 0 65196K 5292K sbwait 10 0:01 3.96% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root 21 0 65196K 5292K sbwait 3 0:01 3.27% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
More information about the freebsd-current
mailing list