Interrupt storm and poor disk performance | mfi(4) driver | FreeBSD 8 | Dell PERC H730
Pallav Bose
pallav_bose at yahoo.com
Fri Sep 25 21:57:53 UTC 2015
Hello,
I have a Dell PowerEdge R430 server with a PERC H730 RAID controller. I'm trying to get FreeBSD 8 to install and run on this server. At this time, I have a patched version of the mfi(4) driver which attaches to the controller. I'm aware of mrsas(4), but since I have scripts that use mfiutil(8), I'd like to continue using the mfi(4) driver.
A simple dd test shows SSD performance to be very poor:
# dd if=/dev/mfid0 of=/dev/null bs=1m count=10241024+0 records in1024+0 records out1073741824 bytes transferred in 27.978784 secs (38377001 bytes/sec)
top -PHS shows a lot of CPU time being used by the swi6 s/w interrupt handler:
last pid: 81270; load averages: 0.01, 0.05, 0.05 up 0+05:34:20 15:45:51302 processes: 7 running, 278 sleeping, 17 waitingCPU 0: 0.0% user, 0.0% nice, 0.0% system, 52.6% interrupt, 47.4% idleCPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idleCPU 2: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idleCPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idleCPU 4: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idleCPU 5: 0.0% user, 0.0% nice, 0.0% system, 0.7% interrupt, 99.3% idleMem: 48M Active, 4044K Inact, 997M Wired, 7144K Cache, 1248K Buf, 30G FreeSwap:
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 10 root 171 ki31 0K 192K CPU5 5 319:51 100.00% {idle: cpu5} 10 root 171 ki31 0K 192K CPU2 2 293:32 94.58% {idle: cpu2} 10 root 171 ki31 0K 192K CPU3 3 298:46 93.65% {idle: cpu3} 10 root 171 ki31 0K 192K CPU4 4 278:55 92.58% {idle: cpu4} 10 root 171 ki31 0K 192K CPU1 1 289:36 92.19% {idle: cpu1} 10 root 171 ki31 0K 192K RUN 0 293:17 85.99% {idle: cpu0} 11 root -24 - 0K 544K WAIT 2 173:40 47.27% {swi6: task queue} 11 root -64 - 0K 544K WAIT 5 11:50 0.00% {irq256: mfi0} 11 root -32 - 0K 544K WAIT 1 6:26 0.00% {swi4: clock}
The interrupt rate in case of irq256:mfi0 is very high, in spite of there being no disk activity.
# vmstat -iinterrupt total rateirq4: uart0 257 0irq9: acpi0 1 0irq18: ehci0 ehci1 71739 3cpu0: timer 40226355 1998irq256: mfi0 3642472 180irq257: bge0 34922 1cpu3: timer 40229128 1998cpu5: timer 40228959 1998cpu4: timer 40229014 1998cpu1: timer 40228629 1998cpu2: timer 40223967 1998Total 245115443 12175
Procstat output:
# procstat -kk 11 # PID 11 taken from output of top PID TID COMM TDNAME KSTACK 11 100008 intr swi3: vm 11 100009 intr swi1: netisr 0 mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100010 intr swi4: clock mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100011 intr swi4: clock mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100012 intr swi4: clock mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100013 intr swi4: clock mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100014 intr swi4: clock mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100015 intr swi4: clock mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100021 intr swi5: + 11 100023 intr swi6: Giant task mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100024 intr swi6: task queue mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100027 intr swi2: cambio mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100032 intr irq9: acpi0 mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100033 intr irq256: mfi0 mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100034 intr irq18: ehci0 ehc mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100039 intr swi0: uart uart mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100040 intr irq1: atkbd0
# kldload dtraceall# dtrace -n 'profile:::profile-276hz { @pc[stack()]=count(); }'dtrace: description 'profile:::profile-276hz ' matched 1 probe
The above dtrace script is supposed to print all the stack traces seen during the sampling period.
The following stack trace occurs a large number of times:
kernel`DELAY+0x64
kernel`bus_dmamap_load+0x3a9 kernel`mfi_mapcmd+0x4f kernel`mfi_startio+0x65 kernel`mfi_wait_command+0x9c kernel`mfi_tbolt_sync_map_info+0xb4 kernel`mfi_handle_map_sync+0x39 kernel`taskqueue_run+0x91 kernel`intr_event_execute_handlers+0x66 kernel`ithread_loop+0x8e kernel`fork_exit+0x112 kernel`0xffffffff8050624e Can someone help me debug this problem? It's likely that the mfi(4) driver I currently have access to doesn't have all the necessary patches.
Thank you.
Regards,
Pallav
More information about the freebsd-questions
mailing list