Weird I/O hangs (9.1R, arcsas, interrupt spikes on uhci0)

Dennis Kögel dk at neveragain.de
Wed Jun 19 13:01:16 UTC 2013


Hi,

very periodically, we see I/O hangs for about 10 seconds, roughly once per minute.

Each time this happens, the I/O rate simply drops to zero, and all disk access hangs; this is also very noticeable on the shell, for NFS clients etc. Everything else (networking, kernel, …) seems to continue normally.

Environment: FreeBSD 9.1R GENERIC on amd64, using ZFS, on a ARC1320 PCIe with 24x Seagate ST33000650SS (3rd party arcsas.ko driver).

It's easy to observe these hangs under write load, e.g. with 'zpool iostat 1':

void        22.4T  42.6T     34  2.73K  1.07M   293M
void        22.4T  42.6T     20  2.74K   623K   289M
void        22.4T  42.6T    144  2.62K  4.83M   279M
void        22.4T  42.6T     13  2.60K   437K   283M
void        22.4T  42.6T      0      0      0      0 <-- hang starts
void        22.4T  42.6T      0      0      0      0
void        22.4T  42.6T      0      0      0      0
void        22.4T  42.6T      0      0      0      0
void        22.4T  42.6T      0      0      0      0
void        22.4T  42.6T      0      0      0      0
void        22.4T  42.6T      0      0      0      0
void        22.4T  42.6T      0      0      0      0
void        22.4T  42.6T      0    296  4.00K  34.2M <-- hang ends
void        22.4T  42.6T      2  2.64K  73.8K   288M
void        22.4T  42.6T      8  3.12K   278K   329M

Each time this happens, there is a completely unexplained spike of interrupts on uhci0: 'systat -vm' then displays numbers around 270k.

# vmstat -i | grep -E '(arcsas|uhci0|Total)'
irq16: uhci0                  1227020890      67708
irq24: arcsas0                  12045211        664
Total                         1266417827      69882

Things to note:

- Booting an USB-less kernel or disabling all USB in the BIOS doesn't change a thing (no interrupt spikes to be seen, but the hangs remain)
- The hangs / interrupt spikes happen just as often when the system is idle
- Board is a Supermicro x8dth
- There's two igb cards
- Root is ZFS as well (separate pool though)
- BIOS, Areca FW and driver already are latest versions
- Putting the controller to a different slot doesn't change the behaviour
- We have two identical systems and both show the exact same symptoms, so flaky hardware is probably not the issue

Any ideas would be appreciated.

Thanks,
D.


More information about the freebsd-stable mailing list