Weird I/O hangs (9.1R, arcsas, interrupt spikes on uhci0)
Ronald Klop
ronald-freebsd8 at klop.yi.org
Wed Jun 19 13:28:45 UTC 2013
On Wed, 19 Jun 2013 15:01:14 +0200, Dennis Kögel <dk at neveragain.de> wrote:
> Hi,
>
> very periodically, we see I/O hangs for about 10 seconds, roughly once
> per minute.
>
> Each time this happens, the I/O rate simply drops to zero, and all disk
> access hangs; this is also very noticeable on the shell, for NFS clients
> etc. Everything else (networking, kernel, …) seems to continue normally.
>
> Environment: FreeBSD 9.1R GENERIC on amd64, using ZFS, on a ARC1320 PCIe
> with 24x Seagate ST33000650SS (3rd party arcsas.ko driver).
>
> It's easy to observe these hangs under write load, e.g. with 'zpool
> iostat 1':
>
> void 22.4T 42.6T 34 2.73K 1.07M 293M
> void 22.4T 42.6T 20 2.74K 623K 289M
> void 22.4T 42.6T 144 2.62K 4.83M 279M
> void 22.4T 42.6T 13 2.60K 437K 283M
> void 22.4T 42.6T 0 0 0 0 <-- hang starts
> void 22.4T 42.6T 0 0 0 0
> void 22.4T 42.6T 0 0 0 0
> void 22.4T 42.6T 0 0 0 0
> void 22.4T 42.6T 0 0 0 0
> void 22.4T 42.6T 0 0 0 0
> void 22.4T 42.6T 0 0 0 0
> void 22.4T 42.6T 0 0 0 0
> void 22.4T 42.6T 0 296 4.00K 34.2M <-- hang ends
> void 22.4T 42.6T 2 2.64K 73.8K 288M
> void 22.4T 42.6T 8 3.12K 278K 329M
>
> Each time this happens, there is a completely unexplained spike of
> interrupts on uhci0: 'systat -vm' then displays numbers around 270k.
>
> # vmstat -i | grep -E '(arcsas|uhci0|Total)'
> irq16: uhci0 1227020890 67708
> irq24: arcsas0 12045211 664
> Total 1266417827 69882
>
> Things to note:
>
> - Booting an USB-less kernel or disabling all USB in the BIOS doesn't
> change a thing (no interrupt spikes to be seen, but the hangs remain)
> - The hangs / interrupt spikes happen just as often when the system is
> idle
> - Board is a Supermicro x8dth
> - There's two igb cards
> - Root is ZFS as well (separate pool though)
> - BIOS, Areca FW and driver already are latest versions
> - Putting the controller to a different slot doesn't change the behaviour
> - We have two identical systems and both show the exact same symptoms,
> so flaky hardware is probably not the issue
>
> Any ideas would be appreciated.
>
> Thanks,
> D.
First send more information about the system:
- The content of /var/run/dmesg.boot.
- Install /usr/ports/sysutils/zfs-stats and send the output of zfs-stats
-a.
- Send the output of zpool status + zpool list.
- Did you configure compression or dedup on the pool?
- Do you keep a lot of snapshots?
- Do you run a cronjob every minute which does something with the pool?
Gathers statistics or something like that.
Ronald.
More information about the freebsd-stable
mailing list