Possible ZFS livelock or SCHED_ULE bug ?
Arnaud Houdelette
arnaud.houdelette at tzim.net
Wed Dec 16 16:04:36 UTC 2009
Hi all !
I got a UniProcessor AMD64 box, with 512 MB ram with 2 ZFS pools as a
home-NAS.
I got some IO issues since I moved from 7.2 to 8.0.
With a GENERIC kernel (or a stripped down one), during high IO activity
(as a make buildword can cause), I encounter random hangs or deadlocks.
top show system CPU usage at 99%, the most CPU using process being
[zfskern] ( {txg_thread_enter} if I switch to thread view).
The box still respond to ping. Current processes can still run, but I
can't run new ones.
Sometimes, I can return to normal by Ctrl-C-ing the buildworld (or other
operation), sometimes I can't, I got to reboot the box.
The Issue seemed to become less frequent with 8.0-stable instead of
8.0-RELEASE, but still present (I get approximately 75% chance of hang
with a buildworld).
I got the hang with Prefetch enabled or disabled. Idem for ZIL.
I tried to enable kernel dumps, but the box hangs saving the dump (root
is on ZFS) or when starting kdbg on it.
I recompiled kernel with SCHED_4BSD, and it seems I can't reproduce the
hang.
What do you think ?
Did I misconfigured something ?
cat /boot/loader.conf
zfs_load="YES"
vfs.root.mountfrom="zfs:unsafe/root"
vm.kmem_size="512M"
vm.kmem_size_max="512M"
vfs.zfs.arc_max="100M"
vfs.zfs.vdev.cache.size="10M"
vfs.zfs.prefetch_disable="0"
vfs.zfs.zil_disable="1"
[carenath] ~> zpool status
pool: tank
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz1 ONLINE 0 0 0
ad6p1 ONLINE 0 0 0
ad10p1 ONLINE 0 0 0
ad8p1 ONLINE 0 0 0
ad4p1 ONLINE 0 0 0
errors: No known data errors
pool: unsafe
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
unsafe ONLINE 0 0 0
ad0p3 ONLINE 0 0 0
errors: No known data errors
More information about the freebsd-stable
mailing list