More than 32 CPUs under 8.4-P
Dennis Glatting
dg at pki2.com
Mon May 20 02:55:15 UTC 2013
Minutes after I typed that message 2x16 the system paniced with the
following back trace:
kdb_backtrace
panic
vdev_deadman
vdev_deadman
vdev_deadman
spa_deadman
softclock
intr_event_execute_handlers
ithread_loop
fork_exit
fork_trampoline
I had just created a memory disk when that happened:
root at iirc:~ # mdconfig -a -t swap -s 1g -u 1
root at iirc:~ # newfs -U /dev/md1
root at iirc:~ # mount /dev/md1 /mnt
root at iirc:~ # cp -p procstat kgdb /mnt
root at iirc:~ # cd /rescue/
root at iirc:/rescue # cp -p * /mnt
On Sun, 2013-05-19 at 18:45 -0700, Dennis Glatting wrote:
> On Sun, 2013-05-19 at 16:28 -0400, Paul Kraus wrote:
> > On May 19, 2013, at 11:51 AM, Dennis Glatting <freebsd at pki2.com> wrote:
> >
> > > ZFS hangs on multi-socket systems (Tyan, Supermicro) under 9.1. ZFS does
> > > not hang under 8.4. This (and one other 4 socket) is a production
> > > system.
> >
> > Can you be more specific, I have been running 9.0 and 9.1 systems with
> > multi-CPU and all ZFS with no (CPU related*) issues.
> >
>
> I have (down to) ten FreeBSD/ZFS systems. Five of them are multi-socket
> populated. All are AMD CPUs of the 6200 series. Two of those
> multi-socketed systems are simply workstations and don't do much file
> I/O, so I have yet to see them fault.
>
> The remaining three perform significant I/O in the 1-8TB (simultaneous)
> file range, including sorting, compression, backup, etc (ZFS compression
> is enabled on some data sets as is dedup on a few minor data sets). I
> also do iSCSI and NFS from one of these systems.
>
> Simply, if I run 9.1 on those three busy systems ZFS will eventually
> hang under load (within ten hours to a few days) whereas it does not
> under 8.3/4. Two of those systems are 4x16 cores, one 2x16, and two 2x8
> cores. Multiple, simultaneous pbzip2 runs on individual 2-5TB ASCII
> files generally causes a hang within 10-20 hours.
>
> "Hang" means the system is alive and on the network but disk I/O has
> stopped. Run any command except statically linked executables on a
> memory volume and they will not run (no output or return to command
> prompt). This includes "reboot," which never really reboots.
>
> The volumes where work is performed are typically 12-33TB RAIDz2
> volumes. For example:
>
> root at mc:~ # zpool list disk-1
> NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
> disk-1 16.2T 5.86T 10.4T 36% 1.32x ONLINE -
>
> root at mc:~ # zpool status disk-1
> pool: disk-1
> state: ONLINE
> scan: scrub repaired 0 in 21h53m with 0 errors on Mon Apr 29 01:52:55
> 2013
> config:
>
> NAME STATE READ WRITE CKSUM
> disk-1 ONLINE 0 0 0
> raidz2-0 ONLINE 0 0 0
> da2 ONLINE 0 0 0
> da3 ONLINE 0 0 0
> da4 ONLINE 0 0 0
> da7 ONLINE 0 0 0
> da5 ONLINE 0 0 0
> da6 ONLINE 0 0 0
> cache
> da0 ONLINE 0 0 0
>
> errors: No known data errors
>
>
> > * I say no CPU related issues because I have run into SATA timeout
> > issues with an external SATA enclosure with 4 drives (I know, SATA port
> > expanders are evil, but it is my best option here). Sometimes the zpool
> > hangs hard, sometimes just becomes unresponsive for a while. My "fix",
> > such as it is, is to tune the zfs per vdev queue depth as follows:
> >
> > vfs.zfs.vdev.min_pending="3"
> > vfs.zfs.vdev.max_pending="5"
> >
>
> I've not tried those. Currently, these are mine:
>
> vfs.zfs.write_limit_override="1G"
> vfs.zfs.arc_max="8G"
> vfs.zfs.txg.timeout=15
> vfs.zfs.cache_flush_disable=1
>
> # Recommended from the net
> # April, 2013
> vfs.zfs.l2arc_norw=0 # Default is 1
> vfs.zfs.l2arc_feed_again=0 # Default is 1
> vfs.zfs.l2arc_noprefetch=0 # Default is 0
> vfs.zfs.l2arc_feed_min_ms=1000 # Default is 200
>
>
> > The defaults are 5 and 10 respectively, and when I run with those I
> > have the timeout issues, but only under very heavy I/O load. I only
> > generate such load when migrating large amounts of data, which
> > thankfully does not happen all that often.
> >
>
> Two days ago when the 9.1 system hanged I was able to run a static
> procstat where it inadvertently(?) printed that da0 wasn't responsive on
> the console. Unfortunately I didn't have a static camcontrol ready so I
> was unable to query it.
>
> That said, according to the criteria from
> https://wiki.freebsd.org/AvgZfsDeadlockDebug that hang isn't a true ZFS
> problem, yet hung it was.
>
> I have since (today) updated the firmware of most of the devices in that
> system and it is currently running some tasks. Most of the disks in that
> system are Seagate but the un-updated devices include three WD disks
> (RAID1 OS and a swap disk) -- unupdated because I haven't been able to
> figure WD firmware download out) and a SSD where the manufacturer
> indicates the firmware diff is minor, though I plan to go back and flash
> it anyway.
>
> If my 4x16 system ever finishes I will be updating its device's firmware
> too but it is an 8.4-P system and doesn't give me any trouble. Another
> 4x16 system gave me ZFS trouble under 9.1 but when I downgraded to 8.4-P
> it has been stable as a rock for the past 22 days often under heavy
> load.
>
>
>
>
>
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe at freebsd.org"
--
Dennis Glatting <dg at pki2.com>
More information about the freebsd-questions
mailing list