Extreme console latency during disk IO (8.0-RC1, previous releases also affected according to others)

Thomas Backman serenity at exscape.org
Mon Oct 12 19:49:14 UTC 2009


I'm copying this over from the freebsd-performance list, as I'm  
looking for a few more opinions - not on the problems *I* am having,  
but rather to check whether the problem is universal or not, and if  
not, find a possible common factor.
In other words: I want to hear about your experiences, *good or bad*!

Here's the original thread (not from the beginning, though): http://lists.freebsd.org/pipermail/freebsd-performance/2009-October/003843.html

Long story short, my version: when the disk is stressed hard enough,  
console IO becomes COMPLETELY unbearable. 10+ seconds to switch  
between windows in screen(1), running (or even typing) simple  
commands, etc. This happens both via SSH and the serial console.

How to reproduce/test:
1) time file /etc/* > /dev/null a few times, or something similar that  
uses the disk; write down a common/average/median/whatever time.
2) cat /dev/zero > /uncompressed_fs/filename # please make *sure* it's  
uncompressed, since ZFS with lzjb/gzip enabled will squish this into a  
kilobyte-sized file, thus creating virtually *no* IO.
3) When cat has been running say 10 seconds, re-time command #1 and do  
some interactive stuff - run commands, edit files, etc.

I couldn't actually reproduce the *completely* horrific increase in  
latency I posted about below just now (I did update my sources and  
rebuild, but I'm pretty sure the delta between ~Sep 29 and Oct 6 had  
no major IO changes in 8-STABLE), and the "time file /etc/*" test  
"only" jumped by about 3x (compared to 20-60x+ previously), but it's  
still bad enough: commands such as "ls" and "w" take 2-3 seconds to  
run, as opposed to 0.005s for ls without the added IO... On Linux, the  
increase in latency is closer to 4%. A bit better than, oh, 400  
times. ;)

Oh, and again: this post is not a complaint; this is a post asking for  
your experiences. I know I'm not alone in having these issues - I just  
want to know if there are a lot of people that *don't* too, and what  
could cause them. I can't possibly switch to FreeBSD in production  
with this behaviour - and I've been looking forward to doing so for  
quite a while now.

Regards,
Thomas

PS.

I'll leave my post to the original discussion below. (I don't usually  
top post, but I don't consider this a reply, more of a new post with  
an addition below.)

On Oct 5, 2009, at 10:45 AM, Thomas Backman wrote:

> Hey everyone,
> I'm having serious trouble with the same thing, and just found this  
> thread while looking for the correct place to post. Looks like I  
> found it. (I wrote most of the post before finding the thread, so  
> some of it will seem a bit odd.)
>
> I run 8.0-RC1/amd64 with ZFS on an A64 3200+ with 2GB RAM and an old  
> 80GB 7200rpm disk.
>
> My problem is that I get completely unacceptable latency on console  
> IO (both via SSH and serial console) when the system is performing  
> disk IO. The worst case I've noticed yet was when I tried copying a  
> core dump from a lzjb compressed ZFS file system to a gzip-9  
> compressed one, to compare the file size/compression ratio. screen 
> (1) took at LEAST ten seconds - probably a bit more - I'm not  
> exaggerating here - to switch to another window, and an "ls" in an  
> empty directory also about 5-10 seconds.
> Doing some silly CPU load with two instances of "yes >/dev/null" (on  
> a single core, remember) doesn't change anything, the system remains  
> very responsive. "cat /dev/zero > /uncompressed_fs/..." however  
> produces the extreme slowdown. (On a gzip-1 FS it doesn't, since the  
> file ends up extremely small - a kilobyte or so - even after a  
> while, thus performing minimal IO).
>
> I'm thinking about switching to FreeBSD on my beefier "production"  
> system (dual-core amd64, 4GB RAM, 4x1TB disks, compared to this one,  
> single-core, 2GB RAM, 80GB disk), but unless I feel assured this  
> won't happen there as well, I'm not so sure anymore. I can do any  
> kind of heavy IO/compilation/whatever on that box, currently running  
> Linux, and it's always unnoticable. In this case it's impossible  
> *not* to notice that your key input is lagging behind 5-10  
> seconds... I thought multiple times that the box must have panicked.
> I do realize that the hardware isn't the best, especially the disks,  
> but this is far worse than it should be!
>
> Here's some of the testing done in this thread (or at least  
> something like it):
>
> [root at chaos ~]# time file /etc/* >/dev/null
> real    0m1.725s
> user    0m0.993s
> sys     0m0.021s
> [root at chaos ~]# time file /etc/* >/dev/null
>
> real    0m1.008s
> user    0m0.990s
> sys     0m0.015s
> [root at chaos ~]# time file /etc/* >/dev/null
>
> real    0m1.008s
> user    0m0.967s
> sys     0m0.038s
> [root at chaos ~]# time file /etc/* >/dev/null
>
> real    0m1.015s
> user    0m0.998s
> sys     0m0.008s
>
> So, pretty much exactly 1 second every time once the cache is warmed  
> up. Now, let's try it 10 seconds after starting heavy disk writing...
>
> [root at chaos ~]# cat /dev/zero > /DELETE_ME &
> (wait for 10 seconds)
> [root at chaos ~]# time file /etc/* >/dev/null
>
> real    0m13.217s
> user    0m1.004s
> sys     0m0.023s
> [root at chaos ~]# time file /etc/* >/dev/null
>
> real    0m24.012s
> user    0m1.020s
> sys     0m0.008s
>
> ... the numbers speak for themselves. FWIW I tried the same on the  
> Linux box: "file" took 0.8 seconds the first time, 0.125s subsequent  
> times. While running cat, 0.13-0.21 seconds (0.21 being the first,  
> subsequent runs took 0.13s). The system remained completely  
> responsive, which cannot be said for the FreeBSD one!
>
> Any advice? Info below - please ask if you need more!
>
> Regards,
> Thomas
>
> -----------------------------------------------------------------------------
>
> Basic info:
>
> [root at chaos ~]# uname -a
> FreeBSD chaos.exscape.org 8.0-RC1 FreeBSD 8.0-RC1 #1 r197613M: Tue  
> Sep 29 15:04:44 CEST 2009     root at chaos.exscape.org:/usr/obj/usr/ 
> src/sys/DTRACE  amd64
> (KDB/DDB enabled, WITNESS/INVARIANTS *disabled*)
>
> [root at chaos ~]# mount
> tank/root on / (zfs, local, noatime)
> devfs on /dev (devfs, local, multilabel)
> /dev/ad0s1a on /bootdir (ufs, local, soft-updates)
> tank/export on /export (zfs, NFS exported, local, noatime)
> tank/tmp on /tmp (zfs, local, noatime)
> tank/usr on /usr (zfs, local, noatime)
> tank/usr/obj on /usr/obj (zfs, local, noatime)
> tank/usr/ports on /usr/ports (zfs, local, noatime)
> tank/usr/ports/distfiles on /usr/ports/distfiles (zfs, local, noatime)
> tank/usr/src on /usr/src (zfs, local, noatime)
> tank/usr/src_r196905 on /usr/src_r196905 (zfs, local, noatime, read- 
> only)
> tank/var on /var (zfs, local, noatime)
> tank/var/crash on /var/crash (zfs, local, noatime)
> tank/var/log on /var/log (zfs, local, noatime)
> tank/var/tmp on /var/tmp (zfs, local, noatime)
>
> dmesg:
>
> -----------------------------------------------------------------------------
> Copyright (c) 1992-2009 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,  
> 1994
> 	The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 8.0-RC1 #1 r197613M: Tue Sep 29 15:04:44 CEST 2009
>    root at chaos.exscape.org:/usr/obj/usr/src/sys/DTRACE
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: AMD Athlon(tm) 64 Processor 3200+ (2009.27-MHz K8-class CPU)
>  Origin = "AuthenticAMD"  Id = 0x10ff0  Stepping = 0
>   
> Features 
> = 
> 0x78bfbff 
> < 
> FPU 
> ,VME 
> ,DE 
> ,PSE 
> ,TSC 
> ,MSR 
> ,PAE 
> ,MCE 
> ,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2>
>  AMD Features=0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow!+,3DNow!>
>  AMD Features2=0x1<LAHF>
> real memory  = 2147483648 (2048 MB)
> avail memory = 2052362240 (1957 MB)
> ACPI APIC Table: <Nvidia AWRDACPI>
> ioapic0 <Version 1.1> irqs 0-23 on motherboard
> kbd1 at kbdmux0
> acpi0: <Nvidia AWRDACPI> on motherboard
> acpi0: [ITHREAD]
> acpi0: Power Button (fixed)
> acpi0: reservation of 0, a0000 (3) failed
> acpi0: reservation of 100000, 7fef0000 (3) failed
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0
> acpi_button0: <Power Button> on acpi0
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pci0: <ACPI PCI bus> on pcib0
> pci0: <memory> at device 0.0 (no driver attached)
> isab0: <PCI-ISA bridge> at device 1.0 on pci0
> isa0: <ISA bus> on isab0
> pci0: <serial bus, SMBus> at device 1.1 (no driver attached)
> ohci0: <OHCI (generic) USB controller> mem 0xfe02f000-0xfe02ffff irq  
> 21 at device 2.0 on pci0
> ohci0: [ITHREAD]
> usbus0: <OHCI (generic) USB controller> on ohci0
> ehci0: <NVIDIA nForce4 USB 2.0 controller> mem 0xfe02e000-0xfe02e0ff  
> irq 22 at device 2.1 on pci0
> ehci0: [ITHREAD]
> usbus1: EHCI version 1.0
> usbus1: <NVIDIA nForce4 USB 2.0 controller> on ehci0
> atapci0: <nVidia nForce CK804 UDMA133 controller> port  
> 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfb00-0xfb0f at device 6.0 on  
> pci0
> ata0: <ATA channel 0> on atapci0
> ata0: [ITHREAD]
> ata1: <ATA channel 1> on atapci0
> ata1: [ITHREAD]
> atapci1: <nVidia nForce CK804 SATA300 controller> port  
> 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xf600-0xf60f mem  
> 0xfe02b000-0xfe02bfff irq 23 at device 7.0 on pci0
> atapci1: [ITHREAD]
> ata2: <ATA channel 0> on atapci1
> ata2: [ITHREAD]
> ata3: <ATA channel 1> on atapci1
> ata3: [ITHREAD]
> atapci2: <nVidia nForce CK804 SATA300 controller> port  
> 0x9e0-0x9e7,0xbe0-0xbe3,0x960-0x967,0xb60-0xb63,0xf100-0xf10f mem  
> 0xfe02a000-0xfe02afff irq 21 at device 8.0 on pci0
> atapci2: [ITHREAD]
> ata4: <ATA channel 0> on atapci2
> ata4: [ITHREAD]
> ata5: <ATA channel 1> on atapci2
> ata5: [ITHREAD]
> pcib1: <ACPI PCI-PCI bridge> at device 9.0 on pci0
> pci1: <ACPI PCI bus> on pcib1
> vgapci0: <VGA-compatible display> mem 0xfcff8000-0xfcffbfff, 
> 0xfd000000-0xfd7fffff,0xfc000000-0xfc7fffff irq 17 at device 7.0 on  
> pci1
> xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xdf00-0xdf7f mem  
> 0xfcfff000-0xfcfff07f irq 18 at device 9.0 on pci1
> miibus0: <MII bus> on xl0
> xlphy0: <3c905C 10/100 internal PHY> PHY 24 on miibus0
> xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> xl0: Ethernet address: 00:50:da:44:c0:4a
> xl0: [ITHREAD]
> nfe0: <NVIDIA nForce4 CK804 MCP9 Networking Adapter> port  
> 0xf000-0xf007 mem 0xfe029000-0xfe029fff irq 22 at device 10.0 on pci0
> miibus1: <MII bus> on nfe0
> e1000phy0: <Marvell 88E1111 Gigabit PHY> PHY 1 on miibus1
> e1000phy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX,  
> 1000baseT, 1000baseT-FDX, auto
> nfe0: Ethernet address: 00:13:d3:a2:aa:0f
> nfe0: [FILTER]
> pcib2: <ACPI PCI-PCI bridge> at device 11.0 on pci0
> pci2: <ACPI PCI bus> on pcib2
> pcib3: <ACPI PCI-PCI bridge> at device 12.0 on pci0
> pci3: <ACPI PCI bus> on pcib3
> pcib4: <ACPI PCI-PCI bridge> at device 13.0 on pci0
> pci4: <ACPI PCI bus> on pcib4
> pcib5: <ACPI PCI-PCI bridge> at device 14.0 on pci0
> pci5: <ACPI PCI bus> on pcib5
> amdtemp0: <AMD K8 Thermal Sensors> on hostb3
> acpi_tz0: <Thermal Zone> on acpi0
> atrtc0: <AT realtime clock> port 0x70-0x73 irq 8 on acpi0
> uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on  
> acpi0
> uart0: [FILTER]
> uart0: console (115200,n,8,1)
> atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
> atkbd0: <AT Keyboard> irq 1 on atkbdc0
> kbd0 at atkbd0
> atkbd0: [GIANT-LOCKED]
> atkbd0: [ITHREAD]
> cpu0: <ACPI CPU> on acpi0
> powernow0: <Cool`n'Quiet K8> on cpu0
> device_attach: powernow0 attach returned 6
> orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xcbfff, 
> 0xcc000-0xcc7ff on isa0
> sc0: <System console> at flags 0x100 on isa0
> sc0: VGA <16 virtual consoles, flags=0x300>
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on  
> isa0
> ZFS NOTICE: system has less than 4GB and prefetch enable is not  
> set... disabling.
> ZFS filesystem version 13
> ZFS storage pool version 13
> Timecounter "TSC" frequency 2009269513 Hz quality 800
> Timecounters tick every 1.000 msec
> usbus0: 12Mbps Full Speed USB v1.0
> usbus1: 480Mbps High Speed USB v2.0
> ad0: 76318MB <Seagate ST380021A 3.19> at ata0-master UDMA100
> ugen0.1: <nVidia> at usbus0
> uhub0: <nVidia OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on  
> usbus0
> ugen1.1: <nVidia> at usbus1
> uhub1: <nVidia EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on  
> usbus1
> ad2: 9768MB <Seagate ST310014A 3.09> at ata1-master UDMA100
> GEOM: ad2s1: geometry does not match label (255h,63s != 16h,63s).
> Root mount waiting for: usbus1 usbus0
> uhub0: 10 ports with 10 removable, self powered
> Root mount waiting for: usbus1
> Root mount waiting for: usbus1
> Root mount waiting for: usbus1
> uhub1: 10 ports with 10 removable, self powered
> Trying to mount root from zfs:tank/root
> netsmb_dev: loaded
>
> -----------------------------------------------------------------------------
> The 80GB disk is used for everything except swap (aka. dump device)  
> for which the 10GB disk is used, exclusively.
>
> loader.conf has nothing of value (just loading a few modules and a  
> vfs.root.mountfrom, plus serial console settings).



More information about the freebsd-stable mailing list