FreeBSD 11.1 Beta 2 ZFS performance degradation on SSDs

Caza, Aaron Aaron.Caza at ca.weatherford.com
Tue Jun 20 00:57:14 UTC 2017


> vfs.zfs.min_auto_ashift is a sysctl only its not a tuneable, so setting it in /boot/loader.conf won't have any effect.
>
> There's no need for it to be a tuneable as it only effects vdevs when they are created, which an only be done once the system is running.
>

The bsdinstall install script itself set vfs.zfs.min_auto_shift=12 in /boot/loader.conf yet, as you say, this doesn't do anything.  As a user, this is a bit confusing to see it in /boot/loader.conf but do a 'sysctl -a | grep min_auto_ashift' and see 'vfs.zfs.min_auto_ashift: 9' so felt it was worth mentioning.

> You don't explain why you believe there is degrading performance?

As I related in my post, my previous FreeBSD 11-Stable setup using this same hardware, I was seeing 950MB/s after bootup.  I've been posting to the freebsd-hackers list, but have moved to freebsd-fs list as this seemingly has something to do with FreeBSD+ZFS behavior and user Jov had previously cross-posted to this list for me:
https://docs.freebsd.org/cgi/getmsg.cgi?fetch=2905+0+archive/2017/freebsd-fs/20170618.freebsd-fs

I've been using FreeBSD+ZFS ever since FreeBSD 9.0, admittedly, with a different zpool layout which is essentially as follows:
    adaXp1 - gptboot loader
    adaXp2 - 1GB UFS partition
    adaXp3 - UFS with UUID labeled partition hosting a GEOM ELI layer using NULL encryption to emulate 4k sectors (done before ashift was an option)

So, adaXp3 would show up as something like the following:

  /dev/gpt/b62feb20-554b-11e7-989b-000bab332ee8
  /dev/gpt/b62feb20-554b-11e7-989b-000bab332ee8.eli

Then, the zpool mirrored pair would be something like the following:

  pool: wwbase
 state: ONLINE
  scan: none requested
config:

        NAME                                              STATE     READ WRITE CKSUM
        wwbase                                            ONLINE       0     0     0
          mirror-0                                        ONLINE       0     0     0
            gpt/b62feb20-554b-11e7-989b-000bab332ee8.eli  ONLINE       0     0     0
            gpt/4c596d40-554c-11e7-beb1-002590766b41.eli  ONLINE       0     0     0

Using the above zpool configuration on this same hardware on FreeBSD 11-Stable, I was seeing read speeds of 950MB/s using dd (dd if=/testdb/test of=/dev/null bs=1m).  However, after anywhere from 5 to 24 hours, performance would degrade down to less than 100MB/s for unknown reasons - server was essentially idle so it's a mystery to me why this occurs.  I'm seeing this behavior on FreeBSD 10.3R amd64 up through FreeBSD11.0 Stable.  As I wasn't making any headway in resolving this, I opted today to use the FreeBSD11.1 Beta 2 memstick image to create a basic FreeBSD 11.1 Beta 2 amd64 Auto(ZFS) installation to see if this would resolve the original issue I was having as I would be using ZFS-on-root and vfs.zfs.min_auto_ashift=12 instead of my own emulation as described above.  However, instead of seeing the 950MB/s that I expected - which it what I see it with my alternative emulation - I'm seeing 450MB/s.  I've yet to determine if this zpool setup as done by the bsdinstall script will suffer from the original performance degradation I observed.

> What is the exact dd command your running as that can have a huge impact on performance.

dd if=/testdb/test of=/dev/null bs=1m

Note that file /testdb/test is 16GB, twice the size of ram available in this system.  The /testdb directory is a ZFS file system with recordsize=8k, chosen as ultimately it's intended to host a PostgreSQL database which uses an 8k page size.

My understanding is that a ZFS mirrored pool with two drives can read from both drives at the same time hence double the speed.  This is what I've actually observed ever since I first started using this in FreeBSD 9.0 with the GEOM ELI 4k sector emulation.  This is actually my first time using FreeBSD's native installer's Auto(ZFS) setup with 4k sectors emulated using vfs.zfs.min_auto_ashift=12.  As it's a ZFS mirrored pool, I still expected it to be able to read at double-speed as it does with the GEOM ELI 4k sector emulation; however, it does not.

/var/run/dmesg.boot:
Copyright (c) 1992-2017 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.1-BETA2 #0 r319993: Fri Jun 16 02:32:38 UTC 2017
    root at releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM 4.0.0)
VT(vga): resolution 640x480
CPU: Intel(R) Xeon(R) CPU E31240 @ 3.30GHz (3292.60-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x206a7  Family=0x6  Model=0x2a  Stepping=7
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x1dbae3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,XSAVE,OSXSAVE,AVX>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  XSAVE Features=0x1<XSAVEOPT>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
real memory  = 8589934592 (8192 MB)
avail memory = 8232431616 (7851 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <SUPERM SMCI--MB>
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s) x 2 hardware threads
random: unblocking device.
ioapic0 <Version 2.0> irqs 0-23 on motherboard
SMP: AP CPU #1 Launched!
SMP: AP CPU #7 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #3 Launched!
SMP: AP CPU #4 Launched!
SMP: AP CPU #6 Launched!
SMP: AP CPU #5 Launched!
Timecounter "TSC-low" frequency 1646297938 Hz quality 1000
random: entropy device external interface
kbd1 at kbdmux0
netmap: loaded module
module_register_init: MOD_LOAD (vesa, 0xffffffff80f5a190, 0) error 19
nexus0
vtvga0: <VT VGA driver> on motherboard
cryptosoft0: <software crypto> on motherboard
acpi0: <SUPERM SMCI--MB> on motherboard
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
cpu2: <ACPI CPU> on acpi0
cpu3: <ACPI CPU> on acpi0
cpu4: <ACPI CPU> on acpi0
cpu5: <ACPI CPU> on acpi0
cpu6: <ACPI CPU> on acpi0
cpu7: <ACPI CPU> on acpi0
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 550
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pcib0: _OSC returned error 0x10
pci0: <ACPI PCI bus> on pcib0
em0: <Intel(R) PRO/1000 Network Connection 7.6.1-k> port 0xf020-0xf03f mem 0xfba00000-0xfba1ffff,0xfba24000-0xfba24fff irq 20 at device 25.0 on pci0
em0: Using an MSI interrupt
em0: Ethernet address: 00:25:90:76:6b:41
em0: netmap queues/slots: TX 1/1024, RX 1/1024
ehci0: <Intel Cougar Point USB 2.0 controller> mem 0xfba23000-0xfba233ff irq 16 at device 26.0 on pci0
usbus0: EHCI version 1.0
usbus0 on ehci0
usbus0: 480Mbps High Speed USB v2.0
pcib1: <ACPI PCI-PCI bridge> irq 17 at device 28.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> irq 17 at device 28.4 on pci0
pci2: <ACPI PCI bus> on pcib2
em1: <Intel(R) PRO/1000 Network Connection 7.6.1-k> port 0xe000-0xe01f mem 0xfb900000-0xfb91ffff,0xfb920000-0xfb923fff irq 16 at device 0.0 on pci2
em1: Using MSIX interrupts with 3 vectors
em1: Ethernet address: 00:25:90:76:6b:40
em1: netmap queues/slots: TX 1/1024, RX 1/1024
ehci1: <Intel Cougar Point USB 2.0 controller> mem 0xfba22000-0xfba223ff irq 23 at device 29.0 on pci0
usbus1: EHCI version 1.0
usbus1 on ehci1
usbus1: 480Mbps High Speed USB v2.0
pcib3: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci3: <ACPI PCI bus> on pcib3
vgapci0: <VGA-compatible display> mem 0xfe000000-0xfe7fffff,0xfb800000-0xfb803fff,0xfb000000-0xfb7fffff irq 23 at device 3.0 on pci3
vgapci0: Boot video device
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
ahci0: <Intel Cougar Point AHCI SATA controller> port 0xf070-0xf077,0xf060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf000-0xf01f mem 0xfba21000-0xfba217ff irq 19 at device 31.2 on pci0
ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
ahciem0: <AHCI enclosure management bridge> on ahci0
acpi_button0: <Power Button> on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse Explorer, device ID 4
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff on isa0
ppc0: cannot reserve I/O port range
est0: <Enhanced SpeedStep Frequency Control> on cpu0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
est2: <Enhanced SpeedStep Frequency Control> on cpu2
est3: <Enhanced SpeedStep Frequency Control> on cpu3
est4: <Enhanced SpeedStep Frequency Control> on cpu4
est5: <Enhanced SpeedStep Frequency Control> on cpu5
est6: <Enhanced SpeedStep Frequency Control> on cpu6
est7: <Enhanced SpeedStep Frequency Control> on cpu7
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
Timecounters tick every 1.000 msec
nvme cam probe device init
ugen0.1: <Intel EHCI root HUB> at usbus0
ugen1.1: <Intel EHCI root HUB> at usbus1
uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus0
uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
ses0 at ahciem0 bus 0 scbus2 target 0 lun 0
ses0: <AHCI SGPIO Enclosure 1.00 0001> SEMB S-E-S 2.00 device
ses0: SEMB SES Device
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <Samsung SSD 850 PRO 256GB EXM03B6Q> ACS-2 ATA SATA 3.x device
ada0: Serial Number S39KNB0HB00482Y
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada0: Command Queueing enabled
ada0: 244198MB (500118192 512 byte sectors)
ada0: quirks=0x3<4K,NCQ_TRIM_BROKEN>
ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
ada1: <Samsung SSD 850 PRO 256GB EXM03B6Q> ACS-2 ATA SATA 3.x device
ada1: Serial Number S39KNB0HB00473Z
ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada1: Command Queueing enabled
ada1: 244198MB (500118192 512 byte sectors)
ada1: quirks=0x3<4K,NCQ_TRIM_BROKEN>
Trying to mount root from zfs:zroot/ROOT/default []...
Root mount waiting for: usbus1 usbus0
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
Root mount waiting for: usbus1 usbus0
ugen0.2: <vendor 0x8087 product 0x0024> at usbus0
uhub2 on uhub0
uhub2: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on usbus0
ugen1.2: <vendor 0x8087 product 0x0024> at usbus1
uhub3 on uhub1
uhub3: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on usbus1
Root mount waiting for: usbus1 usbus0
uhub2: 6 ports with 6 removable, self powered
uhub3: 6 ports with 6 removable, self powered
ugen1.3: <Weatherford SPD> at usbus1
umodem0 on uhub3
umodem0: <Weatherford SPD, class 2/0, rev 1.10/0.01, addr 3> on usbus1
umodem0: data interface 1, has CM over data, has break

> On 19/06/2017 23:14, Caza, Aaron wrote:
> > I've been  having a problem with FreeBSD ZFS SSD performance inexplicably degrading after < 24  hours uptime as described in a separate e-mail thread.  In an effort to get down to basics, I've now performed a ZFS-on-Root install of FreeBSD 11.1 Beta 2 amd64 using the default Auto(ZFS) install using the default 4k sector emulation (vfs.zfs.min_auto_ashift=3D12) setting (no swap, not encrypted).
> >
> > Firstly, the vfs.zfs.min_auto_ashift=3D12 is set correctly in the /boot=/loader.conf file, but doesn't appear to work because when I log in and do "systctl -a | grep min_auto_ashift" it's set to 9 and not 12 as expected.  I tried setting it to vfs.zfs.min_auto_ashift=3D"12" in /boot/loader.conf but that didn't make any difference so I finally just added it to /etc/sysctl.conf where it seems to work.  So, something needs to be changed to make this functionaly work correctly.
> >
> > Next, after reboot I was expecting somewhere in the neighborhood of 950MB/s from the ZFS mirrored zpool of 2 Samsung 850 Pro 256GB SSDs that I'm using as I was previously seeing this before with my previous FreeBSD 11-Stable setup which, admittedly, is a different from the way the bsdinstall script does it.  However, I'm seeing half that on bootup.
> >
> > Performance result:
> > Starting 'dd' test of large file...please wait
> > 16000+0 records in
> > 16000+0 records out
> > 16777216000 bytes transferred in 37.407043 secs (448504207 bytes/sec)
> >
> > Zpool Status:
> >    pool: zroot
> > state: ONLINE
> >    scan: none requested
> > config:
> >
> >          NAME        STATE     READ WRITE CKSUM
> >          zroot       ONLINE       0     0     0
> >            mirror-0  ONLINE       0     0     0
> >              ada0p2  ONLINE       0     0     0
> >              ada1p2  ONLINE       0     0     0
> >
> > /boot/loader.conf:
> > kern.geom.label.disk_ident.enable=3D"0"
> > kern.geom.label.gptid.enable=3D"0"
> > vfs.zfs.min_auto_ashift=3D12
> > vfs.zfs.arc_min=3D"50M"
> > vfs.zfs.arc_max=3D"51M"
> > zfs_load=3D"YES"
> >
> > /etc/sysctl.conf:
> > vfs.zfs.min_auto_ashift=3D12
> >
> >
> > Is this the expected behavior now in FreeBSD 11.1?
> >
> > --
> > Aaron

--
Aaron

This message may contain confidential and privileged information. If it has been sent to you in error, please reply to advise the sender of the error and then immediately delete it. If you are not the intended recipient, do not read, copy, disclose or otherwise use this message. The sender disclaims any liability for such unauthorized use. PLEASE NOTE that all incoming e-mails sent to Weatherford e-mail accounts will be archived and may be scanned by us and/or by external service providers to detect and prevent threats to our systems, investigate illegal or inappropriate behavior, and/or eliminate unsolicited promotional e-mails (spam). This process could result in deletion of a legitimate e-mail before it is read by its intended recipient at our organization. Moreover, based on the scanning results, the full text of e-mails and attachments may be made available to Weatherford security and other personnel for review and appropriate action. If you have any concerns about this process, please contact us at dataprivacy at weatherford.com.


More information about the freebsd-fs mailing list