FreeBSD 11.1 Beta 2 ZFS performance degradation on SSDs
Caza, Aaron
Aaron.Caza at ca.weatherford.com
Tue Jun 20 18:50:32 UTC 2017
> -----Original Message-----
> From: Karl Denninger [mailto:karl at denninger.net]
> Sent: Tuesday, June 20, 2017 11:58 AM
> To: freebsd-fs at freebsd.org
> Subject: Re: FreeBSD 11.1 Beta 2 ZFS performance degradation on SSDs
>
> On 6/20/2017 12:29, Caza, Aaron wrote:
> >> -----Original Message-----
> >> From: Karl Denninger [mailto:karl at denninger.net]
> >> Sent: Monday, June 19, 2017 7:28 PM
> >> To: freebsd-fs at freebsd.org
> >> Subject: Re: FreeBSD 11.1 Beta 2 ZFS performance degradation on SSDs
> >>
> >> Just one note below...
> >>
> >> On 6/19/2017 19:57, Caza, Aaron wrote:
> >>> Note that file /testdb/test is 16GB, twice the size of ram available =n this system. The /testdb directory is a ZFS file system with recordsi=e=8k, chosen as ultimately it's intended to host a PostgreSQL database=which uses an 8k page size.
> >> Do not make this assumption blindly. Yes, I know the docs say to set
> >> recordsize=8k but this is something you need to benchmark against
> >> yo=r actual working data set.
> >>
> >> MANY Postgres workloads are MUCH faster (2x or more!) if you use a
> >> default page size and lz4 compression -- including one I have in
> >> production and have extensively benchmarked. The difference is NOT sm=ll..
> >> ....
> >>
> >> zroot/ticker compressratio 1.53x -
> >> zroot/ticker mounted yes -
> >> zroot/ticker quota none defa=lt
> >> zroot/ticker reservation none defa=lt
> >> zroot/ticker recordsize 128K defa=lt
> >> zroot/ticker mountpoint /usr/local/pgsql/data-ticker loca=
> >> zroot/ticker sharenfs off defa=lt
> >> zroot/ticker checksum fletcher4
> >> inherited from zroot
> >> zroot/ticker compression lz4
> >> inherited from zroot
> >> zroot/ticker atime off
> >> inherited from zroot
> >>
> >> You may also want to consider setting logbias=throughput. In some
> >> c=ses the improvement there can be quite material as well --
> >> depending on th= insert/update traffic to the database in question.
> >>
> >> --
> >> Karl Denninger
> >> karl at denninger.net <mailto:karl at denninger.net> /The Market Ticker/
> >> /[S/MIME encrypted email preferred]/
> > Thanks for the suggestions Karl. I'll investigate further after I reso=ve this performance degradation issue I'm experiencing. I recently read=another FreeBSD+ZFS+PostgreSQL user's Scale15x presentation, PostgreZFS,=Sean Chittenden if I recall correctly, who also advised lz4 compression = 16K page size rather than 8K with PostgreZFS.
> >
> > With regards to my performance woes, I was originally using PostgreSQL =n my posts to freebsd-hackers at freebsd.org but started using 'dd' to remo=e it as a point of contention. In attempting to resolve this issue, I t=ied using your patch to PR 187594 (https://bugs.freebsd.org/bugzilla/sho=_bug.cgi?id=187594). Took a bit of effort to > > find a revision of FreeB=D 10 Stable to which your FreeBSD10 patch would both apply and compile c=eanly; however, it didn't resolve the issue I'm experiencing.
> I would not have expected my PR to impact this issue.
>
> I suspicious of a drive firmware interaction with your I/O pattern; SSDs
> are somewhat-notorious for having that come up under certain workloads
> that involve a lot of writes.
>
I've observed this performance degradation on 6 different hardware systems using 4 differents SSDS (2x Intel 510 120GB, 2x Intel 520 120GB, 2x Intel 540 120GB, 2x Samsung 850 Pro SSDs) on FreeBSD10.3 RELEASE, FreeBSD 10.3 RELEASEp6, FreeBSD 10.3RELEASEp19, FreeBSD 10-Stable, FreeBSD11.0 RELEASE, FreeBSD 11-Stable and now FreeBSD11.1 Beta 2. This latest testing I'm not doing much in the way of writing - only logging the output of the 'dd' command along with 'zfs-stats -a' and 'uptime' to go along with it once an hour. Ran for ~20hrs before performance drop kicked in though why it happens is inexplicable as this server isn't doing anything other than running this test hourly.
I have a FreeBSD9.0 system using 2x Intel 520 120GB SSDs that doesn't exhibit this performance degradation, maintaining ~400MB/s speeds even after many days of uptime. This is using the GEOM ELI layer to provide 4k sector emulation for the mirrored zpool as I previously described.
Interestingly, using the GEOM ELI layering, I was seeing the following
- FreeBSD 10.3 RELEASE : performance ~750MB/s when dd'ing 16GB file
- FreeBSD 10 Stable : performance ~850MB/s when dd'ing 16GB file
- FreeBSD 11 Stable : performance ~950MB/s when dd'ing 16GB file
During the above testing, which was all done after reboot, gstat would show %busy of 90-95%. When performance degradation hits, %busy drops to ~15%.
Switching to FreeBSD 11.1 Beta 2 with Auto(ZFS) ashift-based 4k emulation of ZFS mirrored pool:
- FreeBSD 11.1 Beta 2 : performance ~450MB/s when dd'ing 16GB file with gstat %busy of ~60%. When performance degradation hits, %busy drops to ~15%.
Now, I expected that removing the GEOM ELI layer and just using vfs.zfs.min_auto_ashift=12 to do the 4k sector emulation would provide even better performance. It's seems strange to me that it doesn't.
> --
> Karl Denninger
> karl at denninger.net <mailto:karl at denninger.net>
> /The Market Ticker
--
Aaron
This message may contain confidential and privileged information. If it has been sent to you in error, please reply to advise the sender of the error and then immediately delete it. If you are not the intended recipient, do not read, copy, disclose or otherwise use this message. The sender disclaims any liability for such unauthorized use. PLEASE NOTE that all incoming e-mails sent to Weatherford e-mail accounts will be archived and may be scanned by us and/or by external service providers to detect and prevent threats to our systems, investigate illegal or inappropriate behavior, and/or eliminate unsolicited promotional e-mails (spam). This process could result in deletion of a legitimate e-mail before it is read by its intended recipient at our organization. Moreover, based on the scanning results, the full text of e-mails and attachments may be made available to Weatherford security and other personnel for review and appropriate action. If you have any concerns about this process, please contact us at dataprivacy at weatherford.com.
More information about the freebsd-fs
mailing list