ZFS, SSDs, and TRIM performance

Nicolas Gilles nicolas.gilles at gmail.com
Tue Nov 3 09:12:03 UTC 2015


Not sure about the Samsung XS1715, but lots of SSDs seem to suck at
large amounts of TRIM in general leading a "let me pause everything
for a while" symptom. In fact I think there is work in ZFS to make
TRIMs work better, and to throttle them in case large amounts are
freed to avoid this kind of starvation.

-- Nicolas


On Thu, Oct 29, 2015 at 7:22 PM, Steven Hartland
<killing at multiplay.co.uk> wrote:
> If you running NVMe, are you running a version which has this:
> https://svnweb.freebsd.org/base?view=revision&revision=285767
>
> I'm pretty sure 10.2 does have that, so you should be good, but best to
> check.
>
> Other questions:
> 1. What does "gstat -d -p" show during the stalls?
> 2. Do you have any other zfs tuning in place?
>
> On 29/10/2015 16:54, Sean Kelly wrote:
>>
>> Me again. I have a new issue and I’m not sure if it is hardware or
>> software. I have nine servers running 10.2-RELEASE-p5 with Dell OEM’d
>> Samsung XS1715 NVMe SSDs. They are paired up in a single mirrored zpool on
>> each server. They perform great most of the time. However, I have a problem
>> when ZFS fires off TRIMs. Not during vdev creation, but like if I delete a
>> 20GB snapshot.
>>
>> If I destroy a 20GB snapshot or delete large files, ZFS fires off tons of
>> TRIMs to the disks. I can see the kstat.zfs.misc.zio_trim.success and
>> kstat.zfs.misc.zio_trim.bytes sysctls skyrocket. While this is happening,
>> any synchronous writes seem to block. For example, we’re running PostgreSQL
>> which does fsync()s all the time. While these TRIMs happen, Postgres just
>> hangs on writes. This causes reads to block due to lock contention as well.
>>
>> If I change sync=disabled on my tank/pgsql dataset while this is
>> happening, it unblocks for the most part. But obviously this is not an ideal
>> way to run PostgreSQL.
>>
>> I’m working with my vendor to get some Intel SSDs to test, but any ideas
>> if this could somehow be a software issue? Or does the Samsung XS1715 just
>> suck at TRIM and SYNC?
>>
>> We’re thinking of just setting the vfs.zfs.trim.enabled=0 tunable for now
>> since WAL segment turnover actually causes TRIM operations a lot, but
>> unfortunately this is a reboot. But disabling TRIM does seem to fix the
>> issue on other servers I’ve tested with the same hardware config.
>>
>
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"


More information about the freebsd-stable mailing list