[Bug 209571] ZFS and NVMe performing poorly. TRIM requests stall I/O activity
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Tue May 17 07:37:23 UTC 2016
Bug ID: 209571
Summary: ZFS and NVMe performing poorly. TRIM requests stall
Product: Base System
Severity: Affects Many People
Assignee: freebsd-bugs at FreeBSD.org
Reporter: borjam at sarenet.es
Created attachment 170388
throughput graphs for two bonnie++ runs
On a test system with 10 Intel P3500 NVMEs I have found that TRIM
activity can cause a severe I/O stall. After running several bonnie++
tests, the ZFS file system has been almost unusable for 15 minutes (yes,
HOW TO REPRODUCE:
- Create a ZFS pool, in this case, a raidz2 pool with the 10 NVMEs.
- Create a dataset without compression (we want to test actual I/O
- Run bonnie++. As bonnie++ can quickly saturate a single CPU core and
hence it's unable to generate enough bandwidth for this setup, I run
four bonnie++ processes concurrently. In order to demonstrate this
issue, each bonnie++ performs two runs. So,
( bonnie++ -s 512g -x 2 -f) & # four times.
Graphs included. Made with devilator (an Orca compatible data collector)
pulling data from devstat(9). The disk is just one out of 10 (the other
9 graphs are identical, as expected).
The first run of four bonnie++ processes runs without flaws. On graph
1 (TwoBonniesTput) we have the first bonnie++ from the start of the
graph to around 08:30 (the green line is the "Intelligent reading"
phase, and a second bonnie++ starting right after it.
Bonnie++ does several tests, beginning with a write test (blue line
showing around 230 MBps, from the start to 07:40), followed by a
read/write test (from 07:40 to 08:15 on the graphs), showing
read/write/delete activity and finally a read test (green line showing
250 MBps from 08:15 to 08:30 more or less). After bonnie++ ends, the
files it created are deleted. In this particular test, four concurrent
bonnie++ processes created four files of 512 GB each, a total of 2 TB.
After the first run, the disks show the TRIM activity going on at a rate of
200 MB/s. It seems quite slow, since a test I did at home on an OCZ Vertex4 SSD
(albeit, a single one, not a pool) gave a peak of 2 GB/s. But I understand that
the ada driver is coalescing TRIM requests, while the nvd driver doesn't.
The trouble is: the second bonnie++ process is started right after the first
and, THERE IS ALMOST NO WRITE ACTIVITY FOR 15 MINUTES. The writing activity is
just frozen, and it doesn't pick up until about 08:45, stalling again, although
for a shorter time, around 08:50.
On exhibit 2, "TwoBonniesTimes", it can be seen that the write latency during
is zero, which means (unless I am wrong) that no write commands are actually
During the stalls the ZFS system was unresponsive. Any commands such as a
"zfs list" were painfully slow, taking even some minutes to complete.
I understand that a heavy TRIM activity must have an impact, but in this case
causing a complete starvation for the rest of the ZFS I/O activity which is
wrong. This behavior could cause a severe problem, por example, when destroying
snapshot. In this case, the system is deleting 2 TB of data.
ATTEMPTS TO MITIGATE IT:
The first thing I tried was to reduce the priority of the TRIM operations in
with no visible effect.
After reading the article describing the ZFS I/O scheduler I suspected that the
activity might be activating the write throttle. So I just disabled it.
But it didn't help either. The writing processes still got stuck, but on
There are two problems here. It seems that the nvd driver doesn't coalesce trim
requests. On the other hand, ZFS is dumping a lot of trim requests assuming
lower layer will coalesce them.
I don't think it's a good idea to make such an assumption blindly in ZFS. On
hand, I think that there should be some throttling mechanism applied to trim
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs