svn commit: r292074 - in head/sys/dev: nvd nvme

Alan Somers asomers at freebsd.org
Fri Mar 11 04:58:47 UTC 2016


Do they behave badly for writes that cross a 128KB boundary, but are
nonetheless aligned to 128KB boundaries?  Then I don't understand how this
change (or mav's replacement) is supposed to help.  The stripesize is
supposed to be the minimum write that the device can accept without
requiring a read-modify-write.  ZFS guarantees that it will never issue a
write smaller than the stripesize, nor will it ever issue a write that is
not aligned to a stripesize-boundary.  But even if ZFS worked with 128KB
stripesizes, it would still happily issue writes a multiple of 128KB in
size, and these would cross those boundaries.  Am I not understanding
something here?

-Alan

On Thu, Mar 10, 2016 at 9:34 PM, Warner Losh <imp at bsdimp.com> wrote:

> Some Intel NVMe drives behave badly when the LBA range crosses a 128k
> boundary. Their
> performance is worse for those transactions than for ones that don't cross
> the 128k boundary.
>
> Warner
>
> On Thu, Mar 10, 2016 at 11:01 AM, Alan Somers <asomers at freebsd.org> wrote:
>
>> Are you saying that Intel NVMe controllers perform poorly for all I/Os
>> that are less than 128KB, or just for I/Os of any size that cross a 128KB
>> boundary?
>>
>> On Thu, Dec 10, 2015 at 7:06 PM, Steven Hartland <smh at freebsd.org> wrote:
>>
>>> Author: smh
>>> Date: Fri Dec 11 02:06:03 2015
>>> New Revision: 292074
>>> URL: https://svnweb.freebsd.org/changeset/base/292074
>>>
>>> Log:
>>>   Limit stripesize reported from nvd(4) to 4K
>>>
>>>   Intel NVMe controllers have a slow path for I/Os that span a 128KB
>>> stripe boundary but ZFS limits ashift, which is derived from d_stripesize,
>>> to 13 (8KB) so we limit the stripesize reported to geom(8) to 4KB.
>>>
>>>   This may result in a small number of additional I/Os to require
>>> splitting in nvme(4), however the NVMe I/O path is very efficient so these
>>> additional I/Os will cause very minimal (if any) difference in performance
>>> or CPU utilisation.
>>>
>>>   This can be controller by the new sysctl
>>> kern.nvme.max_optimal_sectorsize.
>>>
>>>   MFC after:    1 week
>>>   Sponsored by: Multiplay
>>>   Differential Revision:        https://reviews.freebsd.org/D4446
>>>
>>> Modified:
>>>   head/sys/dev/nvd/nvd.c
>>>   head/sys/dev/nvme/nvme.h
>>>   head/sys/dev/nvme/nvme_ns.c
>>>   head/sys/dev/nvme/nvme_sysctl.c
>>>
>>>
>


More information about the svn-src-all mailing list