Implementing VOP_READPLUS() in FreeBSD 15?
- Reply: Rick Macklem : "Re: Implementing VOP_READPLUS() in FreeBSD 15?"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 06 Nov 2025 19:39:49 UTC
This is a followup to a discussion with the nfs-ganesha developers. Could FreeBSD implement a VOP_READPLUS() in FreeBSD 15, please? Citing Lionel Cons/CERN: > But the point is to optimise the read(). First, you have less traffic over the wire (which is a > thing if your reads are in the gigabyte range for large VMs), and it tells the VM host that it > can just map all those MMU pages representing the hole to the "default zero page", which > in turn saves lots of space in the L3 and L2 caches ----> THIS DOES WONDERS to VM > performance. > > Example: > The performance benefit here comes from the fast that instead of mapping a 1TB hole > (1099511627776 bytes) to individual 524288 2M pages (x86 2M hugepage size), and then > potentially reading from them, you just have ONE 2M page in the cache, and all reads come > from that. > > READ_PLUS is THE game changer for that kind of application, especially in our case (HPC > simulations). I just played with that: 1. Intel XEON with 512GB 2. loading 16 files with 64GB sparse files which are only holes 3. create kernel core dump Result: Almost all pages in the file cache are zero bytes. VOP_READPLUS() would optimize this case, and map all ranges belonging to sparse file holes into the same read-only MMU page representing a physical address range containing zero bytes. Because it's the same physical memory it would consume very little L2/L3 cache space, and save space in the filesystem cache too. Aurélien -- Aurélien Couderc <aurelien.couderc2002@gmail.com> Big Data/Data mining expert, chess enthusiast