nfs pessimized by vnode pages changes
Bruce Evans
brde at optusnet.com.au
Sun Mar 27 04:34:10 UTC 2016
I debugged another pessimization of nfs.
ncl_getpages() is now almost always called with a count of 1 page, due
to the change changing the count from faultcount to 1 in r292373 in
vm_fault(). The only exception seems to be for the initial pagein for
exec -- this is still normally the nfs rsize. ncl_getpages() doesn't
do any readahead stuff like vnode_pager_generic_getpages() does, so it
normally does read RPCs of size 1 page instead of the the nfs rsize.
This gives the following increases in read RPCs for makeworld of an
old world:
- with rsize = 16K, from 24k to 39k (the worst case would be 4 times as many)
- with rsize = 8K, from 39k to 44k (the worst case would be 2 times as many).
Also, nfs_getpages() has buggy logic which works accidentally if the
count is 1:
X diff -c2 ./fs/nfsclient/nfs_clbio.c~ ./fs/nfsclient/nfs_clbio.c
X *** ./fs/nfsclient/nfs_clbio.c~ Sun Mar 27 01:31:38 2016
X --- ./fs/nfsclient/nfs_clbio.c Sun Mar 27 02:35:32 2016
X ***************
X *** 135,140 ****
X */
X VM_OBJECT_WLOCK(object);
X ! if (pages[npages - 1]->valid != 0 && --npages == 0)
X goto out;
X VM_OBJECT_WUNLOCK(object);
X
X --- 135,155 ----
X */
X VM_OBJECT_WLOCK(object);
X ! #if 0
X ! /* This matches the comment. but doesn't work (has little effect). */
X ! if (pages[0]->valid != 0)
X goto out;
The comment still says that the code checks the requested page, but
that is no longer passed to the function in a_reqpage. The first page
is a better geuss of the requested page than the last one, but when
npages is 1 these pages are the same.
X + #else
X + if (pages[0]->valid != 0)
X + printf("ncl_getpages: page 0 valid; npages %d\n", npages);
X + for (i = 0; i < npages; i++)
X + if (pages[i]->valid != 0)
X + printf("ncl_getpages: page %d valid; npages %d\n",
X + i, npages);
X + for (i = 0; i < npages; i++)
X + if (pages[i]->valid != 0)
X + npages = i;
X + if (npages == 0)
X + goto out;
X + #endif
Debugging and more forceful guessing code. This makes little difference
except of course to spam the console.
X VM_OBJECT_WUNLOCK(object);
X
X ***************
X *** 199,202 ****
X --- 214,220 ----
X KASSERT(m->dirty == 0,
X ("nfs_getpages: page %p is dirty", m));
X + printf("ncl_getpages: partial page %d of %d %s\n",
X + i, npages,
X + pages[i]->valid != 0 ? "valid" : "invalid");
X } else {
X /*
X ***************
X *** 210,215 ****
X --- 228,239 ----
X */
X ;
X + printf("ncl_getpages: short page %d of %d %s\n",
X + i, npages,
X + pages[i]->valid != 0 ? "valid" : "invalid");
X }
X }
X + for (i = 0; i < npages; i++)
X + printf("ncl_getpages: page %d of %d %s\n",
X + i, npages, pages[i]->valid != 0 ? "valid" : "invalid");
X out:
X VM_OBJECT_WUNLOCK(object);
Further debugging code. Similar debugging code in the old working version
shows that normal operation for paging in a 15K file with an rsize of 16K is:
- call here with npages = 4
- page in 3 full pages and 1 partial page using 1 RPC
- call here again with npages = 1 for the partial page
- use the optimization of returning early for this page -- don't do another
RPC
The buggy version does:
- call here with npages = 1; page in 1 full page using 1 RPC
- call here with npages = 1; page in 1 full page using 1 RPC
- call here with npages = 1; page in 1 full page using 1 RPC
- call here with npages = 1; page in 1 partial page using 1 RPC
- call here again with npages = 1 for the partial page; the optimization
works as before.
The partial page isn't handled very well, but at least there is no extra
physical i/o for it, at least if it is at EOF. vfs clustering handles
partial pages even worse than this.
Bruce
More information about the freebsd-fs
mailing list