nfs pessimized by vnode pages changes

Bruce Evans brde at optusnet.com.au
Sun Mar 27 04:34:10 UTC 2016


I debugged another pessimization of nfs.

ncl_getpages() is now almost always called with a count of 1 page, due
to the change changing the count from faultcount to 1 in r292373 in
vm_fault().  The only exception seems to be for the initial pagein for
exec -- this is still normally the nfs rsize.  ncl_getpages() doesn't
do any readahead stuff like vnode_pager_generic_getpages() does, so it 
normally does read RPCs of size 1 page instead of the the nfs rsize.
This gives the following increases in read RPCs for makeworld of an
old world:
- with rsize = 16K, from 24k to 39k (the worst case would be 4 times as many)
- with rsize = 8K, from 39k to 44k (the worst case would be 2 times as many).

Also, nfs_getpages() has buggy logic which works accidentally if the
count is 1:

X diff -c2 ./fs/nfsclient/nfs_clbio.c~ ./fs/nfsclient/nfs_clbio.c
X *** ./fs/nfsclient/nfs_clbio.c~	Sun Mar 27 01:31:38 2016
X --- ./fs/nfsclient/nfs_clbio.c	Sun Mar 27 02:35:32 2016
X ***************
X *** 135,140 ****
X   	 */
X   	VM_OBJECT_WLOCK(object);
X ! 	if (pages[npages - 1]->valid != 0 && --npages == 0)
X   		goto out;
X   	VM_OBJECT_WUNLOCK(object);
X 
X --- 135,155 ----
X   	 */
X   	VM_OBJECT_WLOCK(object);
X ! #if 0
X ! 	/* This matches the comment. but doesn't work (has little effect). */
X ! 	if (pages[0]->valid != 0)
X   		goto out;

The comment still says that the code checks the requested page, but
that is no longer passed to the function in a_reqpage.  The first page
is a better geuss of the requested page than the last one, but when
npages is 1 these pages are the same.

X + #else
X + 	if (pages[0]->valid != 0)
X + 		printf("ncl_getpages: page 0 valid; npages %d\n", npages);
X + 	for (i = 0; i < npages; i++)
X + 		if (pages[i]->valid != 0)
X + 			printf("ncl_getpages: page %d valid; npages %d\n",
X + 			    i, npages);
X + 	for (i = 0; i < npages; i++)
X + 		if (pages[i]->valid != 0)
X + 			npages = i;
X + 	if (npages == 0)
X + 		goto out;
X + #endif

Debugging and more forceful guessing code.  This makes little difference
except of course to spam the console.

X   	VM_OBJECT_WUNLOCK(object);
X 
X ***************
X *** 199,202 ****
X --- 214,220 ----
X   			KASSERT(m->dirty == 0,
X   			    ("nfs_getpages: page %p is dirty", m));
X + 			printf("ncl_getpages: partial page %d of %d %s\n",
X + 			    i, npages,
X + 			    pages[i]->valid != 0 ? "valid" : "invalid");
X   		} else {
X   			/*
X ***************
X *** 210,215 ****
X --- 228,239 ----
X   			 */
X   			;
X + 			printf("ncl_getpages: short page %d of %d %s\n",
X + 			    i, npages,
X + 			    pages[i]->valid != 0 ? "valid" : "invalid");
X   		}
X   	}
X + 	for (i = 0; i < npages; i++)
X + 		printf("ncl_getpages: page %d of %d %s\n",
X + 		    i, npages, pages[i]->valid != 0 ? "valid" : "invalid");
X   out:
X   	VM_OBJECT_WUNLOCK(object);

Further debugging code.  Similar debugging code in the old working version
shows that normal operation for paging in a 15K file with an rsize of 16K is:
- call here with npages = 4
- page in 3 full pages and 1 partial page using 1 RPC
- call here again with npages = 1 for the partial page
- use the optimization of returning early for this page -- don't do another
   RPC

The buggy version does:
- call here with npages = 1; page in 1 full page using 1 RPC
- call here with npages = 1; page in 1 full page using 1 RPC
- call here with npages = 1; page in 1 full page using 1 RPC
- call here with npages = 1; page in 1 partial page using 1 RPC
- call here again with npages = 1 for the partial page; the optimization
   works as before.

The partial page isn't handled very well, but at least there is no extra
physical i/o for it, at least if it is at EOF.  vfs clustering handles
partial pages even worse than this.

Bruce


More information about the freebsd-fs mailing list