svn commit: r232156 - head/sys/kern

Bruce Evans brde at optusnet.com.au
Sun Feb 26 01:46:34 UTC 2012


On Sat, 25 Feb 2012, Rick Macklem wrote:

>> Log:
>> o Reduce chances for integer overflow.
>> o More verbose sysctl description added.
>>
>> MFC after: 2 weeks
>> Sponsored by: Nginx, Inc.
>>
>> Modified:
>> head/sys/kern/vfs_cache.c
>>
>> Modified: head/sys/kern/vfs_cache.c
>> ==============================================================================
>> --- head/sys/kern/vfs_cache.c Sat Feb 25 11:07:32 2012 (r232155)
>> +++ head/sys/kern/vfs_cache.c Sat Feb 25 12:06:40 2012 (r232156)
>> @@ -369,7 +369,7 @@ sysctl_debug_hashstat_nchash(SYSCTL_HAND
>> maxlength = count;
>> }
>> n_nchash = nchash + 1;
>> - pct = (used * 100 * 100) / n_nchash;
>> + pct = (used * 100) / (n_nchash / 100);
>
> You might want to consider a sanity check to make
> sure n_nchash is >= 100, to avoid a "divide by zero".
>
> There was an NFS PR# a while back, where "desiredvnodes" was
> set very small and resulted in a "divide by zero" in the NFS code.

Interesting.  I considered mentioning this possibility in my reply,
but decided that desiredvnodes is always initially at least a few
hundred, since even an unbootable machine with 4MB memory has
1024 4K-pages.  You can tune desiredvnodes down to a bad value,
but another old bug is that tuning desiredvnodes doesn't affect
the namecache, so you can't make the above divide by provided
n_nchash was initially not bad.

In nfs, the problem is larger and still exists in oldnfs:

% nfsclient/nfs_vfsops.c-		nmp->nm_wsize = NFS_WSIZE;
% nfsclient/nfs_vfsops.c-		nmp->nm_rsize = NFS_RSIZE;
% nfsclient/nfs_vfsops.c-	}
% nfsclient/nfs_vfsops.c:	nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);

Now the divisor is 1000, so problems occur with the initial value occurs
on machines with 10 times as much memory as ones which have a problem in
the namecache code, and a good initial value can be tuned down to ensure
division by zero.

% nfsclient/nfs_vfsops.c-	nmp->nm_readdirsize = NFS_READDIRSIZE;
% nfsclient/nfs_vfsops.c-	nmp->nm_numgrps = NFS_MAXGRPS;
% nfsclient/nfs_vfsops.c-	nmp->nm_readahead = NFS_DEFRAHEAD;
% --
% fs/nfsclient/nfs_clvfsops.c-	nmp->nm_timeo = NFS_TIMEO;
% fs/nfsclient/nfs_clvfsops.c-	nmp->nm_retry = NFS_RETRANS;
% fs/nfsclient/nfs_clvfsops.c-	nmp->nm_readahead = NFS_DEFRAHEAD;
% fs/nfsclient/nfs_clvfsops.c:	if (desiredvnodes >= 11000)
% fs/nfsclient/nfs_clvfsops.c:		nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
% fs/nfsclient/nfs_clvfsops.c-	else
% fs/nfsclient/nfs_clvfsops.c-		nmp->nm_wcommitsize = hibufspace / 10;
% fs/nfsclient/nfs_clvfsops.c-

Has been fixed, but if you only want to avoid division by 0, then a simpler
fix is to split up the powers of 10, e.g.:

 		nmp->nm_wcommitsize = hibufspace * 100 / (desiredvnodes / 10);

No one would set desiredvnodes to < 10, but now there is possible overflow
for the multiplication, and this is even easier to arrange then division
by zero since it is easy to set highbufspace to a non-physical value
(2**63-1 on 64-bit machines).  So this is too fragile.  Except the
sysctls are privileged and we should rely on their users to not misuse
them.

Bruce


More information about the svn-src-all mailing list