Serious performance issues, broken initialization, and a likely fix

Scott Long scottl at samsco.org
Tue Aug 9 01:43:35 GMT 2005


Ade Lovett wrote:
> Or perhaps it should be just "Here be dragons"...
> 
> Whilst attempting to nail down some serious performance issues (compared
> with 4.x) in preparation for a 6.x rollout here, we've come across
> something of a fundamental bug.
> 
> In this particular environment (a Usenet transit server, so very high
> network and disk I/O) we observed that processes were spending a
> considerable amount of time in state 'wswbuf', traced back to getpbuf()
> in vm/vm_pager.c
> 
> To cut a long story short, the order in which nswbuf is being
> initialized is completely, totally, and utterly wrong -- this was
> introduced by revision 1.132 of vm/vnode_pager.c just over 4 years ago.
> 
> In vnode_pager.c we find:
> 
> static void
> vnode_pager_init(void)
> {
> 	vnode_pbuf_freecnt = nswbuf / 2 + 1;
> }
> 
> Unfortunately, nswbuf hasn't been assigned to yet, just happens to be
> zero (in all cases), and thus the kernel believes that there is only
> ever *one* swap buffer available.
> 
> kern_vfs_bio_buffer_alloc() in kern/vfs_bio.c which actually does the
> calculation and assignment, is called rather further on in the process,
> by which time the damage has been done.
> 
> The net result is that *any* calls involving getpbuf() will be
> unconditionally serialized, completely destroying any kind of
> concurrency (and performance).
> 
> Given the memory footprint of our machines, we've hacked in a simple:
> 
> 	nswbuf = 0x100;
> 
> into vnode_pager_init(), since the calculation ends up giving us the
> maximum number anyway.  There are a number of possible 'correct' fixes
> in terms of re-ordering the startup sequence.
> 
> With the aforementioned hack, we're now seeing considerably better
> machine operation, certainly as good as similar 4.10-STABLE boxes.
> 
> As per $SUBJECT, this affects all of RELENG_5, RELENG_6, and HEAD, and
> should, IMO, be considered an absolutely required fix for 6.0-RELEASE.
> 
> -aDe
> 

My vote is to revert rev 1.132 and replace the XXX comment with a more
detailed explaination of the perils involved.  Do you have any kind of
easy to run regression test that could be used to quantify this problem
and guard against it in the future?  Thanks very very much for looking
into it and providing such a good explaination.

Scott


More information about the freebsd-current mailing list