kern/84903: Incorrect initialization of nswbuf
Ade Lovett
ade at FreeBSD.org
Sun Aug 14 09:10:13 GMT 2005
>Number: 84903
>Category: kern
>Synopsis: Incorrect initialization of nswbuf
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Sun Aug 14 09:10:11 GMT 2005
>Closed-Date:
>Last-Modified:
>Originator: Ade Lovett
>Release: All FreeBSD > 5.0
>Organization:
Supernews
>Environment:
Any FreeBSD system (RELENG_5, RELENG_6, and HEAD) after
revision 1.132 of sys/vm/vnode_pager.c (4 years, 1 month ago)
>Description:
Whilst attempting to nail down some serious performance issues (compared
with 4.x) in preparation for a 6.x rollout here, we've come across
something of a fundamental bug.
In this particular environment (a Usenet transit server, so very high
network and disk I/O) we observed that processes were spending a
considerable amount of time in state 'wswbuf', traced back to getpbuf()
in vm/vm_pager.c
To cut a long story short, the order in which nswbuf is being
initialized is completely, totally, and utterly wrong -- this was
introduced by revision 1.132 of vm/vnode_pager.c just over 4 years ago.
In vnode_pager.c we find:
static void
vnode_pager_init(void)
{
vnode_pbuf_freecnt = nswbuf / 2 + 1;
}
Unfortunately, nswbuf hasn't been assigned to yet, just happens to be
zero (in all cases), and thus the kernel believes that there is only
ever *one* swap buffer available.
kern_vfs_bio_buffer_alloc() in kern/vfs_bio.c which actually does the
calculation and assignment, is called rather further on in the process,
by which time the damage has been done.
The net result is that *any* calls involving getpbuf() will be
unconditionally serialized, completely destroying any kind of
concurrency (and performance).
Given the memory footprint of our machines, we've hacked in a simple:
nswbuf = 0x100;
into vnode_pager_init(), since the calculation ends up giving us the
maximum number anyway. There are a number of possible 'correct' fixes
in terms of re-ordering the startup sequence.
With the aforementioned hack, we're now seeing considerably better
machine operation, certainly as good as similar 4.10-STABLE boxes.
As per $SUBJECT, this affects all of RELENG_5, RELENG_6, and HEAD, and
should, IMO, be considered an absolutely required fix for 6.0-RELEASE.
>How-To-Repeat:
N/A
>Fix:
We have implemented a local hack as above, given that the
memory footprint of the machines would result in the
maximal value of nswbuf being assigned in any case.
This is not a real fix however.
A solution has been offered by Alexander Kabaev <kabaev at gmail.com>
as follows, which appears to do the right thing, at least on
RELENG_6/i386, which is the only type of machine I have easy
access to for testing purposes.
In my opinion, it would be a fatal error to release 6.0 in
any shape or form without addressing this issue.
Index: vm_init.c
===================================================================
RCS file: /home/ncvs/src/sys/vm/vm_init.c,v
retrieving revision 1.46
diff -u -r1.46 vm_init.c
--- vm_init.c 25 Apr 2005 19:22:05 -0000 1.46
+++ vm_init.c 9 Aug 2005 01:59:12 -0000
@@ -124,7 +124,7 @@
vm_map_startup();
kmem_init(virtual_avail, virtual_end);
pmap_init();
- vm_pager_init();
+ /* vm_pager_init(); */
}
void
Index: vm_pager.c
===================================================================
RCS file: /home/ncvs/src/sys/vm/vm_pager.c,v
retrieving revision 1.105
diff -u -r1.105 vm_pager.c
--- vm_pager.c 18 May 2005 20:45:33 -0000 1.105
+++ vm_pager.c 9 Aug 2005 01:59:55 -0000
@@ -202,6 +202,8 @@
struct buf *bp;
int i;
+ vm_pager_init();
+
mtx_init(&pbuf_mtx, "pbuf mutex", NULL, MTX_DEF);
bp = swbuf;
/*
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list