kern/84903: Incorrect initialization of nswbuf

Ade Lovett ade at FreeBSD.org
Sun Aug 14 09:10:13 GMT 2005


>Number:         84903
>Category:       kern
>Synopsis:       Incorrect initialization of nswbuf
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Aug 14 09:10:11 GMT 2005
>Closed-Date:
>Last-Modified:
>Originator:     Ade Lovett
>Release:        All FreeBSD > 5.0
>Organization:
Supernews
>Environment:

	Any FreeBSD system (RELENG_5, RELENG_6, and HEAD) after
	revision 1.132 of sys/vm/vnode_pager.c (4 years, 1 month ago)

>Description:

Whilst attempting to nail down some serious performance issues (compared
with 4.x) in preparation for a 6.x rollout here, we've come across
something of a fundamental bug.

In this particular environment (a Usenet transit server, so very high
network and disk I/O) we observed that processes were spending a
considerable amount of time in state 'wswbuf', traced back to getpbuf()
in vm/vm_pager.c

To cut a long story short, the order in which nswbuf is being
initialized is completely, totally, and utterly wrong -- this was
introduced by revision 1.132 of vm/vnode_pager.c just over 4 years ago.

In vnode_pager.c we find:

static void
vnode_pager_init(void)
{
	vnode_pbuf_freecnt = nswbuf / 2 + 1;
}

Unfortunately, nswbuf hasn't been assigned to yet, just happens to be
zero (in all cases), and thus the kernel believes that there is only
ever *one* swap buffer available.

kern_vfs_bio_buffer_alloc() in kern/vfs_bio.c which actually does the
calculation and assignment, is called rather further on in the process,
by which time the damage has been done.

The net result is that *any* calls involving getpbuf() will be
unconditionally serialized, completely destroying any kind of
concurrency (and performance).

Given the memory footprint of our machines, we've hacked in a simple:

	nswbuf = 0x100;

into vnode_pager_init(), since the calculation ends up giving us the
maximum number anyway.  There are a number of possible 'correct' fixes
in terms of re-ordering the startup sequence.

With the aforementioned hack, we're now seeing considerably better
machine operation, certainly as good as similar 4.10-STABLE boxes.

As per $SUBJECT, this affects all of RELENG_5, RELENG_6, and HEAD, and
should, IMO, be considered an absolutely required fix for 6.0-RELEASE.

>How-To-Repeat:

	N/A
>Fix:

	We have implemented a local hack as above, given that the
	memory footprint of the machines would result in the
	maximal value of nswbuf being assigned in any case.

	This is not a real fix however.

	A solution has been offered by Alexander Kabaev <kabaev at gmail.com>
	as follows, which appears to do the right thing, at least on
	RELENG_6/i386, which is the only type of machine I have easy
	access to for testing purposes.

	In my opinion, it would be a fatal error to release 6.0 in
	any shape or form without addressing this issue.

Index: vm_init.c
===================================================================
RCS file: /home/ncvs/src/sys/vm/vm_init.c,v
retrieving revision 1.46
diff -u -r1.46 vm_init.c
--- vm_init.c	25 Apr 2005 19:22:05 -0000	1.46
+++ vm_init.c	9 Aug 2005 01:59:12 -0000
@@ -124,7 +124,7 @@
 	vm_map_startup();
 	kmem_init(virtual_avail, virtual_end);
 	pmap_init();
-	vm_pager_init();
+	/* vm_pager_init(); */
 }
 
 void
Index: vm_pager.c
===================================================================
RCS file: /home/ncvs/src/sys/vm/vm_pager.c,v
retrieving revision 1.105
diff -u -r1.105 vm_pager.c
--- vm_pager.c	18 May 2005 20:45:33 -0000	1.105
+++ vm_pager.c	9 Aug 2005 01:59:55 -0000
@@ -202,6 +202,8 @@
 	struct buf *bp;
 	int i;
 
+	vm_pager_init();
+
 	mtx_init(&pbuf_mtx, "pbuf mutex", NULL, MTX_DEF);
 	bp = swbuf;
 	/*
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list