kernel memory checks on boot vs. boot time

Alexander Best arundel at freebsd.org
Tue Mar 22 20:00:49 UTC 2011


On Tue Mar 22 11, John Baldwin wrote:
> On Tuesday, March 22, 2011 1:30:42 pm Bjoern A. Zeeb wrote:
> > Hi,
> > 
> > as part of the i386/pc98/amd64 boot process we are doing some basic
> > memory testing, mapping pages and running a couple of pattern
> > write/read tests on the first bytes (see getmemsize() implmentations).
> > 
> > Depending on the features enabled and boot -v or not you may notice
> > it as "nothing happens" booting from loader, after any of these
> > possible lines:
> >  	GDB: no debug ports present
> >  	KDB: debugger backends: ddb
> >  	KDB: current backend: ddb
> >  	SMAP type=...
> > but before the Copyright message.
> > 
> > With the growing number of memory this can lead to a significant
> > fraction of kernel startup time on amd64 (~40s delays observed with
> > 96G of RAM).  Looping over the pages, but not mapping them and not
> > running the pattern tests reduces this significantly (to single digit
> > numbers of seconds).
> > 
> > As a first step I'd like to discuss how worth the actual memory tests
> > are these days, to figure out a sensible default.
> > 
> > Not wanting to remove them but maybe make more use of them in the
> > future (as we do not report any problems we find currently) I'd suggest
> > to introduce a tunable to disable/enable them, say
> > 
> >  	hw.run_memtest
> > 
> > with the following values:
> > 
> >  	0	do not map the page and do not run the pattern tests
> >  	1	do run the pattern test on the beginning of the page
> >  		(current default).
> > and maybe add
> >  	2	run the pattern tests on the entire pages?
> > 
> > I would further suggest to add a printf independently of boot -v
> > there, so that the user who would wait, will know what's (not) going on.
> > Something along the lines of:
> >  	"Testing physical address space (%s)."
> >  	0       "skipping extra pattern tests"
> >  	1       "pattern tests on beginning of each page"
> >  	2       "pattern tests on entire pages"
> > 
> > 
> > If this is something that makes sense, I'd suggest to factor things
> > out to sys/x86 and would provide a patch for further discussion and
> > improvements (like error reporting, etc).
> > 
> > Comments?  Suggestions?
> 
> Do other platforms bother with these sorts of memory tests?  If not I'd vote 
> to just drop it.  I think this mattered more when you didn't have things like 
> SMAP (so you had to guess at where memory ended sometimes).  Also, modern 
> server class x86 machines generally support ECC RAM which will trigger a 
> machine check if there is a problem.  I doubt that the early checks are 
> catching anything even for the non-ECC case.
> 
> If nothing else, I would definitely drop this from amd64 (all those systems 
> have SMAP and machine check support, etc.).

also +1 for removing these routines on amd64.

i don't think these are necessary on i386/pc98, too. but if it's being decided
that the mem tests should stay on these archs, i vote for the introduction of
a tunable.

cheers.
alex

> 
> -- 
> John Baldwin

-- 
a13x


More information about the freebsd-arch mailing list