kernel memory checks on boot vs. boot time
Alexander Best
arundel at freebsd.org
Tue Mar 22 20:00:49 UTC 2011
On Tue Mar 22 11, John Baldwin wrote:
> On Tuesday, March 22, 2011 1:30:42 pm Bjoern A. Zeeb wrote:
> > Hi,
> >
> > as part of the i386/pc98/amd64 boot process we are doing some basic
> > memory testing, mapping pages and running a couple of pattern
> > write/read tests on the first bytes (see getmemsize() implmentations).
> >
> > Depending on the features enabled and boot -v or not you may notice
> > it as "nothing happens" booting from loader, after any of these
> > possible lines:
> > GDB: no debug ports present
> > KDB: debugger backends: ddb
> > KDB: current backend: ddb
> > SMAP type=...
> > but before the Copyright message.
> >
> > With the growing number of memory this can lead to a significant
> > fraction of kernel startup time on amd64 (~40s delays observed with
> > 96G of RAM). Looping over the pages, but not mapping them and not
> > running the pattern tests reduces this significantly (to single digit
> > numbers of seconds).
> >
> > As a first step I'd like to discuss how worth the actual memory tests
> > are these days, to figure out a sensible default.
> >
> > Not wanting to remove them but maybe make more use of them in the
> > future (as we do not report any problems we find currently) I'd suggest
> > to introduce a tunable to disable/enable them, say
> >
> > hw.run_memtest
> >
> > with the following values:
> >
> > 0 do not map the page and do not run the pattern tests
> > 1 do run the pattern test on the beginning of the page
> > (current default).
> > and maybe add
> > 2 run the pattern tests on the entire pages?
> >
> > I would further suggest to add a printf independently of boot -v
> > there, so that the user who would wait, will know what's (not) going on.
> > Something along the lines of:
> > "Testing physical address space (%s)."
> > 0 "skipping extra pattern tests"
> > 1 "pattern tests on beginning of each page"
> > 2 "pattern tests on entire pages"
> >
> >
> > If this is something that makes sense, I'd suggest to factor things
> > out to sys/x86 and would provide a patch for further discussion and
> > improvements (like error reporting, etc).
> >
> > Comments? Suggestions?
>
> Do other platforms bother with these sorts of memory tests? If not I'd vote
> to just drop it. I think this mattered more when you didn't have things like
> SMAP (so you had to guess at where memory ended sometimes). Also, modern
> server class x86 machines generally support ECC RAM which will trigger a
> machine check if there is a problem. I doubt that the early checks are
> catching anything even for the non-ECC case.
>
> If nothing else, I would definitely drop this from amd64 (all those systems
> have SMAP and machine check support, etc.).
also +1 for removing these routines on amd64.
i don't think these are necessary on i386/pc98, too. but if it's being decided
that the mem tests should stay on these archs, i vote for the introduction of
a tunable.
cheers.
alex
>
> --
> John Baldwin
--
a13x
More information about the freebsd-arch
mailing list