minidumps are unsafe on amd64
scottl at samsco.org
Fri Jan 25 11:23:19 PST 2008
Ruslan Ermilov wrote:
> Kernel minidumps on amd64 SMP can write beyond the bounds
> of the configured dump device causing (as in our case) the
> file system data following swap partition to be overwritten
> with the dump contents.
> The problem is that while we're in the process of dumping
> mapped physical pages via a bitmap (in minidump_machdep.c),
> other CPUs continue to work and may modify page mappings of
> processes. This in turn causes the modifications to
> pv_entries, which in turn modifies the bitmap of pages to
> dump. As the result, we can dump more pages than we've
> calculated, and since dumps are written to the end of the
> dump device, we may end up overwriting it.
> The attached patch mitigates the problem, but the real solution
> seems to be to disable interrupts (there's an XXX about this
> in kern_shutdown.c before calling doadump()), and stopping
> other CPUs, so we don't modify page tables while we're dumping.
> This only affects 7.x/8.x amd64 SMP systems configured with
> minidump. i386 systems aren't affected.
Is this a case where you are manually triggering a dump on a
system that is otherwise running fine? I thought that crashes
already disabled interrupts and made an attempt to stop other
CPUs. That's why there is dump-specific code in every storage
driver in the first place; it implements polled i/o so that
crashdump i/o can take place with interrupts disabled. If it's
a case where interrupts aren't actually getting disabled, then
that's one thing. If it's a case where you're trying to fix
something that isn't broken, then I'm very cautious about the
added complexity that you're proposing.
More information about the freebsd-current