Re: Looking for rationale for the minidump format

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Tue, 23 Nov 2021 16:31:02 UTC
On 11/23/21 1:11 AM, Michał Górny wrote:
> On Mon, 2021-11-22 at 09:47 -0800, John Baldwin wrote:
>> On 11/21/21 6:42 AM, Michał Górny wrote:
>>> Hi, everyone.
>>>
>>> As part of the work contracted by the FreeBSD Foundation, I'm working
>>> on adding explicit minidump support to LLDB.  When discussing
>>> the options with upstream, I've been asked why FreeBSD created their own
>>> minidump format.
>>>
>>> I did a bit of digging but TBH all the rationale I could get was to
>>> create partial memory dumps.  However, unless I'm mistaken the ELF
>>> format is perfectly capable of that -- e.g. via creating an explicit
>>> segment for every continuous active region.
>>>
>>> Does anyone happen to know what the original rationale for creating
>>> a custom file format was, or know where to find one?  Thanks in advance.
>>
>> The direct map aliases pages mapped via kmem.  You'd be double dumping
>> all the data mapped into kmem, once for the direct map and once for the
>> non-direct mappings.
>>
>> You can think of minidumps as being a dump of physical memory, whereas
>> an ELF core for a virtually-mapped kernel wants to dump virtual memory,
>> and there is the disconnect.
>>
>> [...]
>>
>> You could perhaps imagine something similar where you had an ELF core
>> with physical memory for PT_LOAD instead of virtual and a way to hint that
>> so that the debugger would handle all the virtual -> PA translation, but
>> you'd still need some home-grown notes for some of the other metadata we
>> pass along (like the message buffer, etc.).  Also, changing the format
>> doesn't help with reading existing crash dumps.
>>
> 
> Thank you for your reply.  If I understand correctly, you're comparing
> minidump with a "proper" (i.e. virtual memory-based) ELF core.  However,
> the "full memory dump" ELF core also uses physical memory map model, is
> that correct?  Does that mean that using a different core format makes
> it clear that it's a physical memory dump and not virtual?

I think so, yes.

> That said, please correct me if I'm mistaken but I think we should be
> able to create a "virtual memory mapped" ELF core without too much
> duplication.  We could creating multiple segments with different p_vaddr
> values but the same file p_offset, correct (and maybe p_paddr)?  I'm not
> advocating for changing the format, just trying to improve my knowledge.

Humm, we could perhaps do that to avoid duplicate data, but that would be
a _lot_ of PT_LOAD's.  Every physical discontinuity in kmem would generate
another PT_LOAD.  I fear you might have hundreds or thousands of those, but
we wouldn't really know without mocking it up and trying I think.  You
could simulate it perhaps by just writing a tool to convert an existing
vmcore to a "fat ELF" for now vs having to change the kernel.

-- 
John Baldwin