From nobody Tue Nov 23 16:31:02 2021
X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 14EA5188E709
	for <freebsd-hackers@mlmmj.nyi.freebsd.org>; Tue, 23 Nov 2021 16:31:04 +0000 (UTC)
	(envelope-from jhb@FreeBSD.org)
Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256
	 client-signature RSA-PSS (4096 bits) client-digest SHA256)
	(Client CN "smtp.freebsd.org", Issuer "R3" (verified OK))
	by mx1.freebsd.org (Postfix) with ESMTPS id 4Hz8lW6BhNz3lpn;
	Tue, 23 Nov 2021 16:31:03 +0000 (UTC)
	(envelope-from jhb@FreeBSD.org)
Received: from [10.0.1.4] (ralph.baldwin.cx [66.234.199.215])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(Client did not present a certificate)
	(Authenticated sender: jhb)
	by smtp.freebsd.org (Postfix) with ESMTPSA id 4A01A23E3E;
	Tue, 23 Nov 2021 16:31:03 +0000 (UTC)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <19c441de-69ee-7c49-60ad-60aa154efd82@FreeBSD.org>
Date: Tue, 23 Nov 2021 08:31:02 -0800
List-Id: Technical discussions relating to FreeBSD <freebsd-hackers.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-hackers
List-Help: <mailto:freebsd-hackers+help@freebsd.org>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Subscribe: <mailto:freebsd-hackers+subscribe@freebsd.org>
List-Unsubscribe: <mailto:freebsd-hackers+unsubscribe@freebsd.org>
Sender: owner-freebsd-hackers@freebsd.org
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:91.0)
 Gecko/20100101 Thunderbird/91.3.0
Subject: Re: Looking for rationale for the minidump format
Content-Language: en-US
To: =?UTF-8?B?TWljaGHFgiBHw7Nybnk=?= <mgorny@moritz.systems>,
 freebsd-hackers@FreeBSD.org
Cc: emaste@FreeBSD.org, peter@FreeBSD.org
References: <305082d9f216bb8382773c074eaf7a5c3101cc13.camel@moritz.systems>
 <cd2f2575-db5b-db6a-db9d-1575d309a74f@FreeBSD.org>
 <4b1b49da6c83165fbad3c4804c1394473e1b67da.camel@moritz.systems>
From: John Baldwin <jhb@FreeBSD.org>
In-Reply-To: <4b1b49da6c83165fbad3c4804c1394473e1b67da.camel@moritz.systems>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org;
	s=dkim; t=1637685063;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=3xpa1GFb5A6dB121XT62zYX9uCpAuoUNcegsSZtxTPA=;
	b=tpVisKcOtaEoYjLp3A7keV+0qlmboxqZ+CpBYZ7b0+xNP7clMgPXtdAqP0F2zqAr8Qe+1s
	iHloGOpg3p4sKIo6GZ+uHyVfBH+mtEm3HQpCx2r5C1oXA47UjKi+lnJg2AddaC8NsarNJn
	y1w0PBlArgnw+XlCIhBTTj4eUD+b5LMClfHfAYBqCiwOumKpw+d6NAv5aLGe/NMEUp7vSo
	5e/z7Mw5AdA+SO8shXOJjDP2jpHaZIfwASqYb0bHw+Rffj8jhA7oL6RDcYIaCNV58NGGO4
	Ey3bnVzqTpZAvxKkadClZ5TrGh3WjsSfgwI4uoUo/EAsrPxGggFh7onK4Dy8Gg==
ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1637685063; a=rsa-sha256; cv=none;
	b=iuixROgm2uXnc15pCA4wxNfSnKqmjzSC45FSVqQvur0sRSRjZmPQwqXRqH9wbgBwZ1afDs
	EKdkaL1Ut/Mo6nABiS2RqSWr4tJXfZj/4YQ0kWgWeUkEhJQdBKQ/OenHY+Q2m3K8DWusGY
	z9wa8gkON61K9h8e9XOCox9cDOxDH4PUg7FmSUZy64u09hCBvp04pWZKHBQpktCCm7ESec
	ZGxJkKvOa9ggTKNdCe80u6CzVEdS8F2/gg566i/986XhAqbycCv+fg3ZYU21Jo4Vnnu7l9
	P/ID/f9+347YiCpHbWhvWM9iup5/fF35A+HSNgBxjJoLMbGDgZP7pVdXMzCm1Q==
ARC-Authentication-Results: i=1;
	mx1.freebsd.org;
	none
X-ThisMailContainsUnwantedMimeParts: N

On 11/23/21 1:11 AM, Michał Górny wrote:
> On Mon, 2021-11-22 at 09:47 -0800, John Baldwin wrote:
>> On 11/21/21 6:42 AM, Michał Górny wrote:
>>> Hi, everyone.
>>>
>>> As part of the work contracted by the FreeBSD Foundation, I'm working
>>> on adding explicit minidump support to LLDB.  When discussing
>>> the options with upstream, I've been asked why FreeBSD created their own
>>> minidump format.
>>>
>>> I did a bit of digging but TBH all the rationale I could get was to
>>> create partial memory dumps.  However, unless I'm mistaken the ELF
>>> format is perfectly capable of that -- e.g. via creating an explicit
>>> segment for every continuous active region.
>>>
>>> Does anyone happen to know what the original rationale for creating
>>> a custom file format was, or know where to find one?  Thanks in advance.
>>
>> The direct map aliases pages mapped via kmem.  You'd be double dumping
>> all the data mapped into kmem, once for the direct map and once for the
>> non-direct mappings.
>>
>> You can think of minidumps as being a dump of physical memory, whereas
>> an ELF core for a virtually-mapped kernel wants to dump virtual memory,
>> and there is the disconnect.
>>
>> [...]
>>
>> You could perhaps imagine something similar where you had an ELF core
>> with physical memory for PT_LOAD instead of virtual and a way to hint that
>> so that the debugger would handle all the virtual -> PA translation, but
>> you'd still need some home-grown notes for some of the other metadata we
>> pass along (like the message buffer, etc.).  Also, changing the format
>> doesn't help with reading existing crash dumps.
>>
> 
> Thank you for your reply.  If I understand correctly, you're comparing
> minidump with a "proper" (i.e. virtual memory-based) ELF core.  However,
> the "full memory dump" ELF core also uses physical memory map model, is
> that correct?  Does that mean that using a different core format makes
> it clear that it's a physical memory dump and not virtual?

I think so, yes.

> That said, please correct me if I'm mistaken but I think we should be
> able to create a "virtual memory mapped" ELF core without too much
> duplication.  We could creating multiple segments with different p_vaddr
> values but the same file p_offset, correct (and maybe p_paddr)?  I'm not
> advocating for changing the format, just trying to improve my knowledge.

Humm, we could perhaps do that to avoid duplicate data, but that would be
a _lot_ of PT_LOAD's.  Every physical discontinuity in kmem would generate
another PT_LOAD.  I fear you might have hundreds or thousands of those, but
we wouldn't really know without mocking it up and trying I think.  You
could simulate it perhaps by just writing a tool to convert an existing
vmcore to a "fat ELF" for now vs having to change the kernel.

-- 
John Baldwin