From nobody Tue Nov 23 16:31:02 2021 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 14EA5188E709 for ; Tue, 23 Nov 2021 16:31:04 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Hz8lW6BhNz3lpn; Tue, 23 Nov 2021 16:31:03 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from [10.0.1.4] (ralph.baldwin.cx [66.234.199.215]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) (Authenticated sender: jhb) by smtp.freebsd.org (Postfix) with ESMTPSA id 4A01A23E3E; Tue, 23 Nov 2021 16:31:03 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Message-ID: <19c441de-69ee-7c49-60ad-60aa154efd82@FreeBSD.org> Date: Tue, 23 Nov 2021 08:31:02 -0800 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:91.0) Gecko/20100101 Thunderbird/91.3.0 Subject: Re: Looking for rationale for the minidump format Content-Language: en-US To: =?UTF-8?B?TWljaGHFgiBHw7Nybnk=?= , freebsd-hackers@FreeBSD.org Cc: emaste@FreeBSD.org, peter@FreeBSD.org References: <305082d9f216bb8382773c074eaf7a5c3101cc13.camel@moritz.systems> <4b1b49da6c83165fbad3c4804c1394473e1b67da.camel@moritz.systems> From: John Baldwin In-Reply-To: <4b1b49da6c83165fbad3c4804c1394473e1b67da.camel@moritz.systems> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1637685063; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3xpa1GFb5A6dB121XT62zYX9uCpAuoUNcegsSZtxTPA=; b=tpVisKcOtaEoYjLp3A7keV+0qlmboxqZ+CpBYZ7b0+xNP7clMgPXtdAqP0F2zqAr8Qe+1s iHloGOpg3p4sKIo6GZ+uHyVfBH+mtEm3HQpCx2r5C1oXA47UjKi+lnJg2AddaC8NsarNJn y1w0PBlArgnw+XlCIhBTTj4eUD+b5LMClfHfAYBqCiwOumKpw+d6NAv5aLGe/NMEUp7vSo 5e/z7Mw5AdA+SO8shXOJjDP2jpHaZIfwASqYb0bHw+Rffj8jhA7oL6RDcYIaCNV58NGGO4 Ey3bnVzqTpZAvxKkadClZ5TrGh3WjsSfgwI4uoUo/EAsrPxGggFh7onK4Dy8Gg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1637685063; a=rsa-sha256; cv=none; b=iuixROgm2uXnc15pCA4wxNfSnKqmjzSC45FSVqQvur0sRSRjZmPQwqXRqH9wbgBwZ1afDs EKdkaL1Ut/Mo6nABiS2RqSWr4tJXfZj/4YQ0kWgWeUkEhJQdBKQ/OenHY+Q2m3K8DWusGY z9wa8gkON61K9h8e9XOCox9cDOxDH4PUg7FmSUZy64u09hCBvp04pWZKHBQpktCCm7ESec ZGxJkKvOa9ggTKNdCe80u6CzVEdS8F2/gg566i/986XhAqbycCv+fg3ZYU21Jo4Vnnu7l9 P/ID/f9+347YiCpHbWhvWM9iup5/fF35A+HSNgBxjJoLMbGDgZP7pVdXMzCm1Q== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N On 11/23/21 1:11 AM, Michał Górny wrote: > On Mon, 2021-11-22 at 09:47 -0800, John Baldwin wrote: >> On 11/21/21 6:42 AM, Michał Górny wrote: >>> Hi, everyone. >>> >>> As part of the work contracted by the FreeBSD Foundation, I'm working >>> on adding explicit minidump support to LLDB. When discussing >>> the options with upstream, I've been asked why FreeBSD created their own >>> minidump format. >>> >>> I did a bit of digging but TBH all the rationale I could get was to >>> create partial memory dumps. However, unless I'm mistaken the ELF >>> format is perfectly capable of that -- e.g. via creating an explicit >>> segment for every continuous active region. >>> >>> Does anyone happen to know what the original rationale for creating >>> a custom file format was, or know where to find one? Thanks in advance. >> >> The direct map aliases pages mapped via kmem. You'd be double dumping >> all the data mapped into kmem, once for the direct map and once for the >> non-direct mappings. >> >> You can think of minidumps as being a dump of physical memory, whereas >> an ELF core for a virtually-mapped kernel wants to dump virtual memory, >> and there is the disconnect. >> >> [...] >> >> You could perhaps imagine something similar where you had an ELF core >> with physical memory for PT_LOAD instead of virtual and a way to hint that >> so that the debugger would handle all the virtual -> PA translation, but >> you'd still need some home-grown notes for some of the other metadata we >> pass along (like the message buffer, etc.). Also, changing the format >> doesn't help with reading existing crash dumps. >> > > Thank you for your reply. If I understand correctly, you're comparing > minidump with a "proper" (i.e. virtual memory-based) ELF core. However, > the "full memory dump" ELF core also uses physical memory map model, is > that correct? Does that mean that using a different core format makes > it clear that it's a physical memory dump and not virtual? I think so, yes. > That said, please correct me if I'm mistaken but I think we should be > able to create a "virtual memory mapped" ELF core without too much > duplication. We could creating multiple segments with different p_vaddr > values but the same file p_offset, correct (and maybe p_paddr)? I'm not > advocating for changing the format, just trying to improve my knowledge. Humm, we could perhaps do that to avoid duplicate data, but that would be a _lot_ of PT_LOAD's. Every physical discontinuity in kmem would generate another PT_LOAD. I fear you might have hundreds or thousands of those, but we wouldn't really know without mocking it up and trying I think. You could simulate it perhaps by just writing a tool to convert an existing vmcore to a "fat ELF" for now vs having to change the kernel. -- John Baldwin