FYI: textdump MFC to RELENG_7 over the next week or two

Robert Watson rwatson at FreeBSD.org
Mon Mar 24 05:28:30 PDT 2008


Dear 7-stable users:

After settling for three months in 8-current, I'm going to begin MFC'ing 
support for textdumps from HEAD to RELENG_7 over the next week or two. 
Textdumps come in a number of parts, each of which will be merged followed by 
a day or two of settling time:  DDB output capture, DDB scripting, and then 
finally textdump support itself, which also requires changes to savecore(8).

Once the MFC is done, I'll forward out the textdump Q&A I sent to current@ a 
few months ago which gives some ideas for how to use the various parts, which 
can be combined to give textdump support, or used separately for other sorts 
of debugging.  Hands to help update the kernel debugging section of the 
handbook to include information on textdumps, not to mention other 
improvements in kernel debugging in the last few years, would be most welcome.

I've received some requests to MFC textdumps to RELENG_6.  Once support is 
fully merged to RELENG_7, I'll wait a bit and then look at how difficult to do 
that would be.  My guess is that it will be relatively straight forward and as 
such will follow a month or so later.

Thanks,

Robert N M Watson
Computer Laboratory
University of Cambridge

---------- Forwarded message ----------
Date: Tue, 18 Dec 2007 12:10:46 +0000 (GMT)
From: Robert Watson <rwatson at FreeBSD.org>
To: arch at FreeBSD.org
Cc: current at FreeBSD.org
Subject: DDB scripting, output capture, and textdumps


Dear all:

I've been hacking on-and-off for a while on a side project to improve our 
kernel debugging facilities.  Primarily, my concern has been to address three 
problems:

- The complications of employing kernel core dumps for debugging,
   including the large size of dumps making them unwieldy to distribute or
   store for any extended period (even with minidumps), the requirement to
   have relatively synchronized kernel source in order to use the dumps,
   the need to have a kernel with debugging symbols, and the problems with
   fsck causing sufficient swap use to invalidate dumps before they can be
   extracted.

- The decreasing likelihood that notebooks will ship with serial ports
   that can be used for interactive debugging using DDB.  Making end-users
   type in stack traces is cruel, photos are a pain, and X11 rules out
   both.

- The fact that a great many problems are most easily diagnosed using
   utility routines present in DDB, but not as easily using kgdb for
   offline analysis.  I find that for many bugs I analyze, simply looking
   at the DDB output is sufficient to identify the source of the problem.

An idea I punted around a bit at BSDCan earlier this year (or perhaps it was at 
EuroBSDCon the previous year) was an idea of a "textdump" -- that is, a new 
type of kernel dump based on capturing automatically extracted debugging 
information generated by DDB.  The result would be an ASCII text file that 
could be filed as a bug report, perhaps even automatically.

To this end, I have implemented three new facilities for use with DDB:

(1) DDB output capture.  The output of DDB is stored in a memory buffer,
     and can be extracted using a sysctl or textdumps (see below).  This
     can be turned on and off, both for use manually ("I'll want this
     later, but not that") and as part of scripts (see below).

(2) DDB scripting.  A limited number of named scripts can be defined to
     run a series of DDB commands.  No loops, etc, just simple command
     lists.  These can be caused to run automatically on entering DDB for
     various scenarios, including WITNESS violations and kernel panics.
     They can also be run by hand in order to save a bit of typing if you
     use DDB in a repetitive way (as I do).

(3) Textdumps.  A new dump type that stores a series of data files
     containing various pieces of information, including the DDB capture
     buffer, kernel message buffer, kernel configuration (if compiled into
     the kernel), panic message, and kernel version string.  These are
     stored in the ustar format inside the dump partition (aligned to the
     end) so can be easily extended, and savecore(8) requires almost no new
     logic to deal with them (it just drops numbered tar files in
     /var/crash).  This makes it straight forward to extend the textdump format
     to include new types of information and avoids the issue of how to safely
     simultaneously represent information in many different formats in the same
     file.

These are pretty flexible tools, and you can imagine doing the following sorts 
of things:

- Setting the kdb.enter.panic script to automatically turn on output
   capture, do full backtraces of all threads, show open file information,
   dump UMA stats, and save it all to a textdump and then reboot.

- Setting the kdb.enter.witness script to show lock information, generate
   a coredump, and reboot.  Or, just to automatically do "show allocks" and
   drop to the DDB prompt.

- Adding a flag to rc.conf to automatically submit textdumps via e-mail to
   a specific address, perhaps including GNATS or an automated bug system.
   These could be unpacked and automatically analyzed, and do to the compact
   size, kept for long-term trend analysis or to identify when a problem
   started occuring.

I've produced an initial snapshot of the above, which can be found here:

   http://www.watson.org/~robert/freebsd/20071218-ddb.tgz

This adds three files to DDB, patches quite a few kernel files (to pass more 
information into KDB about why it's being entered, in order to trigger the 
right script), enhancements to savecore(8) to know how to extract textdumps, 
adds a ddb(8) command line tool so that userspace can manage DDB scripts from 
outside the debugger, extensions to the ddb(4) man
page, and a new textdump(4) man page.

There are a number of known limitations; I've tried to document them at the top 
of the pertinent files where I am aware of them.  I also regret to say that to 
date I've been able to test only on i386, and not other platforms.  I'd welcome 
any feedback -- I'd like to get these changes into CVS in the next week or two.

Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-current at freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"


More information about the freebsd-stable mailing list