Reliable panic: dump -L on a gjournalled file system

Sun Apr 18 23:42:25 UTC 2010

Hi all,

I finally got jack of waiting for fsck on my large-ish /usr partition, so the weekend before last I re-organized my disks so that
I have all of my "big" partitions gjournalled.  Ever since then I've woken in the morning to discover my system hung and
unresponsive, and the nightly backup not done.  This weekend I went to run the backup manually, and I did it from the text
console, rather than the GUI, so that I could see any last gasps from the system.  And lo: it did panic.  (Note to self: having
the kernel set to fall into DDB on panic isn't a good idea when running an X11 workstation because once the kernel is waiting for
user input on the non-visible VGA console screen, all is lost.)

Anyway, here's what the panic said:

panic: Journal overflow (joffset=982329115136 active=982329273344 inactive=892328975360)
cpuid=1
KDB: enter: panic
[thread 15 tid 100049]
Stopped at kdb_enter+0x3d: moveq $0,0x6ba080(%rip)
> where
kdb_enter()
panic()
g_journal_flush()
g_journal_worker()
fork_exit()
fork_trampoline()

I thought: a-ha! dump must be such a high data-rate process that I'm overfilling the journal before it can catch itself, so I
re-formatted my backup drive without a journal and tried again.  Same panic.

So my conclusion is that the failure is coming from the journal of the file system that I'm *dumping*, rather than the target, and
is caused by the file system checkpoint produced with the -L argument to dump.  I tested that theory yesterday by successfully
doing a backup from single-user mode with the /usr file system off-line and no -L argument to dump.

Is this just pilot error, or a known problem with gjournal?  How is everyone else dumping their journalled file systems?

I have another question to ask about gjournal, but I'll ask it in another message...

Cheers,

-- 
Andrew