kern/164252: [geom] gjournal overflow
Andreas Longwitz
longwitz at incore.de
Wed May 9 16:20:12 UTC 2012
The following reply was made to PR kern/164252; it has been noted by GNATS.
From: Andreas Longwitz <longwitz at incore.de>
To: bug-followup at freebsd.org
Cc:
Subject: Re: kern/164252: [geom] gjournal overflow
Date: Wed, 09 May 2012 18:12:51 +0200
The panic "gjournal overflow" is caused by design problems of snapshot
and/or gjournal on big disks (1 TB or more). Each journaled partition is
served by a kernel thread g_journal and there is one kernel thread
g_journal switcher responsible for all journaled partitions. Aside from
mount/umount the g_journal switcher and snapshot for ffs are the only
kernel threads using vfs_write_suspend() holding the lock "suspwt".
After starting a snapshot on a big disk with
mksnap_ffs /backup/.snap/snapshot
the lock "suspwt" will be catched and hold for a long time. Some seconds
later the g_journal switcher tries to switch the journal of the backup
partition and blocks on the "suspwt" lock. Therefore he can not handle
the other journaled partition anymore, he must wait until the snapshot
releases the "suspwt" lock.
On my test server (FreeBSD 8.3) /backup is mounted on /dev/mirror/gm2p2
(1,8 TB) and kern.geom.journal.debug=1 gives
May 9 09:59:47 : Data has been copied.
May 9 09:59:47 : Msync time of /backup: 0.031111s
May 9 09:59:47 : Sync time of /backup: 0.086015s
May 9 09:59:47 : Cache flush time: 0.000705s
May 9 09:59:47 : BIO_FLUSH time of mirror/gm2p2: 0.015049s
May 9 10:04:12 : Journal mirror/gm2p2 71% full, forcing journal switch.
May 9 10:17:48 : Suspend time of /backup: 1080.955120s
May 9 10:17:48 : Starting copy of journal.
May 9 10:17:48 : Cache flush time: 0.013182s
May 9 10:17:48 : Cache flush time: 0.027241s
May 9 10:17:48 : Switch time of mirror/gm2p2: 0.206213s
May 9 10:17:48 : Entire switch time: 1081.589788s
The critical "Suspend time" was 18 minutes with no I/O on any other
partitions. The same test with some I/O's on any other partition
triggers the panic immediately, because my journal is not big enough to
hold the I/O's of 18 minutes.
The same problem occurs on removing the snapshot, the g_journal switcher
waits for 10 minutes on the "ufs" lock.
The same test on a 1 TB disk drops the "Suspend time" to 190 seconds for
the snapshot and 12 seconds for the remove.
My conclusion is, that snapshot (used for dump -L) on a journaled
partition is not safe, when the "Suspend time" for the biggest journaled
partition is more than about 20 seconds.
--
Dr. Andreas Longwitz
Data Service GmbH
Beethovenstr. 2A
23617 Stockelsdorf
Amtsgericht Lübeck, HRB 318 BS
Geschäftsführer: Wilfried Paepcke, Dr. Andreas Longwitz, Josef Flatau
More information about the freebsd-geom
mailing list