SUJ: fsck_ufs: Sparse journal inode
Mikolaj Golub
to.my.trociny at gmail.com
Tue May 4 21:05:07 UTC 2010
On Mon, 03 May 2010 21:19:52 +0300 Mikolaj Golub wrote:
> Hi,
>
> Experimenting with journaled soft-updates on HAST I observed the error when
> fscking fs on the secondary after primary "crash":
>
> # fsck -y -t ufs /dev/hast/tank
> ** /dev/hast/tank
>
> USE JOURNAL?? yes
>
> ** SU+J Recovering /dev/hast/tank
> ** Reading 33554384 byte journal from inode 4.
> fsck_ufs: Sparse journal inode 4 (blocks = 16376, numfrags = 16383).
>
> (The text between the parentheses is a local modification to the fsck code to
> output some useful values).
>
> So to recover I needed to run fsck and type "no" when prompted "USE
> JOURNAL?". But I am looking for a way to script automatic recovering from this
> situation. Currently the only way I have found is to disable journal, run
> fsck, mount fs somewhere temporary, remove .sujournal, unmount, enable
> journal. Is this really so complicated or may I just miss something?
>
> BTW, I used to observe this error on every "crash" test. And "blocks" value was
> always the same: 16376. So I changed journal size to 16376 * 2048 = 33538048.
> It looks like after this the issue has gone.
Actually, this is tunefs who creates a sparse journal :-)
When creating a journal tunefs allocates size/fs_bsize blocks
(journal_alloc(size)). But if the journal size is not multiple of fs_bsize a
block for tail fragments is not allocated and we have sparse file.
Steps to reproduce:
Choose a journal size: (blocksize * N) + fragsize + something. E.g. 4198400
(2048*2048 + 2*2048).
[root at hasta ~]# newfs /dev/$dev
/dev/md0: 10.0MB (20480 sectors) block size 16384, fragment size 2048
using 4 cylinder groups of 2.52MB, 161 blks, 384 inodes.
super-block backups (for fsck -b #) at:
160, 5312, 10464, 15616
[root at hasta ~]# tunefs -j enable -S 4198400 /dev/$dev
Using inode 4 in cg 0 for 4198400 byte journal
tunefs: soft updates journaling set
[root at hasta ~]# fsck -f -t ufs /dev/$dev
** /dev/md0
USE JOURNAL?? [yn] y
** SU+J Recovering /dev/md0
** Reading 4198400 byte journal from inode 4.
fsck_ufs: Sparse journal inode 4.
Note, the size should be so that tail has at least one full fragment, because
in the code we have:
blocks = ino_visit(jip, sujino, suj_add_block, 0);
if (blocks != numfrags(fs, DIP(jip, di_size)))
errx(1, "Sparse journal inode %d.\n", sujino);
with only one non-full fragment numfrags() will return the value equal to
blocks.
BTW, I am not sure this check would be correct even if tunefs allocated tail
fragments. As I see in indir_visit() for every found block it adds:
(*frags) += fs->fs_frag;
so the same would be for the tail block, and ino_visit() in the code above
would return more then numfrags().
--
Mikolaj Golub
More information about the freebsd-fs
mailing list