SUJ: fsck_ufs: Sparse journal inode

Mikolaj Golub to.my.trociny at gmail.com
Tue May 4 21:05:07 UTC 2010


On Mon, 03 May 2010 21:19:52 +0300 Mikolaj Golub wrote:

> Hi,
>
> Experimenting with journaled soft-updates on HAST I observed the error when
> fscking fs on the secondary after primary "crash":
>
> # fsck -y -t ufs /dev/hast/tank
> ** /dev/hast/tank
>
> USE JOURNAL?? yes
>
> ** SU+J Recovering /dev/hast/tank
> ** Reading 33554384 byte journal from inode 4.
> fsck_ufs: Sparse journal inode 4 (blocks = 16376, numfrags = 16383).
>
> (The text between the parentheses is a local modification to the fsck code to
> output some useful values).
>
> So to recover I needed to run fsck and type "no" when prompted "USE
> JOURNAL?". But I am looking for a way to script automatic recovering from this
> situation. Currently the only way I have found is to disable journal, run
> fsck, mount fs somewhere temporary, remove .sujournal, unmount, enable
> journal. Is this really so complicated or may I just miss something?
>
> BTW, I used to observe this error on every "crash" test. And "blocks" value was
> always the same: 16376. So I changed journal size to 16376 * 2048 = 33538048.
> It looks like after this the issue has gone.

Actually, this is tunefs who creates a sparse journal :-)

When creating a journal tunefs allocates size/fs_bsize blocks
(journal_alloc(size)). But if the journal size is not multiple of fs_bsize a
block for tail fragments is not allocated and we have sparse file.

Steps to reproduce:

Choose a journal size: (blocksize * N) + fragsize + something. E.g. 4198400
(2048*2048 + 2*2048).

[root at hasta ~]# newfs /dev/$dev
/dev/md0: 10.0MB (20480 sectors) block size 16384, fragment size 2048
        using 4 cylinder groups of 2.52MB, 161 blks, 384 inodes.
super-block backups (for fsck -b #) at:
 160, 5312, 10464, 15616
[root at hasta ~]# tunefs -j enable -S 4198400 /dev/$dev
Using inode 4 in cg 0 for 4198400 byte journal
tunefs: soft updates journaling set
[root at hasta ~]# fsck -f -t ufs /dev/$dev
** /dev/md0

USE JOURNAL?? [yn] y

** SU+J Recovering /dev/md0
** Reading 4198400 byte journal from inode 4.
fsck_ufs: Sparse journal inode 4.

Note, the size should be so that tail has at least one full fragment, because
in the code we have:

        blocks = ino_visit(jip, sujino, suj_add_block, 0);
        if (blocks != numfrags(fs, DIP(jip, di_size)))
                errx(1, "Sparse journal inode %d.\n", sujino);

with only one non-full fragment numfrags() will return the value equal to
blocks.

BTW, I am not sure this check would be correct even if tunefs allocated tail
fragments. As I see in indir_visit() for every found block it adds:

  (*frags) += fs->fs_frag;

so the same would be for the tail block, and ino_visit() in the code above
would return more then numfrags().

-- 
Mikolaj Golub


More information about the freebsd-fs mailing list