gmirror+gjournal often makes inconsistens file systems
Eugene Grosbein
egrosbein at rdtc.ru
Fri Sep 9 10:31:56 UTC 2011
Dear Pawel Jakub,
09.09.2011 12:17, Eugene Grosbein writes:
> Hi!
>
> For long time I experience same UFS2 filesystem problems with several 8.2 systems
> running on gmirror+gjournal+async. In case of unclean shutdown, kernel panic or power failure
> gjournal makes fsck skip its checks and that's why I use it.
>
> But quite often my /var partition (and sometimes others) still has severe damage in it
> and running with such /var mounted read-write leads to another panics or hangs and so on.
>
> For example, I have such 8.2-STABLE system with ad4 and ad6 drives combined to /dev/mirror/gm0.
> I have just removed ad6 from the mirror, ran fsck -y manually for all its filesystems,
> shut down this machine again cleanly and booted it next time from ad6
> while keeping mirror with ad4 not mounted nor checked.
>
> Then, I ran fsck -y /dev/mirror/gm0.journals1e (/var on the mirrored drive)
> and got LOTS of bad errors on presumably clean file system.
> Of course, I've seen the same errors while checking ad6 after it was removed from running mirror.
> I have auto-sync gmirror feature turned ON. I've tried to turn it OFF but that just
> increase frequency of such damages not fixed after reboot.
>
> It seems that gjournal cannot handle system crashes reliably, can it?
> I basically run in without any manual tuning. I've also tried to tune it - without luck,
> it works nice when there are no unclean shutdowns but it's here to deal with them in the first place.
>
> # fsck -t ffs -y /dev/mirror/gm0.journals1e
> ** /dev/mirror/gm0.journals1e
> ** Last Mounted on /var
> ** Phase 1 - Check Blocks and Sizes
> 3955872 DUP I=989242
> 3955873 DUP I=989242
> 3955874 DUP I=989242
> 3955875 DUP I=989242
> 3955876 DUP I=989242
> 3955877 DUP I=989242
> 3955878 DUP I=989242
> 3955879 DUP I=989242
> 3955880 DUP I=989242
> 3955881 DUP I=989242
> 3955882 DUP I=989242
> EXCESSIVE DUP BLKS I=989242
> CONTINUE? yes
>
> INCORRECT BLOCK COUNT I=989242 (448 should be 424)
> CORRECT? yes
>
> 3955888 DUP I=989289
> 3955889 DUP I=989289
> 3955890 DUP I=989289
> 3955891 DUP I=989289
> 3955892 DUP I=989289
> 3955893 DUP I=989289
> 3955894 DUP I=989289
> 3955895 DUP I=989289
> ** Phase 1b - Rescan For More DUPS
> 3955872 DUP I=989242
> 3955873 DUP I=989242
> 3955874 DUP I=989242
> 3955875 DUP I=989242
> 3955876 DUP I=989242
> 3955877 DUP I=989242
> 3955878 DUP I=989242
> 3955879 DUP I=989242
> 3955880 DUP I=989242
> 3955881 DUP I=989242
> 3955888 DUP I=989242
> 3955889 DUP I=989242
> 3955890 DUP I=989242
> 3955891 DUP I=989242
> 3955892 DUP I=989242
> 3955893 DUP I=989242
> 3955894 DUP I=989242
> 3955895 DUP I=989242
> ** Phase 2 - Check Pathnames
> DUP/BAD I=989289 OWNER=root MODE=100640
> SIZE=14367 MTIME=Sep 9 11:30 2011
> FILE=/log/kernel.log
>
> REMOVE? yes
>
> DUP/BAD I=989242 OWNER=root MODE=100640
> SIZE=202631 MTIME=Sep 8 19:52 2011
> FILE=/log/mpd.log.0
>
> REMOVE? yes
>
> ** Phase 3 - Check Connectivity
> ** Phase 4 - Check Reference Counts
> UNREF FILE I=376866 OWNER=root MODE=140666
> SIZE=0 MTIME=Sep 5 12:27 2011
> CLEAR? yes
>
> UNREF FILE I=376868 OWNER=root MODE=140666
>
> UNREF FILE I=376868 OWNER=root MODE=140666
> SIZE=0 MTIME=Sep 7 20:30 2011
> CLEAR? yes
>
> UNREF FILE I=376869 OWNER=root MODE=140666
> SIZE=0 MTIME=Sep 8 11:17 2011
> CLEAR? yes
>
> UNREF FILE I=376870 OWNER=root MODE=140666
> SIZE=0 MTIME=Sep 8 12:11 2011
> CLEAR? yes
>
> BAD/DUP FILE I=989242 OWNER=root MODE=100640
> SIZE=202631 MTIME=Sep 8 19:52 2011
> CLEAR? yes
>
> UNREF FILE I=989259 OWNER=root MODE=100640
> SIZE=648 MTIME=Aug 27 00:00 2011
> RECONNECT? yes
>
> BAD/DUP FILE I=989289 OWNER=root MODE=100640
> SIZE=14367 MTIME=Sep 9 11:30 2011
> CLEAR? yes
> LINK COUNT FILE I=989293 OWNER=root MODE=100640
> SIZE=961 MTIME=Sep 9 11:26 2011 COUNT 1 SHOULD BE 2
> ADJUST? yes
>
> UNREF FILE I=989327 OWNER=root MODE=100640
> SIZE=114 MTIME=Aug 27 00:00 2011
> RECONNECT? yes
>
> ** Phase 5 - Check Cyl groups
> FREE BLK COUNT(S) WRONG IN SUPERBLK
> SALVAGE? yes
>
> SUMMARY INFORMATION BAD
> SALVAGE? yes
>
> BLK(S) MISSING IN BIT MAPS
> SALVAGE? yes
>
> 1188 files, 90007 used, 4987072 free (360 frags, 623339 blocks, 0.0%
> fragmentation)
>
> ***** FILE SYSTEM IS CLEAN *****
>
> ***** FILE SYSTEM WAS MODIFIED *****
Please explain if such partitioning is supported?
physical drive - geom_mirror - geom_journal - geom_part_mbr - geom_part_bsd - journalled UFS2
If not, mounting such UFS2 should warn us, shouldn't it?
No warnings now.
Eugene Grosbein
More information about the freebsd-stable
mailing list