[Bug 227204] Combination of gmirror and enabled softupdates journalling cause slow filesystem degradation
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Mon Apr 2 15:46:34 UTC 2018
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227204
Bug ID: 227204
Summary: Combination of gmirror and enabled softupdates
journalling cause slow filesystem degradation
Product: Base System
Version: 10.3-RELEASE
Hardware: Any
OS: Any
Status: New
Severity: Affects Only Me
Priority: ---
Component: kern
Assignee: freebsd-bugs at FreeBSD.org
Reporter: aeder at list.ru
Combination of gmirror and enabled softupdates journalling cause slow
filesystem degradation
Hello!
I'm supporting at least 10 freebsd installations in different parts of the
country (remotely). All of them configured with gmirror in two-disk
configuration, all filesystems with softupdates enabled, and some of them with
softupdates journalling enabled.
Most of installations are rather old, binary updated from 8.x-RELEASE version
to 10.3-RELEASE
At least 3 different systems after a while (years) of uptime and multiple
reboots (sometimes due to power failure) get the following problem: filesystem
unconsistency, causing
a) kernel panics
b) forever locks of processes accessing some files or file listing.
In all 3 cases, I have it solved by booting in single-user mode, disabling
soft-updates journalling (leaving softupdates only) and performing fsck -f -y
on all filesystems.
Typical list of errors found by fsck. Most of dublicates lines omitted:
===================================================================
[root at yakutia /usr/home/der]# fsck -f -y /dev/ada0s1a
** /dev/ada0s1a
** Last Mounted on /
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
UNREF FILE I=5858989 OWNER=root MODE=100600
SIZE=380 MTIME=Oct 2 17:50 2017
CLEAR? yes
UNREF FILE I=5858993 OWNER=root MODE=100600
SIZE=427 MTIME=Oct 2 17:54 2017
CLEAR? yes
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? yes
SUMMARY INFORMATION BAD
SALVAGE? yes
BLK(S) MISSING IN BIT MAPS
SALVAGE? yes
149860 files, 918866 used, 31578821 free (15053 frags, 3945471 blocks, 0.0%
fragmentation)
***** FILE SYSTEM MARKED CLEAN *****
***** FILE SYSTEM WAS MODIFIED *****
[root at yakutia /usr/home/der]# fsck -f -y /dev/ada0s1d
** /dev/ada0s1d
** Last Mounted on /usr
** Phase 1 - Check Blocks and Sizes
26629536 DUP I=13324783
UNEXPECTED SOFT UPDATE INCONSISTENCY
26629537 DUP I=13324783
UNEXPECTED SOFT UPDATE INCONSISTENCY
...
26629543 DUP I=13324783
UNEXPECTED SOFT UPDATE INCONSISTENCY
26629544 DUP I=13324783
UNEXPECTED SOFT UPDATE INCONSISTENCY
26629545 DUP I=13324783
UNEXPECTED SOFT UPDATE INCONSISTENCY
26629546 DUP I=13324783
UNEXPECTED SOFT UPDATE INCONSISTENCY
EXCESSIVE DUP BLKS I=13324783
CONTINUE? yes
INCORRECT BLOCK COUNT I=13324783 (11200 should be 80)
CORRECT? yes
INCORRECT BLOCK COUNT I=28090681 (8 should be 0)
CORRECT? yes
...
INCORRECT BLOCK COUNT I=28100016 (8 should be 0)
CORRECT? yes
INCORRECT BLOCK COUNT I=28100017 (8 should be 0)
CORRECT? yes
INCORRECT BLOCK COUNT I=28100020 (8 should be 0)
CORRECT? yes
INTERNAL ERROR: dups with softupdates
UNEXPECTED SOFT UPDATE INCONSISTENCY
** Phase 1b - Rescan For More DUPS
26629536 DUP I=13322839
UNEXPECTED SOFT UPDATE INCONSISTENCY
26629537 DUP I=13322839
UNEXPECTED SOFT UPDATE INCONSISTENCY
...
26629545 DUP I=13322839
UNEXPECTED SOFT UPDATE INCONSISTENCY
** Phase 2 - Check Pathnames
DUP/BAD I=13322839 OWNER=root MODE=100644
SIZE=946634 MTIME=Oct 30 17:43 2017
FILE=/local/lib/libfreetype.a
UNEXPECTED SOFT UPDATE INCONSISTENCY
REMOVE? yes
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
BAD/DUP FILE I=13322839 OWNER=root MODE=100644
SIZE=946634 MTIME=Oct 30 17:43 2017
CLEAR? yes
BAD/DUP FILE I=13324783 OWNER=root MODE=100555
SIZE=5691264 MTIME=Jun 26 22:35 2017
CLEAR? yes
...
UNREF FILE I=28098948 OWNER=root MODE=100644
SIZE=0 MTIME=Oct 23 20:32 2017
RECONNECT? yes
ZERO LENGTH DIR I=28098949 OWNER=root MODE=40755
SIZE=0 MTIME=Apr 2 23:33 2018
CLEAR? yes
UNREF FILE I=28098950 OWNER=root MODE=100644
SIZE=0 MTIME=Nov 13 07:12 2016
RECONNECT? yes
UNREF FILE I=28098951 OWNER=root MODE=100644
SIZE=0 MTIME=Nov 13 07:12 2016
RECONNECT? yes
...
...
UNREF FILE I=28100020 OWNER=root MODE=100644
SIZE=0 MTIME=Apr 2 23:33 2018
RECONNECT? yes
UNREF FILE I=28100049 OWNER=root MODE=120755
SIZE=0 MTIME=Apr 2 23:33 2018
RECONNECT? yes
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? yes
SUMMARY INFORMATION BAD
SALVAGE? yes
BLK(S) MISSING IN BIT MAPS
SALVAGE? yes
370932 files, 2653812 used, 82406856 free (23056 frags, 10297975 blocks, 0.0%
fragmentation)
***** FILE SYSTEM STILL DIRTY *****
***** FILE SYSTEM WAS MODIFIED *****
***** PLEASE RERUN FSCK *****
[root at yakutia /usr/home/der]# fsck -f -y /dev/ada0s1d
** /dev/ada0s1d
** Last Mounted on /usr
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
370932 files, 2653812 used, 82406856 free (23056 frags, 10297975 blocks, 0.0%
fragmentation)
***** FILE SYSTEM MARKED CLEAN *****
==========================================================
I can't find direct information in handbook, if journalling softupdates is not
compatible with gmirror, so decided to create this bug.
I'm sure that it's not hardware problem - smartctl show nothing on those hard
drives, and we actually replace disks ~every 3 years using gmirror.
>From the last case I can save broken filesystem image (mounts OK, but cause
kernel panic if attempting to write some of the files), if it can be used to
find root cause. Due to security reasons I can't give it to anyone outside my
company, but I can try to analyse it if proper questions was given.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list