kern/127420: panic: Journal overflow on gmirrored gjournal
Ruben van Staveren
ruben at verweg.com
Tue Sep 16 11:10:03 UTC 2008
>Number: 127420
>Category: kern
>Synopsis: panic: Journal overflow on gmirrored gjournal
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Tue Sep 16 11:10:02 UTC 2008
>Closed-Date:
>Last-Modified:
>Originator: Ruben van Staveren
>Release: FreeBSD 7.1-PRERELEASE amd64
>Organization:
>Environment:
System: FreeBSD chassis 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #2: Tue Sep 16 11:29:52 CEST 2008 root at chassis:/opt/obj/usr/cvsup/7-stable/src/sys/CHASSIS-DEBUG amd64
>Description:
Crash 1
panic: Journal overflow (joffset=180955342336 active=180735900160 inactive=180952868864)
cpuid = 1
Uptime: 40m34s
Physical memory: 4085 MB
Dumping 625 MB:
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0x200
fault code = supervisor read instruction, page not present
instruction pointer = 0x8:0x200
stack pointer = 0x10:0xffffffffae1ece40
frame pointer = 0x10:0xffffffffae1ece70
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 47 (g_journal mirror/gm)
trap number = 12
Crash 2 (with debug kernel)
panic: Journal overflow (joffset=180542946816 active=181305220608 inactive=180542008320)
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
panic() at panic+0x17d
g_journal_flush() at g_journal_flush+0x8cb
g_journal_worker() at g_journal_worker+0x14ce
fork_exit() at fork_exit+0x12a
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffffae1edd30, rbp = 0 ---
panic: BUF_UNLOCK 0xffffffff9a26e220 while B_REMFREE is still set.
cpuid = 1
panic: BUF_UNLOCK 0xffffffff9a04b420 while B_REMFREE is still set.
cpuid = 1
Uptime: 20m24s
Physical memory: 4084 MB
Dumping 625 MB:
Unfortunately, dumping doesn't succeed anymore at this stage
Kernel config, the -DEBUG version just includes that file with as extra
options:
options BREAK_TO_DEBUGGER
options INVARIANTS
options INVARIANT_SUPPORT
options WITNESS
options WITNESS_KDB
options DIAGNOSTIC
(I had to disable some KASSERTS in sys/geom/geom_io.c as gjournal may alter
some data there it seems, also see
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/current/2007-08/msg00648.html
)
http://ruben.is.verweg.com/stuff/gjournal-panic/CHASSIS
http://ruben.is.verweg.com/stuff/gjournal-panic/dmesg.boot
The machine is a Sun X2100M2 with 2 x 250Gb SATA drives
Geom name: gm0
State: COMPLETE
Components: 2
Balance: round-robin
Slice: 4096
Flags: NOFAILSYNC
GenID: 0
SyncID: 1
ID: 4042519102
Providers:
1. Name: mirror/gm0
Mediasize: 250055999488 (233G)
Sectorsize: 512
Mode: r6w6e8
Consumers:
1. Name: ad4
Mediasize: 250056000000 (233G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Priority: 1
Flags: NONE
GenID: 0
SyncID: 1
ID: 2820405034
2. Name: ad6
Mediasize: 250056000000 (233G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Priority: 0
Flags: NONE
GenID: 0
SyncID: 1
ID: 933275518
Geom name: gjournal 243051746
ID: 243051746
Providers:
1. Name: mirror/gm0s1a.journal
Mediasize: 3221224960 (3.0G)
Sectorsize: 512
Mode: r1w1e2
Consumers:
1. Name: mirror/gm0s1a
Mediasize: 4294967296 (4.0G)
Sectorsize: 512
Mode: r1w1e1
Jend: 4294966784
Jstart: 3221224960
Role: Data,Journal
Geom name: gjournal 3027218344
ID: 3027218344
Providers:
1. Name: mirror/gm0s1d.journal
Mediasize: 33285996032 (31G)
Sectorsize: 512
Mode: r1w1e2
Consumers:
1. Name: mirror/gm0s1d
Mediasize: 34359738368 (32G)
Sectorsize: 512
Mode: r1w1e1
Jend: 34359737856
Jstart: 33285996032
Role: Data,Journal
Geom name: gjournal 1964026446
ID: 1964026446
Providers:
1. Name: mirror/gm0s1e.journal
Mediasize: 3221224960 (3.0G)
Sectorsize: 512
Mode: r1w1e2
Consumers:
1. Name: mirror/gm0s1e
Mediasize: 4294967296 (4.0G)
Sectorsize: 512
Mode: r1w1e1
Jend: 4294966784
Jstart: 3221224960
Role: Data,Journal
Geom name: gjournal 3220754734
ID: 3220754734
Providers:
1. Name: mirror/gm0s1f.journal
Mediasize: 7516192256 (7.0G)
Sectorsize: 512
Mode: r1w1e2
Consumers:
1. Name: mirror/gm0s1f
Mediasize: 8589934592 (8.0G)
Sectorsize: 512
Mode: r1w1e1
Jend: 8589934080
Jstart: 7516192256
Role: Data,Journal
Geom name: gjournal 1120739874
ID: 1120739874
Providers:
1. Name: mirror/gm0s1g.journal
Mediasize: 180255252480 (168G)
Sectorsize: 512
Mode: r1w1e2
Consumers:
1. Name: mirror/gm0s1g
Mediasize: 181328994816 (169G)
Sectorsize: 512
Mode: r1w1e1
Jend: 181328994304
Jstart: 180255252480
Role: Data,Journal
Name Status Components
label/swap N/A mirror/gm0s1b
ufs/root N/A mirror/gm0s1a.journal
ufs/var N/A mirror/gm0s1d.journal
ufs/tmp N/A mirror/gm0s1e.journal
ufs/usr N/A mirror/gm0s1f.journal
ufs/opt N/A mirror/gm0s1g.journal
******* Working on device /dev/ad4 *******
parameters extracted from in-core disklabel are:
cylinders=484514 heads=16 sectors/track=63 (1008 blks/cyl)
Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=484514 heads=16 sectors/track=63 (1008 blks/cyl)
Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
start 63, size 488375937 (238464 Meg), flag 80 (active)
beg: cyl 0/ head 1/ sector 1;
end: cyl 703/ head 254/ sector 63
The data for partition 2 is:
<UNUSED>
The data for partition 3 is:
<UNUSED>
The data for partition 4 is:
<UNUSED>
******* Working on device /dev/ad6 *******
parameters extracted from in-core disklabel are:
cylinders=484514 heads=16 sectors/track=63 (1008 blks/cyl)
Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=484514 heads=16 sectors/track=63 (1008 blks/cyl)
Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
start 63, size 488375937 (238464 Meg), flag 80 (active)
beg: cyl 0/ head 1/ sector 1;
end: cyl 703/ head 254/ sector 63
The data for partition 2 is:
<UNUSED>
The data for partition 3 is:
<UNUSED>
The data for partition 4 is:
<UNUSED>
# /dev/mirror/gm0s1:
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
a: 8388608 16 4.2BSD 2048 16384 28528
b: 33554432 8388624 swap
c: 488375937 0 unused 0 0 # "raw" part, don't edit
d: 67108864 41943056 4.2BSD 2048 16384 28528
e: 8388608 109051920 4.2BSD 2048 16384 28528
f: 16777216 117440528 4.2BSD 2048 16384 28528
g: 354158193 134217744 4.2BSD 2048 16384 28528
/dev/ufs/root on / (ufs, asynchronous, local, gjournal)
devfs on /dev (devfs, local)
/dev/ufs/opt on /opt (ufs, asynchronous, local, gjournal)
/dev/ufs/tmp on /tmp (ufs, asynchronous, local, gjournal)
/dev/ufs/usr on /usr (ufs, asynchronous, local, gjournal)
/dev/ufs/var on /var (ufs, asynchronous, local, gjournal)
>How-To-Repeat:
on /opt/bonnie, run in parallel
bonnie++ -c 4 -s 4096 -r 4096 -u nobody -d $PWD
both bonnie processes will stall the system with suspfs/wdrain states until it
panics.
Also building a 1Gb sized nanobsd image will lock during disk install phase on
suspfs/wdrain, but that is not always reproducable: it succeeds about 50% of
the time.
It looks it takes longer to trigger when using the debugging options.
>Fix:
Maybe don't run a mirrored gjournal on FreeBSD/amd64 ?
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list