geom_raid5 livelock?
CyberLeo Kitsana
cyberleo at cyberleo.net
Fri Jan 12 18:55:32 UTC 2007
Hi.
I've been making use of the geom_raid5 class for FreeBSD 6.2-PRERELEASE.
I've noticed an odd behavior lately, with the latest sources provided
(as of 2006-01-10) in which, with 'safeop' enabled, the raid5 will stop
responding.
I have an 800MHz Celeron (i815 chipset) with 512MB RAM and 4x 400GB
PATA100 disks, two on a Promise PCI ATA controller, running FreeBSD
6.2-PRERELEASE (2006-01-10). The kernel is SMP, with DEVICE_POLLING and
HZ=100 set. The disks are configured with a 2GB slice 1 as a 4-disk
geom_mirror containing /, and the remainder in slice 2 as a 4-disk
geom_raid5 (32768 stripe size). The box is designed to receive daily
backups from production servers, so data integrity is preferred over
throughput or latency.
All works beautifully, until several dozen gigabytes are transferred to
or from the filesystem with safeop enabled, at which point the
filesystem will grow quickly less responsive, and eventually cease
responding entirely (processes stuck in diskwait) CPU usage is at 0%,
and all four members of the raid5 are being read at around 160kB/sec
(16kB/t, 10tps) constantly. It does not naturally recover within 72
hours. The mirror is unaffected by this behavior.
When this occurs, the moment safeop is disabled on the raid5, all the
problems cease, the filesystem begins responding and the programs resume.
Is this intentional, an artifact of the hardware or layout I'm using, or
could this be indicative of an obscure bug somewhere? Can I provide any
additional information which would assist in tracking this down?
----
[cyberleo at mikayla ~]$ gmirror list
Geom name: root
State: COMPLETE
Components: 4
Balance: load
Slice: 4096
Flags: NONE
GenID: 0
SyncID: 2
ID: 89087781
Providers:
1. Name: mirror/root
Mediasize: 1610563584 (1.5G)
Sectorsize: 512
Mode: r1w1e1
Consumers:
1. Name: ad0s1a
Mediasize: 1610564096 (1.5G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Priority: 0
Flags: DIRTY
GenID: 0
SyncID: 2
ID: 3326083319
2. Name: ad2s1a
Mediasize: 1610564096 (1.5G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Priority: 0
Flags: DIRTY
GenID: 0
SyncID: 2
ID: 1957052293
3. Name: ad4s1a
Mediasize: 1610564096 (1.5G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Priority: 0
Flags: DIRTY
GenID: 0
SyncID: 2
ID: 3131999117
4. Name: ad6s1a
Mediasize: 1610564096 (1.5G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Priority: 0
Flags: DIRTY
GenID: 0
SyncID: 2
ID: 2209607005
[cyberleo at mikayla ~]$ graid5 list
Geom name: raid5
State: COMPLETE CALM
Status: Total=4, Online=4
Type: AUTOMATIC
Pending: (wqp 0 // 0)
Stripesize: 32768
MemUse: 3467264 (msl 138)
Newest: -1
ID: 3906282509
Providers:
1. Name: raid5/raid5
Mediasize: 1193822846976 (1.1T)
Sectorsize: 512
Mode: r1w1e1
Consumers:
1. Name: ad6s2
Mediasize: 397940981760 (371G)
Sectorsize: 512
Mode: r2w2e2
DiskNo: 3
Error: No
2. Name: ad4s2
Mediasize: 397940981760 (371G)
Sectorsize: 512
Mode: r2w2e2
DiskNo: 2
Error: No
3. Name: ad2s2
Mediasize: 397940981760 (371G)
Sectorsize: 512
Mode: r2w2e2
DiskNo: 1
Error: No
4. Name: ad0s2
Mediasize: 397940981760 (371G)
Sectorsize: 512
Mode: r2w2e2
DiskNo: 0
Error: No
----
--
Fuzzy love,
-CyberLeo
Technical Administrator
CyberLeo.Net Webhosting
http://www.CyberLeo.Net
<CyberLeo at CyberLeo.Net>
Furry Peace! - http://www.fur.com/peace/
More information about the freebsd-geom
mailing list