geom_raid5 livelock?

Fri Jan 12 20:29:18 UTC 2007

R. B. Riddick wrote:
> --- CyberLeo Kitsana <cyberleo at cyberleo.net> wrote:
>> I've been making use of the geom_raid5 class for FreeBSD 6.2-PRERELEASE. 
>> I've noticed an odd behavior lately, with the latest sources provided 
>> (as of 2006-01-10) in which, with 'safeop' enabled, the raid5 will stop 
>> responding.
>>
> Ohoh...
> 
>> All works beautifully, until several dozen gigabytes are transferred to 
>> or from the filesystem with safeop enabled, at which point the 
>> filesystem will grow quickly less responsive, and eventually cease 
>> responding entirely (processes stuck in diskwait) CPU usage is at 0%, 
>> and all four members of the raid5 are being read at around 160kB/sec 
>> (16kB/t, 10tps) constantly. It does not naturally recover within 72 
>> hours. The mirror is unaffected by this behavior.
>>
> Strange... :-)
> SAFEOP means, that a failed disk leads to an IO error for every request, and
> that every read request reads all corresponding disk areas (if possible) and
> checks parity. SAFEOP mode is surely useful, if u want to be sure, that neither
> ur disks nor ur operating system provide bogus data... SAFEOP mode causes a lot
> of disk activity (e. g. in case of sequential read, it reads n-1 times (where n
> is the disk count of the RAID5) the whole stripe (n blocks)...).
> This special form of a read request is used by the rebuild-procedure, so that
> it should work fine...
> 
> What does "graid5 list" say in those times?
> Are there any special messages logged via syslog in those times?
> 
>> When this occurs, the moment safeop is disabled on the raid5, all the 
>> problems cease, the filesystem begins responding and the programs resume.
>>
> So the kernel does not panic or so...? :-)
> 
>> Is this intentional, an artifact of the hardware or layout I'm using, or 
>> could this be indicative of an obscure bug somewhere? Can I provide any 
>> additional information which would assist in tracking this down?
>>
> I would guess: An obscure bug in graid5...
> 
> You could try to put gcache between the disks and graid5...
> And the syslog messages (if there r any) would be very interesting (like
> messages about read error or disk failure or so)...

Hence the testing. I wish to ensure there are as few problems as 
possible when this is put into production.

The kernel does not panic, and everything resumes just fine as soon as 
safeop is disabled. There are no new messages in the kernel log, nor in 
syslog, and all disks are operating properly, if slowly (around 40 
kilobytes per second each, with dd bs=4096).

Also, gcache doesn't seem to exist in the base system. Is this easier to 
build than gjournal? (patching for that caused a whole ton of kernel and 
world problems I'd rather not revisit)

The attached list was taken during the most recent lockup.

----
[cyberleo at mikayla ~]$ graid5 list
Geom name: raid5
State: COMPLETE CALM (safeop)
Status: Total=4, Online=4
Type: AUTOMATIC
Pending: (wqp 0 // 0)
Stripesize: 32768
MemUse: 147456 (msl 7)
Newest: -1
ID: 3906282509
Providers:
1. Name: raid5/raid5
    Mediasize: 1193822846976 (1.1T)
    Sectorsize: 512
    Mode: r1w1e1
Consumers:
1. Name: ad6s2
    Mediasize: 397940981760 (371G)
    Sectorsize: 512
    Mode: r2w2e2
    DiskNo: 3
    Error: No
2. Name: ad4s2
    Mediasize: 397940981760 (371G)
    Sectorsize: 512
    Mode: r2w2e2
    DiskNo: 2
    Error: No
3. Name: ad2s2
    Mediasize: 397940981760 (371G)
    Sectorsize: 512
    Mode: r2w2e2
    DiskNo: 1
    Error: No
4. Name: ad0s2
    Mediasize: 397940981760 (371G)
    Sectorsize: 512
    Mode: r2w2e2
    DiskNo: 0
    Error: No
----

--
Fuzzy love,
-CyberLeo
Technical Administrator
CyberLeo.Net Webhosting
http://www.CyberLeo.Net
<CyberLeo at CyberLeo.Net>

Furry Peace! - http://www.fur.com/peace/