After crash geom::graid5 may not resync correctly. Geli complains missing metadata

Joshua Dunham joshua.dunham at gmail.com
Thu Sep 25 05:20:09 UTC 2008


Hi Guys,

   So, I'm a graid5 / geli newbie and having some trouble. Here's the scoop.

Last week I had a perfectly running system which included 6x 500GB WD
sata drives in a geom::graid5 array with geli managing the encryption
layer. 2x sata controllers that are the same as when system was
working 100% so I assume there should be no conflict now.

My transfer rates started crawling so I popped into ssh and
immediately read that one of the discs was throwing UDMA xfer errors.
Well I rebooted and changed the BIOS settings for the drive and tried
to boot back into freebsd. The system then started freezing after a
few seconds of starting a resync. I called the drive junk, shut down
the system, and ordered a spare. Once it came I replaced the drive and
booted into freebsd, formatted the disc using the raid filesystem and
rebooted. The system recognized the new drive as one that was out of
sync and immediately started to re-sync it. After it was done
resynchronizing I went to attach the encryption layer and here is
where the trouble started. I get an error message about missing
metadata.

NAS:~# geli attach /dev/raid5/storage
Cannot read metadata from /dev/raid5/storage: Invalid argument.

  In troubleshooting this problem I have learned about the geli dump
command, but it's probably too late now. I also heard from various
sources that it is very bad to swap the sata cables on a raid system.
I don't think I swapped any cables, but the BIOS has a section where I
can define the HD's of the system for booting etc. When you remove a
drive the one below it (ad4 let's say) will shift up (to ad6 lets
say). This has probably happened, but does it really kill the array?
I'd really assume since it's raid5 the metadata for graid5 would keep
the discs in the correct order as it stores the Disk number in the
metadata. Also, since geli is on a raid5 I'd also assume that the geli
metadata would not go missing if the raid5 rebuilds correctly. Please
Help!

Here are some stats about the geom::graid5 array if it's helpful.

NAS:~# graid5 list
Geom name: storage
State: COMPLETE CALM
Status: Total=6, Online=6
Type: AUTOMATIC
Pending: (wqp 0 // 0)
Stripesize: 131072
MemUse: 0 (msl 0)
Newest: -1
ID: 68917578
Providers:
1. Name: raid5/storage
Mediasize: 2500539187200 (2.3T)
Sectorsize: 512
Mode: r0w0e0
Consumers:
1. Name: ad20
Mediasize: 500107862016 (466G)
Sectorsize: 512
Mode: r1w1e1
DiskNo: 4
Error: No
2. Name: ad18
Mediasize: 500107862016 (466G)
Sectorsize: 512
Mode: r1w1e1
DiskNo: 1
Error: No
3. Name: ad16
Mediasize: 500107862016 (466G)
Sectorsize: 512
Mode: r1w1e1
DiskNo: 0
Error: No
4. Name: ad14
Mediasize: 500107862016 (466G)
Sectorsize: 512
Mode: r1w1e1
DiskNo: 5
Error: No
5. Name: ad12
Mediasize: 500107862016 (466G)
Sectorsize: 512
Mode: r1w1e1
DiskNo: 3
Error: No
6. Name: ad10
Mediasize: 500107862016 (466G)
Sectorsize: 512
Mode: r1w1e1
DiskNo: 2
Error: No

NAS:~# graid5 dump ad10
Metadata on ad10:
Magic string: GEOM::RAID5
Metadata version: 2
Device name: storage
Device ID: 68917578
Disk number: 2
Total number of disks: 6
Provider Size: 500107862016
Verified: -1
State: 0
Stripe size: 131072
Newest: 4294967295
NoHot: No
Hardcoded provider:

## graid5 dump adXX output looks exactly the same besides the 'Disk
number:' output for all devices.

   Any advice you guys can give to rescue the data would be soooo
appreciated, you have no idea.

         -Joshua


More information about the freebsd-geom mailing list