LSI Trouble

Anton Nikiforov anton at nikiforov.ru
Mon Feb 20 13:05:25 PST 2006


Bakul Shah wrote:

> And TELL US WHAT WORKED (just on freebsd-hardware)!  That is
> your punishment for duplicate posting.

Hello all
Here we go, a report of getting my disks back alive :)

The situation:
Server got down and cannot startup claiming that NVRAM configuration 
differ with DISK configuration of LSI logic RAID controller and it 
cannot resolve it.
There was no way to vew configuration on this computer, so i moved disk 
to another system with the same controller type.
Trying to use controller's software i have found out that NVRAM contain 
some strange config while disks contain NO CONFIG at all.

The solution steps
1. I did reconnect disks to the plain Adpatec SCSI controller.
2. After a while i have found out that disks that contaon clear FreeBSD 
partitions.
3. I have made dd backup of all disks. That takes a long time, and i 
have found out that 3 out of 6 disks cannot sync transfer at 320MBPS, 
but only 80.
4. After backing everything I mounted disks into /mnt and found out that 
disks that were RAID1 are just fine and i can acess the data with no 
problem.
5. RAID 5 disks contain some strange (strange for me) info - 2 140GB 
disks contain partition of 280GB and the third disk of this RAID5 set - 
only 14MB!!! FreeBSD partition and some other data. As far as i know 
RAID5 algorythm this is impossible - i should have 3 disks with some 
data, maybe even with partitions, viewable with FreeBSD fdisk. This two 
disks looks exactly like software RAID0. And the third one may contain 
parity (it contain no data that should looks like a data, i was trying 
to analyze it), so this will be not a RAID5 set, but RAID4 (it is 
strange - controller does not support RAID4 ;) ).
6. Then i have configured RAID sets exactly like it was, conntcted my 
data disks and was trying to boot. But controller starts to claim on 
disks wit ID 1,3 and 5 that they are not present in the system. And 
spinup time was about 20 minutes. In addition controller clamed "There 
is no LSI controller on the system"!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
7. Analyzing how leds are blinking i have found out that disk with ID4 
have a delay and it's led was lighting more than other's.
8. I removed this disk and the controller starts to boot with no claims. 
In addition it rebuilds the RAID5 set on the Hot Spare disk.

Resume:
1. Looks like the block size of the RAID affect only cache and other 
system characteristics, but not data itself (in case i'm right with 
RAID5-RAID4. Otherwise i cannot understand why Strip size is available 
for RAID1 where it has no sence)
2. Looks like this controller claims that it is suport RAID5 while it is 
only RAID4.
3. While one disk become broken (and please note, that i made a backup 
of this disk with no problem on the different controller). the 
controller could claim anything! that other disks broken, that there is 
no controller or whatever it can claim :) So do not belive the 
controller messages :) This was checked on two systems with 3 
controllers and the result was the same - while LSIs were claiming - 
Adaptec just saw disks and data.

ToDo:
Analyze RAID5 sets from LSI and, maybe (because i'm not a great C coder) 
make some tool to get data from broken RAID sets.

Best regards,
Anton Nikiforov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 2218 bytes
Desc: S/MIME Cryptographic Signature
Url : http://lists.freebsd.org/pipermail/freebsd-hardware/attachments/20060221/14777cd6/smime.bin


More information about the freebsd-hardware mailing list