Help! zpool corrupted!

Paul Wootton paul at fletchermoorland.co.uk
Tue Mar 3 02:07:25 PST 2009


On Monday 02 March 2009 21:45:35 Cache wrote:
> I have FreeBSD 8.0-CURRENT r188913M (amd64) on notebook with dual core
> Turion and 4G of RAM. Disk controller is AMD SB600. Single HDD - SATA-150
> 250G WD2500BEVS, ad4 at ata2, formatted with one ufs root partition 256M
> (ad4s1a) and ZFS pool (ver. 6 from FreeBSD 7-STABLE, ad4s1d) on rest of the
> disk. On ZFS pool I have about ten datasets: /root /usr /home /usr/src etc.
>
> Now I have zpool status "One or more devices has experienced an error...".
> When I run scrubing, I see many errors in pool. Every scrub after reboot
> displays different amount of errors: 47, 176 - or ~24000. Disk and disk
> controller seems to be OK, checked with mhdd, but with hw.ata.ata_dma=1
> there are error messages in console sometimes (something like 'DMA error'.
> Sorry, I can't explain its. I don't save its last time and now trying to
> reproduce).
>
> When I set hw.ata.ata_dma=0 in loader.conf, there are no errors in console.
>
> With 'zfs mount -a' command terminal not returns command prompt, but system
> not freezes - any typing echoed to display and ctrl-alt-del reboots system
> as expected. I tried to mound datasets manually - system became thinking on
> /home and /usr.
>
> Does anybody know, how can I restore those two datasets or just make its
> temporary accessible for retrieving data? Any HOWTOs? I have some important
> data and many polished app configs on /home and just not want one more time
> installing of ~1000 ports... And yes, I stupid, because last backup was
> long time ago... :(
>

Hi,

I had a similar issue a while back, but with a AMD SB700 chipset. I was 
running CURRENT with GPT and ZFS boot on a SATA mirrored setup.
What I noticed was that every time I booted I had silent data corruption on 
either of the disks, usually at the start or end of the disk (GPT partition 
tables would get screwed up). There might have been other data corruption but 
I just destroyed the GPT tables and re-added the partition back in to the 
mirror
I know that there are issues with the Silicon Image SATA chipsets and I wonder 
if the AMD chipset is based on the Silicon Image ones.

The only solution I found was to use a non Silicon Image based SATA controller 
(in my case I went for a JMicron)

If the SMART data from the drive does not show bad then I would probably agree 
that the drive ok.

As for data recovery, I dont know but CURRENT should not really be used in 
production environments or to store critical or important data. While CURRENT 
has been very stable for me (for the most) you really do have your hands on 
the FreeBSD gods and I have been bitten a few times... but its CURRENT so I 
cant complain
Even with FreeBSD 7, you still get a warning saying "WARNING: ZFS is 
considered to be an experimental feature in FreeBSD."

Paul


More information about the freebsd-fs mailing list