Constant minor ZFS corruption

Chris Forgeron cforgeron at acsi.ca
Tue Mar 15 13:13:43 UTC 2011


Hey,
 
 - If you run Current in a semi-production env, you don't build daily/weekly, you build once, test a lot, and then don't budge until you need to.  There's plenty of bug fixes, but if they are not in what you're using, it usually won't matter. It's why I recommend a very minimal kernel, to reduce exposure to bugs.  At the same time, I'm finding 9 to really be shaping up nicely, this isn't like a beta from other sources, it's actually quite useable thanks to all the hard work from committers.  Stick with gcc as your compiler for now, and you should be fine.  At the state you're in right now, I don't think you could be more unstable. :-)

- I found better ZFS performance in 9.0-Current than 8.2-PRE back in Dec 2010, not huge, but enough for me to brave the waters of -CURRENT

-  2 Gig is very low, but yes, it won't cause corruption. If anything, low performance and a higher chance of a panic. Try that system with 8 GB, and you'll notice the difference for random I/O after the ARC fills up.

- I'm not familiar with the PIKE card - Do you have enough SATA ports on the MB to connect a few drives to, to see if your issues go away? We use the SuperMicro AOC LSI 2008 cards, and they look to be working well for us so far.

-----Original Message-----
From: smckay at internode.on.net [mailto:smckay at internode.on.net] On Behalf Of Stephen McKay
Sent: Thursday, March 10, 2011 7:20 PM
To: Chris Forgeron
Cc: Stephen McKay; Mark Felder; freebsd-fs at freebsd.org
Subject: Re: Constant minor ZFS corruption 

On Thursday, 10th March 2011, Chris Forgeron wrote:

>You know,  I've had better luck with v28 and FreeBSD-9-CURRENT.  Make a 
>very minimal compile, test it well, and you should be fine. I just 
>upgraded my last 8.2 v14 ZFS FreeBSD system earlier this week, so I'm 
>now 9-Current with v28 across the board. The only issue I've found so 
>far is a small oddity with displaying files across ZFS, but pjd has 
>already patched that in r219404. (I'm about to test it now)

We are OK using -current if we really have to, but would prefer to stick with an official release (maybe with one or two hand-rolled patches if they are important enough).

We've already noticed the -current "upgrade treadmill", having to build a new kernel every day of our testing because important bug fixes are arriving.  And in the end, we saw no difference in behaviour, so -current doesn't fix our problems.

It's important to test -current, but not in production. :-)

>Oh - and you're AMD64, correct, not i386? I think we (royal we) should 
>remove support for i385 in ZFS, it has never been stable for me, and I 
>see a lot of grief about it on the boards.  I also think you need 8 GB 
>of RAM to play seriously. I've had reasonable success with 4GB and a 
>light load, but any serious file traffic needs 8GB of breathing room as 
>ZFS gobbles up the RAM in a very aggressive manner.

Yes, we are running the adm64 kernel.  Currently we're low on memory
(2GB) because I swapped out the RAM, but that, again, didn't affect our failures.

>Lastly, check what Mike Tancsa said about his hardware - All of my gear 
>is quality,  1000W dual redundant power supplies, LSI SAS controllers, 
>ECC registered ram, no overclocking, etc, etc.  You may have a software 
>issue, but it's more likely that ZFS is just exposing some instability 
>in your system. Has your RAM checked out with a Memtest run overnight? 
>We're talking small, intermittent errors here, not big red flags that 
>will be obvious to spot.

The ASUS PIKE2008 card is LSI based.  Our RAM is ECC.  We're not overclocking (in fact I disabled turbo-boost).  We haven't run memtest but we have done a few "make buildworld" runs.  All of these completed without error.  And with ECC RAM, we should see log messages if anything is wrong there anyway.

We have tried to buy quality hardware.  At least, we didn't deliberately skimp (except to build our own box vs buy a big name brand pre-built zfs server).

We're starting to get suspicious of the PIKE card though.  Is there anyone here who is using an ASUS PIKE2008 (as opposed to other LSI SAS 2008 cards)?  We're kinda wishing we'd gotten an older PIKE 1068E instead...

Cheers,

Stephen.


More information about the freebsd-fs mailing list