Large discrepancy in reported disk usage on USR partition

Brendan Hart brendanh at strategicecommerce.com.au
Thu Oct 30 17:45:53 PDT 2008


>> I took a look at using the smart tools as you suggested, but have now 
>> found that the disk in question is a RAID1 set on a DELL PERC 3/Di 
>> controller and smartctl does not appear to be the correct tool to 
>> access the SMART data for the individual disks.  After a little 
>> research, I have found the aaccli tool and used it to get the following
information:

> Sadly, that controller does not show you SMART attributes.  This is one of
> the biggest problems with the majority (but not all) of hardware RAID 
> controllers -- they give you no access to disk-level things like SMART.
> FreeBSD has support for such (using CAM's pass(4)), but the driver has
> to support/use it, *and* the card firmware has to support it.  At present,
> Areca, 3Ware, and Promise controllers support such; HighPoint might, but 
> I haven't confirmed it.  Adaptec does not.

> What you showed tells me nothing about SMART, other than the remote
possibility 
> its basing some of its decisions on the "general SMART health status", 
> which means jack squat.  I can explain why this is if need be, but it's
> not related to the problem you're having.

Thanks for this additional information. I hadn't understood that there was
far more information behind the simple SMART ok/not ok reported by the PERC
controller.

> Either way, this is just one of many reasons to avoid hardware RAID
controllers if given the choice.

I have seen some mentions of using gvinum and/or gmirror to achieve the
goals of protection from Single Point Of Failure with a single disk, which I
believe is the reason that most people, myself included, have specified
Hardware RAID in their servers. Is this what you mean by avoiding Hardware
Raid? 


> I hope these are SCSI disks you're showing here, otherwise I'm not sure
how the 
> controller is able to get the primary defect count of a SATA or SAS disk.
So, 
> assuming the numbers shown are accurate, then yes, I don't think there's
any 
> disk-level problem.

Yes, they are SCSI disks. Not particularly relevant to this topic, but
interesting: I would have thought that SAS would make the same information
available as SCSI does, as it is a serial bus evolution of SCSI. Is this
thinking incorrect?

> I understand at this point you're running around with your arms in the
air, 
> but you've already confirmed one thing: none of your other systems exhibit

> this problem.  If this is a production environment, step back a moment and

> ask yourself: "just how much time is this worth?"  It might be better to
just 
> newfs the filesystem and be done with it, especially if this is a
one-time-never-seen-before thing.

>> I will wait and see if any other list member has any suggestions for 
>> me to try, but I am now leaning toward scrubbing the system. Oh well.

> When you say scrubbing, are you referring to actually formatting/wiping
the system, or are you referring to disk scrubbing?

I meant reformatting and reinstalling, as a way to escape the issue without
spending too much more time on it. I would of course like to understand the
problem so as to know what to avoid in the future, but as you make the point
above, time is money and it is rapidly approaching the point where it isn't
worth any more effort.

Thanks for all your help.

Best Regards,
Brendan Hart

 

__________ Information from ESET NOD32 Antivirus, version of virus signature
database 3571 (20081030) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com
 



More information about the freebsd-questions mailing list