The need for initialising disks before use?
brooks at one-eyed-alien.net
Sat Aug 19 02:52:16 UTC 2006
On Fri, Aug 18, 2006 at 01:41:27PM -1000, Antony Mawer wrote:
> On 18/08/2006 4:29 AM, Brooks Davis wrote:
> >On Fri, Aug 18, 2006 at 09:19:04AM -0500, Kirk Strauser wrote:
> >>On Thursday 17 August 2006 8:35 am, Antony Mawer wrote:
> >>>A quick question - is it recommended to initialise disks before using
> >>>them to allow the disks to map out any "bad spots" early on?
> >>Note: if you once you actually start seeing bad sectors, the drive is
> >>almost dead. A drive can remap a pretty large number internally, but
> >>once that pool is exhausted (and the number of errors is still growing
> >>exponentially), there's not a lot of life left.
> >There are some exceptions to this. The drive can not remap a sector
> >which failes to read. You must perform a write to cause the remap to
> >occur. If you get a hard write failure it's gameover, but read failures
> >aren't necessicary a sign the disk is hopeless. For example, the drive
> >I've had in my laptop for most of the last year developed a three sector
> >error within a week or so of arrival. After dd'ing zeros over the
> >problem sectors the problem sectors I've had no problems.
> This is what prompted it -- I've been seeing lots of drives that are
> showing up with huge numbers of read errors - for instance:
> >Aug 19 04:02:27 server kernel: ad0: FAILURE - READ_DMA
> >status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=66293984
> >Aug 19 04:02:27 server kernel:
> >g_vfs_done():ad0s1f[READ(offset=30796791808, length=16384)]error = 5
> >Aug 19 04:02:31 server kernel: ad0: FAILURE - READ_DMA
> >status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=47702304
> >Aug 19 04:02:31 server kernel:
> >g_vfs_done():ad0s1f[READ(offset=21277851648, length=16384)]error = 5
> >Aug 19 04:02:36 server kernel: ad0: FAILURE - READ_DMA
> >status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=34943296
> >Aug 19 04:02:36 server kernel:
> >g_vfs_done():ad0s1f[READ(offset=14745239552, length=16384)]error = 5
> >Aug 19 04:03:08 server kernel: ad0: FAILURE - READ_DMA
> >status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=45514848
> >Aug 19 04:03:08 server kernel:
> >g_vfs_done():ad0s1f[READ(offset=20157874176, length=16384)]error = 5
> I have /var/log/messages flooded with incidents of these "FAILURE -
> READ_DMA" messages. I've seen it on more than one machine with
> relatively "young" drives.
> I'm trying to determining of running a dd if=/dev/zero over the whole
> drive prior to use will help reduce the incidence of this, or if it is
> likely that these are developing after the initial install, in which
> case this will make negligible difference...
I really don't know. The only way I can think of to find out is to own
a large number of machine and perform an experiment. We (the general
computing public) don't have the kind of models needed to really say
anything definitive. Drive are too darn opaque.
> Once I do start seeing these, is there an easy way to:
> a) determine what file/directory entry might be affected?
Not easily, but this question has been asked and answered on the mailing
lists recently (I don't remember the answer, but I think there were some
ports that can help).
> b) dd if=/dev/zero over the affected sectors only, in order to
> trigger a sector remapping without nuking the whole drive
You can use src/tools/tools/recover disk to refresh all of the disk
except the parts that don't work and then use dd and the console error
output to do the rest.
> c) depending on where that sector is allocated, I presume I'm
> either going to end up with:
> i) zero'd bytes within a file (how can I tell which?!)
> ii) a destroyed inode
> iii) ???
Presumably it will be one of i, ii or a mangled superblock. I don't
know how you'd tell which off the top of my head. This is one of the
reasons I think Sun is on the right track with zfs's checksum everything
approach. At least that way you actually know when something goes
> Any thoughts/comments/etc appreciated...
> How do other operating systems handle this - Windows, Linux, Solaris,
> MacOSX ...? I would have hoped this would be a condition the OS would
> make some attempt to trigger a sector remap... or are OSes typically
> ignorant of such things?
The OS is generally unaware of such events except to the extent that
they know a fatal read error occurred or that they read the SMART data
from the drive in the case of write failures.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060819/98c1a3db/attachment.pgp
More information about the freebsd-stable