Panic on 7.4-RELEASE-p5

Mon Jan 30 01:41:52 UTC 2012

On Fri, Jan 27, 2012 at 01:29:23PM +0100, Peter Maloney wrote:
> On 01/27/2012 04:43 AM, Gary Palmer wrote:
> >
> >   After scanning selected spans, do NOT read-scan remainder of disk.
> > If Selective self-test is pending on power-up, resume after 0 minute delay.
> >
> > I noticed a while ago that there were some "bad" sectors on the disk, and
> > at the time they were under the swap partition if my math was correct,
> > and the box never swaps so it wasn't a problem.  I don't know if
> > the errors above are the same ones I saw earlier or not.
> >
> > There were no read or write errors on the console prior to the panic
> > earlier today.  In fact the previos output on the console relates to
> > the last reboot for a software upgrade (fixing some packages) 11
> > days prior.  The only thing in logs going back to November relating
> > to ad1 are boot messages.
> >
> > Thanks,
> >
> > Gary
> >
> 
> Unmount your swap, and then write zeros to it to relocate the bad sectors.
> 
> in one shell:
> gstat -I 100ms -f da#p#
> 
> in another:
> swapoff /dev/da#p#
> sysctl kern.geom.debugflags=0x10
> dd if=/dev/zero of=/dev/da#p# bs=1M
> (eventually it stops saying end of device or no space left; at this
> point I am not sure if you should then continue writing where it stopped
> in 512 byte blocks, or if it wrote a partial 1M in the last 1M)
> 
> Watch first shell. If the speed goes up, settles at a certain number,
> then wildly goes down low and back up to that number, it is possibly
> working.
> 
> Then repeat. If the same wild fluctuations happen, then the drive didn't
> relocate enough, because it is trying to keep some semi-bad ones, or
> they are only bad when reading. If it is just settling at a speed and
> staying there, then it is probably successful. I don't know how reliable
> it is. I have found it to be 100% reliable in my testing though. But
> some/most disks lie to you on the "relocated sector count".
> 
> And then remount the swap and change that kernel parameter back.
> sysctl kern.geom.debugflags=0
> swapon /dev/da#p#
> 
> 
> Your relocated sector count:
> 
>   5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
> 
> 
> 
> However, this does not fix your disk. eg. If you have heads grinding the
> platter, you have dust flying around, and your disk will get worse.
> 
> Be VERY careful using dd to write directly to disks. If you use the
> wrong slice, or you use the main device without slices and miscalculate,
> bad things happen. This is why that kernel parameter was set to stop you.

Hi Peter,

I did things a little differently.  When I checked swapinfo, apparently I
set the swap partition up just purely to act as a dump device - it wasn't
used as swap.  So I tested it:

# recoverdisk /dev/ad1s1b /dev/ad1s1b
        start    size     block-len state          done     remaining    % done
    628097024 1040384       1040384     0     629137408             0 100.00000
Completed

smartctl still reports:

  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0

I then did a read test across the whole disk with no errors

# recoverdisk /dev/ad1 /dev/null
        start    size     block-len state          done     remaining    % done
 120033640448  483328        483328     0  120034123776             0 100.00000
Completed

Reallocated_Sector_Ct is still the same

I dunno where the problems are/were, but apparently I cannot hit them now
through just reading the disk or writing to swap.

Thanks,

Gary