fsdb&smartctl&/var/log/messages

Dmitry Lunts eingorn777 at gmail.com
Mon Jul 12 00:25:41 UTC 2010


Hello,all!
The question is as follows.
1).When I try to upgrade some port or merely execute the command pkgdb -uF
I get the error:
Input/output error - /var/db/pkg/kdeutils-3.5.10_5/+CONTENTS

2)$sudo cat /var/log/messages|grep DMA|tail -2
gives:
Jul 12 03:07:06 dim007 kernel: ad6: FAILURE - READ_DMA
status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=4007967
Jul 12 03:07:09 dim007 kernel: ad6: FAILURE - READ_DMA
status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=4007967

3)From the output of
$cat /etc/fstab and
$sudo bsdlabel ad6s1:
#        size   offset    fstype   [fsize bsize bps/cpg]
  a:  1048576        0    4.2BSD     2048 16384     8
  b:  1048576  1048576      swap
  c: 100663227        0    unused        0     0         # "raw" part, don't
edit
  d:  2097152  2097152    4.2BSD     2048 16384 28552
  e:   655360  4194304    4.2BSD     2048 16384 40968
  f: 37748736  4849664    4.2BSD     2048 16384 28552
  g: 58064827 42598400    4.2BSD     2048 16384 28552
I can conclude that LBA=4007967 falls into /dev/ad6s1d partition mounted as
/var

Next:
 4) $sudo fsdb -r /dev/ad6s1d
** /dev/ad6s1d (NO WRITE)
Examining file system `/dev/ad6s1d'
Last Mounted on /var
current inode: directory
I=2 MODE=40755 SIZE=512
        BTIME=Jan  1 15:07:34 2009 [0 nsec]
        MTIME=Jul 12 03:03:19 2010 [0 nsec]
        CTIME=Jul 12 03:03:19 2010 [0 nsec]
        ATIME=Jul 10 01:09:22 2010 [0 nsec]
OWNER=root GRP=wheel LINKCNT=27 FLAGS=0 BLKCNT=4 GEN=5e655284

Offset of bad LBA sector within /dev/ad6s1d (i.e., /var) partition is:
(bad LBA sector-63-offset of /dev/ad6s1d=4007967-63-2097152=1910752 (see the
output of bsdlabel above)

5) Searching for inode:
fsdb (inum: 2)>findblk 1910752
1910752: data block of inode 117934

6)Searching for file:
$sudo find /var -inum 117934
/var/db/pkg/kdeutils-3.5.10_5/+CONTENTS
which exactly corresponds to error message from pkgdb -uF (see item 1)
above)

On the other hand, the following script executes without errors:
$export i=4007967
$ while [ $i -lt 4007976 ] / #checking 10 sectors
> do echo $i
> dd if=/dev/ad6 of=/dev/null bs=512 count=1 skip=$i
> let i+=1
> done
4007967
1+0 records in
1+0 records out
512 bytes transferred in 0.008722 secs (58702 bytes/sec)
4007968
<SKIPPED>
So, no errors.

7)Moreover, following commands give quite another number of bad LBA secror:
$sudo smartctl -t long /dev/ad6
$sudo smartctl -l selftest /dev/ad6
smartctl 5.39.1 2010-01-28 r3054 [FreeBSD 7.3-RELEASE-p1 i386] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)
LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%      7376
4007996
<SKIPPED>
8) In this case LBA_of_first_error=4007996 (not 4007967!)
is different from bad sector number previousely discovered from
/var/log/messages

9) again, trying to read bad sectors:
$ export i=4007996
$ while [ $i -lt 4008006 ] #again checking 10 sectors
> do echo $i
> dd if=/dev/ad6 of=/dev/null bs=512 count=1 skip=$i
> let i+=1
> done
4007996
dd: /dev/ad6: Input/output error
0+0 records in
0+0 records out
0 bytes transferred in 2.704641 secs (0 bytes/sec)
4007997
<SKIPPED>

10) And what's more:
offset of LBA_of_first_error within /dev/ad6s1d is
4007996-63-2097152=1910781
fsdb->findblk returns nothing:
fsdb (inum: 2)> findblk 1910781
fsdb (inum: 2)>

So, the puzzle is:
from one hand, pkgdb -uF, /var/log/messages, fsdb point to the same bad
sector (4007967) and filename
this sector belongs to
(and even cat /var/db/pkg/kdeutils-3.5.10_5/+CONTENTS returns Input/Output
error),
but low level reading of bad sector returns without a sign of error.
>From the other hand, smartctl long test performed immediately after low
level read test with dd
gives quite another number of bad sector (4007996) which in turn doesn't
belong to any file.

And what makes me completely lost is that
the tests in items 1)-10) were repeated two times and gave  the same
results!
So where are 29 (4007996-4007967) sectors lost?
Could anyone give me a hint where I'm wrong?
TIA,
Dmitry


-- 
С уважением, Дмитрий
Best regards, Dmitry
email: eingorn777 at gmail.com


More information about the freebsd-fs mailing list