9-stable : geli + one-disk ZFS fails
Arno J. Klaassen
arno at heho.snv.jussieu.fr
Wed Feb 15 14:54:25 UTC 2012
Hello,
Martin Simmons <martin at lispworks.com> writes:
> Some random ideas:
>
> 1) Can you dd the whole of ada0s3.eli without errors?
[root at cc ~]# dd if=/dev/ada0s3.eli of=/dev/null bs=4096 conv=noerror
103746635+0 records in
103746635+0 records out
424946216960 bytes transferred in 18773.796016 secs (22635072 bytes/sec)
[root at cc ~]#
> 2) If you scrub a few more times, does it find the same number of errors each
> time and are they always in that XNAT.tar file?
Looks like each scrub worsens the situation :
[root at cc ~]# zpool status -v
pool: zfiles
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scan: scrub repaired 148K in 0h14m with 26 errors on Mon Feb 13 18:54:33 2012
config:
NAME STATE READ WRITE CKSUM
zfiles ONLINE 0 0 26
ada0s3.eli ONLINE 0 0 87
errors: Permanent errors have been detected in the following files:
[ 11 files ]
[root at cc ~]# zpool status -v
pool: zfiles
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scan: scrub in progress since Wed Feb 15 14:36:52 2012
17.7G scanned out of 28.7G at 72.1M/s, 0h2m to go
0 repaired, 61.56% done
config:
NAME STATE READ WRITE CKSUM
zfiles ONLINE 0 0 54
ada0s3.eli ONLINE 0 0 143
errors: Permanent errors have been detected in the following files:
[ 11 files ]
# [root at cc ~]# zpool status -v
pool: zfiles
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scan: scrub repaired 4K in 0h7m with 70 errors on Wed Feb 15 14:43:57 2012
config:
NAME STATE READ WRITE CKSUM
zfiles ONLINE 0 0 96
ada0s3.eli ONLINE 0 0 228
errors: Permanent errors have been detected in the following files:
[ 25 files (cannot quickly see iff it contains all old 11 files) ]
[root at cc ~]#
[root at cc ~]# zpool status -v
pool: zfiles
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scan: scrub repaired 0 in 0h6m with 70 errors on Wed Feb 15 15:19:28 2012
config:
NAME STATE READ WRITE CKSUM
zfiles ONLINE 0 0 166
ada0s3.eli ONLINE 0 0 368
errors: Permanent errors have been detected in the following files:
[ 25 files ]
[root at cc ~]#
> 3) Can you try zfs without geli?
>
> 4) Is the slice/partition layout definitely correct?
>
> __Martin
>
>
>>>>>> On Mon, 13 Feb 2012 23:39:06 +0100, Arno J Klaassen said:
>>
>> hello,
>>
>> to eventually gain interest in this issue :
>>
>> I updated to today's -stable, tested with vfs.zfs.debug=1
>> and vfs.zfs.prefetch_disable=0, no difference.
>>
>> I also tested to read the raw partition :
>>
>> [root at cc /usr/ports]# dd if=/dev/ada0s3 of=/dev/null bs=4096 conv=noerror
>> 103746636+0 records in
>> 103746636+0 records out
>> 424946221056 bytes transferred in 13226.346738 secs (32128768 bytes/sec)
>> [root at cc /usr/ports]#
>>
>> Disk is brand new, looks ok, either my setup is not good or there is
>> a bug somewhere; I can play around with this box for some more time,
>> please feel free to provide me with some hints what to do to be useful
>> for you.
>>
>> Best,
>>
>> Arno
>>
>>
>> "Arno J. Klaassen" <arno at heho.snv.jussieu.fr> writes:
>>
>> > Hello,
>> >
>> >
>> > I finally decided to 'play' a bit with ZFS on a notebook, some years
>> > old, but I installed a brand new disk and memtest passes OK.
>> >
>> > I installed base+ports on partition 2, using 'classical' UFS.
>> >
>> > I crypted partition 3 and created a single zpool on it containing
>> > 4 Z-"file-systems" :
>> >
>> > [root at cc ~]# zfs list
>> > NAME USED AVAIL REFER MOUNTPOINT
>> > zfiles 10.7G 377G 152K /zfiles
>> > zfiles/home 10.6G 377G 119M /zfiles/home
>> > zfiles/home/arno 10.5G 377G 2.35G /zfiles/home/arno
>> > zfiles/home/arno/.priv 192K 377G 192K /zfiles/home/arno/.priv
>> > zfiles/home/arno/.scito 8.18G 377G 8.18G /zfiles/home/arno/.scito
>> >
>> >
>> > I export the ZFS's via nfs and rsynced on the other machine some backup
>> > of my current note-book (geli + UFS, (almost) same 9-stable version, no
>> > problem) to the ZFS's.
>> >
>> >
>> > Quite fast, I see on the notebook :
>> >
>> >
>> > [root at cc /usr/temp]# zpool status -v
>> > pool: zfiles
>> > state: ONLINE
>> > status: One or more devices has experienced an error resulting in data
>> > corruption. Applications may be affected.
>> > action: Restore the file in question if possible. Otherwise restore the
>> > entire pool from backup.
>> > see: http://www.sun.com/msg/ZFS-8000-8A
>> > scan: scrub repaired 0 in 0h1m with 11 errors on Sat Feb 11 14:55:34
>> > 2012
>> > config:
>> >
>> > NAME STATE READ WRITE CKSUM
>> > zfiles ONLINE 0 0 11
>> > ada0s3.eli ONLINE 0 0 23
>> >
>> > errors: Permanent errors have been detected in the following files:
>> >
>> > /zfiles/home/arno/.scito/contrib/XNAT.tar
>> > [root at cc /usr/temp]# md5 /zfiles/home/arno/.scito/contrib/XNAT.tar
>> > md5: /zfiles/home/arno/.scito/contrib/XNAT.tar: Input/output error
>> > [root at cc /usr/temp]#
>> >
>> >
>> > As said, memtest is OK, nothing is logged to the console, UFS on the
>> > same disk works OK (I did some tests copying and comparing random data)
>> > and smartctl as well seems to trust the disk :
>> >
>> > SMART Self-test log structure revision number 1
>> > Num Test_Description Status Remaining LifeTime(hours)
>> > # 1 Extended offline Completed without error 00% 388
>> > # 2 Short offline Completed without error 00% 387
>> >
>> >
>> > Am I doing something wrong and/or let me know what I could provide as
>> > extra info to try to solve this (dmesg.boot at the end of this mail).
>> >
>> > Thanx a lot in advance,
>> >
>> > best, Arno
>> >
>> >
>> >
More information about the freebsd-stable
mailing list