9-stable: one-device ZFS fails [was: 9-stable : geli + one-disk ZFS
fails]
Arno J. Klaassen
arno at heho.snv.jussieu.fr
Sun Feb 19 16:55:46 UTC 2012
a followup to myself
> Hello,
>
> Martin Simmons <martin at lispworks.com> writes:
>
>> Some random ideas:
>>
>> 1) Can you dd the whole of ada0s3.eli without errors?
>>
>> 2) If you scrub a few more times, does it find the same number of errors each
>> time and are they always in that XNAT.tar file?
>>
>> 3) Can you try zfs without geli?
>
>
> yeah, and it seems to rule out geli :
>
> [ splitted original /dev/ada0s3 in equally sized /dev/ada0s3 and
> /dev/ada0s4 ]
>
> geli init /dev/ada0s3
> geli attach /dev/ada0s3
>
> zpool create zgeli /dev/ada0s3.eli
>
> zfs create zgeli/home
> zfs create zgeli/home/arno
> zfs create zgeli/home/arno/.priv
> zfs create zgeli/home/arno/.scito
> zfs set copies=2 zgeli/home/arno/.priv
> zfs set atime=off zgeli
>
>
> [put some files on it, wait a little : ]
>
>
> [root at cc ~]# zpool status -v
> pool: zgeli
> state: ONLINE
> status: One or more devices has experienced an error resulting in data
> corruption. Applications may be affected.
> action: Restore the file in question if possible. Otherwise restore the
> entire pool from backup.
> see: http://www.sun.com/msg/ZFS-8000-8A
> scan: scrub in progress since Sat Feb 18 17:46:54 2012
> 425M scanned out of 2.49G at 85.0M/s, 0h0m to go
> 0 repaired, 16.64% done
> config:
>
> NAME STATE READ WRITE CKSUM
> zgeli ONLINE 0 0 1
> ada0s3.eli ONLINE 0 0 2
>
> errors: Permanent errors have been detected in the following files:
>
> /zgeli/home/arno/8.0-CURRENT-200902-amd64-livefs.iso
> [root at cc ~]# zpool scrub -s zgeli
> [root at cc ~]#
>
>
> [then idem directly on next partition ]
>
> zpool create zgpart /dev/ada0s4
>
> zfs create zgpart/home
> zfs create zgpart/home/arno
> zfs create zgpart/home/arno/.priv
> zfs create zgpart/home/arno/.scito
> zfs set copies=2 zgpart/home/arno/.priv
> zfs set atime=off zgpart
>
> [put some files on it, wait a little : ]
>
> pool: zgpart
> state: ONLINE
> status: One or more devices has experienced an error resulting in data
> corruption. Applications may be affected.
> action: Restore the file in question if possible. Otherwise restore the
> entire pool from backup.
> see: http://www.sun.com/msg/ZFS-8000-8A
> scan: scrub repaired 0 in 0h0m with 1 errors on Sat Feb 18 18:04:45 2012
> config:
>
> NAME STATE READ WRITE CKSUM
> zgpart ONLINE 0 0 1
> ada0s4 ONLINE 0 0 2
>
> errors: Permanent errors have been detected in the following files:
>
> /zgpart/home/arno/.scito/ ....
> [root at cc ~]#
I tested a bit more this afternoon :
- zpool create zgpart /dev/ada0s4d =>
KO
- split ada0s4 in two equally sized partitions and then
zpool create zgpart mirror /dev/ada0s4d /dev/ada0s4e =>
works like a charm .....
( [root at cc /zgpart]# zpool status -v zgpart
pool: zgpart
state: ONLINE
scan: scrub repaired 0 in 0h36m with 0 errors on Sun Feb 19
17:20:34 2012
config:
NAME STATE READ WRITE CKSUM
zgpart ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ada0s4d ONLINE 0 0 0
ada0s4e ONLINE 0 0 0
errors: No known data errors )
FYI, best, Arno
>
> I still do not particuliarly suspect the disk since I cannot reproduce
> similar behaviour on UFS.
>
> That said, this disk is supposed to be 'hybrid-SSD', maybe something
> special ZFS doesn't like ??? :
>
>
> ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
> ada0: <ST95005620AS SD23> ATA-8 SATA 2.x device
> ada0: Serial Number 5YX0J5YD
> ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> ada0: Command Queueing enabled
> ada0: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C)
> ada0: Previously was known as ad4
> GEOM: new disk ada0
>
>
> Please let me know what information to provide more.
>
> Best,
>
> Arno
>
>
>
>
>> 4) Is the slice/partition layout definitely correct?
>>
>> __Martin
>>
>>
>>>>>>> On Mon, 13 Feb 2012 23:39:06 +0100, Arno J Klaassen said:
>>>
>>> hello,
>>>
>>> to eventually gain interest in this issue :
>>>
>>> I updated to today's -stable, tested with vfs.zfs.debug=1
>>> and vfs.zfs.prefetch_disable=0, no difference.
>>>
>>> I also tested to read the raw partition :
>>>
>>> [root at cc /usr/ports]# dd if=/dev/ada0s3 of=/dev/null bs=4096 conv=noerror
>>> 103746636+0 records in
>>> 103746636+0 records out
>>> 424946221056 bytes transferred in 13226.346738 secs (32128768 bytes/sec)
>>> [root at cc /usr/ports]#
>>>
>>> Disk is brand new, looks ok, either my setup is not good or there is
>>> a bug somewhere; I can play around with this box for some more time,
>>> please feel free to provide me with some hints what to do to be useful
>>> for you.
>>>
>>> Best,
>>>
>>> Arno
>>>
>>>
>>> "Arno J. Klaassen" <arno at heho.snv.jussieu.fr> writes:
>>>
>>> > Hello,
>>> >
>>> >
>>> > I finally decided to 'play' a bit with ZFS on a notebook, some years
>>> > old, but I installed a brand new disk and memtest passes OK.
>>> >
>>> > I installed base+ports on partition 2, using 'classical' UFS.
>>> >
>>> > I crypted partition 3 and created a single zpool on it containing
>>> > 4 Z-"file-systems" :
>>> >
>>> > [root at cc ~]# zfs list
>>> > NAME USED AVAIL REFER MOUNTPOINT
>>> > zfiles 10.7G 377G 152K /zfiles
>>> > zfiles/home 10.6G 377G 119M /zfiles/home
>>> > zfiles/home/arno 10.5G 377G 2.35G /zfiles/home/arno
>>> > zfiles/home/arno/.priv 192K 377G 192K /zfiles/home/arno/.priv
>>> > zfiles/home/arno/.scito 8.18G 377G 8.18G /zfiles/home/arno/.scito
>>> >
>>> >
>>> > I export the ZFS's via nfs and rsynced on the other machine some backup
>>> > of my current note-book (geli + UFS, (almost) same 9-stable version, no
>>> > problem) to the ZFS's.
>>> >
>>> >
>>> > Quite fast, I see on the notebook :
>>> >
>>> >
>>> > [root at cc /usr/temp]# zpool status -v
>>> > pool: zfiles
>>> > state: ONLINE
>>> > status: One or more devices has experienced an error resulting in data
>>> > corruption. Applications may be affected.
>>> > action: Restore the file in question if possible. Otherwise restore the
>>> > entire pool from backup.
>>> > see: http://www.sun.com/msg/ZFS-8000-8A
>>> > scan: scrub repaired 0 in 0h1m with 11 errors on Sat Feb 11 14:55:34
>>> > 2012
>>> > config:
>>> >
>>> > NAME STATE READ WRITE CKSUM
>>> > zfiles ONLINE 0 0 11
>>> > ada0s3.eli ONLINE 0 0 23
>>> >
>>> > errors: Permanent errors have been detected in the following files:
>>> >
>>> > /zfiles/home/arno/.scito/contrib/XNAT.tar
>>> > [root at cc /usr/temp]# md5 /zfiles/home/arno/.scito/contrib/XNAT.tar
>>> > md5: /zfiles/home/arno/.scito/contrib/XNAT.tar: Input/output error
>>> > [root at cc /usr/temp]#
>>> >
>>> >
>>> > As said, memtest is OK, nothing is logged to the console, UFS on the
>>> > same disk works OK (I did some tests copying and comparing random data)
>>> > and smartctl as well seems to trust the disk :
>>> >
>>> > SMART Self-test log structure revision number 1
>>> > Num Test_Description Status Remaining LifeTime(hours)
>>> > # 1 Extended offline Completed without error 00% 388
>>> > # 2 Short offline Completed without error 00% 387
>>> >
>>> >
>>> > Am I doing something wrong and/or let me know what I could provide as
>>> > extra info to try to solve this (dmesg.boot at the end of this mail).
>>> >
>>> > Thanx a lot in advance,
>>> >
>>> > best, Arno
>>> >
>>> >
>>> >
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
More information about the freebsd-stable
mailing list