slowdown of zfs (tx->tx)
Nicolas Rachinsky
fbsd-mas-0 at ml.turing-complete.org
Wed Jan 9 16:26:16 UTC 2013
* Artem Belevich <art at freebsd.org> [2013-01-08 12:47 -0800]:
> On Tue, Jan 8, 2013 at 9:42 AM, Nicolas Rachinsky
> <fbsd-mas-0 at ml.turing-complete.org> wrote:
> > NAME STATE READ WRITE CKSUM
> > pool1 DEGRADED 0 0 0
> > raidz2-0 DEGRADED 0 0 0
> > ada5 ONLINE 0 0 0
> > ada8 ONLINE 0 0 0
> > ada2 ONLINE 0 0 0
> > ada3 ONLINE 0 0 0
> > 11846390416703086268 UNAVAIL 0 0 0 was /dev/dsk/ada1
> > ada6 ONLINE 0 0 0
> > ada0 ONLINE 0 0 1
> > ada7 ONLINE 0 0 0
> > ada4 ONLINE 0 0 3
>
> You seem to have some checksum errors which does suggest hardware troubles.
I somehow missed these. Is there any way to learn when these checksum
errors happen?
> For starters, check smart info for all drives and see if they have any
> relocated sectors.
There are some disks with relocated sectors, but for both ada0 and
ada4 Reallocated_Sector_Ct is 0.
> Use gstat during your workload to see if any of the drives takes much
> longer than others to handle its job.
There is one disk sticking out a bit.
> > There is almost no disk activity during this time.
>
> What kind of disk activity *is* there?
What would be interesting?
> > sync is disabled for the whole pool.
>
> If that's the case (assyming you're talking about sync=disabled zfs
> property), then synchronous writes are probably not the cause of
> slowdown. My guess would be either failing HDD or something funky with
> cabling or sata controller.
Yes, sync=disabled for pool1.
Ok, I will start swapping hardware (sadly the machine is quite a drive
away).
Thank you very much for your help.
Nicolas
--
http://www.rachinsky.de/nicolas
More information about the freebsd-fs
mailing list