Resilver ZIL & Sequential Resilvering?

Thu Aug 28 18:08:02 UTC 2014

As I'm suffering through the process of watching ZFS resilver....I wondered if
there are any updates to ZFS coming down that help in these areas?

The first is Resilvering a ZIL

 scan: resilver in progress since Thu Aug 28 08:35:38 2014
        996G scanned out of 1.15T at 75.7M/s, 0h41m to go
        0 resilvered, 84.33% done
-----------
  scan: resilver in progress since Thu Aug 28 08:37:12 2014
        3.80T scanned out of 4.92T at 298M/s, 1h5m to go
        0 resilvered, 77.24% done
-----------
  scan: resilver in progress since Thu Aug 28 08:37:49 2014
        7.83T scanned out of 9.41T at 616M/s, 0h44m to go
        0 resilvered, 83.23% done

Replaced one of the SSDs for my mirrored ZIL this morning.

Why does it have to scan all the storage in the main pool to resilver the ZIL?
 Why does it have to resilver the ZIL at all?

Granted it goes a whole lot faster than if I were replacing a disk in one of
the pools (which I recently did, twice, for the top pool.... when it started
it had estimates of like 400+ hours, though reality was ~60 hours.

Which brings me to the second item.... sequential resilvering?

We were way behind on updates for our 7420, and I happened to spot this as a
new feature.

The gist is that all the random i/o, especially at the beginning makes
resilvering slow...so this enhancement splits the resilvering into two steps,
in the first step it scans all the blocks that need to be copied and sorts
them into LBA order....

In the meantime...more slow resilvering is on my horizon.  Since replaced one
SSD with a larger SSD, I'm going to want replace the other.  Plus I'm thinking
of migrating to a new root pool, to see if it'll rid me of the "ZFS: i/o error
- all block copies unavailable" messages during boot....and make some layout
changes.

Plus need to see about getting the first pool expand to its new
size....someday I'll need to figure something out for the other two pools
(there's one drive about to go in the second pool...)

All the harddrives were 512 byte sectors, but only the first pool (a pair of
ST31500341AS drives) was made with ashift=12.  So, when one drive reported
imminent failure with the sudden relocation of 3000+ sectors....when grew to
4000+ the next day, and got replaced the next...on a Sunday.  But, it wasn't
that big a problem as I had a pair of 3TB WD Reds that were supposed to go
somewhere else.  Second pool is raidz of similar 1.5TB drives, third pool is
raidz of Hitachi 2TB drives....so far its only been the 1.5TB drives that keep
leaving me.

I suspect at the time, I expected the drives in the first pool to fail and
that I would be upgrading to bigger 4K drives, while the other two
pools....well...the pool of 1.5TB was temporary, only its not anymore....plus
I still have a few extras from other failed arrays.  And, the 2TB pool was
probably because I knew it was going to have lots of tiny files, etc.

The overhead of 4K versus 512 is pretty huge.... Originally created my
/poudriere space on the second pool....a ports directory was about 450MB.
Moved things over to the first pool....the same directories are now about 840MB.

I guess things could've been worse....

Had noticed similar things when I changed the page size of an sqlite file from
1k to 4k....

-- 
Who: Lawrence K. Chen, P.Eng. - W0LKC - Sr. Unix Systems Administrator
For: Enterprise Server Technologies (EST) -- & SafeZone Ally