ZFS: drive replacement performance

Tue Jul 7 20:54:02 UTC 2009

On Tue, Jul 7, 2009 at 12:56 PM, Mahlon E. Smith <mahlon at martini.nu> wrote:

> I've got a 9 sata drive raidz1 array, started at version 6, upgraded to
> version 13.  I had an apparent drive failure, and then at some point, a
> kernel panic (unrelated to ZFS.)  The reboot caused the device numbers
> to shuffle, so I did an 'export/import' to re-read the metadata and get
> the array back up.
>

This is why we've started using glabel(8) to label our drives, and then add
the labels to the pool:
  # zpool create store raidz1 label/disk01 label/disk02 label/disk03

That way, it does matter where the kernel detects the drives or what the
physical device node is called, GEOM picks up the label, and ZFS uses the
label.

> Once I swapped drives, I issued a 'zpool replace'.
>

See comment at the end:  what's the replace command that you used?

>
> That was 4 days ago now.  The progress in a 'zpool status' looks like
> this, as of right now:
>
>  scrub: resilver in progress for 0h0m, 0.00% done, 2251h0m to go
>
> ... which is a little concerning, since a) it appears to have not moved
> since I started it, and b) I'm in a DEGRADED state until it finishes...
> if it finishes.
>

There's something wrong here.  It definitely should be incrementing.  Even
when we did the foolish thing of creating a 24-drive raidz2 vdev and had to
replace a drive, the progress bar did change.  Never got above 39% as it
kept restarting, but it did increment.

>
> So, I reach out to the list!
>
>  - Is the resilver progress notification in a known weird state under
>   FreeBSD?
>
>  - Anything I can do to kick this in the pants?  Tuning params?
>

I'd redo the replace command, and check the output of "zpool status" to make
sure it's showing the proper device node and not some random string of
numbers like it is.

>  - This was my first drive failure under ZFS -- anything I should have
>   done differently?  Such as NOT doing the export/import? (Not sure
>   what else I could have done there.)
>

If you knew which drive it was, I'd have shutdown the server and replaced
it, so that the drives came back up renumbered correctly.

This happened to us once when I was playing around with simulating dead
drives (pulling drives) and rebooting.  That's when I moved over to using
glabels.

% zpool status store
>  pool: store
>  state: DEGRADED
> status: One or more devices is currently being resilvered.  The pool will
>        continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
>  scrub: resilver in progress for 0h0m, 0.00% done, 2251h0m to go
> config:
>
>        NAME                       STATE     READ WRITE CKSUM
>        store                      DEGRADED     0     0     0
>          raidz1                   DEGRADED     0     0     0
>            da0                    ONLINE       0     0     0  274K
> resilvered
>            da1                    ONLINE       0     0     0  282K
> resilvered
>            replacing              DEGRADED     0     0     0
>              2025342973333799752  UNAVAIL      3 4.11K     0  was /dev/da2
>              da8                  ONLINE       0     0     0  418K
> resilvered
>            da2                    ONLINE       0     0     0  280K
> resilvered
>            da3                    ONLINE       0     0     0  269K
> resilvered
>            da4                    ONLINE       0     0     0  266K
> resilvered
>            da5                    ONLINE       0     0     0  270K
> resilvered
>            da6                    ONLINE       0     0     0  270K
> resilvered
>            da7                    ONLINE       0     0     0  267K
> resilvered
>
> errors: No known data errors
>
>
> -----------------------------------------------------------------------
>
>
> % zpool iostat -v
>                              capacity     operations    bandwidth
> pool                        used  avail   read  write   read  write
> -------------------------  -----  -----  -----  -----  -----  -----
> store                      1.37T  2.72T     49    106   138K   543K
>  raidz1                   1.37T  2.72T     49    106   138K   543K
>    da0                        -      -     15     62  1017K  79.9K
>    da1                        -      -     15     62  1020K  80.3K
>    replacing                  -      -      0    103      0  88.3K
>      2025342973333799752      -      -      0      0  1.45K    261
>      da8                      -      -      0     79  1.45K  98.2K
>    da2                        -      -     14     62   948K  80.3K
>    da3                        -      -     13     62   894K  80.0K
>    da4                        -      -     14     63   942K  80.3K
>    da5                        -      -     15     62   992K  80.4K
>    da6                        -      -     15     62  1000K  80.1K
>    da7                        -      -     15     62  1022K  80.1K
> -------------------------  -----  -----  -----  -----  -----  -----
>

That definitely doesn't look right.   It should be showing the device name
there in the "replacing" section.

What's the exact "zpool replace" command that you used?

-- 
Freddie Cash
fjwcash at gmail.com