some ZFS questions

Thu Aug 7 16:24:31 UTC 2014

On 2014.08.07 03:16, Scott Bennett wrote:
>      On Wed, 6 Aug 2014 03:49:37 -0500 Andrew Berg
> <aberg010 at my.hennepintech.edu> wrote:
>>On 2014.08.06 02:32, Scott Bennett wrote:
>>>      I have a number of questions that I need answered before I go about
>>> setting up any raidz pools.  They are in no particular order.
>>> 
>>> 	1) What is the recommended method of using geli encryption with
>>> 	ZFS?
>>
>>> Does one first create .eli devices and then specify those
>>> 	.eli devices in the zpool(8) command as the devices to include
>>> 	in the pool? 
>>This.
> 
>      Oh.  Well, that's doable, if not terribly convenient, but it brings up
> another question.  After a reboot, for example, what does ZFS do while the
> array of .eli devices is being attached one by one?  Does it see the first
> one attached without the others in sight and decide it has a failed pool?
Once you bring the .eli devices back online, zpool will see them and your pool
will be back online. Before then, it won't really do anything but tell you the
disks are not available and therefore, neither is your pool. The status of the
pool is 'unavailable', not 'faulted'.

>>mercilessly thrash disks; standard reads and writes are given higher priority
>>in the scheduler than resilver and scrub operations.
> 
>      If two pools use different partitions on a drive and both pools are
> rebuilding those partitions at the same time, then how could ZFS *not*
> be hammering the drive?
A good reason not to setup your pools like that.

>>> 	3) If a raidz2 or raidz3 loses more than one component, does one
>>> 	simply replace and rebuild all of them at once?  Or is it necessary
>>> 	to rebuild them serially?  In some particular order?
>>AFAIK, replacement of several disks can't be done in a single command, but I
>>don't think you need to wait for a resilver to finish on one before you can
>>replace another.
> 
>      That looks good.  What happens if a "zpool replace failingdrive newdrive"
> is running when the failingdrive actually fails completely?
Assuming you don't trigger some race condition (which would be rare if you're
using decent controllers), nothing special. A disk doesn't need to be present
and functioning to be replaced.

>>> 	5) When I upgrade to amd64, the usage would continue to be low-
>>> 	intensity as defined above.  Will the 4 GB be enough?  I will not
>>> 	be using the "deduplication" feature at all.
>>It will be enough unless you are managing tens of TB of data. I recommend
>>setting an ARC limit of 3GB or so. There is a patch that makes the ARC handle
> 
>      3 GB for ARC plus whatever is needed for FreeBSD would leave much room
> for applications to run.  Maybe I won't be able to use ZFS if it requires
> so vastly more page-fixed memory than UFS. :-(
3GB is the hard limit here. If applications need more, they'll get it. The only
reason to set a limit at all is that the ARC currently has issues giving up
memory gracefully. As I said, there's a patch in discussion to fix it.

>      One thing I ran across was the following from the zpool(8) man page.
> 
>   "For pools to be portable, you must give the zpool command whole
>   disks, not just slices, so that ZFS can label the disks with portable
>   EFI labels. Otherwise, disk drivers on platforms of different endian-
>   ness will not recognize the disks."
Well, that is kind of confusing since slices != partitions and partitions
aren't mentioned. Using slices is also something someone would generally not do
with GPT. I'll look at that part of the man page and maybe bring it up on the
doc and fs MLs.

> If I have one raidzN comprising .eli partitions and another raidzN comprising
> a set of unencrypted partitions on those same drives, will I be able to
> export both raidzN pools from a 9-STABLE system and then import them
> into, say, a 10-STABLE system on a different Intel amd64 machine?  By your
> answer to question 1), it would seem that I need to have two raidzN pools,
> although there might be a number of benefits to having both encrypted and
> unencrypted file systems allocated inside a single pool were that an option.
Having any physical disk be a part of more than one pool is not recommended
(except perhaps for cache and log devices where failure is not a big deal). Not
only can it cause thrashing as you mentioned above, but one disk dying makes
both pools degraded. Lose two disks, and you lose both pools. If you need only
some things encrypted, perhaps something that works above the FS layer such as
PEFS would be a better option for you.