EBS snapshot backups from a FreeBSD zfs file system: zpool freeze?

Jeremy Chadwick jdc at koitsu.org
Thu Jul 4 02:15:51 UTC 2013


On Thu, Jul 04, 2013 at 01:40:07PM +1200, Berend de Boer wrote:
> >>>>> "Jeremy" == Jeremy Chadwick <jdc at koitsu.org> writes:
> 
> 
>     Jeremy>   Also, because nobody seems to warn others of this: if
>     Jeremy> you go the ZFS route on FreeBSD, please do not use
>     Jeremy> features like dedup or compression.
> 
> Exactly the two reasons why I'm experimenting with FreeBSD on AWs.
> 
> Please tell me more.

dedup has immense and crazy memory requirements; the commonly referenced
model (which is in no way precise, it's just a general recommendation)
is that for every 1TB of data you need 1GB of RAM just for the DDT
(deduplication table)) -- understand that ZFS's ARC also eats lots of
memory, so when I say 1GB of RAM, I'm talking about that being *purely
dedicated* to DDT.  But as I said the need varies depending on the type
of data you have.  When using dedup, the general attitude is "give ZFS
as much memory as possible.  Max your DIMM slots out with the biggest
DIMMs the MCH can support".

Many problems I have seen on the FreeBSD lists -- and one horror story
on Solaris -- often pertain to people trying dedup.  There have been
reported issues with resilvering pools that use dedup, or even simply
mounting filesystems using dedup.  The situation when dedup is in use
becomes significantly more complex in a "something is broken" scenario.
The horror story I've heard and retell is this one, and this is me going
off of memory:

There was supposedly an Oracle customer who had been using dedup for
some time, and they began to have problems (I don't remember what; if it
was with ZFS, the controller, disks, or what).  Anyway, the situation
was such that the client needed to either resilver their pool, or just
get their data -- but because they were using dedup, they could not.
The system could not be upgraded to have more RAM (which would have
alleviated the pains).

The solution which was chosen was for Oracle to actually ship the
customer an entire bare metal system with a gargantuan amount of RAM
(hundreds of gigabytes; I often say 384GB because that's what sticks in
my mind for some reason, maybe it was 192GB, doesn't matter), just to
recover from the situation.

compression is generally safe to use on FreeBSD, but there are often
surprising changes to certain behaviours that people don't consider: the
most common one I see reported is conflicting information between what
"df", "du", and "zfs list" show.  AFAIK this applies to Solaris/Illumos
too, so it's just the nature of the beast.  compression doesn't have the
crazy memory requirements of dedup, obviously -- two separate things,
don't confuse the two.  :-)

The final item is the one that, still to this day, keeps me from using
either dedup or compression on FreeBSD (well actually I'd never consider
dedup, only compression): system interactivity is destroyed when using
either of these features.  The system will regularly stall/lock up
(depending on the I/O, for a few seconds) regularly, even at VGA
console.  This problem is specific to the FreeBSD port of ZFS as of this
writing; Solaris/Illumos addressed this long ago.  Rather than re-write
it, I recommend you read my post from Feburary 2013 which references my
convo with Bob Friesenhahn in October 2011 (please read all the quoted
material too):

http://lists.freebsd.org/pipermail/freebsd-stable/2013-February/072171.html

Changing the compression scheme does not solve the issue; the less
CPU-intensive schemes (ex. lzjb) help decrease the impact but do not
solve it.

All that said: there are people (often FreeNAS folks using their systems
solely as a dedicated NAS, not as a shell server or desktop or other
things) who do use these features happily and do not care about the last
issue.  Cool/great, I'm glad it works for them.  But in my case it's not
acceptable.  If/when the above issue is addressed (putting the ZFS
writer threads into their own priority/scheduling class), I look forward
to using compression (but never dedup, I don't have the hardware/memory
for that kind of thing).

Otherwise please spend an afternoon looking through freebsd-fs and
freebsd-stable lists over the past 2 years (see web archives) and
reading about different stories/situations.  I always, *always* advocate
this.

-- 
| Jeremy Chadwick                                   jdc at koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |



More information about the freebsd-fs mailing list