ufs2 / softupdates / ZFS / disk write cache

Nathanael Hoyle nhoyle at hoyletech.com
Sun Jun 21 02:48:50 UTC 2009


Dan Naumov wrote:
> I decided to do some performance tests of my own, "bonnie -s 4096" was
> used to obtain the results. Note that these results should be used to
> compare "write cache on" to "write cache off" and not to compare UFS2
> vs ZFS, as the testing was done on different parts of the same
> physical disk (the UFS2 partition resides on the first 16gb of disk
> and ZFS pool takes the remaining ~1,9tb) and I am also using rather
> conservative ZFS tunables.
>
>
> UFS2 with write cache:
>               -------Sequential Output-------- ---Sequential Input-- --Random--
>               -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
> Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
>          4096 55457 95.9 91630 46.7 36264 37.5 46565 74.0 84751 33.7 164.3 10.3
>
> UFS2 without write cache:
>               -------Sequential Output-------- ---Sequential Input-- --Random--
>               -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
> Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
>          4096  4938 46.9  4685 18.0  4288 21.8 17453 34.0 74232 31.6 165.0  9.9
>
>
> As we can clearly see, the performance diffence between having disk
> cache enabled and disabled is _ENORMOUS_. In the case of sequential
> block write on UFS2, the performance loss is a staggering 94,89%. More
> surprinsingly, even reading seems to be affected in a noticable way,
> per char reads suffer a 62,62% penalty while block reads take a 12,42%
> hit. Moving on to testing ZFS with and without disk cache enabled:
>
>
> ZFS with write cache (384M ARC, 1GB max kmem):
>               -------Sequential Output-------- ---Sequential Input-- --Random--
>               -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
> Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
>          4096 25972 66.1 45026 40.6 34269 36.0 46371 86.5 93973 34.6  84.5  8.5
>
> ZFS without write cache (384M ARC, 1GB max kmem):
>               -------Sequential Output-------- ---Sequential Input-- --Random--
>               -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
> Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
>          4096  2399  6.7  2258  3.5  2290  3.9 34380 66.1 85971 32.8  56.7  6.1
>
> 		
> Uh oh.... After some digging around, I found the following quote: "ZFS
> is designed to work with storage devices that manage a disk-level
> cache. ZFS commonly asks the storage device to ensure that data is
> safely placed on stable storage by requesting a cache flush." at
> http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide I
> guess this might be somewhat related to why in the "disk cache
> disabled" scenario, ZFS suffers bigger losses than UFS2.
>
> It is quite obvious at this point that disabling disk cache in order
> have softupdates live in harmony with disks "lying" about whether disk
> cache contents have actually been committed to the disk in not in any
> way, shape or form a viable solution to the problem. On a sidenote, is
> there any way I can test whether *MY* disk is truthful about writing
> cache to disk or not?
>
> In the past (this was during my previous foray into the FreeBSD world,
> circa-2001/2002) I have suffered severe data corruption (leading to an
> unbootable system) using UFS2 + softupdates on 2 different occasions
> due to power losses and this past experience has me very worried about
> the proper way to configure my system to avoid such incidents in the
> future.
>
>
> - Sincerely,
> Dan Naumov
>
>
>
>   

Dan,

<scold>Top posting on mailing lists is bad (and not the preferred 
convention for this list).</scold>

The performance numbers are startling, and good to have.  You could also 
try setting the 'sync' flag on the FFS+SU mount to see what that looks 
like, it should give a small extra measure of protection.  Since that 
mount shouldn't be write-heavy, I wouldn't expect much (perceived) 
performance hit (though the bonnie numbers may be ugly).

As Peter Jeremy responded in your question about whether or not your 
proposed configuration looked sane (your post from the 14th), one solid 
strategy is to have an *offline* copy of your root filesystem.  This 
ensures that outstanding disk writes cannot leave that instance in an 
unusable form, and helps protect you from all the evilness that can 
occur to online/mounted filesystems.  On Linux systems where the kernel 
image and grub config usually reside in /boot, I usually make that a 
separate partition and set 'noauto' on the /etc/fstab so that it is 
never mounted except when I'm installing a new kernel or updating my 
grub config.

-Nathanael


More information about the freebsd-fs mailing list