zfs very poor performance compared to ufs due to lack of cache?

Steven Hartland killing at multiplay.co.uk
Sun Sep 12 21:01:49 UTC 2010


----- Original Message ----- 
From: "Andriy Gapon" <avg at freebsd.org>
> 
> All :-)
> Revision of your code, all the extra patches, workload, graphs of ARC and memory
> dynamics and that's just for the start.
> Then, analysis similar to that of Wiktor.  E.g. trying to test with a single
> file and then removing it, or better yet, examining with DTrace actual code
> paths taken from sendfile(2).

All those have been given in past posts on this thread, but that's quite fragmented,
sorry about that, so here's the current summary for reference:-

The machine is a stream server with its job being to serve mp4 http streams via
nginx. It also exports the fs via nfs to an encoding box which does all the grunt
work of creating the streams, but that doesn't seem relevant here as this was
not in use during these tests.

We currently have two such machines one which has been updated to zfs and one
which is still on ufs. After upgrading to 8.1-RELEASE and zfs all seemed ok until we
had a bit of a traffic hike at which point we noticed the machine in question really
struggling even though it was serving less than 100 clients at under 3mbps for
a few popular streams which should have all easily fitted in cache.

Upon investigation it seems that zfs wasn't caching anything so all streams where
being read direct from disk overloading the areca controller backed with a 7 disk
RAID6 volume.

After my original post we've done a number of upgrades and we are now currently
running 8-STABLE as of the 06/09 plus the following
http://people.freebsd.org/~mm/patches/zfs/v15/stable-8-v15.patch
http://people.freebsd.org/~mm/patches/zfs/zfs_metaslab_v2.patch
http://people.freebsd.org/~mm/patches/zfs/zfs_abe_stat_rrwlock.patch
needfree.patch and vm_paging_needed.patch posted by jhell

> --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
> +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
> @@ -500,6 +500,7 @@ again:
>      sched_unpin();
>     }
>     VM_OBJECT_LOCK(obj);
> +  if (error == 0)
> +     vm_page_set_validclean(m, off, bytes);
>     vm_page_wakeup(m);
>     if (error == 0)
>      uio->uio_resid -= bytes;

When nginx is active and using sendfile we see a large amount of memory, equivalent
to the size of the files being accessed it seems, slip into inactive according to top and
the size of arc drop to the at most the minimum configured and some times even less.

The machine now has 7GB or ram and these are the load.conf settings currently in use:-
# As we have battery backed cache we can do this
vfs.zfs.cache_flush_disable=1
vfs.zfs.prefetch_disable=0
# Physical Memory * 1.5
vm.kmem_size="11G"
vfs.zfs.arc_min="5G"
vfs.zfs.arc_max="6656M"
vfs.zfs.vdev.cache.size="20M"

Currently arc_summary reports the following after been idle for several hours:-
ARC Size:
        Current Size:                   76.92%  5119.85M (arcsize)
        Target Size: (Adaptive)         76.92%  5120.00M (c)
        Min Size (Hard Limit):          76.92%  5120.00M (c_min)
        Max Size (High Water):          ~1:1    6656.00M (c_max)

Column details as requested previously:-
cnt, time, kstat.zfs.misc.arcstats.size, vm.stats.vm.v_pdwakeups,
vm.stats.vm.v_cache_count, vm.stats.vm.v_inactive_count,
vm.stats.vm.v_active_count, vm.stats.vm.v_wire_count,
vm.stats.vm.v_free_count
1,1284323760,5368902272,72,49002,156676,27241,1505466,32523
2,1284323797,5368675288,73,51593,156193,27612,1504846,30682
3,1284323820,5368675288,73,51478,156248,27649,1504874,30671
4,1284323851,5368670688,74,22994,184834,27609,1504794,30698
5,1284323868,5368670688,74,22990,184838,27605,1504792,30698
6,1284324024,5368679992,74,22246,184624,27663,1505177,31171
7,1284324057,5368679992,74,22245,184985,27663,1504844,31170

Point notes:
1. Initial values
2. single file request size: 692M
3. repeat request #2
4. request for second file 205M
5. repeat request #4
6. multi request #2
7. complete

top details after tests:-
Mem: 106M Active, 723M Inact, 5878M Wired, 87M Cache, 726M Buf, 124M Free
Swap: 4096M Total, 836K Used, 4095M Free

arc_summary snip after test
ARC Size:
        Current Size:                   76.92%  5119.97M (arcsize)
        Target Size: (Adaptive)         76.92%  5120.09M (c)
        Min Size (Hard Limit):          76.92%  5120.00M (c_min)
        Max Size (High Water):          ~1:1    6656.00M (c_max)

If I turn the box on so it gets a real range of requests, after about an hour I see something
like:-
Mem: 104M Active, 2778M Inact, 3065M Wired, 20M Cache, 726M Buf, 951M Free
Swap: 4096M Total, 4096M Free

ARC Size:
        Current Size:                   34.37%  2287.36M (arcsize)
        Target Size: (Adaptive)         100.00% 6656.00M (c)
        Min Size (Hard Limit):          76.92%  5120.00M (c_min)
        Max Size (High Water):          ~1:1    6656.00M (c_max)

As you can see the size of ARC has even dropped below c_min. The results of the live test
where gathered directly after a reboot, in case that's relevant.

If someone could suggest a set of tests that would help I'll be happy to run them but
from what's been said thus far is seems that the use of sendfile is forcing memory use
other than that coming from arc which is what's expected?

Would running the same test with sendfile disabled in nginx help?

    Regards
    Steve


================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster at multiplay.co.uk.



More information about the freebsd-fs mailing list