NFS Server
Michael Conlen
meconlen at obfuscated.net
Wed Apr 28 14:27:07 PDT 2004
I've got an NFS server that's doing some heavy load. It's holding the
web pages, images and videos for a cluster of servers doing about 40
Mbit/sec of traffic (and 160 requests/second). the NFS server has been
doing between 40 Mbit/sec in and about 10 Mbit/sec out as daily
averages for over 45 days and everything runs well.
Today I noticed that at Midnight *exactly* the interrupt time went
through the roof on the system (from 5% to 20%). I checked out the
system and noticed that it's actually going to the disks a lot, 2-7
MB/sec of disk usage in systat -vmstat. My first thought is that
something's got the inactive pages hosed, so I made a 2 GB file (dd
if=/dev/zero of=foo bs=1024k count=2048), removed it and sync; sync;
sync. Just like magic the Inactive page count vaporized as expected.
The disk usage is the same as it had been when there was 1.6 GB of
inactive pages. After running about a half hour the system still
doesn't have much inactive page use. I've included systat -vmstat
output below, though it's difficult to read. The main thing is that
there's about 3500KB of inactive page use with a system doing 2-7
MB/sec of disk activity, mostly read operations (despite the network
traffic, which I think is due to caching).
Now, the whole system performs like magic right now, so I'm not too
worried about it, until I dump another 80 MBit/sec of web traffic and
100 GB of more files on to the system. At that time I plan to jump to 4
GB of memory, with the idea that the extra memory used for inactive
pages means less disk IO than there would otherwise be, but today's
activity has me puzzled.
The only thing in the whole system that might cause this is the backup
process which kicks off at... ...midnight! The catch is that it's been
kicking off every midnight for weeks and it's never affected the CPU.
The current backup process is (don't shoot me, please) that I mount the
filesystems on another server and rsync them on that server to local
filesystems. The process ran and finished as normal. The backup server
has since been rebooted (to address other needs) and is fine.
Any thoughts as to why I've lost my inactive pages and have gone
straight to disk for all operations?
Having written all this the page count is still
Mem: 16M Active, 3496K Inact, 270M Wired, 92K Cache, 199M Buf, 1719M
Free
Swap: 4079M Total, 48K Used, 4079M Free
What follows is systat -vmstat output
4 users Load 1.46 1.28 1.16 Apr 28 17:14
Mem:KB REAL VIRTUAL VN PAGER SWAP
PAGER
Tot Share Tot Share Free in out in
out
Act 6096 2652 16900 3956 1790332 count
All 266596 3884 2431192 8064 pages
Interrupts
Proc:r p d s w Csw Trp Sys Int Sof Flt cow 8617
total
15 11 2640 6 101 8617 42 7 246924 wire 8389
mux irq11
16236 act
ata1 irq15
22.5%Sys 23.2%Intr 0.0%User 0.0%Nice 54.2%Idl 3344 inact
fdc0 irq6
| | | | | | | | | | 92 cache
atkbd0 irq
===========++++++++++++ 1790240 free
ppc0 irq7
daefr 100
clk irq0
Namei Name-cache Dir-cache prcfr 128
rtc irq8
Calls hits % hits % react
1050 1050 100 pdwake
zfod pdpgs
Disks aacd0 acd0 md0 ofod intrn
KB/t 16.26 0.00 0.00 %slo-z 204096 buf
tps 467 0 0 1734 tfree 219 dirtybuf
MB/s 7.41 0.00 0.00 134716 desiredvnodes
% busy 15 0 0 121459 numvnodes
118582 freevnodes
More information about the freebsd-performance
mailing list