6.1-R ? 6-Stable ? 5.5-R ?
Francisco Reyes
lists at stringsutils.com
Thu Jun 29 17:39:03 UTC 2006
Kostik Belousov writes:
>> > Approved by: pjd (mentor)
>> > Revision Changes Path
>> > 1.156.2.3 +16 -0 src/sys/nfsserver/nfs_serv.c
>> > 1.136.2.3 +4 -0 src/sys/nfsserver/nfs_srvsubs.c
>>
>> The above files are what I have.
Yes from a 6.1 stable around 6-25-06
> What this means ? That you have _this_ revisions of the files,
> and your LA skyrocketed ?
LA = load average?
Our problem is vmstat 'b' column growing and nfs causing locks on the
server side. When the machine locked it was running a background fsck. I saw
"Giant" a lot in the status of the nfsd.
I am really wondering if 6.1 is ready for production under heavy load. And
for sure the NFS client in the whole 6.X line seems problematic (see my post
in the stable list under subject: NFS clients freeze and can not
disconnect).
As for the vmstat, about the only thing doing anything even remotely
appearing to be doing work is NFS.
For instance I saw this in another thread:
ps ax -O ppid,flags,mwchan | awk '($6 ~ /^D/ || $6 == "STAT") && $3 !~
/^20.$/'
And in the machine in question it shows
PID PPID F MWCHAN TT STAT TIME COMMAND
16124 16123 0 biowr ?? D 46:24.76 nfsd: server (nfsd)
16125 16123 0 biowr ?? D 16:05.58 nfsd: server (nfsd)
16126 16123 0 biowr ?? D 11:05.53 nfsd: server (nfsd)
16127 16123 0 biowr ?? D 8:01.21 nfsd: server (nfsd)
16128 16123 0 biowr ?? D 6:19.15 nfsd: server (nfsd)
16129 16123 0 biowr ?? D 5:01.27 nfsd: server (nfsd)
16130 16123 0 biowr ?? D 3:55.56 nfsd: server (nfsd)
16131 16123 0 biowr ?? D 3:13.11 nfsd: server (nfsd)
16132 16123 0 biowr ?? D 2:43.26 nfsd: server (nfsd)
16133 16123 0 biowr ?? D 2:16.40 nfsd: server (nfsd)
16134 16123 0 biowr ?? D 1:57.00 nfsd: server (nfsd)
16135 16123 0 biowr ?? D 1:41.02 nfsd: server (nfsd)
16136 16123 0 biowr ?? D 1:27.07 nfsd: server (nfsd)
16137 16123 0 biowr ?? D 1:15.25 nfsd: server (nfsd)
16138 16123 0 biowr ?? D 1:06.54 nfsd: server (nfsd)
16139 16123 0 biowr ?? D 0:57.57 nfsd: server (nfsd)
16140 16123 0 biowr ?? D 0:50.65 nfsd: server (nfsd)
16141 16123 0 biowr ?? D 0:44.60 nfsd: server (nfsd)
16142 16123 0 biowr ?? D 0:38.29 nfsd: server (nfsd)
16143 16123 0 biowr ?? D 0:34.21 nfsd: server (nfsd)
16144 16123 0 biowr ?? D 0:29.34 nfsd: server (nfsd)
16145 16123 0 biowr ?? D 0:26.35 nfsd: server (nfsd)
16146 16123 0 biowr ?? D 0:22.25 nfsd: server (nfsd)
16147 16123 0 biowr ?? D 0:18.17 nfsd: server (nfsd)
16148 16123 0 biowr ?? D 0:15.95 nfsd: server (nfsd)
16149 16123 0 biowr ?? D 0:13.66 nfsd: server (nfsd)
16150 16123 0 biowr ?? D 0:10.81 nfsd: server (nfsd)
16151 16123 0 biowr ?? D 0:08.92 nfsd: server (nfsd)
16152 16123 0 biowr ?? D 0:06.82 nfsd: server (nfsd)
16153 16123 0 biowr ?? D 0:05.16 nfsd: server (nfsd)
84338 10043 4100 ufs ?? D 0:02.00 qmgr -l -t fifo -u
91632 10043 4100 biowr ?? D 0:00.02 cleanup -z -t unix -u
91650 10043 4100 ufs ?? D 0:00.04 [smtpd]
91912 86635 4100 biowr ?? Ds 0:00.01 /usr/local/bin/maildrop -d
cathy at sitescape.com
91916 90579 4100 biowr ?? Ds 0:00.01 /usr/local/bin/maildrop -d
jobs at sitescape.com
71677 71672 4002 ppwait p1 D 0:00.15 -su (csh)
The iostat for that machine shows:
iostat 5
tty da0 pass0 cpu
tin tout KB/t tps MB/s KB/t tps MB/s us ni sy in id
0 130 15.35 109 1.63 0.00 0 0.00 6 0 6 1 87
0 36 10.43 230 2.34 0.00 0 0.00 3 0 2 1 93
0 12 10.81 280 2.96 0.00 0 0.00 6 0 2 0 92
0 12 13.03 259 3.30 0.00 0 0.00 0 0 1 1 98
0 12 12.87 259 3.26 0.00 0 0.00 5 0 2 1 91
0 12 17.17 228 3.82 0.00 0 0.00 8 0 3 1 87
0 12 18.38 306 5.49 0.00 0 0.00 3 0 2 1 94
0 12 14.53 284 4.04 0.00 0 0.00 6 0 3 1 89
0 12 26.03 213 5.41 0.00 0 0.00 5 0 3 2 91
Before that machine went into production, during the stress test I saw the
machine do 700+ tps and substantially more MB/s.
We also have another machine identical hardware wise and although it's tps
is 50 to 100 less than this one.. the machine is always ver low in the
'b' column.
I am trying now to read up in vmstat.. to see if I can see anything wrong in vmstat -s
1660720108 cpu context switches
736683712 device interrupts
46973243 software interrupts
99310719 traps
3405487756 system calls
46 kernel threads created
385149 fork() calls
7785 vfork() calls
0 rfork() calls
2809 swap pager pageins
4449 swap pager pages paged in
2027 swap pager pageouts
4609 swap pager pages paged out
5068 vnode pager pageins
20399 vnode pager pages paged in
0 vnode pager pageouts
0 vnode pager pages paged out
2156 page daemon wakeups
58310018 pages examined by the page daemon
12161 pages reactivated
21541481 copy-on-write faults
3659 copy-on-write optimized faults
38628563 zero fill pages zeroed
30430314 zero fill pages prezeroed
5780 intransit blocking page faults
79476476 total VM faults taken
0 pages affected by kernel thread creation
30747781 pages affected by fork()
3054182 pages affected by vfork()
0 pages affected by rfork()
152627514 pages freed
6 pages freed by daemon
35726176 pages freed by exiting processes
51914 pages active
810514 pages inactive
47456 pages in VM cache
56444 pages wired down
24779 pages free
4096 bytes per page
184453449 total name lookups
cache hits (67% pos + 6% neg) system 2% per-directory
deletions 6%, falsehits 0%, toolong 0%
root at mailstore12.simplicato.com:~/bin#uptime
Uptime: 1:35PM up 3 days, 14:48, 3 users, load averages: 0.26, 0.36, 0.29
More information about the freebsd-stable
mailing list