vnodes - is there a leak? where are they going?
Marc G. Fournier
scrappy at hub.org
Wed Sep 1 14:53:48 PDT 2004
On Wed, 1 Sep 2004, Allan Fields wrote:
> On Tue, Aug 31, 2004 at 09:21:09PM -0300, Marc G. Fournier wrote:
>>
>> I have two servers, both running 4.10 of within a few days (Aug 5 for
>> venus, Aug 7 for neptune) ... both running jail environments ... one with
>> ~60 running, the other with ~80 ... the one with 60 has been running for
>> ~25 days now, and is at the border of running out of vnodes:
>>
>> Aug 31 20:58:00 venus root: debug.numvnodes: 519920 - debug.freevnodes:
>> 11058 - debug.vnlru_nowhere: 256463 - vlrup
>> Aug 31 20:59:01 venus root: debug.numvnodes: 519920 - debug.freevnodes:
>> 13155 - debug.vnlru_nowhere: 256482 - vlrup
>> Aug 31 21:00:03 venus root: debug.numvnodes: 519920 - debug.freevnodes:
>> 13092 - debug.vnlru_nowhere: 256482 - vlruwt
>>
>> [..]
>>
>> I've tried shutting down all of the VMs on venus, and umount'd all of the
>> unionfs mounts, as well as the one nfs mount we have ... the above #s are
>> after the VMs (and mounts are recreated ...
>>
>> Now, my understanding of the vnodes is that for every file opened, a vnode
>> is created ... in my case, since I'm using unionfs, there are two vnodes
>> per file ... if it possible that there are 'stale' vnodes that aren't
>> being freed up? Is there some way of 'viewing' the vnode structure?
>>
>> For instance, fstat shows:
>>
>> venus# fstat | wc -l
>> 19531
>
> You can also try pstat -f|more from the user side.
Even less:
venus# fstat | wc -l; pstat -f | wc -l
20930
6555
> You might want to setup for remote kernel debugging and peek around the
> system / further examine vnode structures. (If you have physical access
> to two machines you can setup a null modem cable.)
Unfortunately, I'm working with a remote server here, so am quite limited
right now in what I can do ... anything I can, I will though ...
>> So, where else are the vnodes going? Is there a 'leak'? What can I look
>> at to try and narrow this down / provide more information?
>
> If the use count isn't decremented (to zero) vnodes wont
> be placed on the freelist. Perhaps something isn't
> calling vrele() where it should in unionfs? You should check the
> reference counts: v_usecount and v_holdcnt on some of the suspect
> vnodes.
How do I do that? I'm at the limit of my current knowledge right now ...
willing to do the foot work, just don't know the directions to take from
here :(
> Any specific things you might suspect as possible cause?
Nothing specific, no ...
> Any messages preceeding the ones you listed above?
The above is a script that I put together over a year ago to generate some
simple reports that I could look at after a crash ...
>> Even some way of determining a specific process that is sucking back alot
>> of them, to move that to a different machine ... ?
>
> While this only works for open file entries you can get a top 10
> by using:
>
> fstat|perl -ane '
> $sum{$F[1]}++;
> END{print "$_: $sum{$_}\n" for sort {$sum{$b}<=>$sum{$a}} keys %sum}
> '|head -10
sh /tmp/t
httpd: 7416
master: 6618
syslogd: 1117
qmgr: 780
pickup: 779
smtpd: 609
sshd: 503
cron: 495
perl: 279
trivial-rewrite: 274
but, again, those are known/open files ... fstat | wc -l only accounts for
~20k or so of that list :(
----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy at hub.org Yahoo!: yscrappy ICQ: 7615664
More information about the freebsd-stable
mailing list