vnodes - is there a leak? where are they going?

Marc G. Fournier scrappy at hub.org
Wed Sep 1 14:53:48 PDT 2004


On Wed, 1 Sep 2004, Allan Fields wrote:

> On Tue, Aug 31, 2004 at 09:21:09PM -0300, Marc G. Fournier wrote:
>>
>> I have two servers, both running 4.10 of within a few days (Aug 5 for
>> venus, Aug 7 for neptune) ... both running jail environments ... one with
>> ~60 running, the other with ~80 ... the one with 60 has been running for
>> ~25 days now, and is at the border of running out of vnodes:
>>
>> Aug 31 20:58:00 venus root: debug.numvnodes: 519920 - debug.freevnodes:
>> 11058 - debug.vnlru_nowhere: 256463 - vlrup
>> Aug 31 20:59:01 venus root: debug.numvnodes: 519920 - debug.freevnodes:
>> 13155 - debug.vnlru_nowhere: 256482 - vlrup
>> Aug 31 21:00:03 venus root: debug.numvnodes: 519920 - debug.freevnodes:
>> 13092 - debug.vnlru_nowhere: 256482 - vlruwt
>>
>> [..]
>>
>> I've tried shutting down all of the VMs on venus, and umount'd all of the
>> unionfs mounts, as well as the one nfs mount we have ... the above #s are
>> after the VMs (and mounts are recreated ...
>>
>> Now, my understanding of the vnodes is that for every file opened, a vnode
>> is created ... in my case, since I'm using unionfs, there are two vnodes
>> per file ... if it possible that there are 'stale' vnodes that aren't
>> being freed up?  Is there some way of 'viewing' the vnode structure?
>>
>> For instance, fstat shows:
>>
>> venus# fstat | wc -l
>>    19531
>
> You can also try pstat -f|more from the user side.

Even less:

venus# fstat | wc -l; pstat -f | wc -l
    20930
     6555

> You might want to setup for remote kernel debugging and peek around the 
> system / further examine vnode structures.  (If you have physical access 
> to two machines you can setup a null modem cable.)

Unfortunately, I'm working with a remote server here, so am quite limited 
right now in what I can do ... anything I can, I will though ...

>> So, where else are the vnodes going?  Is there a 'leak'?  What can I look
>> at to try and narrow this down / provide more information?
>
> If the use count isn't decremented (to zero) vnodes wont
> be placed on the freelist.  Perhaps something isn't
> calling vrele() where it should in unionfs?  You should check the
> reference counts: v_usecount and v_holdcnt on some of the suspect
> vnodes.

How do I do that?  I'm at the limit of my current knowledge right now ... 
willing to do the foot work, just don't know the directions to take from 
here :(

> Any specific things you might suspect as possible cause?

Nothing specific, no ...

> Any messages preceeding the ones you listed above?

The above is a script that I put together over a year ago to generate some 
simple reports that I could look at after a crash ...

>> Even some way of determining a specific process that is sucking back alot
>> of them, to move that to a different machine ... ?
>
> While this only works for open file entries you can get a top 10
> by using:
>
> fstat|perl -ane '
>  $sum{$F[1]}++;
>  END{print "$_: $sum{$_}\n" for sort {$sum{$b}<=>$sum{$a}} keys %sum}
> '|head -10

sh /tmp/t
httpd: 7416
master: 6618
syslogd: 1117
qmgr: 780
pickup: 779
smtpd: 609
sshd: 503
cron: 495
perl: 279
trivial-rewrite: 274

but, again, those are known/open files ... fstat | wc -l only accounts for 
~20k or so of that list :(


----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy at hub.org           Yahoo!: yscrappy              ICQ: 7615664


More information about the freebsd-stable mailing list