vnodes - is there a leak? where are they going?

Allan Fields bsd at afields.ca
Wed Sep 1 14:40:08 PDT 2004


On Tue, Aug 31, 2004 at 09:21:09PM -0300, Marc G. Fournier wrote:
> 
> I have two servers, both running 4.10 of within a few days (Aug 5 for 
> venus, Aug 7 for neptune) ... both running jail environments ... one with 
> ~60 running, the other with ~80 ... the one with 60 has been running for 
> ~25 days now, and is at the border of running out of vnodes:
> 
> Aug 31 20:58:00 venus root: debug.numvnodes: 519920 - debug.freevnodes: 
> 11058 - debug.vnlru_nowhere: 256463 - vlrup
> Aug 31 20:59:01 venus root: debug.numvnodes: 519920 - debug.freevnodes: 
> 13155 - debug.vnlru_nowhere: 256482 - vlrup
> Aug 31 21:00:03 venus root: debug.numvnodes: 519920 - debug.freevnodes: 
> 13092 - debug.vnlru_nowhere: 256482 - vlruwt
>
> [..]
>
> I've tried shutting down all of the VMs on venus, and umount'd all of the 
> unionfs mounts, as well as the one nfs mount we have ... the above #s are 
> after the VMs (and mounts are recreated ...
> 
> Now, my understanding of the vnodes is that for every file opened, a vnode 
> is created ... in my case, since I'm using unionfs, there are two vnodes 
> per file ... if it possible that there are 'stale' vnodes that aren't 
> being freed up?  Is there some way of 'viewing' the vnode structure?
>
> For instance, fstat shows:
> 
> venus# fstat | wc -l
>    19531

You can also try pstat -f|more from the user side.

> So, obviously it isn't just open files that I'm dealing with here, for 
> even if I double that, that is nowhere near 519920 ...

You might want to setup for remote kernel debugging and peek around
the system / further examine vnode structures.  (If you have physical
access to two machines you can setup a null modem cable.)

> So, where else are the vnodes going?  Is there a 'leak'?  What can I look 
> at to try and narrow this down / provide more information?

If the use count isn't decremented (to zero) vnodes wont
be placed on the freelist.  Perhaps something isn't
calling vrele() where it should in unionfs?  You should check the
reference counts: v_usecount and v_holdcnt on some of the suspect
vnodes.

Any specific things you might suspect as possible cause?
Any messages preceeding the ones you listed above?

If you can espace to the debugger, some things to try are:
	show page
	show lockedvn

You could do a dump for later examination if you are forced to
reboot the machine (after trying unmount).

> Even some way of determining a specific process that is sucking back alot 
> of them, to move that to a different machine ... ?

While this only works for open file entries you can get a top 10
by using:

fstat|perl -ane '
  $sum{$F[1]}++;
  END{print "$_: $sum{$_}\n" for sort {$sum{$b}<=>$sum{$a}} keys %sum}
'|head -10

> ----
> Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
> Email: scrappy at hub.org           Yahoo!: yscrappy              ICQ: 7615664

-- 
 Allan Fields, AFRSL - http://afields.ca
 2D4F 6806 D307 0889 6125  C31D F745 0D72 39B4 5541


More information about the freebsd-stable mailing list