Remote upgrade possible?

M m at obmail.net
Fri Jan 7 10:16:48 PST 2005


On Jan 7, 2005, at 12:33 PM, Danny MacMillan wrote:

> I haven't looked at the code, but your assertion is extremely unlikely.
> I really want to say "impossible" but as I said, I haven't looked at
> the code.  If FreeBSD loaded entire executable images into RAM when
> starting new processes, it would perform very poorly.  What is more
> likely is that the kernel keeps the image file open during program
> execution.  When the xterm binary is replaced, the old binary is still
> on disk in its old location, it just doesn't have any directory
> entries pointing to it.  Since the kernel still has the file open it
> won't be overwritten.  Hence the kernel can and will still load
> pages from the old image.  This is a function of the same behaviour
> that causes df and du output to differ in some cases.
>
> The lsof(8) utility seems to bear this out, as each process seems to
> keep each image (program and shared object files) open during
> execution.
>
> A new instance of xterm would use the new, upgraded binary.
>

When you run a program the program that runs the new one makes a copy 
of itself in the process table and they share code pages. This is done 
through fork(). At that point the new process, called the child, calls 
one of the exec() function calls which in turn calls a single syscall, 
execve(). execve() uses namei() to get the vnode pointer. Each vnode 
pointer has three ference counts, v_usecount, v_holdcnt and 
v_writecount. A vnode is not recycled until both the usecount and 
holdcnt are 0. When namei() is called it calls VREF() which is vref() 
which does

         vp->v_usecount++;

so if it's running the page can't be recycled from a point in time 
before the program actually is loaded in to memory. execve() calls 
exec_map_first_page(). Without tearing this apart I'm going to guess 
that this memory maps the first page of text (code) through the VM 
subsystem as evidenced by the conspicuous calls to vm_page*() functions 
so I'd conclude the file is memory mapped. Presuming it turns out the 
command you're calling isn't a shell script or other script execve() 
cleans up the environment so file descriptors and signal handlers don't 
get shared, the processes environment is setup, lets the calling 
(forking) process know it can continue on it's merry way, sets uid/gid 
if necessary/possible, and it looks like the scheduler takes care of 
the rest  (I'll be honest here, the code seems to trail off here so far 
as I can tell in to parts that are jumped to in case of error). In any 
case we have a increased usecount.

Now we are going to unlink that file and create a new one.

After some basic checks (you can't remove the root of a file system for 
example) unlink() will call VOP_REMOVE() which calls vrele() which 
deincrements the usecount when it's greater than one, which in this 
case it MUST be because the xterm process has one count on it and the 
file entry has another (hard links to the file may have additional 
counts on it).

Therefore it appears that you can unlink the file, it will remain on 
the disk to serve the memory mapped image used for the running process 
and install a new copy. I'm going to presume when a process exits it 
de-increments the usecount for the vnode, which, when 0 should put the 
page on the free list.



More information about the freebsd-questions mailing list