simplifying linux_emul_convpath()

Wed Jan 14 13:02:12 PST 2004

On 14 Jan, Robert Watson wrote:
> 
> On Wed, 14 Jan 2004, Don Lewis wrote:
> 
>> I just stumbled across a vnode locking violation in
>> linux_emul_convpath().  Rather than locking and unlocking each vnode for
>> the VOP_GETATTR() calls, is there any reason that this code should not
>> be simplified to just compare the vnode pointers rather than fetching
>> the vnode attributes and comparing the attributes for equality. 
> 
> For some time, I've been thinking of adding samefile() and fsamefile() 
> system calls to FreeBSD, which would allow userspace applications to
> determine if two names or file handles refer to the same object without
> playing games with inode numbers, device ids, etc.  The reason to do this
> would be that 32-bit inode numbers are subject to collision on large file
> systems.  My initial implementation simply compared vnode pointers, but
> that raises an interesting question about how stacked file systems should
> be treated, and depends a lot on the semantics of the stacked file system,
> really.  My leaning is that in general they should probably be treated as
> different objects if they have different vnodes, because with the
> exception of nullfs (and occasionally unionfs), that probably is the
> desired semantic.  You could imagine introducing a VOP to ask "Are you the
> same as this other vnode", and pointing it at both vnodes, but I think
> that adds unnecessary complexity without a whole lot of benefit.

The typical user of something like this would be tar when it is deciding
what to hardlink together.  One could make a case for making a nullfs
mounted copy match the original (or two separately mounted nullfs copies
match each other).  That would do the "right" think when archiving a
file tree containing nullfs mount points and untarring into a single
file system, except that it would confuse the heck out of tar because
the link counts would be wrong.  The VOP would be cheap, too. But what
about a crypto or compression layer?

The problem for something like tar is that this mechanism doesn't scale
well. When creating an archive, tar keeps a database of pathnames of
files that have more than one link, with the inode number as the key.
Each time encounters a file with multiple links, it does a lookup in the
database.  If it finds a match, it outputs a record with the pathname it
found in the database, and if it didn't find a match it adds a new
record to the database.  This can be done with reasonable efficiency in
userland.  If the only way of comparing if two files were the same were
to use syscalls, it would be terribly slow.  Tar would only be able to
keep a list of the pathnames and would have to iterate through the list
doing the syscall for each entry in search of a match to the current
file it was processing.  This is an O^2 problem with a syscall in the
loop.  Tar might be able to narrow the search by matching file
attributes, but it would still be possible to have degenerate cases
unless the inode number were used as an attribute (which would not work
if you wanted nullfs copies to match).

There are programs that could make use of samefile(), such as cp.  It
would probably want a nullfs copy to match the original.