vn_fullpath() again

Matthew Dillon dillon at apollo.backplane.com
Tue Sep 6 19:15:53 PDT 2005


    At the cost of drawing ire from FreeBSD core developers, I will 
    point out that reverse-resolution is hardly a black-and-white issue.
    There are many shades of grey, and there is a huge problem set that
    can either be solved or 99.99% of the way solved (greatly reducing the
    time required to solve the remainder) with a more robust namecache
    implementation.  FreeBSD's implementation is basically at the lowest
    rung on the ladder.  DragonFly's is a couple of rungs up.  DragonFly
    can returned a guarenteed consistent path (even through renames of
    any component) to any open vnode as well as tell you whether the 
    namespace used to access the vnode was remove()'d.  For an auditing
    program or for generating a high level journaling stream to generate
    a mirror on a remote host, that covers 99.99% of the filesystem.

    NFS views from the client are one of those shades of gray, since
    files and directories can be ripped up by other clients or the
    server, but since clients have to assume a certain level of 
    consistency anyway it's hardly a show stopper from the point of view
    of any real-life use or need.

    Hardlinks are one of those shades of gray, but they hardly invalidate
    the many uses that namepath resolution can be put to.  99.99% of the
    files on most filesystems are either not hardlinked or not removed
    once acccessed, after all, and at least with UFS a directory CAN'T
    be hardlinked.  The important thing is to reduce the problem set to
    something manageable.  For a mirroring program or an auditing program,
    being able to get valid paths in real time for nearly all the changes
    made to a filesystem is no small thing, and you at least get a
    definitive red flag for any hardlinks (simply by the fact that st_nlink
    is greater then one) and can do it the slow way (aka scan/index all
    files with st_nlink > 1) for the remaining few, and track realtime
    namespace operations on hardlinked files after that (which is very
    easy to do).

    As to how to solve the basic problem in FreeBSD... well, it basically
    isn't solvable to the degree that DFly has solved it unless someone
    good spends a lot of time rewriting the namecache code and the VFS API.
    BUT, short of doing that, I think it *IS* possible to rewrite enough
    of the namecache to at least make the namecache records consistent
    against the active vnodes and to not throw away namecache records for
    the directory chain leading up to any vnode.  It's even possible to
    generate the chain for vnodes generated from file handles (inode
    numbers), which an NFS server op has to do quite often, because the
    directory is available in those cases (DragonFly does this for NFS
    server operations so I know it's possible).

    It is even possible to do even less work to maintain the associations...
    you don't even NEED to have a working namecache, in fact.  All you need
    are ref'd directory vnodes in a chain from any leaf leading to the
    mount point... basically taking the vnode->v_dd field and changing it
    from a verifier heuristic to a real, ref'd directory vnode, with
    appropriate feedback from filesystem to fix things up for rename(), 
    and mark the namecache entry as invalid for remove().

    Given a valid directory vnode chain, you can ALWAYS regenerate a valid
    path (maybe not the only path, but a *VALID* path) to any vnode for all 
    cases except the case where you have a hardlinked file that you have
    open()'d and remove()'d.  Very few programs care about open but
    completely unlinked files.  DragonFly can provide the original path to
    such a file, but it flags it as having been removed and one can almost
    certainly ignore such files for, e.g. filesystem mirroring and even for
    auditing if the file is not otherwise important.  Considering the rarity
    of the case, it would be sufficient to simply red-flag the condition
    (which you can reliably do with a v_dd directory chain implementation).

    In summary, the implementation would be:

    * Maintain vref'd v_dd pointers in leaf vnodes representing the directory
      tree to a leaf so they can't go away until the leaf vnode goes away.

    * Handle NFS server based file handle -> vnode translation by resolving
      the chain to root (doable because the NFS server has access to the
      related directory vnode for all such translations).

    * Use the namecache when it exists, and

    * Create related namecache records when asked to resolve a full path when
      it doesn't by recursing upwards through the directory chains and
      scanning the directory to locate the name translation for the underlying
      vnode.  DragonFly does this for the NFS server (see the
      cache_inefficient_scan() procedure in kern/vfs_cache.c in the DFly source
      for an example).

    That is more achievable in FreeBSD.  In fact, I would say that it is 
    VERY achievable in FreeBSD because you are not trying to maintain a
    fully coherent namecache like DragonFly does, you are simply maintaining
    enough information to be able to regenerate the path when the
    namecache record happens not to exist.  This means you don't have to
    fix all the places in FreeBSD where it unconditionally invalidates 
    large chunks of the namecache out of laziness in the original 
    implementation (that alone took several months for me to fix in
    DragonFly).   If nobody wants to do it, well, that's one thing, but 
    it's different from saying that it's impossible when it clearly is
    not impossible.

						-Matt



More information about the freebsd-hackers mailing list