SMP on FreeBSD 6.x and 7.0: Worth doing?
rwatson at FreeBSD.org
Tue Jan 1 16:52:05 PST 2008
On Wed, 26 Dec 2007, Adrian Chadd wrote:
> On 26/12/2007, Scott Long <scottl at samsco.org> wrote:
>> Yes, Squid is the ideal application for IFS. Do you still have any of your
>> work on this, and would you be able to share it?
> It'd be easy to rewrite it from scratch if IFS were recovered. In fact, the
> whole point behind IFS, way back when, is I could layer a user-space
> directory hierarchy on top of a kernel provided space and then do "stuff" (I
> had a POP3 Maildir-like server written using IFS back then.)
> The squid code wasn't difficult at all. The biggest problem back then was
> rebuilding the disk index - didn't I have some code to export the inode
> allocation bitmap via a special file in the filesystem so I didn't have to
> stat() each individual inode, or didn't I end up comitting that?
> I'm happy to work on that later on next year. I've got enough non-disk Squid
> code to rewrite and optimise over the next few months; the storage side is
> going to have to wait a while.
Do you think the IFS model offers significant benefits from an application
perspective to, say, the fh*() model used by Arla? This approach originated,
as far as I am aware, with the AFS implementation from CMU, in which new
ioctls added by CMU allowed an give-me-a-free-inode, open-by-inode-number, and
flagged inodes as "in use by AFS" even though they weren't hooked up to the
namespace. fsck then knew to skip them, but the UFS implementation was
otherwise largely unmodified.
In the slightly less intrusive Arla view of the world, cache files do appear
in the UFS name space, but an independent namespace is maintained by the cache
manager, each with two file system names: a normal path (used to delete the
cache file if required), and its NFS file handle, which can be used to open,
stat, etc, the file without a normal file system namespace operation. The
user application can allocate a set of inodes in some arbitrary directory tree
using normal operations (ideally in advance), but when it does so also query
the NFS file handles for the files using getfh(2). Then it later performs all
accesses using the file handles (fhopen(2) fhstat(2), etc), unless they are
invalidated due to, say, moving the cache to a new file system, in which case
the handle database can be rebuilt by re-getfh(2)'ing the files using the
actual file system namespace. It also passes the file handles to the kernel
for use by the nnpfs synthetic file system for file access...
Last time I looked closely, it seemed like the main downside to this vs. IFS
was that you did in fact need real file system names to files with the fh*()
approach, even though you never used them except for create/destroy. As long
as the application effectively "cached" the inodes for reuse, rather than
unlinking/creating frequently, this wasn't a problem. This did, however, mean
that a whole new metadata layer didn't have to be created for an IFS, and fsck
requires no modifications as compared to the AFS approach. So Squid (or
whatever) would need to populate a tree and build a DB with file handles as
well as real names in case the DB has to be rebuilt. You'd also have to be
careful about crash-recovery state to make sure the squid DB agreed with the
contents of the files when coming up after a crash, if reusing inodes rather
than unlinking/reallocating them.
Robert N M Watson
University of Cambridge
More information about the freebsd-stable