Meta-Data & stackable FS

Robert Watson rwatson at FreeBSD.org
Thu Jul 13 03:31:40 GMT 2000


On Thu, 13 Jul 2000, Marius Bendiksen wrote:

> CC:'ing R. Watson on this, as he is ExtAttr guy for FreeBSD.
> 
> > Ok, one problem I see with using extended attributes is that the amount
> > of storage space is limited.  Would something like the following be
> 
> Fault Robert Watson and the POSIX.1e list on this. I've been arguing that
> this needs to be fixed.

Actually, this is a little inaccurate :-).  It's important to distinguish
(a) the API and (b) the implementation in a specific file system.  It is
our belief that while the semantics of the extended attribute call are
limited, and intentionally so (atomic replacement, etc), they do not bound
potential size.  In fact, the API can quite happily front a number of
underlying implementations, including the Linux variable-size
implementation.  I believe that implementation places a bound of 256k per
attribute, but that is an implementation detail.  The FreeBSD
implementation is optimized for fixed-size attributes (such as security
labels), but could easily be replaced with something more flexible.
Patches are, as they say, welcome :-).  The POSIX.1e mailing list has had
the specific goal of having relatively well-defined semantics for the API
while retaining the ability to support different underlying
implementations.

> > Does the above sound feasible and kosher?
> 
> Actually, I'd say you want to have the metadata inode pointed to by the EA
> instead. And, as for limited storage, first realize that an EA can hold
> about 1k, IIRC. Second, nag R. Watson about changing this to doing things
> the way I've suggested ;).

One thing you may want to do is take a look at the Linux EA
implementation, which behaves in approximately that manner.  With
snapshots and soft updates in FFS, there's a bit more work involved than
the ext2fs implementation as you have to work out dependencies and
copy-on-write vnodes.

> Oh, as an aside. If neither of these resolve your problem, stick a file in
> /, called something like ".foolayer", and save an index in the EA instead.
> That saves you from cluttering up the disk with a million inodes.

This is in effect what the FFS attribute implementation does.  It indexes
an array file using the inode number.  It's important to observe, however,
that "inode number" is local to FFS, and not meaningful in distributed
file systems such as Coda and AFS, where such semantics are far from
useful (those two file systems use 96-bit unique identifiers, not 32-bit
ones).  I can use the inode number to index EAs in FFS because it happens
withing a VFS layer, not between layers, where only the vnode pointer is
unique for the persistence of a vnode instantiation.  In Linux, where
inode numbers do provide guarantees in kernel, this would be possible, but
inadvisable given that enourmous suffering that occurs when an inode
number collision occurs for AFS or Coda :-).

  Robert N M Watson 

robert at fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services

To Unsubscribe: send mail to majordomo at cyrus.watson.org
with "unsubscribe posix1e" in the body of the message



More information about the posix1e mailing list