Extended attribute interfaces

Andreas Gruenbacher a.gruenbacher at bestbits.at
Tue Jun 20 21:08:12 GMT 2000


All,

here are some more thoughts on extended attributes in response
to Robert Watson's message from 18 Jun 2000 12:50:25 -0400 (EDT).

Please comment...

I can imagine the following operations on extended attributes:
get, set (implicit create), remove, enumerate.

There are other more complex operations for atomically
updating a number of extended attributes, passing flags like
O_CREAT and O_EXCL when modifying extended attributes, etc. I
think we don't really need this level of complexity. Many such
ideas can be found in Tru64 manual pages, see
<http://www.unix.digital.com/faqs/publications/pub_page/doc_list.html>.

I think we agree about get, set and remove. I think we also
agree that extended attributes are not like files. Accessing
doesn't require to open()/.../close() a stream. Instead,
attribute values are copied in and out of a buffer atomically.
This simplifies the interface, and also makes it faster.

Speed is important since extended attributes are accessed
frequently when they are used for ACLs and MAC labels, for
example. The key issue will be to cut down on the number of
disk seeks required to fetch extended attributes of a file.

ENUMERATING ATTRIBUTES

So far, FreeBSD has no mechanism to enumerate the attributes
that exist. My ideas about this operation is as follows.

Enumerating attributes is important for checking the state of
files/directories, and for backing up. I can think of at
least two different interfaces for implementing enumeration.
One possibility is an opendir()/readdir()/closedir() like
interface. The other choice is to simply copy the names of
all extended attributes in a buffer.

A readdir() like interface requires to keep state about the
enumeration in the kernel. This involves issues such as
cleaning up after a process terminates, for example. Another
poblem is changing EAs while enumerating them.

I have implemented and favor the latter approach (copy the list
of EA names into a buffer). My expectations are that there will
be not so many extended attributes for a single file. I assume
a maximum upper limit (of, say, 50 EAs) is no problem at all.
Assuming that there will only be a low number of extended
attributes, fetching a list of attribute names into a buffer is
faster than opening an enumeration, fetching entries,
and closing the enumeration. No state must be kept in the kernel.

The disadvantage of this interface is that it has a scalability
problem with large numbers of attributes.

There might be another problem with enumerating attributes.
In my current implementation, everyone with search access to
a file may obtail the list of attribute names defined for that
file. (permission to read the attribute value may still be
denied.)

This seems fine for user attributes. I'm not sure whether the
knowledge that a file is associated with a certain system
attribute is sensitive information.

It might become necessary to hide system attributes from
non-privileged processes (e.g., expose all attributes to
processes capable of CAP_DAC_READ_SEARCH, but hide system
attributes from all processes not capable of
CAP_DAC_READ_SEARCH, or something similar).

ERROR VALUES

We (robert and I) have discussed which sybol to use. Currently,
I'm using ENOATTR, which is an alias to EDOM. This of course is
a hack. I would favor having a real ENOATTR symbol in glibc.
Using any of the existing error codes is not very appropriate;
strerror() would give a wrong message.

ATTRIBUTE NAMESPACES

Robert has already mentioned there are two attribute namespaces
on Irix (user and root). In my Linux implementation, there is a
user and a system namespace. However, im my implementation the
only difference between a user EA and a system EA is the
attribute name: system attributes start with a '$' character,
while user attribtes start with an English-alphabet character.

At the filesystem level, it makes no sense to make a difference
between user and system attributes. The difference only is how
the kernel uses extended attributes, not how they are stored.
So encoding the attribute namespace in the attribute
name also simplifies the interface between the kernel and
file systems.

ATTRIBUTE MANIPULATION SYSCALLS

If I understood correctly, Robert suggests to manipulate user
attributes using one set of system calls, and to manipulate
each different system attribute using a separate mechanism.

I think it's easier to use the same kernel interface for user
as well as system attributes. That way, fewer system calls
are necessary. Also, backup and restore utilities can be
implemented without having to know exactly which system
attributes the kernel implements.

That is also what I've implemented. When manipulation of an
attribute is requested, the kernel checks if it's a system
attribute. If it is, the kernel looks for a handler for that
attribute. If a handler exists, that handler takes care of
permission checking. The handler also checks for legal
attribute values, and for set operations, whether the user
is allowed to set that attribute to the value given.
If no handler is found for a system attribute, the kernel
declines the attribute manipulation request.

If it's a user attribute, a user attribute handler is used.

One disadvantage of this `unified' interface is that the
kernel needs to look up a handler in the system calls.
The overhead involved is minimal, though.

A possible advantage of using the extended-attribute interface
also for system attributes is that system attributes are already
passed to the kernel as extended attributes. No further
conversion is necessary, the value constructed in user space
can be passed on to the filesystem for storing. (The current
implementation doesn't yet take advantage of this, though.)

ATTRIBUTE PERMISSIONS

In Irix, user attributes are subject to the same permissions as
the file/directory they are associated with. As this seemed to
make sense for me, I've implemented it that way for Linux, too.

For system attributes, it depends on the specific attribute
which permissions are required for retrieving and setting. For
example, setting ACLs is allowed for the owner and for processes
capable of CAP_FOWNER. For filesystem capabilities, the
CAP_SETFCAP capability is required.

In Robert's version, the different policies would each be
implemented by a separate system call. In my version, the
attribute handlers implement these rules.


Regards,
Andreas.

------------------------------------------------------------------------
 Andreas Gruenbacher, a.gruenbacher at computer.org
 Contact information: http://www.bestbits.at/~ag/
To Unsubscribe: send mail to majordomo at cyrus.watson.org
with "unsubscribe posix1e" in the body of the message



More information about the posix1e mailing list