final decision about *at syscalls
John Baldwin
jhb at FreeBSD.org
Thu Dec 20 10:43:15 PST 2007
On Tuesday 18 December 2007 04:22:22 am Roman Divacky wrote:
> Dear arch@
>
> Over this summer I was working (among other things) on *at family of syscalls
> kindly sponsored by Google (in their Summer of Code). The resulting patch is
> almost finished but I need to decide one design question. If you are not interested
> in *at/namei feel free to skip this mail.
>
> The *at syscalls are a threads-oriented extension to basic file syscalls (think
> of open(), fstat(), etc.) adding the possibility to specify from where the search
> for relative path should start.
>
> image that we have /tmp/foo/bar
>
> and CWD is set to "/tmp/", and the process has opened "foo" as dirfd. with ordinary
> open() syscall you have to either
>
> chdir("/tmp/foo");open("./bar");
>
> or
>
> open("/tmp/foo/bar");
>
> The first approach is problematic because it changes CWD for all threads in the process,
> the second is prone to race-conditions as some of the components of the path can
> change in parallel with the "open".
>
> So POSIX introduced a new API, called "Extended API set part 2, ISBN: 1-931624-67-4" (at
> least this was the latest when I looked last time), which solves that by introducing "*at"
> syscalls that supply an fd of previously opened directory which is used instead of CWD
> for searching relative path, ie. the previous example becomes
>
> dirfd = open("/tmp/foo"); openat("foo", dirfd);
>
> I implemented the whole API as native FreeBSD syscalls + in linuxulator emulation layer.
> Here's the problem:
>
> There are two approaches to the name translation from "filedescriptor" to the "vnode".
>
> 1) we can do it in the kern_fooat() syscall and pass namei() the resulting vnode
> 2) we can pass namei() the filedescriptor and do the translation there
>
> PROs of #1:
>
> o namei() does not need to know about the curthread, you can use this *at
> ability for different purposes, it's cleaner (imho)
>
> PROs of #2
>
> o raceless implementation
> o no code duplication
>
> CONs of #1
>
> o some very small code duplication (the translation is done in every
> kern_fooat() function)
> o there is a race between the name translation and the actual use of the result
> of the translation that needs to be handled, the "path_to_file" string is copied
> to the kernel space twice hence a race
>
> CONs of #2
>
> o namei is made thread dependant
>
> Please tell me what approach you like more. I personally favour #1 because I don't like namei()
> being thread dependant, Kostik Belousov prefers #2.
Considering Robert's paper on security race problems in things like systrace
stemming from when you copy parameters out of userland and into the kernel
multiple times, I think #2 is definitely the better choice. Also, namei() is
already thread aware AFAICT since 'struct componentname' already contains a
'cnp_thread' member (was 'cnp_proc' in 4.x).
--
John Baldwin
More information about the freebsd-arch
mailing list