posix_fallocate(2) && posix_fadvise(2) are somewhat broken

Konstantin Belousov kostikbel at gmail.com
Tue Dec 8 19:13:35 UTC 2015


On Tue, Dec 08, 2015 at 07:54:06PM +0100, Dag-Erling Sm??rgrav wrote:
> Konstantin Belousov <kostikbel at gmail.com> writes:
> > Dag-Erling Sm??rgrav <des at des.no> writes:
> > > Maxim Sobolev <sobomax at FreeBSD.org> writes:
> > > > Hi, while working on some unrelated feature I've noticed that at least
> > > > those two system calls are not returning proper value (-1) on error.
> > > > Instead actual errno value is returned from the syscall verbatim,
> > > > i.e. posix_fadvise() returns 22 on EINVAL.
> > > That's how syscalls work.
> > No, this is not how typical syscalls work, but is how the posix_fallocate()
> > and posix_fadvise() are specified by Posix.  The patch is wrong, see also
> > r261080 and r288640.
> 
> Umm, I can't find the code ATM but syscalls store the actual return
> value in td_retval and return 0 or EWHATEVER and the syscall wrapper
> handles the translation.  If that's not what Maxim was talking about,
> then please ignore me.
I mean that typical syscall does not return error to usermode, it
returns -1 and sets errno. But usermode conventions for the posix_f*e()
are different, and I believe this is what tripped over Maxim and I
reacted upon.

Indeed kernel expects the syscall function from the sysentvec table
to return error or zero. If zero is returned, then td_retval array is
translated into return value for usermode by cpu_set_syscall_retval().
If non-zero is returned, typical kernel/libc interface returns the
syscall function return value to usermode and additionally set flag
(like PSL_C in the processor status word). Of course, there is an
additional translation layer in usermode syscall trampolines.


> 
> Anyway, happy to hear that the X/Open group have found a new way to
> screw people over.
It is the same as the pthread_* conventions.  They are somewhat consistent.


More information about the freebsd-current mailing list