svn commit: r320900 - in head/sys: fs/cd9660 fs/ext2fs fs/fifofs fs/msdosfs fs/nandfs fs/nfsclient fs/smbfs fs/tmpfs ufs/ufs

Bruce Evans brde at optusnet.com.au
Thu Jul 13 10:42:39 UTC 2017


On Wed, 12 Jul 2017, John Baldwin wrote:

> On Wednesday, July 12, 2017 08:08:42 PM Bruce Evans wrote:
>> On Tue, 11 Jul 2017, John Baldwin wrote:
>>
>>> Log:
>>>  Consistently use vop_stdpathconf() for default pathconf values.
>> ...
>> This is sort of backwards.  Not many settings are the same for most
>> file systems, and many that did were wrong.  Now more are wrong.
>>
>>> Modified: head/sys/fs/cd9660/cd9660_vnops.c
>>> ...
>>> -	case _PC_PIPE_BUF:
>>> -		*ap->a_retval = PIPE_BUF;
>>> -		return (0);
>>
>> PIPE_BUF isn't system wide.  cd9660 does support fifos, but setting it
>> for files that aren't fifos or directories is bogus.  POSIX allows such
>> bogus settings, and needs complicated wording to describe this.  E.g.,
>> PIPE_BUF for a directory is not completely bogus and it means that the
>> value for the directory applies to all fifos in the directory and is
>> ignored for all non-fifos in the directory.  It is unclear if PIPE_BUF
>> must be set for a directory if there is any fifo in the directory.
>> FreeBSD's man pages give no details about this.
>
> PIPE_BUF is system wide because all FIFOs in FreeBSD are implemented via
> a common fifo implementation that has a single size.  The wording about
> PIPE_BUF from
> http://pubs.opengroup.org/onlinepubs/009695399/functions/fpathconf.html:
>
>    If path refers to a FIFO, or fildes refers to a pipe or FIFO, the
>    value returned shall apply to the referenced object. If path or fildes
>    refers to a directory, the value returned shall apply to any FIFO that
>    exists or can be created within the directory. If path or fildes refers
>    to any other type of file, it is unspecified whether an implementation
>    supports an association of the variable name with the specified file.
>
> This leaves the value for non-directories and non-FIFOs undefined.  We
> could perhaps add more complexity to fail with EINVAL for types other
> than VFIFO or VDIR, and filesystems that do not support FIFOs could
> fail _PC_PIPE_BUF in their fs-specific vnop, but this would not be any
> more correct as both behaviors fall into "undefined", but the current
> one is simpler to implement.

This is what I said was unclear.  It is clear to standards lawyers but
not even hinted at in FreeBSD man pages.

>> This is system-wide.
>>
>>> 	case _PC_NO_TRUNC:
>>> 		*ap->a_retval = 1;
>>> 		return (0);

There is another bug with this, and a similar, larger one with
CHOWN_RESTRICTED, like the one for NAME_MAX.

{NAME_MAX} is variable, so defining NAME_MAX is a bug and since it
varies a lot defaulting _PC_NAME_MAX is worse than useless.

Similarly for NO_TRUNC and CHOWN_RESTRICTED, except they are POSIX
limits and always have prefixes and there are further bugs in the
defaulting:
- _POSIX_NO_TRUNC is defined as 1.  This means that all file systems
   have it.  But at least shortnames for msdosfs don't have it.  As
   previously discussed, this is not defaulted, but it is better for
   defaulting than _PC_NAME_MAX since 1 is correct for most file systems.
- _POSIX_CHOWN_RESTRICTED is defined as 1.  This means that all file
   systems have it.  Perhaps they are actually do.  It is defaulted, but
   its default value is spelled 1 instead of _POSIX_CHOWN_RESTRICTED.

>> - NAME_MAX.  This is very far from system-wide
>> - PATH_MAX.  This is system-wide.
>> - LINK_MAX.  This is very far from system-wide, especially after expansion
>>    of nlink_t.  Most file systems get this wrong.  LINK_MAX is bogusly
>>    defined to be the constant 32767, but the value is very variable.
>>    zfs_pathconf() returns INT_MAX and zfs uses something like int32_t
>>    internally, but zfs_getattr() clamps to LINK_MAX.  ext2fs limits to
>>    32000 for ext2 and 65000 for ext4 internally but ext2_pathconf()
>>    only returns this limit in some cases -- other cases return INT_MAX.
>
> In theory if a filesystem supported an infinite link count or name count
> it would have no filesystem-specific limit and having LINK_MAX and
> NAME_MAX provide the system limits gives a reasonable default in that case.

NAME_MAX is required to give the limit for all file systems if it is
defined.  _POSIX_NAME_MAX is what gives a "reasonable" default (it actually
gives the minimum value for a "reasonable" {NAME_MAX}).

For newer POSIX limits, a macro value of 0 means that the feature might
be supported (and all its headers are present).  So it wouldn't be
such good style to use the macro value as the default -- pathconf() must
give the runtime value which is rarely 0.  We only use 0 for
_POSIX_IPV6 and _POSIX_PRIORITY_SCHEDULING which are not under pathconf().

> I don't know if any filesystem supports infinite limits in practice, but
> it's also true that the correct behavior in each filesystem would be to
> return a value like MIN(fs_limit, system_limit), not just the native
> filesystem limit.  Perhaps this could be implemented by a post-processing
> step for VOP_PATHCONF that truncates values that exceed system limits to
> system limits.

The file system should limit itself to the system limit.  This already
happens for {LINK_MAX}.  zfs does this with bugs by not limiting the
link count but reporting wrong link counts when the correct link count
is not representable.

pathconf() is limited to LONG_MAX, so there is always a system limit of
LONG_MAX.

nlink_t was recently bloated to uint64_t.  UINT64_MAX is not representable
by pathconf() even on 64-bit systems.

>> This is easy to improve:
>> - NAME_MAX, LINK_MAX: move to individual fs's
>
> That probably is true.  In this case I was just trying to reduce overrides
> to the ones that were really fs-specific.  The existing places that used
> NAME_MAX and LINK_MAX were also likely wrong as they did not return a
> filesystem-specific value.
>
>> - MAX_CANON, MAX_INPUT, VDISABLE: move to devfs_pathconf().
>
> That is probably fine as well as it is simple and doesn't add complexity.
>
>> - PIPE_BUF: move to fifo_pathconf()?  What about directories?
>
> I think this one should be standard for the reasons I gave above.  If
> we really wanted to, we could make it conditional on the type (VFIFO and
> VDIR) without adding a lot of complexity.

Special files are also very easy to classify in vop_stdpathconf().

>>> Modified: head/sys/fs/fifofs/fifo_vnops.c
>>> ...
>>> -static int
>>> -fifo_pathconf(ap)
>>> -	struct vop_pathconf_args /* {
>>> -		struct vnode *a_vp;
>>> -		int a_name;
>>> -		int *a_retval;
>>> -	} */ *ap;
>>> -{
>>> -
>>> -	switch (ap->a_name) {
>>> -	case _PC_LINK_MAX:
>>> -		*ap->a_retval = LINK_MAX;
>>> -		return (0);
>>> -	case _PC_PIPE_BUF:
>>> -		*ap->a_retval = PIPE_BUF;
>>> -		return (0);
>>> -	case _PC_CHOWN_RESTRICTED:
>>> -		*ap->a_retval = 1;
>>> -		return (0);
>>> -	default:
>>> -		return (EINVAL);
>>> -	}
>>> -	/* NOTREACHED */
>>> }
>>
>> This was small and hopefully correct.  Not many settings apply to fifos.
>> It seems a but too small to be correct -- why would LINK_MAX apply but
>> not PATH_MAX?  Using the default, a pathconf which wants to set only 3
>> value would need about 15 cases to kill all the defaults.
>
> This was not really correct since at least NAME_MAX was missing, but it also
> should never have been used as a full VOP.  The parent filesystem in which the
> FIFO resides dictates values like LINK_MAX and NAME_MAX.  The only safe way to

It is actually correct.  Name limits are properties of directories (really
file systems/mount points), and NAME_MAX and PATH_MAX are explicitly
unspecified for non-directories (requirement 4).

LINK_MAX and CHOWN_RESTRICTED are more properties of individual files.
POSIX seems to be broken since it only requires something for them for
directories (requirement 1 and 7).  Requirement 4 about PIPE_BUF has the
complications needed to not have this bug.

>* ...
>> I expect that some of the other defaulted limits will become more variable
>> (so unsuitable as defaults).  LINK_MAX is almost there.
>>
>>> @@ -2505,11 +2493,6 @@ ufs_pathconf(ap)
>>> 	case _PC_MIN_HOLE_SIZE:
>>> 		*ap->a_retval = ap->a_vp->v_mount->mnt_stat.f_iosize;
>>> 		break;
>>> -	case _PC_ASYNC_IO:
>>> -		/* _PC_ASYNC_IO should have been handled by upper layers. */
>>> -		KASSERT(0, ("_PC_ASYNC_IO should not get here"));
>>> -		error = EINVAL;
>>> -		break;
>>> 	case _PC_PRIO_IO:
>>> 		*ap->a_retval = 0;
>>> 		break;
>>
>> But some of the newer limits like are in vfs so should have been defaulted?
>
> _PC_ASYNC_IO used to be explicitly handled in the pathconf system calls as a
> special case.  I had moved it to vop_stdpathconf() so that it was handled in
> one place (rather than 3), but had missed these specific handlers in existing
> pathconf VOPs.  System-wide pathconf values should really be in one place in
> vop_stdpathconf().  We might want to alter the list of things in
> vop_stdpathconf(), but I think all filesystems should fall through to it
> and only override values that are filesystem-specific.

ufs, ext2fs and zfs have many newer POSIX limits.   I think it is a bug
for unistd.h to not define these as 0 (supported by headers and
libraries but not by all file systems), but not many of these should
be defaulted.

Bruce


More information about the svn-src-head mailing list