[Bug 294768] fdescfs: `fdesc_getattr` mutates `vp->v_type` to `VLNK` on first `stat()`, breaking subsequent `open()` of pipe-fd entries on `linrdlnk` mounts
Date: Sat, 25 Apr 2026 06:09:53 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=294768
Bug ID: 294768
Summary: fdescfs: `fdesc_getattr` mutates `vp->v_type` to
`VLNK` on first `stat()`, breaking subsequent `open()`
of pipe-fd entries on `linrdlnk` mounts
Product: Base System
Version: 15.0-RELEASE
Hardware: Any
OS: Any
Status: New
Severity: Affects Some People
Priority: ---
Component: kern
Assignee: bugs@FreeBSD.org
Reporter: mmpestorich@gmail.com
Attachment #270087 text/plain
mime type:
Created attachment 270087
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=270087&action=edit
Small C program to reproduce the bug
## Summary
On a `linrdlnk` fdescfs mount (e.g., the standard Linuxulator fdescfs mount
inside a Linux jail), the first `stat()` (or `lstat()` / `fstatat()`) on
`/dev/fd/N` — where `N` is a pipe fd in the calling process — silently poisons
the cached fdescfs vnode for fd `N`. Every subsequent `open()` of that same
path, from any process in the same jail, returns `ENOENT`, until the fdescfs
filesystem is `umount`ed and remounted.
The same path resolves cleanly via `/proc/self/fd/N` since linprocfs just
symlinks `fd` to `/dev/fd`, so the bug surfaces equivalently from both paths.
The bug is **per-fd-index** (poisoning fd 11 doesn't affect fd 12),
**pipe-specific** (regular-file fds are not affected), and **persistent across
`fork`/`exec`** (it's mount-scoped state).
User-visible impact: zsh's `source <(...)` always fails inside a Linux jail.
zsh's `source` builtin probes the path with `stat()` before opening; that probe
is the trigger. bash's `source` doesn't stat first and so doesn't trip the bug.
## Reproducer
A 120-line standalone C program (attached) that compiles inside any Linux jail
with `linrdlnk`-mounted fdescfs and prints a deterministic `POISONED` or `OK`
verdict. The relevant excerpt:
```c
/* Set up a pipe with read end at fd 11, fork a producer that
* writes "x\n" to the write end and exits. */
setup_pipe_at_fd_11();
/* Step 1: external cat of /proc/self/fd/11 succeeds (baseline). */
verify_via_external_cat("/proc/self/fd/11"); /* OK */
/* Step 2: a single stat() call on the path. */
struct stat st;
stat("/proc/self/fd/11", &st); /* returns 0 */
/* Step 3: external cat of the same path now fails. */
verify_via_external_cat("/proc/self/fd/11"); /* POISONED, ENOENT */
/* Step 4: a fresh forked child setting up its own fd 11 also fails. */
fork_child_does_setup_then_external_cat(); /* POISONED, ENOENT */
```
Operations matrix (each row run on a freshly-mounted fdescfs):
| Operation | Triggers? |
| ------------------------------------------- | --------- |
| `noop` | NO |
| `stat("/proc/self/fd/11")` | **YES** |
| `lstat("/proc/self/fd/11")` | **YES** |
| `stat("/dev/fd/11")` | **YES** |
| `lstat("/dev/fd/11")` | **YES** |
| `fstatat(AT_FDCWD, ".../fd/11", ..., 0)` | **YES** |
| `access("/proc/self/fd/11", R_OK)` | NO |
| `readlink("/proc/self/fd/11")` | NO |
| `open("/proc/self/fd/11", O_RDONLY)` | NO |
| `stat("/proc/self/fd/0")` (regular file fd) | NO |
## Root cause
`sys/fs/fdescfs/fdesc_vnops.c:474` in `fdesc_getattr`:
```c
static int
fdesc_getattr(struct vop_getattr_args *ap)
{
struct vnode *vp = ap->a_vp;
struct vattr *vap = ap->a_vap;
...
case Fdesc:
vap->va_type = (VFSTOFDESC(vp->v_mount)->flags &
(FMNT_RDLNKF | FMNT_LINRDLNKF)) == 0 ? VCHR : VLNK;
...
break;
...
vp->v_type = vap->va_type; /* <-- THE BUG */
return (0);
}
```
The line caches `vap->va_type` (which is `VLNK` for both `rdlnk` and `linrdlnk`
modes) into the cached vnode itself. For `linrdlnk` mode, that mutates `v_type`
from `VNON` (its allocation-time value) to `VLNK`.
`linrdlnk`'s design intent (`fdesc_allocvp` lines 191–196) is to keep `v_type`
non-`VLNK` while setting `VV_READLINK`, so that `readlink()` returns
Linux-style strings while `namei()` does NOT follow the entry as a symlink:
```c
if (ftype == Fdesc) {
if ((fmp->flags & FMNT_RDLNKF) != 0)
vp->v_type = VLNK; /* rdlnk: real symlink */
else if ((fmp->flags & FMNT_LINRDLNKF) != 0)
vp->v_vflag |= VV_READLINK; /* linrdlnk: readlink only */
}
```
`fdesc_open` depends on namei NOT following the entry as a symlink — it returns
`ENODEV` to signal `dupfdopen()`, which is the magic that makes
`open("/dev/fd/N")` equivalent to `dup(N)` for the caller. If namei follows the
vnode as a symlink first (because `v_type == VLNK`), it never reaches
`fdesc_open` and the dup magic never happens.
After the lazy mutation in `fdesc_getattr`, `kern/vfs_lookup.c:1345` triggers
symlink follow:
```c
if ((dp->v_type == VLNK) &&
((cnp->cn_flags & FOLLOW) || (cnp->cn_flags & TRAILINGSLASH) ||
*ndp->ni_next == '/')) {
cnp->cn_flags |= ISSYMLINK;
...
}
```
namei calls `VOP_READLINK` → `fdesc_readlink`. For non-vnode-backed fd file
types (`fp->f_type != DTYPE_VNODE`, e.g. pipes, sockets) it returns the literal
string `"anon_inode:[unknown]"`. namei tries to walk that string as a path,
fails to find any such file, and returns `ENOENT`.
The pipe-specificity is explained by `fdesc_readlink`'s switch on `fp->f_type`:
regular-file fds (`DTYPE_VNODE`) take the `vn_fullpath` branch and return a
real path; everything else returns the unresolvable `anon_inode:[unknown]`.
## Prior art
Commit [`3bffa22`](https://cgit.freebsd.org/src/commit/?id=3bffa22) (kostikbel,
2023-06-27) attempted to set `vp->v_type = VLNK` directly in `fdesc_allocvp`
for `FMNT_LINRDLNKF`. Commit
[`9c3bfe2`](https://cgit.freebsd.org/src/commit/?id=9c3bfe2) (kostikbel,
2023-07-13) reverted it citing **"linuxolator expectations"** incompatibility —
i.e., the intent was specifically to keep `v_type` non-`VLNK` for linrdlnk so
that the dup-on-open magic works.
The revert restored `VV_READLINK` at allocation time but did not remove the
equivalent mutation in `fdesc_getattr` line 474, which runs lazily on the first
stat. That second mutation reproduces exactly the breakage the revert was
trying to avoid.
I did not find any existing Bugzilla PR matching this symptom, so filing this
as new.
## Suggested fix
Three options, in order of cleanliness:
```c
/* Option 1: remove line 474 entirely.
*
* v_type is correctly initialized at allocation for all three
* fdescfs modes (RDLNKF=VLNK at alloc; LINRDLNKF=VNON+VV_READLINK
* at alloc; neither flag → VNON, returning va_type=VCHR via getattr
* but the vnode itself is never opened as a CHR device because
* fdesc_open intercepts). The runtime mutation in getattr is at
* best redundant (RDLNKF, no-op) and at worst harmful (LINRDLNKF,
* the bug).
*/
/* Option 2: only set if currently unset. Preserves the invariant
* "v_type is set once at allocation" and makes the line a safe
* no-op for all currently-allocated states. */
if (vp->v_type == VNON)
vp->v_type = vap->va_type;
/* Option 3: explicitly skip for linrdlnk. */
if ((VFSTOFDESC(vp->v_mount)->flags & FMNT_LINRDLNKF) == 0)
vp->v_type = vap->va_type;
```
I'd recommend Option 1 — the line shouldn't exist; coupling the caller-visible
`va_type` to the kernel-cached `vp->v_type` is what created the bug.
## Environment
- Host kernel: `FreeBSD 15.0-STABLE stable/15-n282575-973d607b284b GENERIC
amd64`
- Architecture: amd64
- Kernel config: `GENERIC`, no `WITNESS`, no `INVARIANTS`-only debug build
- Jail: Devuan Excalibur (Linux ABI compat) with the standard four Linux-jail
mounts:
```
fdescfs <jailroot>/dev/fd fdescfs rw,linrdlnk
linprocfs <jailroot>/proc linprocfs rw
linsysfs <jailroot>/sys linsysfs rw
devfs <jailroot>/dev devfs rw
```
(`linrdlnk` is added in the mount options; userspace `mount(8)` doesn't
display it but the kernel honors it, as evidenced by `readlink()` returning
`anon_inode:[unknown]` for pipe fds.)
- The bug should reproduce on any Linux ABI jail with `linrdlnk` fdescfs on any
reasonably recent FreeBSD — the buggy line has been in `fdesc_getattr` for many
years; the 2023 `linrdlnk` revert (commit `9c3bfe2`) didn't catch it.
## Severity
Affects some people — anyone running zsh inside a FreeBSD Linux jail (a common
developer setup, especially with oh-my-zsh which calls `source <(...)` during
shell startup). Userland workaround exists (wrap `source` to copy through a
temp file, avoiding the triggering `stat()`), but it requires per-system or
per-user dotfile changes. The bug class — userspace `stat()` corrupting kernel
filesystem state across processes — is also worth fixing on its own merits.
## Attaching
- The C reproducer (`fdescfs-stat-poisons-pipe-fd.c`) as a `text/plain`
attachment.
- I have not built and tested a patched kernel myself; the one-line fix
proposals are based on source reading plus the behavioral evidence above. Happy
to test a patch.
## Reporter context
I encountered this bug in a production-ish Linuxulator + Devuan jail setup
hosting per-developer containers — zsh + oh-my-zsh became unusable on first
login, which is what surfaced it. The deterministic C reproducer was developed
empirically by narrowing what I observed in that environment.
The kernel source analysis (identifying `fdesc_vnops.c:474` as the trigger, the
2023 `9c3bfe2` revert as relevant prior art, the proposed fix shape) was done
with LLM assistance — I'm comfortable in C but not a kernel developer. I read
the cited line numbers in the local `/usr/src/sys/fs/fdescfs/` and
`/usr/src/sys/kern/` sources and verified the analysis is consistent with
observed behavior, but I would appreciate independent verification by someone
who knows fdescfs and `namei` internals as I do not. Per the FreeBSD Q2 2025
Core Team status report, this kind of LLM-assisted bug analysis ("tracking down
bugs", "helping to understand large code bases") is within the
explicitly-endorsed acceptable use; flagging the assistance here for
transparency, not because policy requires it. Happy to test patches and provide
additional data from this host.
--
You are receiving this mail because:
You are the assignee for the bug.