[Bug 294768] fdescfs: `fdesc_getattr` mutates `vp->v_type` to `VLNK` on first `stat()`, breaking subsequent `open()` of pipe-fd entries on `linrdlnk` mounts

From: <bugzilla-noreply_at_freebsd.org>
Date: Sat, 25 Apr 2026 06:09:53 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=294768

            Bug ID: 294768
           Summary: fdescfs: `fdesc_getattr` mutates `vp->v_type` to
                    `VLNK` on first `stat()`, breaking subsequent `open()`
                    of pipe-fd entries on `linrdlnk` mounts
           Product: Base System
           Version: 15.0-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: mmpestorich@gmail.com
 Attachment #270087 text/plain
         mime type:

Created attachment 270087
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=270087&action=edit
Small C program to reproduce the bug

## Summary

On a `linrdlnk` fdescfs mount (e.g., the standard Linuxulator fdescfs mount
inside a Linux jail), the first `stat()` (or `lstat()` / `fstatat()`) on
`/dev/fd/N` — where `N` is a pipe fd in the calling process — silently poisons
the cached fdescfs vnode for fd `N`. Every subsequent `open()` of that same
path, from any process in the same jail, returns `ENOENT`, until the fdescfs
filesystem is `umount`ed and remounted.

The same path resolves cleanly via `/proc/self/fd/N` since linprocfs just
symlinks `fd` to `/dev/fd`, so the bug surfaces equivalently from both paths.
The bug is **per-fd-index** (poisoning fd 11 doesn't affect fd 12),
**pipe-specific** (regular-file fds are not affected), and **persistent across
`fork`/`exec`** (it's mount-scoped state).

User-visible impact: zsh's `source <(...)` always fails inside a Linux jail.
zsh's `source` builtin probes the path with `stat()` before opening; that probe
is the trigger. bash's `source` doesn't stat first and so doesn't trip the bug.

## Reproducer

A 120-line standalone C program (attached) that compiles inside any Linux jail
with `linrdlnk`-mounted fdescfs and prints a deterministic `POISONED` or `OK`
verdict. The relevant excerpt:

```c
/* Set up a pipe with read end at fd 11, fork a producer that
 * writes "x\n" to the write end and exits. */
setup_pipe_at_fd_11();

/* Step 1: external cat of /proc/self/fd/11 succeeds (baseline). */
verify_via_external_cat("/proc/self/fd/11");   /* OK */

/* Step 2: a single stat() call on the path. */
struct stat st;
stat("/proc/self/fd/11", &st);                  /* returns 0 */

/* Step 3: external cat of the same path now fails. */
verify_via_external_cat("/proc/self/fd/11");   /* POISONED, ENOENT */

/* Step 4: a fresh forked child setting up its own fd 11 also fails. */
fork_child_does_setup_then_external_cat();      /* POISONED, ENOENT */
```

Operations matrix (each row run on a freshly-mounted fdescfs):

| Operation                                   | Triggers? |
| ------------------------------------------- | --------- |
| `noop`                                      | NO        |
| `stat("/proc/self/fd/11")`                  | **YES**   |
| `lstat("/proc/self/fd/11")`                 | **YES**   |
| `stat("/dev/fd/11")`                        | **YES**   |
| `lstat("/dev/fd/11")`                       | **YES**   |
| `fstatat(AT_FDCWD, ".../fd/11", ..., 0)`    | **YES**   |
| `access("/proc/self/fd/11", R_OK)`          | NO        |
| `readlink("/proc/self/fd/11")`              | NO        |
| `open("/proc/self/fd/11", O_RDONLY)`        | NO        |
| `stat("/proc/self/fd/0")` (regular file fd) | NO        |

## Root cause

`sys/fs/fdescfs/fdesc_vnops.c:474` in `fdesc_getattr`:

```c
static int
fdesc_getattr(struct vop_getattr_args *ap)
{
    struct vnode *vp = ap->a_vp;
    struct vattr *vap = ap->a_vap;
    ...
    case Fdesc:
        vap->va_type = (VFSTOFDESC(vp->v_mount)->flags &
            (FMNT_RDLNKF | FMNT_LINRDLNKF)) == 0 ? VCHR : VLNK;
        ...
        break;
    ...
    vp->v_type = vap->va_type;     /* <-- THE BUG */
    return (0);
}
```

The line caches `vap->va_type` (which is `VLNK` for both `rdlnk` and `linrdlnk`
modes) into the cached vnode itself. For `linrdlnk` mode, that mutates `v_type`
from `VNON` (its allocation-time value) to `VLNK`.

`linrdlnk`'s design intent (`fdesc_allocvp` lines 191–196) is to keep `v_type`
non-`VLNK` while setting `VV_READLINK`, so that `readlink()` returns
Linux-style strings while `namei()` does NOT follow the entry as a symlink:

```c
if (ftype == Fdesc) {
    if ((fmp->flags & FMNT_RDLNKF) != 0)
        vp->v_type = VLNK;                 /* rdlnk: real symlink */
    else if ((fmp->flags & FMNT_LINRDLNKF) != 0)
        vp->v_vflag |= VV_READLINK;        /* linrdlnk: readlink only */
}
```

`fdesc_open` depends on namei NOT following the entry as a symlink — it returns
`ENODEV` to signal `dupfdopen()`, which is the magic that makes
`open("/dev/fd/N")` equivalent to `dup(N)` for the caller. If namei follows the
vnode as a symlink first (because `v_type == VLNK`), it never reaches
`fdesc_open` and the dup magic never happens.

After the lazy mutation in `fdesc_getattr`, `kern/vfs_lookup.c:1345` triggers
symlink follow:

```c
if ((dp->v_type == VLNK) &&
    ((cnp->cn_flags & FOLLOW) || (cnp->cn_flags & TRAILINGSLASH) ||
     *ndp->ni_next == '/')) {
    cnp->cn_flags |= ISSYMLINK;
    ...
}
```

namei calls `VOP_READLINK` → `fdesc_readlink`. For non-vnode-backed fd file
types (`fp->f_type != DTYPE_VNODE`, e.g. pipes, sockets) it returns the literal
string `"anon_inode:[unknown]"`. namei tries to walk that string as a path,
fails to find any such file, and returns `ENOENT`.

The pipe-specificity is explained by `fdesc_readlink`'s switch on `fp->f_type`:
regular-file fds (`DTYPE_VNODE`) take the `vn_fullpath` branch and return a
real path; everything else returns the unresolvable `anon_inode:[unknown]`.

## Prior art

Commit [`3bffa22`](https://cgit.freebsd.org/src/commit/?id=3bffa22) (kostikbel,
2023-06-27) attempted to set `vp->v_type = VLNK` directly in `fdesc_allocvp`
for `FMNT_LINRDLNKF`. Commit
[`9c3bfe2`](https://cgit.freebsd.org/src/commit/?id=9c3bfe2) (kostikbel,
2023-07-13) reverted it citing **"linuxolator expectations"** incompatibility —
i.e., the intent was specifically to keep `v_type` non-`VLNK` for linrdlnk so
that the dup-on-open magic works.

The revert restored `VV_READLINK` at allocation time but did not remove the
equivalent mutation in `fdesc_getattr` line 474, which runs lazily on the first
stat. That second mutation reproduces exactly the breakage the revert was
trying to avoid.

I did not find any existing Bugzilla PR matching this symptom, so filing this
as new.

## Suggested fix

Three options, in order of cleanliness:

```c
/* Option 1: remove line 474 entirely.
 *
 * v_type is correctly initialized at allocation for all three
 * fdescfs modes (RDLNKF=VLNK at alloc; LINRDLNKF=VNON+VV_READLINK
 * at alloc; neither flag → VNON, returning va_type=VCHR via getattr
 * but the vnode itself is never opened as a CHR device because
 * fdesc_open intercepts). The runtime mutation in getattr is at
 * best redundant (RDLNKF, no-op) and at worst harmful (LINRDLNKF,
 * the bug).
 */

/* Option 2: only set if currently unset. Preserves the invariant
 * "v_type is set once at allocation" and makes the line a safe
 * no-op for all currently-allocated states. */
if (vp->v_type == VNON)
    vp->v_type = vap->va_type;

/* Option 3: explicitly skip for linrdlnk. */
if ((VFSTOFDESC(vp->v_mount)->flags & FMNT_LINRDLNKF) == 0)
    vp->v_type = vap->va_type;
```

I'd recommend Option 1 — the line shouldn't exist; coupling the caller-visible
`va_type` to the kernel-cached `vp->v_type` is what created the bug.

## Environment

- Host kernel: `FreeBSD 15.0-STABLE stable/15-n282575-973d607b284b GENERIC
amd64`
- Architecture: amd64
- Kernel config: `GENERIC`, no `WITNESS`, no `INVARIANTS`-only debug build
- Jail: Devuan Excalibur (Linux ABI compat) with the standard four Linux-jail
mounts:
  ```
  fdescfs    <jailroot>/dev/fd    fdescfs   rw,linrdlnk
  linprocfs  <jailroot>/proc      linprocfs rw
  linsysfs   <jailroot>/sys       linsysfs  rw
  devfs      <jailroot>/dev       devfs     rw
  ```
  (`linrdlnk` is added in the mount options; userspace `mount(8)` doesn't
display it but the kernel honors it, as evidenced by `readlink()` returning
`anon_inode:[unknown]` for pipe fds.)
- The bug should reproduce on any Linux ABI jail with `linrdlnk` fdescfs on any
reasonably recent FreeBSD — the buggy line has been in `fdesc_getattr` for many
years; the 2023 `linrdlnk` revert (commit `9c3bfe2`) didn't catch it.

## Severity

Affects some people — anyone running zsh inside a FreeBSD Linux jail (a common
developer setup, especially with oh-my-zsh which calls `source <(...)` during
shell startup). Userland workaround exists (wrap `source` to copy through a
temp file, avoiding the triggering `stat()`), but it requires per-system or
per-user dotfile changes. The bug class — userspace `stat()` corrupting kernel
filesystem state across processes — is also worth fixing on its own merits.

## Attaching

- The C reproducer (`fdescfs-stat-poisons-pipe-fd.c`) as a `text/plain`
attachment.
- I have not built and tested a patched kernel myself; the one-line fix
proposals are based on source reading plus the behavioral evidence above. Happy
to test a patch.

## Reporter context

I encountered this bug in a production-ish Linuxulator + Devuan jail setup
hosting per-developer containers — zsh + oh-my-zsh became unusable on first
login, which is what surfaced it. The deterministic C reproducer was developed
empirically by narrowing what I observed in that environment.

The kernel source analysis (identifying `fdesc_vnops.c:474` as the trigger, the
2023 `9c3bfe2` revert as relevant prior art, the proposed fix shape) was done
with LLM assistance — I'm comfortable in C but not a kernel developer. I read
the cited line numbers in the local `/usr/src/sys/fs/fdescfs/` and
`/usr/src/sys/kern/` sources and verified the analysis is consistent with
observed behavior, but I would appreciate independent verification by someone
who knows fdescfs and `namei` internals as I do not. Per the FreeBSD Q2 2025
Core Team status report, this kind of LLM-assisted bug analysis ("tracking down
bugs", "helping to understand large code bases") is within the
explicitly-endorsed acceptable use; flagging the assistance here for
transparency, not because policy requires it. Happy to test patches and provide
additional data from this host.

-- 
You are receiving this mail because:
You are the assignee for the bug.