git: 2694c869ff9f - main - ktls: fix a panic with INVARIANTS

Ian Lepore ian at freebsd.org
Thu Aug 5 17:41:58 UTC 2021


On Thu, 2021-08-05 at 17:10 +0000, Andrew Gallatin wrote:
> The branch main has been updated by gallatin:
> 
> URL:
> https://cgit.FreeBSD.org/src/commit/?id=2694c869ff9ff60fd8e3d4d7936b7dc61763c18a
> 
> commit 2694c869ff9ff60fd8e3d4d7936b7dc61763c18a
> Author:     Andrew Gallatin <gallatin at FreeBSD.org>
> AuthorDate: 2021-08-05 17:05:00 +0000
> Commit:     Andrew Gallatin <gallatin at FreeBSD.org>
> CommitDate: 2021-08-05 17:09:06 +0000
> 
>     ktls: fix a panic with INVARIANTS
>     
>     98215005b747fef67f44794ca64abd473b98bade introduced a new
>     thread that uses tsleep(..0) to sleep forever.  This hit
>     an assert due to sleeping with a 0 timeout.
>     
>     So spell "forever" using SBT_MAX instead, which does not
>     trigger the assert.
>     
>     Pointy hat to: gallatin
>     Pointed out by: emaste
>     Sponsored by: Netflix
> ---
>  sys/kern/uipc_ktls.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/sys/kern/uipc_ktls.c b/sys/kern/uipc_ktls.c
> index 17b87195fc50..47815c266667 100644
> --- a/sys/kern/uipc_ktls.c
> +++ b/sys/kern/uipc_ktls.c
> @@ -2240,7 +2240,7 @@ ktls_alloc_thread(void *ctx)
>         nbufs = 0;
>         for (;;) {
>                 atomic_store_int(&sc->running, 0);
> -               tsleep(sc, PZERO, "waiting for work", 0);
> +               tsleep_sbt(sc, PZERO, "waiting for work", SBT_MAX,
> SBT_1S, 0);
>                 atomic_store_int(&sc->running, 1);
>                 sc->wakeups++;
>                 if (nbufs != ktls_max_alloc) {

Finding a different way to spell "forever" is not a valid way to fix a
problem where you're being warned that it is not safe to sleep forever.

The assert was warning you that the code was vulnerable to hanging
forever due to a missed wakeup.  The code is still vulnerable to that,
but now the problem is hidden and will be very difficult to find (more
so because the wait message also violates the convention of using a
short name that can be displayed by tools such as ps(1) and SIGINFO,
where the wait-channel display is currently likely to show as
"waitin").

I haven't looked at the code outside of the few lines shown in the
commit diff, but based on the names involved, I suspect the right fix
is to protect sc->running with a mutex and use msleep() instead of
trying to roll-your-own with atomics.  That would certainly be true if
the wakeup code is some form of "if (!sc->running) wakeup(sc);"

-- Ian




More information about the dev-commits-src-all mailing list