From nobody Tue Nov 30 16:14:22 2021 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id D3CE818B3A16; Tue, 30 Nov 2021 16:14:34 +0000 (UTC) (envelope-from kevans@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4J3S3G3Tjbz3rtM; Tue, 30 Nov 2021 16:14:34 +0000 (UTC) (envelope-from kevans@freebsd.org) Received: from mail-qk1-f182.google.com (mail-qk1-f182.google.com [209.85.222.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) (Authenticated sender: kevans) by smtp.freebsd.org (Postfix) with ESMTPSA id 52DC667C1; Tue, 30 Nov 2021 16:14:34 +0000 (UTC) (envelope-from kevans@freebsd.org) Received: by mail-qk1-f182.google.com with SMTP id q64so27360165qkd.5; Tue, 30 Nov 2021 08:14:34 -0800 (PST) X-Gm-Message-State: AOAM5301wQTn18z/8VKmOKGndYkRHZl8pfbb1ETmr7WvBH2EY9I80fqG ZjjJOb0ej7jqOVDDzPLCtatc+y0WNZYG5pqeaCA= X-Google-Smtp-Source: ABdhPJxdZTHe+0xXp+y+CnS50sT7NPGpomErsRUHB2gTXo/v9mTpRaXAKYKQYstPGzB4o6O3Pyd5HWM/O/v/DDGrw0g= X-Received: by 2002:a37:b944:: with SMTP id j65mr186133qkf.708.1638288873877; Tue, 30 Nov 2021 08:14:33 -0800 (PST) List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 References: <202111301334.1AUDYJEU014078@gitrepo.freebsd.org> In-Reply-To: <202111301334.1AUDYJEU014078@gitrepo.freebsd.org> From: Kyle Evans Date: Tue, 30 Nov 2021 10:14:22 -0600 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: git: 3d9d64aa1846 - main - kern_tc: unify timecounter to bintime delta conversion To: Andriy Gapon Cc: src-committers , "" , dev-commits-src-main@freebsd.org Content-Type: text/plain; charset="UTF-8" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1638288874; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=5cnt50EVX0CW6w/kCWTmhE4iPfCuAKuSRanbSKNKyJQ=; b=k8wXMerJzKBFj36GsJcGwAygWgxaC131Wwx0asho4bjKd0Xns1QVDhVBRVzr1hNV6DGB7J q945S1Xa4BSx/fraUzrglGsYSHn526L8i4eWtcyd/M8i7XLnXmfODBdQ+n9O1+/ki/7Iqr O3dvzfVFWR8L0vh9sQFQ7CIqCH4/+9Ovj6lkHrAp0vosSWUMornrNzuJljztgcQLtt/e0P PL9MOPl4qCeC9mYMpLSiDFgFg5r4zYUTYCpAL0JdAHWhpkQ72K9y+P/xPSxnEukN0Gmb4H 1nAQTZwQjRQmJ1aDk0BEkNjNhYFTGtU5mFbtmhP5eZaw8XNhVL/pYDcp6HvZCw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1638288874; a=rsa-sha256; cv=none; b=eH9gNkmzlOk/+RVBLIalExwowwFX4UgvmNaedqJ/5nP0zJ9a138TosoS3LVIlTx3jFoH/y mfsG00aU0NWIjyZXEmlB0eW8qCp2BzNKVrxVpo706YKmfcipD9E6fnjwhpcPFhWnVgvK6X hk9l1pA6NHLb2kivxLqTGV4vL2Beea48Zd+9bveL+opLdRwhqJmSrYiapdbjDS2Ly435eM /rc3bA9p1D27zWiJ3I2EuGwK6/spKi+LZzRrvNZHeMNxjzgWQiUCn7Fgt5ysS++OC+YPQP 9cZlXtzWMM0rCMLs3EHO2OELQcOnrOjsGz5Um9R1nbdZeYXG8RxVbggsRy9KPA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N On Tue, Nov 30, 2021 at 7:34 AM Andriy Gapon wrote: > > The branch main has been updated by avg: > > URL: https://cgit.FreeBSD.org/src/commit/?id=3d9d64aa1846217eac9229f8cba5cb6646a688b7 > > commit 3d9d64aa1846217eac9229f8cba5cb6646a688b7 > Author: Andriy Gapon > AuthorDate: 2021-11-30 13:23:23 +0000 > Commit: Andriy Gapon > CommitDate: 2021-11-30 13:23:23 +0000 > > kern_tc: unify timecounter to bintime delta conversion > > There are two places where we convert from a timecounter delta to > a bintime delta: tc_windup and bintime_off. > Both functions use the same calculations when the timecounter delta is > small. But for a large delta (greater than approximately an equivalent > of 1 second) the calculations were different. Both functions use > approximate calculations based on th_scale that avoid division. Both > produce values slightly greater than a true value, calculated with > division by tc_frequency, would be. tc_windup is slightly more > accurate, so its result is closer to the true value and, thus, smaller > than bintime_off result. > > As a consequence there can be a jump back in time when time hands are > switched after a long period of time (a large delta). Just before the > switch the time would be calculated with a large delta from > th_offset_count in bintime_off. tc_windup does the switch using its own > calculations of a new th_offset using the large delta. As explained > earlier, the new th_offset may end up being less than the previously > produced binuptime. So, for a period of time new binuptime values may > be "back in time" comparing to values just before the switch. > > Such a jump must never happen. All the code assumes that the uptime is > monotonically nondecreasing and some code works incorrectly when that > assumption is broken. For example, we have observed sleepq_timeout() > ignoring a timeout when the sbinuptime value obtained by the callout > code was greater than the expiration value, but the sbinuptime obtained > in sleepq_timeout() was less than it. In that case the target thread > would never get woken up. > > The unified calculations should ensure the monotonic property of the > uptime. > > The problem is quite rare as normally tc_windup should be called HZ > times per second (typically 1000 or 100). But it may happen in VMs on > very busy hypervisors where a VM's virtual CPU may not get an execution > time slot for a second or more. > I wonder if this helps explain the behavior we saw when enabling TSC on VirtualBox guests. Threads doing small ~1 second or less sleeps would start to miss their wakeups, so we'd consistently see, e.g., shutdown issues after applying a high loading while we're waiting for bufdaemon threads.