From nobody Tue Jul 26 09:03:47 2022 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4LsWDR3jcBz4Xcfb; Tue, 26 Jul 2022 09:03:51 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4LsWDR39jTz3gjZ; Tue, 26 Jul 2022 09:03:51 +0000 (UTC) (envelope-from kp@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1658826231; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=1MjOJoPjFAZzQ/OH/QquivoIPpBbjUXXZjnhkrQJgoQ=; b=dEgHJax9ht9mX8AmTiwxPjGWxvb3CPQPogmACrGerIRGTWPLoM2y83jMGzc5bmBbZo013V pBA5ZryPuy+ROdaVRWl6Dhzls4aI3/1m94cV8v2gibVZk1dBFN9nzpTNwPN2LbJ1Q5lmrL aQhW6wHJqY2fBgBoPaSSyev3rnMI3cH21FZ8o7qoBoA9/LiYoRkscbRwrYAzH1jHGPlUv/ 5f1AA8a2r69yhQvK2qJP72uZvq59xJvAnFXJWmynLK6WoU8R0YJLWGTH3OKO9NZsX0Uga8 jgE/qRO1Ei838RvTzRVsx85nxkL2VVGmjXcHxWIqpdIOhRR8rPsuPCKHKazXCg== Received: from venus.codepro.be (venus.codepro.be [5.9.86.228]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mx1.codepro.be", Issuer "R3" (verified OK)) (Authenticated sender: kp) by smtp.freebsd.org (Postfix) with ESMTPSA id 4LsWDR0wrgz1SSf; Tue, 26 Jul 2022 09:03:51 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: by venus.codepro.be (Postfix, authenticated sender kp) id 145593AB6E; Tue, 26 Jul 2022 11:03:48 +0200 (CEST) From: Kristof Provost To: John Baldwin Cc: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org Subject: Re: git: 151abc80cde7 - main - if_vlan: avoid hash table thrashing when adding and removing entries Date: Tue, 26 Jul 2022 11:03:47 +0200 X-Mailer: MailMate (1.14r5852) Message-ID: <057F4F07-EFE1-4F67-B15D-857F42C5588D@FreeBSD.org> In-Reply-To: <1ad901aa-ac87-3725-de1e-3c6a42c50cc2@FreeBSD.org> References: <202207222319.26MNJEVY032683@gitrepo.freebsd.org> <1ad901aa-ac87-3725-de1e-3c6a42c50cc2@FreeBSD.org> List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="=_MailMate_5ECFD0FD-8C38-43CD-ACD5-6EDDAE48A5DF_=" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1658826231; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=1MjOJoPjFAZzQ/OH/QquivoIPpBbjUXXZjnhkrQJgoQ=; b=FDcrY97sQhybpG020eMBNUwpqZcgNvg/IiK25Nr4c66YlsHeaMFnNuW7o3vyjnPpc6OhjH Z0rrpiyqMSTf8aULCjVgjWSZ8F3talkQS8Os27V5d4zFZghyQFAaU1K6i+G9iqbNRmd0i7 qN4nJ900RLE1ly9OhCcmXzOSHHQZif2tuq8iaxil7iLqD3oq8d8IKGTDOQs7IbPsG4ATLl yu6IX7VYWbvl1rscyMQBqrsfqNNcbKtP+byliYNYNBmGWss6jwdXT5kb73DDOSaHI32WBA 0sUIT0OQO9g7/ixgyTQ/7W8XdW6ijkOsKANppqN+GqJPNh87J38p/AGj9XLHtA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1658826231; a=rsa-sha256; cv=none; b=FI1xwFpqwQ9J3XNk6MyL5Zvs3QN2ke2jdRRMCdHwm6sUgg34wYhjv2GEBcPNKCGR69Ngej 1zuQL7SCjtHm34/376Yndi/LcnhHNMDDr6e0pVnlhZeUIrpZVcClowyEBf0XEALgjOqfms YDuApyumoZVCilTkDzdvFG2INaG/+Zd6xEnuxBgHFwHMh9WsHW/xXYG5leXcF7LUF0e3QG PlZUh9l/8yXX91UhdNZxO5WwixViwPB3jm63nqKHQFiVp6jOKvSYSfh1eCleLbTIvwTuA4 jqTNNl4f375aycmHhR5VL6ZyhPKePBllHBFAFzWf4h6tA4ZB3eFw8SrwOysReg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N --=_MailMate_5ECFD0FD-8C38-43CD-ACD5-6EDDAE48A5DF_= Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable On 25 Jul 2022, at 18:20, John Baldwin wrote: > On 7/22/22 4:19 PM, Kristof Provost wrote: >> The branch main has been updated by kp: >> >> URL: = >> https://cgit.FreeBSD.org/src/commit/?id=3D151abc80cde778bc18b91c334d07= fbd52bbb38fb >> >> commit 151abc80cde778bc18b91c334d07fbd52bbb38fb >> Author: Kristof Provost >> AuthorDate: 2022-07-22 17:17:04 +0000 >> Commit: Kristof Provost >> CommitDate: 2022-07-22 17:18:41 +0000 >> >> if_vlan: avoid hash table thrashing when adding and removing = >> entries >> vlan_remhash() uses incorrect value for b. >> When using the default value for VLAN_DEF_HWIDTH (4), the = >> VLAN hash-list table >> expands from 16 chains to 32 chains as the 129th entry is added. = >> trunk->hwidth >> becomes 5. Say a few more entries are added and there are now = >> 135 entries. >> trunk-hwidth will still be 5. If an entry is removed, = >> vlan_remhash() will >> calculate a value of 32 for b. refcnt will be decremented to = >> 134. The if >> comparison at line 473 will return true and vlan_growhash() will = >> be called. The >> VLAN hash-list table will be compressed from 32 chains wide to = >> 16 chains wide. >> hwidth will become 4. This is an error, and it can be seen when = >> a new VLAN is >> added. The table will again be expanded. If an entry is then = >> removed, again >> the table is contracted. >> If the number of VLANS stays in the range of 128-512, each = >> time an insert >> follows a remove, the table will expand. Each time a remove = >> follows an >> insert, the table will be contracted. >> The fix is simple. The line 473 should test that the number = >> of entries has >> decreased such that the table should be contracted using what = >> would be the new >> value of hwidth. line 467 should be: >> b =3D 1 << (trunk->hwidth - 1); >> PR: 265382 >> Reviewed by: kp >> MFC after: 2 weeks >> Sponsored by: NetApp, Inc. > > This does mean that there are some odd edge cases (e.g. if you remove = > the 513th entry > only to turn around and re-add it) that can still thrash even with = > this fixed. Not > sure if those are worth worrying about though? If they were, perhaps = > you could > imagine some sort of "dampener" where you have to remove some number N = > of entries > to actually rehash to a smaller table. But that is probably rather = > complex to > implement and probably not worth it? > Good point, although I think I agree it=E2=80=99s probably not worth worr= ying = too much about this. There=E2=80=99s a tangentially related issue I also want to look at one o= f = these days, namely that the current `trunk->refcnt > (b * b) / 2` check = results in far too many entries in each hash row. For example, until 128 = vlan interfaces we stick with width =3D 4, so end up with an average of 8= = entries in each row. That means that for every inbound packet we scan through a linked list = of 8 entries to find the correct vlan interface. We should tweak the = check to reduce this, as all it=E2=80=99ll cost is a tiny bit of memory. = (Or = we should just default VLAN_ARRAY on, paying a bit more memory for no = linked-list scanning). Kristof --=_MailMate_5ECFD0FD-8C38-43CD-ACD5-6EDDAE48A5DF_= Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

On 25 Jul 2022, at 18:20, John Baldwin wrote:

On 7/22/22 4:19 PM, Kristof Provost= wrote:

The branch main has been updated by kp:

URL: https://cgit.FreeBSD.org/src/co= mmit/?id=3D151abc80cde778bc18b91c334d07fbd52bbb38fb

commit 151abc80cde778bc18b91c334d07fbd52bbb38fb
Author: Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2022-07-22 17:17:04 +0000
Commit: Kristof Provost <kp@FreeBSD.org>
CommitDate: 2022-07-22 17:18:41 +0000

if_vlan: avoid hash table thrashing when adding and = removing entries
vlan_remhash() uses incorrect value for b.
When using the default value for VLAN_DEF_HWIDTH (4), the VLAN h= ash-list table
expands from 16 chains to 32 chains as the 129th entry is added. tru= nk->hwidth
becomes 5. Say a few more entries are added and there are now 135 en= tries.
trunk-hwidth will still be 5. If an entry is removed, vlan_remhash()= will
calculate a value of 32 for b. refcnt will be decremented to 134. Th= e if
comparison at line 473 will return true and vlan_growhash() will be = called. The
VLAN hash-list table will be compressed from 32 chains wide to 16 ch= ains wide.
hwidth will become 4. This is an error, and it can be seen when a ne= w VLAN is
added. The table will again be expanded. If an entry is then removed= , again
the table is contracted.
If the number of VLANS stays in the range of 128-512, each time = an insert
follows a remove, the table will expand. Each time a remove follows = an
insert, the table will be contracted.
The fix is simple. The line 473 should test that the number of e= ntries has
decreased such that the table should be contracted using what would = be the new
value of hwidth. line 467 should be:
b =3D 1 << (trunk->hwidth - 1);
PR: 265382
Reviewed by: kp
MFC after: 2 weeks
Sponsored by: NetApp, Inc.

This does mean that there are some odd edge = cases (e.g. if you remove the 513th entry
only to turn around and re-add it) that can still thrash even with this f= ixed. Not
sure if those are worth worrying about though? If they were, perhaps you= could
imagine some sort of "dampener" where you have to remove some number N of= entries
to actually rehash to a smaller table. But that is probably rather compl= ex to
implement and probably not worth it?


Good point, although I think I agree it=E2=80=99s probabl= y not worth worrying too much about this.

There=E2=80=99s a tangentially related issue I also want = to look at one of these days, namely that the current trunk->ref= cnt > (b * b) / 2 check results in far too many entries in each= hash row. For example, until 128 vlan interfaces we stick with width =3D= 4, so end up with an average of 8 entries in each row.
That means that for every inbound packet we scan through a linked list of= 8 entries to find the correct vlan interface. We should tweak the check = to reduce this, as all it=E2=80=99ll cost is a tiny bit of memory. (Or we= should just default VLAN_ARRAY on, paying a bit more memory for no linke= d-list scanning).

Kristof

--=_MailMate_5ECFD0FD-8C38-43CD-ACD5-6EDDAE48A5DF_=--