From nobody Sat Feb 03 03:15:39 2024 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TRd7y5fnTz58tZ7; Sat, 3 Feb 2024 03:15:58 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-oi1-x230.google.com (mail-oi1-x230.google.com [IPv6:2607:f8b0:4864:20::230]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TRd7x4nsxz4jcC; Sat, 3 Feb 2024 03:15:57 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-oi1-x230.google.com with SMTP id 5614622812f47-3bd4e6a7cb0so1704616b6e.3; Fri, 02 Feb 2024 19:15:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706930156; x=1707534956; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=o8uCLbNo/R5iSdbdI8CNoNnJR3ftEHgHipU87im0erY=; b=iSePHhC9kqdZbuZ8TrqO4ufw7vo2EuKi9/JxMOSZCwS3K4tisp2uAutAqGY+CCn9Fm 71CWYe/n0H2v9E3fmyAy7UmXEkfVINEPcl5WEp+FLJzSdwyQmlsimRP49+KxWNB8faNR vKFn7MAh1RBNrSCSCKJe5m1Q36RuVCJkoSVkSgS1Klmtslc8S+QLClwEU+Z/+n4N6b6o qAinhHQ9WXgOYjjr+nK3eoQINfN50YSH5Z70fPBlcXoqcDFi6YhsHzoEbMatde8mtwZ3 k3lFWQZwT9JRJLfa/LlqZ6q2WNiUEuFH4d4urZxk7JcB322Z2IFKoJVK6Aj3BlFjazzG Srqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706930156; x=1707534956; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=o8uCLbNo/R5iSdbdI8CNoNnJR3ftEHgHipU87im0erY=; b=GGKatnJYBI8enrazHpolwuLL0byY3juCU2qs2pxHLzj2/tiIq64ChS/6+CRGUWK6Fy Thr0NyufHn/Qv1m+Genx0/gy/bcpsbZfCWTOk3O/vMSABNTfiIM1wv4M7iQkaynY2xJX Q5xVfTnm2bg/0uqIvRiRpWf6OvlUGqCNPH3PGrBwN/O4n8XVs7RR1El9tQ3tIgl37i6u nM2ruMu+o0dvdILZ7tHeSQa4nCfXyZWoXuhdvUd3Rskp+6kMGpHCJ0lM0qkUeZ+tsrY3 slgWbL9fUeMOhsfeVL9HaVeLX29hNt2RPBHvxvhCgq0oA8BRDMHXPyQuDWd59bMpXUon l+mw== X-Gm-Message-State: AOJu0YymyaOCWDCf5S1lx9A44uX2AhkPJ0e49SZIjuEsKuoWjQQiA7Fq XNzaVbbNKKPxsFnOUsn5t4LhfMkmnJQAelq4apWNIeM0DJ2Y+3LZjyZ+4D5I53lB8YWBExe/ZrQ Wa3R9Wcin1b7fWRih3lfCb0Hv7qc83oQ= X-Google-Smtp-Source: AGHT+IFpaR6MUUSl/y8AbsnDea1G0Xn8sbOmw0h2bN/CwD6kfqQ61IYfBSs1OrEFiwbSA9qsFxYnIQEulqsDvzg0IEI= X-Received: by 2002:a05:6808:ecd:b0:3bd:a088:ab25 with SMTP id q13-20020a0568080ecd00b003bda088ab25mr10842526oiv.50.1706930156190; Fri, 02 Feb 2024 19:15:56 -0800 (PST) List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org MIME-Version: 1.0 References: <2c31ac44-b34b-469c-a6de-fdd927ec2f9e@freebsd.org> <2fac0ac3-ba3a-4bca-b0d4-fafb0c5b75fd@app.fastmail.com> In-Reply-To: <2fac0ac3-ba3a-4bca-b0d4-fafb0c5b75fd@app.fastmail.com> From: Rick Macklem Date: Fri, 2 Feb 2024 19:15:39 -0800 Message-ID: Subject: Re: Increasing TCP TSO size support To: Drew Gallatin Cc: Richard Scheffenegger , "freebsd-net@FreeBSD.org" , FreeBSD Transport , rmacklem@freebsd.org, kp@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TRd7x4nsxz4jcC X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] On Fri, Feb 2, 2024 at 6:20=E2=80=AFPM Drew Gallatin = wrote: > > > > On Fri, Feb 2, 2024, at 9:05 PM, Rick Macklem wrote: > > > But the page size is only 4K on most platforms. So while an M_EXTPGS m= buf can hold 5 pages (..from memory, too lazy to do the math right now) and= reduces socket buffer mbuf chain lengths by a factor of 10 or so (2k vs 20= k per mbuf), the S/G list that a NIC will need to consume would likely decr= ease only by a factor of 2. And even then only if the busdma code to map m= bufs for DMA is not coalescing adjacent mbufs. I know busdma does some coa= lescing, but I can't recall if it coalesces physcally adjacent mbufs. > > I'm guessing the factor of 2 comes from the fact that each page is a > contiguous segment? > > > Actually, no, I'm being dumb. I was thinking that pages would be split u= p, but that's wrong. Without M_EXTPGS, each mbuf generated by sendfile (or= nfs) would be an M_EXT with a wrapper around a single 4K page. So the sca= tter/gather list would be exactly the same. > > The win would be if the pages themselves were contiguous (which they ofte= n are), and if the bus_dma mbuf mapping code coalesced those segments, and = if the device could handle DMA across a 4K boundary. That's what would get= you shorter s/g lists. > > I think tcp_m_copy() can handle this now, as if_hw_tsomaxsegsize is set b= y the driver to express how long the max contiguous segment they can handle= is. Sounds good. I'll give it a try someday soon (April maybe). Thanks for all the good info, rick > > BTW, I really hate the mixing of bus dma restrictions with the hw_tsomax = stuff. It always makes my head explode.. > > Drew >