From nobody Wed Nov 10 02:57:47 2021 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4515A1847E88 for ; Wed, 10 Nov 2021 02:57:51 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-io1-xd2a.google.com (mail-io1-xd2a.google.com [IPv6:2607:f8b0:4864:20::d2a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HpqKC1Jdbz4WFC for ; Wed, 10 Nov 2021 02:57:51 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-io1-xd2a.google.com with SMTP id r8so1129985iog.7 for ; Tue, 09 Nov 2021 18:57:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=eMZfncqc/A+8mtKMIC0zC0XCg8gRoXjut7qPqEtTHXM=; b=VcopxPR3QLUO7G4Lo5ybcrfXYqcFitVgegKufk25lgyg8HDM3WjALbBnz2hyU9p94u 0uZNyZOKlhsAUV+h4svUUuw4SZmGerjhwyPAOdHrSBol5N/0Y6Lq6cgI2PH9EcmYu3eD 755REqpYbXecygGr6DXIj2xYtunfgeFVQsNd05dVyvb5+hOHorEizOZcxK4bG+MXnvXq fLaC1yNJiMNdpSjpx9Dhjn+WTB1E4gB/Dm+ISaWExomVB5k5agiUUcN1cnAHuO0mZ25Y MjbLtEJOuKMdyPcql+nbtLSKxzcy6qtvA2dIRQKB1haLbxRpjXvH4HZomS9Eqb6KcmBF 9OOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=eMZfncqc/A+8mtKMIC0zC0XCg8gRoXjut7qPqEtTHXM=; b=5N9pDgOiUo5TvPu9bOb50jShtv/rHxJi/42RmF9y3v1zEwiuoOJ124RRbcYILcNH4v 16b5xHufFdlwiacKPn3qbu4JKiL9PCJzAMN99d5sXk3Vnq1V6WvRc4vHifyCARVwH8Yj O++LiZB7KEN5k0E51awJ6kWYkv0HoDju5cz5KOpkcg88gdpM0kePmeApCEwm/VMppQV3 NQVvwkdU7OjDnpkTE1EdoFVqDCJofFNgzFR3oPp8xIBKssQEoZp6uV+WFWinSAx/Y1ZK S1ZO5E7BjZZUo+akKzvtMpca1BFdR0r7mlO8O7gBz+Cy0AJKG8/pnT2wWmrv2iU7e7U+ ZgFw== X-Gm-Message-State: AOAM531xjksQGvMRgtEZullEyF3B9IUqvK+Kh8x4UNmxhx2XtyS+13ra YUPiLLHvIsjOHbklJgdeXoaUH5mA/nk= X-Google-Smtp-Source: ABdhPJz0pSrNNKm8TUuqAWEnGyyOmN2lnh6sgGu0OocH8IOJ+/40fqsVlbPizXGjbaItLZhKt4ilIQ== X-Received: by 2002:a5d:8c8e:: with SMTP id g14mr8050162ion.16.1636513070277; Tue, 09 Nov 2021 18:57:50 -0800 (PST) Received: from nuc ([142.126.186.191]) by smtp.gmail.com with ESMTPSA id j15sm12654337ile.65.2021.11.09.18.57.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Nov 2021 18:57:49 -0800 (PST) Date: Tue, 9 Nov 2021 21:57:47 -0500 From: Mark Johnston To: jschauma@netmeister.org Cc: freebsd-net@freebsd.org Subject: Re: AF_UNIX socketpair dgram queue sizes Message-ID: References: <20211110015719.GY3553@netmeister.org> List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211110015719.GY3553@netmeister.org> X-Rspamd-Queue-Id: 4HpqKC1Jdbz4WFC X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Tue, Nov 09, 2021 at 08:57:20PM -0500, Jan Schaumann via freebsd-net wrote: > Hello, > > I'm trying to wrap my head around the buffer sizes > relevant to AF_UNIX/PF_LOCAL dgram socketpairs. > > On a FreeBSD/amd64 13.0 system, creating a socketpair > and simply writing a single byte in a loop to the > non-blocking write end without reading the data, I can > perform 64 writes before causing EAGAIN, yielding 1088 > bytes in FIONREAD on the read end (indicating 16 bytes > per datagram overhead). When transmitting on a unix dgram socket, each message will include a copy of the sender's address, represented by a dummy 16-byte sockaddr in this case. This is stripped by the kernel when receiving, but still incurs overhead with respect to socket buffer accounting. > This is well below the total net.local.dgram.recvspace > = 4096 bytes. I would have expected to be able to > perform 240 1 byte writes (240 + 240*16 = 4080). > > Now if I try to write SO_SNDBUF = 2048 bytes on each > iteration (or subsequently as many as I can until > EAGAIN), then I can send one datagram with 2048 bytes > and one datagram with 2016 bytes, filling recvspace as > (2 * 16) + (2048 + 2016) = 4096. > > But at smaller sizes, it looks like the recvspace is > not filled completely: writes in chunks of > 803 bytes > will fill recvspace up to 4096 bytes, but below 803 > bytes, recvspace is not maxed out. > > Does anybody know why smaller datagrams can't fill > recvspace? Or what I'm missing / misunderstanding > about the recvspace here? There is an additional factor: wasted space. When writing data to a socket, the kernel buffers that data in mbufs. All mbufs have some amount of embedded storage, and the kernel accounts for that storage, whether or not it's used. With small byte datagrams there can be a lot of overhead; with stream sockets the problem is mitigated somewhat by compression, but for datagrams we don't have a smarter mechanism to maintain message boundaries. The kern.ipc.sockbuf_waste_factor sysctl controls the upper limit on total bytes (used or not) that may be enqueued in a socket buffer. The default value of 8 means that we'll waste up to 7 bytes per byte of data, I think. Setting it higher should let you enqueue more messages. As far as I know this limit can't be modified directly, it's a function of the waste factor and the socket buffer size.