From nobody Mon Aug 23 12:54:31 2021 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id C1C701792A4A for ; Mon, 23 Aug 2021 12:54:49 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ot1-f54.google.com (mail-ot1-f54.google.com [209.85.210.54]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GtXJT4wkBz3Q1m for ; Mon, 23 Aug 2021 12:54:49 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-ot1-f54.google.com with SMTP id i3-20020a056830210300b0051af5666070so26251381otc.4 for ; Mon, 23 Aug 2021 05:54:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=8HYQ/5M+ObmLu3/zNasTiE6eIgfOIwfCZFl0FPOBHrA=; b=KoX07xB/BaFn9sdKKlMOE06BGkQPdAid9IXaB5T3LSY9WwnaeEG0Hk+CvTW2cfKchy PxrKj1LJebmEygsJBbnkMUunuiedTyt5V2L2tSAGFlhiw2/NbCKi43TYHL69euztsmag 4pAUTn5sdw2rTydXSBDTaP5weo/fTSDAj0CZXy/Y7nRLqNcX0cvB3vAyH3WpvufYnNog mhaiY5JBczIkGAvCU+KjpZZ0PtJ095jbP+mviQqqVJpgELnoQpDYC3q+S/utJhRj3SkT Yhyz2QubcbK6mGv5RYgTQN/2sRJG77U2zSdQXk2bDPdNW19zdCHeJ5ql8mCc9NPIOqdq slUA== X-Gm-Message-State: AOAM532Pf2HEOD/qVl5nSZwaBCtvsVfgzWVDpvsMlAgviuJn4MkGo4g1 4ZmXk4mbXJjhMmnrL3cfRzDlp5+ElJqyijcZnXk= X-Google-Smtp-Source: ABdhPJwjrcZB4WDC7sRB3K6mzZcWRn0O2T27fYT4Ry1PdbLpaBmtM51PviGY85MbwUtEj+DfQucYMpOmSCTB9WAeTNg= X-Received: by 2002:a05:6808:14d6:: with SMTP id f22mr11333121oiw.57.1629723282777; Mon, 23 Aug 2021 05:54:42 -0700 (PDT) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Alan Somers Date: Mon, 23 Aug 2021 06:54:31 -0600 Message-ID: Subject: Re: sysctl is too slow To: Mateusz Guzik Cc: FreeBSD Hackers Content-Type: multipart/alternative; boundary="0000000000005a288805ca398581" X-Rspamd-Queue-Id: 4GtXJT4wkBz3Q1m X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: Y --0000000000005a288805ca398581 Content-Type: text/plain; charset="UTF-8" Ideally, but it's not very high priority, since it's merely a performance issue in a monitoring tool. On Mon, Aug 23, 2021 at 6:05 AM Mateusz Guzik wrote: > So is this something you plan on fixing? > > On 8/17/21, Alan Somers wrote: > > Actually, I did get a flamegraph, and only 0.77% of samples were in ZFS. > > > > On Mon, Aug 16, 2021 at 7:19 PM Mateusz Guzik wrote: > > > >> On 8/16/21, Alan Somers wrote: > >> > Yes, I see what you're talking about now. There are a bunch of linked > >> > lists in sysctl_find_oid etc. Good point. > >> > -Alan > >> > > >> > >> You still want to get a flamegraph, chances are most of the problem is > in > >> zfs. > >> > >> > On Mon, Aug 16, 2021 at 1:30 PM Mateusz Guzik > >> > wrote: > >> > > >> >> Last time I checked lookup of a sysctl was very bad with linear scans > >> all > >> >> over. > >> >> > >> >> Short of complete revamp of the entire thing I would start with > >> >> replacing the scans with a RB tree at each level. As is if you indeed > >> >> have 5000 datasets, you are doing increasingly longer walks. > >> >> > >> >> On 8/16/21, Alan Somers wrote: > >> >> > ztop feels very sluggish on a server with 5000 ZFS datasets. > Dtrace > >> >> shows > >> >> > that almost all of its time is spent in sys_sysctl. ktrace shows > >> >> > that > >> >> both > >> >> > ztop and sysctl(8) call sys_sysctl a total of five times for each > >> >> > sysctl > >> >> > they care about: > >> >> > > >> >> > 1) To get the next oid > >> >> > 2) To get the sysctl's name > >> >> > 3) To get the oidfmt > >> >> > 4) To get the size of the value > >> >> > 5) To get the value itself. > >> >> > > >> >> > Each of these steps takes about equal time, and together all five > >> >> > take > >> >> > about 100us. If the time per call is mostly syscall overhead, then > >> the > >> >> > process could be sped up by 80% by combining all of these things > >> >> > into > >> a > >> >> > single syscall: return the next oid, its name, its format, the size > >> >> > of > >> >> its > >> >> > value, and optimistically the value itself, assuming the user > passed > >> >> > a > >> >> > sufficiently large buffer. > >> >> > > >> >> > Am I missing something? Is there any other reason why sysctl is so > >> >> > slow? > >> >> > Or should I forget about it, and try to export ZFS's dataset stats > >> >> through > >> >> > devstat instead? > >> >> > -Alan > >> >> > > >> >> > >> >> > >> >> -- > >> >> Mateusz Guzik > >> >> > >> > > >> > >> > >> -- > >> Mateusz Guzik > >> > > > > > -- > Mateusz Guzik > --0000000000005a288805ca398581--