From nobody Tue Apr 22 15:49:43 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Zhms166qFz5snxf for ; Tue, 22 Apr 2025 15:49:57 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-ed1-x52d.google.com (mail-ed1-x52d.google.com [IPv6:2a00:1450:4864:20::52d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Zhms145Bdz3ZZt for ; Tue, 22 Apr 2025 15:49:57 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ed1-x52d.google.com with SMTP id 4fb4d7f45d1cf-5f435c9f2f9so8139320a12.1 for ; Tue, 22 Apr 2025 08:49:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1745336996; x=1745941796; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=MImSsogORdjYUlPCWyoeyTLoi8jA2jbqbrVQS5LqjY0=; b=KHxXrH13VOnHcosIjk27RYcnUYu/GVzv5Mm6hETllse3XUGk7wDydTk83yR1I5/Gtu JY80kQPvpg+QBaAJ0JsQdk5HMnQbX7OoXNsJcw6cpdySxq69I0hyjSnJMjSkJQ9VDpbD JKLXrb67PVT7KMPzcS01zd9keMZ2mimlGyCUfygeUrpHmtolj1cm51cHRJv8hWKniqhK KVeJtYAZVMAYbKa12uyg2UbznlAwYCs63cjd8DH6iHe+nEO/scz477m1EuqEKk/B90aP JaXNWBB94/+bminfuqOXq0oSxvPo9QzjF30cPqjToyr2rmoDe8Eqvk/HBg8k2gPmZtwj 291g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745336996; x=1745941796; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MImSsogORdjYUlPCWyoeyTLoi8jA2jbqbrVQS5LqjY0=; b=gWKVvFsqgIpYN3qInC4nth8vZqjkpfDRZdy+PdoiyE68KY5gSDoCwapsSCFcLI4h/Z FX5AnextYLArNGYn657NDOgtTR+eq8UNOtW3JAu9Ko/6oYlj5OpOg214p8AaV4rmfYfF 6Q01rgM4nQr7vS3jRj4tERMuiBh/W6yO7S+sAdzpMf3NzQo6xI78CCS2BSAdYbgOdc0y GW1xSNJOBri3s0YZhcYocCi6mdrbgDbIISlf5zXAQo6btYtCWIYK+Quhl24gX2uizLop csetHtiVWyydqH0olJyhymwxpkqINW8WXaSzI2QAY0EeIH5l6dpDKXLWHK0EICoVwai/ 9DtQ== X-Gm-Message-State: AOJu0Yx6YIjdlCGSxVQ4u/r4dD3sKGqO1PsEhrNQEil7FOipk7b2AVz8 mHzU5BeKe/qNRZlfpDS/I6X9wgy5ndGHo5mPL+YNtgPXepk7sSKfdXGI9PkYX2DWGoAs3dbe4K+ jKp/UempTviA+8EDUcZbNOYvCFQ== X-Gm-Gg: ASbGncsv72YXkKt7DL48R8cAb0/RnB+hve5MBYwsu3pDKQoXLdI1BmQlqUUS2sifN1h Wre7lv7oE5Zx3lLxUOcVudmOD53OppJ0APi8O+YUTn79kw7JlxhVrrK01UsOCl6Dlh50T7ujqg9 /XcDpm27Pmo3A3b1Le7fjkjIBcIlniRGjWoNOwvhhOUWT0yZR330kq+A== X-Google-Smtp-Source: AGHT+IE42lLh11i2LbRDX+GrrXmYQwTTaGmb9t1htPLSjPFB47PG7Li4iDJvDiSqH5h4+TdBp5yw34LrOe6tcryi6AA= X-Received: by 2002:a05:6402:5241:b0:5e5:3643:c8b5 with SMTP id 4fb4d7f45d1cf-5f628612e4cmr15193725a12.30.1745336995564; Tue, 22 Apr 2025 08:49:55 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 References: <56F52DF4-2988-4F06-9F53-90D07AF5DD02@ketas.si.pri.ee> <1357110019.7132.1745326331870@localhost> In-Reply-To: From: Rick Macklem Date: Tue, 22 Apr 2025 08:49:43 -0700 X-Gm-Features: ATxdqUFM1-WDEqJfg7ZFm7dVFD8GJZKn0eo8wVJ47poqmS2qaiOOSorlp-4Lqhw Message-ID: Subject: Re: zfs (?) issues? To: Sulev-Madis Silber Cc: freebsd-current@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4Zhms145Bdz3ZZt X-Spamd-Bar: ---- I wouldn't normally top post, but all I have is a generic question. Do you have a swap partition setup? (I'd use something like 6-8Gbytes for a 4Gbyte system.) rick On Tue, Apr 22, 2025 at 8:23=E2=80=AFAM Sulev-Madis Silber wrote: > > well i don't have those errors anymore so there's nothing to give > > i've tried to tune arc but it didn't do anything so i took those things o= ff again > > right now i'm looking at > > ARC: 1487M Total, 1102M MFU, 128M MRU, 1544K Anon, 56M Header, 199M Other > 942M Compressed, 18G Uncompressed, 19.36:1 Ratio > > and wonder wtf > > i bet there's issue somewhere and i somehow can't properly recreate it. o= n memory pressure it does resize arc down properly so seems like i don't ne= ed any limits > > and there's no tmpfs. it would be useless at that low memory sizes > > the problem is that i can't figure out what all those problems are, how t= o recreate those conditions and how to workaround or maybe find bugs. also = don't have enough hw to solely test it on. unless i can maybe try it on tin= y 512m vm. and then i would need to know what to try > > i also don't know why those git settings help me: > > [core] > packedGitWindowSize =3D 32m > packedGitLimit =3D 128m > preloadIndex =3D false > [diff] > renameLimit =3D 16384 > > how to tune it from some global place. and so on. and why it would even n= eed fiddiling so much? zfs indeed has improved a lot, previously it was qui= te a hell to use > > i don't even know if this is related to mmap. even then, i don't really g= et what that function even does. hence then "zfs (?) issue". it might even = not be zfs at all > > there are probably multiple combined issues here > > i also don't really buy the idea that ton of ram would automatically fix = this > > so yeah unsure what to think of this > > some of the issues i found that others also have. some of them seem new > > some fixes were like as if trial and errors and nobody seemed to know wha= t's wrong even. granted, that was forum so maybe here it's better here? > > i mean i have used below average equipment my entire life and usual case = to cope with this is to just give it more time. put more swap and just wait > > i think someone tested my git issues in 4g vm and found no issues at all?= other things seem like as i only i have them > > i also find kind of confusing that if this is hw, why i don't see any oth= er issues > > this is not the first time that i have found something confusing in fbsd = that later turned out to be bug and was further tested and fixed by other > > hence the current mailing list so maybe someone else has ideas. or if it = has already fix. and i hope there are people with much larger labs and coul= d easily tell / test things > > so in the end, > > 1) why should git on large repo cause machine to run out of memory, inste= ad of just being as slow as it would need to be > > 2) why / what are fs operations that could cause low power machine to mys= teriously fail on zfs, when expected results would be slow fs behaviour > > i don't know what really happens and it's way too complex me to get all m= emory management that happens in kernel. i only have this wild guess that a= ny type of caching should happen in "leftover" ram and make things faster i= f possible. and any fs operations that have already reported completed by k= ernel can't be suddenly found incomplete later. whatever that fs-related st= ray buildworld error was that resolved itself somehow. and what i can recre= ate > > and i'm not expert in this so how do i even know? > > what's fun is how running rsync over several tb's of data doesn't seem to= cause any issues at all. this is still same machine, many would not recomm= end using this. different workload? > > hell knows what's all this. maybe later i could figure it out or actually= save some logs or. those i didn't save as i assumed it repeats itself. did= n't and it went off tmux window history > > oh well. yes, this is questionable report but those are "heisenbugs" as w= ell. at least some? > > > > On April 22, 2025 3:52:11 PM GMT+03:00, Ronald Klop wrote: > >Hi, > > > >First, instead of writing "it gives vague errors", it really helps other= s on this list if you can copy-paste the errors into your email. > > > >Second, as far as I can see FreeBSD 13.4 uses OpenZFS 2.1.14. FreeBSD 14= uses OpenZFS 2.2.X which has bugfixes and improved tuning, although I cann= ot claim that will fix your issues. > >What you can try is to limit the growth of the ARC. > > > >Set "sysctl vfs.zfs.arc_max=3D1073741824" or add this to /etc/sysctl.con= f to set the value at boot. > > > >This will limit the ARC to 1GB. I used similar settings on small machine= s without really noticing a speed difference while usability increased. You= can play a bit with the value. Maybe 512MB will be even enough for your us= e case. > > > >NB: sysctl vfs.zfs.arc_max was renamed to vfs.zfs.arc.max with arc_max a= s a legacy alias, but I don't know if that already happened in 13.4. > > > >Another thing to check is the usage of tmpfs. If you don't restrict the = max size of a tmpfs filesystem it will compete for memory. Although this wi= ll also show an increase in swap usage. > > > >Regards, > >Ronald. > > > > > >Van: Sulev-Madis Silber > >Datum: maandag, 21 april 2025 03:25 > >Aan: freebsd-current > >Onderwerp: zfs (?) issues? > >> > >> i have long running issue in my 13.4 box (amd64) > >> > >> others don't get it at all and only suggest adding more than 4g ram > >> > >> it manifests as some mmap or other problems i don't really get > >> > >> basically unrestricted git consumes all the memory. i had to turn watc= hdog on because something a git pull on ports tree causes kernel to take 10= 0% of ram. it keeps killing userland off until it's just kernel running the= re happily. it never panics and killing off userland obviously makes the pr= oblem disappear and nothing will do any fs operations anymore > >> > >> dovecot without tuning or with some tuning tended to do this too > >> > >> what is it? > >> > >> now i noticed another issue. if i happen to do too many src git pulls = in a row, they never actually "pull" anything. and / or clean my obj tree o= ut. i can't run buildworld anymore. it gives vague errors > >> > >> if i wait a little before starting buildworld, it always works > >> > >> what could possibly happening here? the way the buildworld fails means= there's serious issue with fs. and how could it be fixed with waiting? it = means that some fs operations are still going on in background > >> > >> i have no idea what's happening here. zfs doesn't report any issues. n= or do storage. nothing was killed with out of memory but arc usage somehow = increased a lot. and it's compression ratio went weirdly high, like ~22:1 o= r so > >> > >> i don't know if it's acceptable zfs behaviour if it runs low on memory= or not. how to test it. etc. and if this is fixed on 14, on stable, or on = current. i don't have enough hw to test it on all > >> > >> i have done other stuff on that box that might also improper for amoun= g of ram i have there but then it's just slow, nothing fails like this > >> > >> unsure how this could be fixed or tuned or something else. or why does= it behave like this. as opposed to usual low resource issues that just mea= n you need more time > >> > >> i mean it would be easy to add huge amounts of ram but people could al= so want to use zfs in slightly less powerful embedded systems where lack of= power is expected but weird fails maybe not > >> > >> so is this a bug? a feature? something fixed? something that can't be = fixed? what could be acceptable ram size? 8g? 16g? and why can't it just tu= ne everything down and become slower as expected > >> > >> i tried to look up on any openzfs related bugs but zfs is huge and i'm= not fs expert either > >> > >> i also don't know what happens while i wait. it doesn't show any serio= us io load. no cpu is taken. load is down. system is responsible > >> > >> it all feels like bug still > >> > >> i have wondered if this is second hand hw acting up but i checked and = tested it as best as i could and why would it only bug out when i try more = complex things on zfs? > >> > >> i'm curious about using zfs on super low memory systems too, because i= t offers certain features. maybe we could fix this if whole issue is ram. o= r if it's elsewhere, maybe that too > >> > >> i don't know what to think of this all. esp the last issue. i'm not re= ally alone here with earlier issues but unsure > >> > >> > >> > > >