From nobody Tue Mar 04 04:51:11 2025 X-Original-To: hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Z6NZK4gjxz5pVDy for ; Tue, 04 Mar 2025 04:51:53 +0000 (UTC) (envelope-from paige@paige.bio) Received: from pv50p00im-ztdg10011201.me.com (pv50p00im-ztdg10011201.me.com [17.58.6.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Z6NZK31Vfz3PyQ for ; Tue, 04 Mar 2025 04:51:48 +0000 (UTC) (envelope-from paige@paige.bio) Authentication-Results: mx1.freebsd.org; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paige.bio; s=sig1; bh=qsgarWqA2/at+t7kCHwVL1P7T5rGBOHl4z3B3h6tAIw=; h=Content-Type:From:Mime-Version:Subject:Date:Message-Id:To:x-icloud-hme; b=GoNwxkAMB9xzW94zSTfFFZsugk4CHHAM8vRP5olmqwwzwbS6AS6IuisWL/XKjMN0I U9G9ZImp2y7RwY8jyytBc8uxWyrEQ1UDMOEl4BYcjWIW2DhicVyrLPrWK29vHXuilo 3PSaIhGbVvPC7Cc16kPCyk2qbDkGMft+76d3MIMoFXDepTFRra3LkNpN5/9veVGmVb 16Xoq7qd/9ZVymw/hEPKGPkjVpmMa6cETW94m5AczB+6/TC6imozQ1b+NU/9O6hbIb kiVEy+oGHYVioNnVRGKpC9svs4gTCs4fzdDSWOMDEbTvWgBXsgFFGGB8fkL9NGjreV uBqp4EQuX0TJQ== Received: from smtpclient.apple (pv50p00im-dlb-asmtp-mailmevip.me.com [17.56.9.10]) by pv50p00im-ztdg10011201.me.com (Postfix) with ESMTPSA id B1BB4680745; Tue, 4 Mar 2025 04:51:44 +0000 (UTC) Content-Type: multipart/alternative; boundary=Apple-Mail-9A0698ED-8B6A-46C0-AF24-2EAE3D5A5264 Content-Transfer-Encoding: 7bit From: paige@paige.bio List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org Mime-Version: 1.0 (1.0) Subject: Re: LLM and file systems (was Re: Porting BeFS to FreeBSD for GSoC2025) Date: Mon, 3 Mar 2025 20:51:11 -0800 Message-Id: <3833F097-C8A6-4F06-8F81-CA919B291FBF@paige.bio> References: <773517072.6945384.1741058461093@mail.yahoo.com> Cc: hackers@freebsd.org In-Reply-To: <773517072.6945384.1741058461093@mail.yahoo.com> To: Pedro Giffuni X-Mailer: iPhone Mail (22D72) X-Proofpoint-GUID: 83u_4rQ4XJHeeSM04FIOyVr9ge9HFKQU X-Proofpoint-ORIG-GUID: 83u_4rQ4XJHeeSM04FIOyVr9ge9HFKQU X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1093,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-03-04_02,2025-03-03_04,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 mlxlogscore=999 adultscore=0 mlxscore=0 clxscore=1030 bulkscore=0 malwarescore=0 phishscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2411120000 definitions=main-2503040039 X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:714, ipnet:17.58.0.0/20, country:US] X-Rspamd-Queue-Id: 4Z6NZK31Vfz3PyQ X-Spamd-Bar: ---- --Apple-Mail-9A0698ED-8B6A-46C0-AF24-2EAE3D5A5264 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable It=E2=80=99s up to you man. You do you, if you wanna write it from scratch, p= robably do better than I could, but FWIW, I don=E2=80=99t think you are doin= g yourself any favors if it=E2=80=99s a method of problem solving that you=E2= =80=99re already familiar with; it=E2=80=99s a lot of work for essentially n= othing (I damn sure won=E2=80=99t care if about BeOS filesystem but if you=E2= =80=99re just looking for an excuse to make it for yourself but can=E2=80=99= t justify the effort unless you submit it for GSoC then don=E2=80=99t it=E2=80= =99s a waste;=20 osb and jails could use a little improvement there=E2=80=99s issues when you= start activating multiple fibs. People have also talked about how to tackle= the problem of secure boot and trusted like IMA/EVM, but nobody ever has (r= eally interesting problem too when you consider the implication it has on co= ntainers and jails. Podman/ocijail works on FreeBSD too now, just not with m= ultiple FIBs activated at boot (ask me how I know.)=20 And yeah even though ExFAT is pretty damn useful to me and I don=E2=80=99t l= ike fuse, getting people behind the idea of helping me bring that to fruitio= n you can forget it=E2=80=94nobody cares that much. I feel like it=E2=80=99d= be the lamest GSoC proposal ever, but it has utility for backups and sd car= ds and Microsoft=E2=80=99s patents on file indexing using a hash table are j= ust about cooked.=20 Your guess is pretty inaccurate; Anthropic doesn=E2=80=99t provide any guide= lines for this code and as far as legally what is derived work the law is pr= etty obtuse and and it=E2=80=99s kind of whatever people are in the mood to s= ay; there=E2=80=99s no basis of meaningful science behind that and they coul= d take something that you honestly did write yourself and as long as they ca= n convince a jury that you stole it then that=E2=80=99s all that matters. So for that matter you=E2=80=99re actually better off not open sourcing anyt= hing at all because all it takes is one person who realizes your code looks a= little like theirs, seriously it=E2=80=99s not a joke that person exists so= mewhere. So no at some point when I=E2=80=99m satisfied that this driver is a= s satisfactory as it can possibly be I=E2=80=99ll probably submit it, that i= s if I don=E2=80=99t decide to fork FreeBSD before then and start using the L= LM to drive development efforts in favor of where I want them to go which is= starting to look like a real possibility, perhaps in the next couple of yea= rs.=20 But yeah just focusing on what I can reasonably expect to complete in the ne= xt year or so=E2=80=A6maybe I=E2=80=99ll submit it, I don=E2=80=99t know yet= . I know that even with the LLM it=E2=80=99s an enormous amount of work even= with the help of an LLM, let alone for one person to do alone. There=E2=80=99= s no money in it, it probably won=E2=80=99t help you get a job either.=20 But yeah dude no judgement from me, I have dozens of projects that nobody ca= res about except me but really the only reason I can justify it to myself is= if I can find a way to accomplish something for as little work as possible.= But yeah, in its current state you can git clone that and cd and make and i= t will build if you have the kernel src tree /usr/src/ and /usr/src/sys and l= oad it too.. there=E2=80=99s no secure boot or module signing on FreeBSD so i= t just works.. but you know it could be different=E2=80=94 that=E2=80=99s a t= hing some people want and that might be a little more worthy of GSoC but tha= t=E2=80=99s just my opinion and it=E2=80=99s up to you.=20 Sent from my iPhone > On Mar 3, 2025, at 7:22=E2=80=AFPM, Pedro Giffuni wrote:= >=20 > =EF=BB=BF > Hi, >=20 > There are good reasons to avoid LLM generated code in an OS like FreeBSD. I= n short it will take some time to understand the licensing implications of u= sing code based on other code used to train it. My previous employer was con= cerned about sharing costumer code with the owner of the AI provider. >=20 > It may be conceptually acceptable to use LLM to generate test cases though= , since the test cases do not end up being part of the end product, but it v= ery much depends on the project. For FreeBSD nothing has been approved AFAIC= T. >=20 > My guess for your project is that you can package it as an external loadab= le module and add a legal disclaimer (which I wouildn't know how to write si= nce I am not a lawyer ;-) ). > For a GSoC we expect a human programmer. >=20 > Pedro. >=20 >=20 >> On Monday, March 3, 2025 at 04:05:03 PM GMT-5, wrote: >>=20 >>=20 >> I=E2=80=99ve collectively been making an ExFAT native driver (uses VFS in= stead of fuse)=20 >>=20 >> https://github.com/paigeadelethompson/exfat >>=20 >> And I=E2=80=99ve been using an LLM to do it. I recommend using something l= ike Claude if you can, not sure when I=E2=80=99ll be done with this but if y= ou want some advice:=20 >>=20 >> - start with newfs and use a known good chkdsk or fsck program on another= computer; macOS is good starting point if you can get befs.fsck there other= wise plan on having to copy stuff back and forth a bit. >>=20 >> If you use an LLM and can get this converted to text: https://www.nobius.= org/dbg/practical-file-system-design.pdf it will help you a lot=20 >> ExFAT is documented extensively on MSDN and Claude-3.5-sonnet seems to ha= ve pretty decent RAG. In any case I recommend having a look through my READM= E and making heavy use of bootverbose.. but you will also want to enable the= various kernel level options in my readme, VFS is a little tricky but once y= ou get through this initial mount trace: >>=20 >> https://github.com/paigeadelethompson/exfat/commit/187c6694c68554f7961b42= 7501373984a0742366 >>=20 >> The rest shouldn=E2=80=99t be as bad.. you can see the snippet of bootver= bose messages have the function name that its calling from (very helpful to h= ave honestly especially if you=E2=80=99re using an LLM) but be prepared to d= rop into DDB and reset / retry a few dozen or a hundred times until you figu= re out VFS in any case xD >>=20 >> At least with lock debugging enabled in the kernel it=E2=80=99s a little m= ore actionable.=20 >>=20 >> Sent from my iPhone >>=20 >>> On Mar 3, 2025, at 6:05=E2=80=AFAM, Pedro Giffuni wrot= e: >>>=20 >> =EF=BB=BF >> Hello Krutarth; >>=20 >> Thank you for the interest! >>=20 >> Yes, the idea is still open. In all honesty FreeBSD does have much better= filesystems than openBFS, but we don't have a "true" journalling filesystem= and BFS is rather well documented with an open implementation so it could s= till be a nice to have. >>=20 >> At a time I spoke with some Haiku guys and Bruno was interested in co-men= toring this project. >>=20 >> As I mentioned in private, you are probably better of checking the ext2fs= sources (sys/fs/ext2fs), for a simplified UFS. We don't have any open issue= s AFAICT, but maybe fedor@ has something pending. >>=20 >> For documentation "The Design and Implementation of the FreeBSD OS", seem= s pretty much compulsory. >>=20 >> Pedro. >>=20 >> ps. I am somewhat retired from FreeBSD, if such a thing exists, but if no= one else steps in I would co-mentor. >>=20 >>=20 >> On Monday, March 3, 2025 at 12:53:00 AM GMT-5, Krutarth Patel wrote: >>=20 >>=20 >> Hello, >>=20 >> I am interested in porting BeFS from Haiku. I see that it is listed as on= e of the GSoC ideas. >>=20 >> I have done some contributions in the PCI subsystem over at Haiku and hav= e some Linux kernel debugging experience.=20 >>=20 >> I am new to FreeBSD( not entirely, I am in the process of porting a drive= r from FreeBSD to Haiku) and filesystems in general( I have an idea of the b= asic terminologies like inode, block etc. but thats about it). But I am wil= ling to learn. >>=20 >> Here are my questions: >>=20 >> Is the idea still open? >> Are there any smaller issues I can resolve to get myself familiar with co= debase?( something related to UFS/ZFS would be perfect) >> Where is the UFS and ZFS implementation in the source tree? >> Any recommended resources for learning about filesystems( specifically Fre= eBSD, I am reading a guide about BeFS )? >> Looking forward to hearing from you >>=20 >>=20 >>=20 >>=20 --Apple-Mail-9A0698ED-8B6A-46C0-AF24-2EAE3D5A5264 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable It=E2=80=99s up to you man. You do you, if y= ou wanna write it from scratch, probably do better than I could, but FWIW, I= don=E2=80=99t think you are doing yourself any favors if it=E2=80=99s a met= hod of problem solving that you=E2=80=99re already familiar with; it=E2=80=99= s a lot of work for essentially nothing (I damn sure won=E2=80=99t care if a= bout BeOS filesystem but if you=E2=80=99re just looking for an excuse to mak= e it for yourself but can=E2=80=99t justify the effort unless you submit it f= or GSoC then don=E2=80=99t it=E2=80=99s a waste; 

os= b and jails could use a little improvement there=E2=80=99s issues when you s= tart activating multiple fibs. People have also talked about how to tackle t= he problem of secure boot and trusted like IMA/EVM, but nobody ever has (rea= lly interesting problem too when you consider the implication it has on cont= ainers and jails. Podman/ocijail works on FreeBSD too now, just not with mul= tiple FIBs activated at boot (ask me how I know.) 

<= div>And yeah even though ExFAT is pretty damn useful to me and I don=E2=80=99= t like fuse, getting people behind the idea of helping me bring that to frui= tion you can forget it=E2=80=94nobody cares that much. I feel like it=E2=80=99= d be the lamest GSoC proposal ever, but it has utility for backups and sd ca= rds and Microsoft=E2=80=99s patents on file indexing using a hash table are j= ust about cooked. 

Your guess is pretty inaccu= rate; Anthropic doesn=E2=80=99t provide any guidelines for this code and as f= ar as legally what is derived work the law is pretty obtuse and and it=E2=80= =99s kind of whatever people are in the mood to say; there=E2=80=99s no basi= s of meaningful science behind that and they could take something that you h= onestly did write yourself and as long as they can convince a jury that you s= tole it then that=E2=80=99s all that matters.

So fo= r that matter you=E2=80=99re actually better off not open sourcing anything a= t all because all it takes is one person who realizes your code looks a litt= le like theirs, seriously it=E2=80=99s not a joke that person exists somewhe= re. So no at some point when I=E2=80=99m satisfied that this driver is as sa= tisfactory as it can possibly be I=E2=80=99ll probably submit it, that is if= I don=E2=80=99t decide to fork FreeBSD before then and start using the LLM t= o drive development efforts in favor of where I want them to go which is sta= rting to look like a real possibility, perhaps in the next couple of years.&= nbsp;

But yeah just focusing on what I can reasonab= ly expect to complete in the next year or so=E2=80=A6maybe I=E2=80=99ll subm= it it, I don=E2=80=99t know yet. I know that even with the LLM it=E2=80=99s a= n enormous amount of work even with the help of an LLM, let alone for one pe= rson to do alone. There=E2=80=99s no money in it, it probably won=E2=80=99t h= elp you get a job either. 

But yeah dude no ju= dgement from me, I have dozens of projects that nobody cares about except me= but really the only reason I can justify it to myself is if I can find a wa= y to accomplish something for as little work as possible. But yeah, in its c= urrent state you can git clone that and cd and make and it will build if you= have the kernel src tree /usr/src/ and /usr/src/sys and load it too.. there= =E2=80=99s no secure boot or module signing on FreeBSD so it just works.. bu= t you know it could be different=E2=80=94 that=E2=80=99s a thing some people= want and that might be a little more worthy of GSoC but that=E2=80=99s just= my opinion and it=E2=80=99s up to you. 


Sent from my iPhone

=
On Mar 3, 2025, at 7:22=E2=80=AFPM, Pedro Giffuni &= lt;pfg@freebsd.org> wrote:

=EF=BB=BF
Hi,

There ar= e good reasons to avoid LLM generated code in an OS like FreeBSD. In short i= t will take some time to understand the licensing implications of using code= based on other code used to train it. My previous employer was concerned ab= out sharing costumer code with the owner of the AI provider.

It may be conceptually acceptable to use LLM to generate test cases though,= since the test cases do not end up being part of the end product, but it ve= ry much depends on the project. For FreeBSD nothing has been approved AFAICT= .

My guess for your project is that you can package it as a= n external loadable module and add a legal disclaimer (which I wouildn't kno= w how to write since I am not a lawyer ;-) ).
For a GSoC we expect a human programmer.

Pedr= o.


=20
=20
On Monday, March 3, 2025 at 04:05:03 PM GMT-5, <= paige@paige.bio> wrote:


=20 =20
I=E2=80=99ve collectively been making an ExFAT native driver (uses VFS in= stead of fuse) 

https://github.com/paigeadelethompson/exfat
<= /div>

And I=E2=80=99ve been using an LLM t= o do it. I recommend using something like Claude if you can, not sure when I= =E2=80=99ll be done with this but if you want some advice: 
<= br clear=3D"none">
- start with newfs and use a known good chkdsk o= r fsck program on another computer; macOS is good starting point if you can g= et befs.fsck there otherwise plan on having to copy stuff back and forth a b= it.

If you use an LLM and can get th= is converted to text: htt= ps://www.nobius.org/dbg/practical-file-system-design.pdf it will he= lp you a lot 
ExFAT is documented extensively on MSDN and Cla= ude-3.5-sonnet seems to have pretty decent RAG. In any case I recommend havi= ng a look through my README and making heavy use of bootverbose.. but you wi= ll also want to enable the various kernel level options in my readme, VFS is= a little tricky but once you get through this initial mount trace:


The rest shouldn=E2=80=99t be= as bad.. you can see the snippet of bootverbose messages have the function n= ame that its calling from (very helpful to have honestly especially if you=E2= =80=99re using an LLM) but be prepared to drop into DDB and reset / retry a f= ew dozen or a hundred times until you figure out VFS in any case xD

At least with lock debugging enabled in the kernel it=E2=80=99s a litt= le more actionable. 

Sent from my iPhone

On Mar 3, 2025= , at 6:05=E2=80=AFAM, Pedro Giffuni <pfg@freebsd.org> wrote:

=EF=BB=BF
Hello Krutarth;

Thank you for the interest!

Yes, the idea is still open. In= all honesty FreeBSD does have much better filesystems than openBFS, but we d= on't have a "true" journalling filesystem and BFS is rather well documented w= ith an open implementation so it could still be a nice to have.
At a time I spoke with some Haiku gu= ys and Bruno was interested in co-mentoring this project.

As I mentioned in private, you= are probably better of checking the ext2fs sources (sys/fs/ext2fs), for a s= implified UFS. We don't have any open issues AFAICT, but maybe fedor@ has so= mething pending.

For documentation "The Design and Implementation of the FreeBSD OS", se= ems pretty much compulsory.

<= div dir=3D"ltr">Pedro.

p= s. I am somewhat retired from FreeBSD, if such a thing exists, but if no one= else steps in I would co-mentor.

=20
=20
On Monday, March 3, 2025 at 12:53:00 AM GMT-5, Kruta= rth Patel <krutarthpatel929@gmail.com> wrote:


=20 =20

Hello,

I am interested in porting BeFS from= Haiku. I see that it is listed as one of the GSoC ideas.

I have done s= ome contributions in the PCI subsystem over at Haiku and have some Linux ker= nel debugging experience. 

I am new to FreeBSD( not entirely, I a= m in the process of porting a driver from FreeBSD to Haiku) and filesystems i= n general( I have an idea of the basic terminologies like inode, block etc. b= ut thats about it).  But I am willing to learn.

Here are my quest= ions:

  • Is the idea still open?
  • Are there any smaller issues I can r= esolve to get myself familiar with codebase?( something related to UFS/ZFS w= ould be perfect)
  • Where is the UFS and ZFS implementation in the sour= ce tree?
  • Any recommended resources for learning about filesystems( s= pecifically FreeBSD, I am reading a guide about BeFS )?

Looking f= orward to hearing from you



= --Apple-Mail-9A0698ED-8B6A-46C0-AF24-2EAE3D5A5264--