Re: LLM and file systems (was Re: Porting BeFS to FreeBSD for GSoC2025)

From: <paige_at_paige.bio>
Date: Tue, 04 Mar 2025 04:51:11 UTC
It’s up to you man. You do you, if you wanna write it from scratch, probably do better than I could, but FWIW, I don’t think you are doing yourself any favors if it’s a method of problem solving that you’re already familiar with; it’s a lot of work for essentially nothing (I damn sure won’t care if about BeOS filesystem but if you’re just looking for an excuse to make it for yourself but can’t justify the effort unless you submit it for GSoC then don’t it’s a waste; 

osb and jails could use a little improvement there’s issues when you start activating multiple fibs. People have also talked about how to tackle the problem of secure boot and trusted like IMA/EVM, but nobody ever has (really interesting problem too when you consider the implication it has on containers and jails. Podman/ocijail works on FreeBSD too now, just not with multiple FIBs activated at boot (ask me how I know.) 

And yeah even though ExFAT is pretty damn useful to me and I don’t like fuse, getting people behind the idea of helping me bring that to fruition you can forget it—nobody cares that much. I feel like it’d be the lamest GSoC proposal ever, but it has utility for backups and sd cards and Microsoft’s patents on file indexing using a hash table are just about cooked. 

Your guess is pretty inaccurate; Anthropic doesn’t provide any guidelines for this code and as far as legally what is derived work the law is pretty obtuse and and it’s kind of whatever people are in the mood to say; there’s no basis of meaningful science behind that and they could take something that you honestly did write yourself and as long as they can convince a jury that you stole it then that’s all that matters.

So for that matter you’re actually better off not open sourcing anything at all because all it takes is one person who realizes your code looks a little like theirs, seriously it’s not a joke that person exists somewhere. So no at some point when I’m satisfied that this driver is as satisfactory as it can possibly be I’ll probably submit it, that is if I don’t decide to fork FreeBSD before then and start using the LLM to drive development efforts in favor of where I want them to go which is starting to look like a real possibility, perhaps in the next couple of years. 

But yeah just focusing on what I can reasonably expect to complete in the next year or so…maybe I’ll submit it, I don’t know yet. I know that even with the LLM it’s an enormous amount of work even with the help of an LLM, let alone for one person to do alone. There’s no money in it, it probably won’t help you get a job either. 

But yeah dude no judgement from me, I have dozens of projects that nobody cares about except me but really the only reason I can justify it to myself is if I can find a way to accomplish something for as little work as possible. But yeah, in its current state you can git clone that and cd and make and it will build if you have the kernel src tree /usr/src/ and /usr/src/sys and load it too.. there’s no secure boot or module signing on FreeBSD so it just works.. but you know it could be different— that’s a thing some people want and that might be a little more worthy of GSoC but that’s just my opinion and it’s up to you. 


Sent from my iPhone

> On Mar 3, 2025, at 7:22 PM, Pedro Giffuni <pfg@freebsd.org> wrote:
> 
> 
> Hi,
> 
> There are good reasons to avoid LLM generated code in an OS like FreeBSD. In short it will take some time to understand the licensing implications of using code based on other code used to train it. My previous employer was concerned about sharing costumer code with the owner of the AI provider.
> 
> It may be conceptually acceptable to use LLM to generate test cases though, since the test cases do not end up being part of the end product, but it very much depends on the project. For FreeBSD nothing has been approved AFAICT.
> 
> My guess for your project is that you can package it as an external loadable module and add a legal disclaimer (which I wouildn't know how to write since I am not a lawyer ;-) ).
> For a GSoC we expect a human programmer.
> 
> Pedro.
> 
> 
>> On Monday, March 3, 2025 at 04:05:03 PM GMT-5, <paige@paige.bio> wrote:
>> 
>> 
>> I’ve collectively been making an ExFAT native driver (uses VFS instead of fuse) 
>> 
>> https://github.com/paigeadelethompson/exfat
>> 
>> And I’ve been using an LLM to do it. I recommend using something like Claude if you can, not sure when I’ll be done with this but if you want some advice: 
>> 
>> - start with newfs and use a known good chkdsk or fsck program on another computer; macOS is good starting point if you can get befs.fsck there otherwise plan on having to copy stuff back and forth a bit.
>> 
>> If you use an LLM and can get this converted to text: https://www.nobius.org/dbg/practical-file-system-design.pdf it will help you a lot 
>> ExFAT is documented extensively on MSDN and Claude-3.5-sonnet seems to have pretty decent RAG. In any case I recommend having a look through my README and making heavy use of bootverbose.. but you will also want to enable the various kernel level options in my readme, VFS is a little tricky but once you get through this initial mount trace:
>> 
>> https://github.com/paigeadelethompson/exfat/commit/187c6694c68554f7961b427501373984a0742366
>> 
>> The rest shouldn’t be as bad.. you can see the snippet of bootverbose messages have the function name that its calling from (very helpful to have honestly especially if you’re using an LLM) but be prepared to drop into DDB and reset / retry a few dozen or a hundred times until you figure out VFS in any case xD
>> 
>> At least with lock debugging enabled in the kernel it’s a little more actionable. 
>> 
>> Sent from my iPhone
>> 
>>> On Mar 3, 2025, at 6:05 AM, Pedro Giffuni <pfg@freebsd.org> wrote:
>>> 
>> 
>> Hello Krutarth;
>> 
>> Thank you for the interest!
>> 
>> Yes, the idea is still open. In all honesty FreeBSD does have much better filesystems than openBFS, but we don't have a "true" journalling filesystem and BFS is rather well documented with an open implementation so it could still be a nice to have.
>> 
>> At a time I spoke with some Haiku guys and Bruno was interested in co-mentoring this project.
>> 
>> As I mentioned in private, you are probably better of checking the ext2fs sources (sys/fs/ext2fs), for a simplified UFS. We don't have any open issues AFAICT, but maybe fedor@ has something pending.
>> 
>> For documentation "The Design and Implementation of the FreeBSD OS", seems pretty much compulsory.
>> 
>> Pedro.
>> 
>> ps. I am somewhat retired from FreeBSD, if such a thing exists, but if no one else steps in I would co-mentor.
>> 
>> 
>> On Monday, March 3, 2025 at 12:53:00 AM GMT-5, Krutarth Patel <krutarthpatel929@gmail.com> wrote:
>> 
>> 
>> Hello,
>> 
>> I am interested in porting BeFS from Haiku. I see that it is listed as one of the GSoC ideas.
>> 
>> I have done some contributions in the PCI subsystem over at Haiku and have some Linux kernel debugging experience. 
>> 
>> I am new to FreeBSD( not entirely, I am in the process of porting a driver from FreeBSD to Haiku) and filesystems in general( I have an idea of the basic terminologies like inode, block etc. but thats about it).  But I am willing to learn.
>> 
>> Here are my questions:
>> 
>> Is the idea still open?
>> Are there any smaller issues I can resolve to get myself familiar with codebase?( something related to UFS/ZFS would be perfect)
>> Where is the UFS and ZFS implementation in the source tree?
>> Any recommended resources for learning about filesystems( specifically FreeBSD, I am reading a guide about BeFS )?
>> Looking forward to hearing from you
>> 
>> 
>> 
>>