[rfc] 64-bit inode numbers
Garance A Drosehn
gad at FreeBSD.org
Mon Jun 27 18:40:04 UTC 2011
On 6/24/11 6:21 PM, mdf at FreeBSD.org wrote:
> On Fri, Jun 24, 2011 at 2:07 PM, Garance A Drosehn<gad at freebsd.org> wrote:
>> The AFS cell at RPI has approximately 40,000 AFS volumes, and each
>> volume should have it's own dev_t (IMO).
>> Please realize that I do not mind if people felt that there was no
>> need to increase the size of dev_t at this time, and that we should
>> wait until we see more of a demand for increasing it. But given the
>> project to increase the size of inode numbers, I thought this was a
>> good time to also ask about dev_t. I ask about it every few years :-)
> I don't see why 32 bits are anywhere close to becoming tight to
> represent 40k unique values. Is there something wrong with how each
> new dev_t is computed, that runs out of space quicker than this
The 40K values are just for the AFS volumes at RPI. AFS presents the
entire world as a single filesystem, with the RPI cell as just one
small part of that worldwide filesystem. The public CellServDB.master
file lists 200 cells, where all of those cells would be available at
the same time to any user sitting on a single machine which has AFS
installed. And that's just the official public AFS cells.
Organizations can (and do) have private AFS cells which are not part
of the official public list.
I mentioned the 40K volumes at RPI because someone said "I do not
expect to see hundreds of thousands of mounts on a single system".
My example was just to show that I can access 40 thousand AFS volumes
in a single unix *command*, without even leaving RPI. That was not
meant to show how many volumes are reachable under all of /afs.
Also, it was really easy for me to come up with the number of AFS
volumes in the RPI cell. I'd be reluctant to try and probe all of
the publicly-reachable AFS cells to come up with a real number for
how many AFS volumes there are in the world.
(aside: actually there are more like 60K AFS volumes at RPI, but
at least 20K of those are not readily accessible via unix commands,
so I said 40K. And most users at RPI couldn't even access 40K of
those AFS volumes, but I suspect I can because I'm an AFS admin)
One reason RPI has so many AFS volumes is that each user has their
own AFS volume for their home directory. Given the way AFS works,
that is a very very reasonable thing to do. In fact, it'd almost
be stupid to *not* give every user their own AFS volume. Now
imagine the WWW, where every single http/www.place.tld/~username on
the entire planet was on a different disk volume. And any single
user on a single system can access any combination of those disk
volumes within a single login session. The WWW is a world-wide web.
AFS is meant as a world-wide distributed file system. When working
on a world-wide scale, you hit larger numbers. I think that many
people who have not worked with AFS keep thinking of it the same
way they think of NFS, but AFS was designed with much larger-scale
deployment in mind.
Again, I don't mind if we don't wish to tackle a larger dev_t right
now, and I definitely do not want the 64-bit ino_t project to get
bogged down with a debate over a larger dev_t. But I have been
working with OpenAFS for ten years now, and it is definitely true
that a larger dev_t would be helpful for that specific filesystem.
And it may be that some other solution would be even better, so I
don't want to push this one too much.
Garance Alistair Drosehn = gad at gilead.netel.rpi.edu
Senior Systems Programmer or gad at freebsd.org
Rensselaer Polytechnic Institute or drosih at rpi.edu
More information about the freebsd-fs