Cluster Filesystem for FreeBSD - any interest?
yf-263
yfyoufeng at 263.net
Wed Jul 20 02:37:26 GMT 2005
在 2005-07-19二的 21:16 -0500,Eric Anderson写道:
> Bakul Shah wrote:
> [..snip..]
> >>:) I understand. Any nudging in the right direction here would be
> >>appreciated.
> >
> >
> > I'd probably start with modelling a single filesystem and how
> > it maps to a sequence of disk blocks (*without* using any
> > code or worrying about details of formats but capturing the
> > essential elements). I'd describe various operations in
> > terms of preconditions and postconditions. Then, I'd extend
> > the model to deal with redundancy and so on. Then I'd model
> > various failure modes. etc. If you are interested _enough_
> > we can take this offline and try to work something out. You
> > may even be able to use perl to create an `executable'
> > specification:-)
>
> I've done some research, and read some books/articles/white papers since
> I started this thread.
>
> First, porting GFS might be a more universal effort, and might be
> 'easier'. However, that doesn't get us a clustered filesystem with BSD
> license (something that sounds good to me).
It has been said it would be a seven man-month efforts for a FS expert.
>
> Clustering UFS2 would be cool. Here's what I'm looking for:
It is exactly how "Lustre" doing its work, though it build itself on
Ext3, and Lustre targets at http://www.lustre.org/docs/SGSRFP.pdf .
>
> A clustered filesystem (or layer?) that allows all machines in the
> cluster to see the same filesystem as if it were local, with read/write
> access. The cluster will need cache coherency across all nodes, and
> there will need to be some sort of lock manager on each node to
> communicate with all the other nodes to coordinate file locking. The
> filesystem will have to support journaling.
>
> I'm wondering if one could make a pseudo filesystem something like
> nullfs that sits on top of a UFS2 partition, and essentially monitors
> all VFS operations to the filesystem, and communicates them over TCP/IP
> to the other nodes in the cluster. That way, each node would know which
> inodes and blocks are changing, so they can flush those buffers, and
> they would know which blocks (or partial blocks) to view as locked as
> another node locks it. This could be done via multicast, so all nodes in
> the cluster would have to be running a distributed lock manager daemon
> (dlmd) that would coordinate this. I think also that the UFS2
> filesystem would have to have a bit set upon mount that tracked it's
> mount as a 'clustered' filesystem mount. The reason for that is so that
> we could modify mount to only mount 'clustered' filesystems (mount -o
> clustered) if the dlmd was running, since that would be a dependency for
> stable coherent file control on a mount point.
>
> Does anyone have any insight as to whether a layer would work? Or maybe
> I'm way off here and I need to do more reading :)
>
> Eric
>
>
>
--
yf-263 <yfyoufeng at 263.net>
Unix-driver.org
More information about the freebsd-fs
mailing list