mount ZFS snapshot on Linux system

Rick Macklem rmacklem at uoguelph.ca
Mon Dec 16 00:57:17 UTC 2013


Jason Keltz wrote:
> On 10/12/2013 7:21 PM, Rick Macklem wrote:
> > Jason Keltz wrote:
> >> I'm running FreeBSD 9.2 with various ZFS datasets.
> >> I export a dataset to a Linux system (RHEL64), and mount it.  It
> >> works
> >> fine...
> >> When I try to access the ZFS snapshot directory on the Linux NFS
> >> client,
> >> things go weird.
> >>
> >> With NFSv4:
> >>
> >> [jas at archive /]# cd /mnt/.zfs/snapshot
> >> [jas at archive snapshot]# ls
> >> 20131203  20131205  20131206  20131207  20131208  20131209
> >>  20131210
> >> [jas at archive snapshot]# cd 20131210
> >> 20131210: Not a directory.
> >>
> >> huh?
> >>
> >> [jas at archive snapshot]# ls -al
> >> total 77
> >> dr-xr-xr-x   9 root root   9 Dec 10 11:20 .
> >> dr-xr-xr-x   4 root root   4 Nov 28 15:42 ..
> >> drwxr-xr-x 380 root root 380 Dec  2 15:56 20131203
> >> drwxr-xr-x 381 root root 381 Dec  3 11:24 20131205
> >> drwxr-xr-x 381 root root 381 Dec  3 11:24 20131206
> >> drwxr-xr-x 381 root root 381 Dec  3 11:24 20131207
> >> drwxr-xr-x 381 root root 381 Dec  3 11:24 20131208
> >> drwxr-xr-x 381 root root 381 Dec  3 11:24 20131209
> >> drwxr-xr-x 381 root root 381 Dec  3 11:24 20131210
> >> [jas at archive snapshot]# stat *
> >> [jas at archive snapshot]# ls -al
> >> total 292
> >> dr-xr-xr-x 9 root      root         9 Dec 10 11:20 .
> >> dr-xr-xr-x 4 root      root         4 Nov 28 15:42 ..
> >> -rw-r--r-- 1 uax    guest   137647 Mar 17  2010 20131203
> >> -rw-r--r-- 1 uax    guest         865 Jul 31  2009 20131205
> >> -rw-r--r-- 1 uax    guest   137647 Mar 17  2010 20131206
> >> -rw-r--r-- 1 uax    guest         771 Jul 31  2009 20131207
> >> -rw-r--r-- 1 uax    guest         778 Jul 31  2009 20131208
> >> -rw-r--r-- 1 uax     guest       5281 Jul 31  2009 20131209
> >> -rw------- 1 btx      faculty      893 Jul 13 20:21 20131210
> >>
> >> But it gets even more fun..
> >>
Just to let everyone know, Jason sent me a packet capture and
it does appear that the FreeBSD NFSv4 server generates bogus
attributes (the ones listed just above) in a Readdir reply when
the .zfs/snapshot directory is read.

I have sent him a simple patch which makes the server use VOP_LOOKUP()
unconditionally instead of switching from VFS_VGET() to VOP_LOOKUP()
upon a EOPNOTSUPP reply from VFS_VGET(). { It seems that zfs_vget()
returns vnodes which VOP_GETATTR() gets the bogus attributes from. }

Hopefully he will be able to test the patch, but I'm not sure at
this point.

I still don't know if VOP_LOOKUP() will return a vnode with
v_mountedhere != NULL when it does a lookup of a snapshot in
.zfs/snapshot, but I should find out the answer to that if/when
he tests the patch.

If someone else knows what zfs_lookup() will return when doing
a lookup of a snapshot in .zfs/snapshot or is willing to test
the patch to find out, please email.

rick

> >> # ls -ali
> >> total 205
> >>     2 dr-xr-xr-x   9 root      root       9 Dec 10 11:20 .
> >>     1 dr-xr-xr-x   4 root      root       4 Nov 28 15:42 ..
> >> 863 -rw-r--r--   1 uax     guest 137647 Mar 17  2010 20131203
> >>     4 drwxr-xr-x 381 root      root     381 Dec  3 11:24 20131205
> >>     4 drwxr-xr-x 381 root      root     381 Dec  3 11:24 20131206
> >>     4 drwxr-xr-x 381 root      root     381 Dec  3 11:24 20131207
> >>     4 drwxr-xr-x 381 root      root     381 Dec  3 11:24 20131208
> >>     4 drwxr-xr-x 381 root      root     381 Dec  3 11:24 20131209
> >>     4 drwxr-xr-x 381 root      root     381 Dec  3 11:24 20131210
> >>
> >> This is not a user id mapping issue because all the files in /mnt
> >> have
> >> the proper owner/groups, and I can access them there fine.
> >>
> >> I also tried explicitly exporting .zfs/snapshot.  The result isn't
> >> any
> >> different.
> >>
> >> If I use nfs v3 it "works", but I'm seeing a whole lot of errors
> >> like
> >> these in syslog:
> >>
> >> Dec 10 12:32:28 jungle mountd[49579]: can't delete exports for
> >> /local/backup/home9/.zfs/snapshot/20131203: Invalid argument
> >> Dec 10 12:32:28 jungle mountd[49579]: can't delete exports for
> >> /local/backup/home9/.zfs/snapshot/20131209: Invalid argument
> >> Dec 10 12:32:28 jungle mountd[49579]: can't delete exports for
> >> /local/backup/home9/.zfs/snapshot/20131210: Invalid argument
> >> Dec 10 12:32:28 jungle mountd[49579]: can't delete exports for
> >> /local/backup/home9/.zfs/snapshot/20131207: Invalid argument
> >>
> >> It's not clear to me why this doesn't just "work".
> >>
> >> Can anyone provide any advice on debugging this?
> >>
> > As I think you already know, I know nothing about ZFS and never
> > use it.
> Yup! :)
> > Having said that, I suspect that there are filenos (i-node #s)
> > that are the same in the snapshot as in the parent file system
> > tree.
> >
> > The basic assumptions are:
> > - within a file system, all i-node# are unique (represent one file
> >    object only) and all file objects have the same fsid
> > - when the fsid changes, that indicates a file system boundary and
> >    fileno (i-node#s) can be reused in the subtree with a different
> >    fsid
> >
> > For NFSv3, the server should export single volumes only (all
> > objects
> > have the same fsid and the filenos are unique). This is indicated
> > to
> > the VFS by the use of the NOCROSSMOUNT flag on VOP_LOOKUP() and
> > friends.
> >
> > For NFSv4, the server does export multiple volumes and the boundary
> > is indicated by a change in fsid value.
> >
> > I suspect ZFS snaphots don't obey the above in some way, but that
> > is
> > just a hunch.
> >
> > Now, how to narrow this down...
> > - Do the above tests (both NFSv4 and NFSv3) and capture the
> > packets,
> >    then look at them in wireshark. In particular, look at the
> >    fileid numbers
> >    and fsid values for the various directories under .zfs.
> 
> I gave this a shot, but I haven't used wireshark to capture NFS
> traffic
> before, so if I need to provide additional details, let me know..
> 
> NFSv4:
> 
> For /mnt/.zfs/snapshot/20131203:
> fileid=4
> fsid4.major=1446349656
> fsid4.minor=222
> 
> For /mnt/.zfs/snapshot/20131205:
> fileid=4
> fsid4.major=1845998066
> fsid4.minor=222
> 
> For /mnt/jas:
> fileid=144
> fsid4.major=597946950
> fsid4.minor=222
> 
> For /mnt/jas1:
> fileid=338
> fsid4.major=597946950
> fsid4.minor=222
> 
> So fsid is the same for all the different "data" directories, which
> is
> what I would expect given what you said.  I  guess each snapshot is
> seen
> as a unique filesystem...  but then a repeating inode in different
> filesystems shouldn't be a problem...
> 
> NFSv3:
> 
> For /mnt/.zfs/snapshot/20131203:
> fileid=4
> fsid=0x0000000056358b58
> 
> For /mnt/.zfs/snapshot/20131205:
> fileid=4
> fsid=0x000000006e07b1f2
> 
> For /mnt/jas
> fileid=144
> fsid=0x0000000023a3f246
> 
> For /mnt/jas1:
> fileid=338
> fsid=0x0000000023a3f246
> 
> Here, it seems it's the same, even though it's NFSv3... hmm.
> 
> 
> > - Try mounting the individual snapshot directory, like
> >     .zfs/snapshot/20131209 and see if that works (for both NFSv3
> >     and NFSv4).
> 
> Hmm .. I tried this:
> 
> /local/backup/home9/.zfs/snapshot/20131203  -ro
> archive-mrpriv.cs.yorku.ca
> V4: /
> 
> ... but syslog reports:
> 
> Dec 10 22:28:22 jungle mountd[85405]: can't export
> /local/backup/home9/.zfs/snapshot/20131203
> 
> ... and of course I can't mount from either v3/v4.
> 
> On the other hand, I kept it as:
> 
> /local/backup/home9 -ro archive-mrpriv.cs.yorku.ca
> V4:/
> 
> ... and was able to NFSv4 mount
> /local/backup/home9/.zfs/snapshot/20131203, and this does indeed
> work.
> 
> > - Try doing the mounts with a FreeBSD client and see if you get the
> > same
> >    behaviour?
> I found this:
> http://forums.freenas.org/threads/mounting-snapshot-directory-using-nfs-from-linux-broken.6060/
> .. implies it will work from FreeBSD/Nexenta, just not Linux.
> Found this as well:
> https://groups.google.com/a/zfsonlinux.org/forum/#!topic/zfs-discuss/lKyfYsjPMNM
> 
> Jason.
> 
> 


More information about the freebsd-fs mailing list