Unable to pwd in ZFS snapshot

Gleb Kurtsou gleb.kurtsou at gmail.com
Sun Feb 14 01:05:36 UTC 2010


On (10/02/2010 22:09), Jaakko Heinonen wrote:
> On 2010-02-05, Gleb Kurtsou wrote:
> > Comments in zfs_ctldir.c explain the inode numbering scheme under .zfs
> > in detail, each snapshot node has to have unique inode number.  Correct
> > inode number is returned by READDIR call, but not by GETATTR for the
> > same vnode. This breaks our pwd (getcwd). The patch attached adds a hack
> > to VOP_GETATTR to return expected inode numbers.
> > 
> > Also, with r197513 reverted all inode numbers are still the same, but it
> > seems to work as expected.
> 
> r196309 added a VOP_VPTOCNP(9) implementation for snapshots. However due
> to changes made in r197513 zfsctl_snapshot_vptocnp() never gets called.
> 
> There's also another problem with the hidden .zfs directory: if the
> directory is not in the name cache, __getcwd() will fail. To reproduce
> this set debug.vfscache sysctl to 0 before the directory enters to
> cache.
> 
> # sysctl debug.vfscache=0
> # cd /scratch/.zfs
> # /bin/pwd
> pwd: .: No such file or directory
> 
> Here's a patch which tries to fix/work around these problems:
> 
> 	http://people.freebsd.org/~jh/patches/zfs-ctldir-vptocnp.diff
> 
> The patch needs more work and I have tested it only very lightly.
I see no reason in implementing VPTOCNP for directories under .zfs. The
problem here lies in incorrect inode numbers (inconsistency between
VOP_READDIR and VOP_GETATTR). Fixing it only in kernel (__getcwd)
doesn't necessarily fixes it for all cases in userland, getcwd in our
libc falls back to comparing inode numbers (algorithm equivalent to
default VOP_VPTOCNP implementation), besides I think midnight commander
also performs something very similar with inode numbers.

Fixing inode numbers would also fix pwd issues with namecache disabled.

Here is what it looks like in my case (with patch from my previous email
applied):

/.zfs % testdir 
./.                     1       1
./..                    1       3       !!!
./snapshot              2       2          
/.zfs % cd snapshot 
/.zfs/snapshot % testdir    
./.                     2       2
./..                    1       1
./2009-12-25            205     205
./2010-02-10            122     122
^^^ These two are fixed by the patch. It would be 3 3 otherwise

/.zfs/snapshot % cd 2009-12-25
/.zfs/snapshot/2009-12-25 % testdir
./.                     3       205     !!!
./..                    3       2       !!!
./.cshrc                13102   13102
./media                 38      38
[...]
/usr/bin % testdir | head
./.                     85      85
./..                    45      45
./from                  123257  123257
./env                   123223  123223
./bzfgrep               470652  470652
./nm                    470564  470564
./calendar              122923  122923
[...]

testdir sources attached.

What do you think about it? Do you by chance have OpenSolaris running to
check if situation is different there?

Thanks,
Gleb.
> 
> -- 
> Jaakko
-------------- next part --------------
#include <sys/param.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <dirent.h>
#include <stdio.h>

int main(int argc, char **argv)
{
	char *path;
	char buf[MAXPATHLEN];
	DIR *dirp;
	struct dirent de_buf;
	struct dirent *de;
	struct stat st;

	if (argc < 2)
		path = ".";
	else
		path = argv[0];
	
	dirp = opendir(path);
	if (dirp == NULL)
		err(1, "opendir");

	while (1) {
		if (readdir_r(dirp, &de_buf, &de) != 0)
			err(1, "readdir");
		if (de == NULL)
			break;
		snprintf(buf, sizeof(buf), "%s/%.*s", path, de->d_namlen, de->d_name);
		if (lstat(buf, &st) == -1)
			err(1, "stat: %s, buf");
		printf("%-20s\t%d\t%d\t%s\n", buf, de->d_fileno, st.st_ino,
		    (de->d_fileno != st.st_ino ? "!!!" : ""));
	}
	closedir(dirp);

	return (0);
}


More information about the freebsd-fs mailing list