[old nfsclient] different nmount() args passed from mount vs.
mount_nfs
Sergey Kandaurov
pluknet at gmail.com
Tue May 17 09:36:45 UTC 2011
Hi.
First, sorry for the long mail. I just tried to describe in full details.
When mounting nfs with some options, I found that /sbin/mount and
/sbin/mount_nfs pass options to nmount() differently, which results
in bad things (TM). I traced the options and here they are:
>From mount(8) -> mount_nfs(8):
"rw" -> ""
"addr" -> {something valid }
"fh" -> 5
"sec" -> "sys"
"nfsv3" -> 0x0 => NFSMNT_NFSV3
"hostname" -> "dev2.mail:/home/svn/freebsd/head"
"fstype" -> "oldnfs"
"fspath" -> "/usr/src"
"errmsg" -> ""
(nil)
>From pre-r221124 mount(8):
= "fstype" -> "oldnfs"
"hostname" -> "dev2.mail"
= "fspath" -> "/usr/src"
"from" -> "dev2.mail:/home/svn/freebsd/head"
= "errmsg" -> ""
(nil)
Note, that pre-r221124 mount(8) knows nothing about oldnfs.
1. "hostname" option is passed differently from mount(8) and mount_nfs(8).
When I force to mount oldnfs file system with mount(8) directly (to not
bypass the nmount(2) call to mount_nfs(8)), I get this error:
./mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid argument
Hmm.. this may be because mount(8) passes value in $hostname:$path format
(see the traces above). It might be due to different old nfsclient way to parse
args, but I am not sure, I can be wrong. Anyway, it does not matter now.
The actual problem manifests when running the command with pre-r221124
mount(8) binary. It knows nothing about "oldnfs" and (attention!)
calls nmount(2)
directly instead of bypassing the call to the mount_nfs(8) binary as
usually done,
and this is the place where the "unsanitized nmount(2) args" problem is hidden.
[New mount knows about "oldnfs" and passes the call to mount_oldnfs(8) that
prepares all the nmount(2) args to correctly hide the problem.]
To prove it, that is how old and new mount(8) work differently:
1) new mount(8) as of current
mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
exec: mount_oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
2) old mount(8) as of pre-r221124
./mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
Ok, back to the first paragraph: a different "hostname" mount option.
When I first faced with this, I tried to specify value for "hostname"
explicitly. Here it comes:
./mount -t oldnfs -o hostname=dev2.mail
dev2.mail:/home/svn/freebsd/head /usr/src
[CABOOM!]
It just crashed. Do not do this :)
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0x1
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff805da299
stack pointer = 0x28:0xffffff807bef6240
frame pointer = 0x28:0xffffff807bef62a0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 2541 (mount)
db> bt
Tracing pid 2541 tid 100076 td 0xfffffe0001ace460
nfs_connect() at 0xffffffff805da299 = nfs_connect+0x79
nfs_request() at 0xffffffff805da978 = nfs_request+0x398
nfs_getattr() at 0xffffffff805e2a6c = nfs_getattr+0x2bc
VOP_GETATTR_APV() at 0xffffffff806f4283 = VOP_GETATTR_APV+0xd3
mountnfs() at 0xffffffff805de739 = mountnfs+0x329
nfs_mount() at 0xffffffff805dffc7 = nfs_mount+0xcf7
vfs_donmount() at 0xffffffff804d46ff = vfs_donmount+0x82f
nmount() at 0xffffffff804d54f3 = nmount+0x63
syscallenter() at 0xffffffff804861cb = syscallenter+0x1cb
syscall() at 0xffffffff806ae710 = syscall+0x60
Xfast_syscall() at 0xffffffff8069922d = Xfast_syscall+0xdd
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x800ab444c, rsp =
0x7fffffffca48, rbp = 0x801009058 ---
As you might see from above nmount(2) args traces, mount(8) itself doesn't
pass the "addr" option to the nmount(2) syscall while nfs_mount() expects to
receive it, which is the problem.
Later deep in nmount(2) in /sys/nfsclient/nfs_krpc.c it tries to dereference
addr value and page faults here in nfs_connect() :
vers = NFS_VER3;
else if (nmp->nm_flag & NFSMNT_NFSV4)
vers = NFS_VER4;
XXX saddr is NULL, the next line will crash
if (saddr->sa_family == AF_INET)
if (nmp->nm_sotype == SOCK_DGRAM)
nconf = getnetconfigent("udp");
I think that nfsclient, probably in sys/nfsclient/nfs_vfsops.c:mount_nfs(),
should handle a missing value for "addr" and/or "fh" mount options.
It doesn't check it currently:
% static int
% nfs_mount(struct mount *mp)
% {
% struct nfs_args args = {
% [...]
% .addr = NULL,
% };
% int error, ret, has_nfs_args_opt;
% int has_addr_opt, has_fh_opt, has_hostname_opt;
% struct sockaddr *nam;
addr is initialized with NULL. num used later as a pointer to args.addr value.
% if ((mp->mnt_flag & (MNT_ROOTFS | MNT_UPDATE)) == MNT_ROOTFS) {
% error = nfs_mountroot(mp);
% goto out;
% }
We do not try to mount root, this is not ours.
% if (vfs_getopt(mp->mnt_optnew, "nfs_args", NULL, NULL) == 0) {
[...]
% has_nfs_args_opt = 1;
% }
We do not use old mount(2) interface, not ours.
% if (vfs_getopt(mp->mnt_optnew, "nfsv3", NULL, NULL) == 0)
% args.flags |= NFSMNT_NFSV3;
mount(8) doesn't pass nfsv3 option, so NFSMNT_NFSV3 isn't set.
% if (vfs_getopt(mp->mnt_optnew, "addr", (void **)&args.addr,
% &args.addrlen) == 0) {
% has_addr_opt = 1;
% if (args.addrlen > SOCK_MAXADDRLEN) {
% error = ENAMETOOLONG;
% goto out;
% }
% nam = malloc(args.addrlen, M_SONAME,
% M_WAITOK);
% bcopy(args.addr, nam, args.addrlen);
% nam->sa_len = args.addrlen;
% }
mount(8) doesn't pass addr option, so args.addr isn't set, hence
struct sockaddr *nam is also NULL, has_addr_opt is 0.
% if (vfs_getopt(mp->mnt_optnew, "hostname", (void **)&args.hostname,
% NULL) == 0) {
% has_hostname_opt = 1;
% }
% if (args.hostname == NULL) {
% vfs_mount_error(mp, "Invalid hostname");
% error = EINVAL;
% goto out;
% }
I don't know why I got here the error. I didn't analyze it deep though.
"mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid argument"
% if (mp->mnt_flag & MNT_UPDATE) {
[...]
That's not update case, it's not ours.
% if (has_nfs_args_opt) {
has_nfs_args_opt is 0, as we don't use legacy mount(2) interface, see above.
So, the whole block is ignored. Though, see below.
% /*
% * In the 'nfs_args' case, the pointers in the args
% * structure are in userland - we copy them in here.
% */
% if (!has_fh_opt) {
% error = copyin((caddr_t)args.fh, (caddr_t)nfh,
% args.fhsize);
% if (error) {
% goto out;
% }
% args.fh = nfh;
% }
has_fh_opt is 0, as mount(8) didn't pass "fh" to nmount(2),
though this part is not executed anyway.
% if (!has_hostname_opt) {
% error = copyinstr(args.hostname, hst, MNAMELEN-1, &len)
% if (error) {
% goto out;
% }
% bzero(&hst[len], MNAMELEN - len);
% args.hostname = hst;
has_hostname_opt is 1, as mount(8) passes "hostname" to nmount(2),
though this part is not executed anyway.
% }
% if (!has_addr_opt) {
% /* sockargs() call must be after above copyin() calls *
% printf("args.addr: %p\n", args.addr);
% error = getsockaddr(&nam, (caddr_t)args.addr,
% args.addrlen);
% printf("error: %d\n", error);
% if (error) {
% goto out;
% }
% }
has_addr_opt is 0, as mount(8) didn't pass "addr" to nmount(2),
though this part is not executed anyway.
% }
% error = mountnfs(&args, mp, nam, args.hostname, &vp,
% curthread->td_ucred, negnametimeo);
mountnfs() is called with nam == NULL, then it crashes deep in
/sys/nfsclient/nfs_krpc.c:nfs_connect().
Also compare ddb backtrace with one from new mount(8)
which bypasses the call to mount_nfs(8). I got it by adding
kdb_enter() just before NULL pointer dereference.
db> bt
Tracing pid 2143 tid 100117 td 0xfffffe0001c58000
kdb_enter() at 0xffffffff80477d1b = kdb_enter+0x3b
nfs_connect() at 0xffffffff805da7e8 = nfs_connect+0x88
nfs_request() at 0xffffffff805daec8 = nfs_request+0x398
nfs_fsinfo() at 0xffffffff805ddec0 = nfs_fsinfo+0xd0
mountnfs() at 0xffffffff805ded44 = mountnfs+0x3e4
nfs_mount() at 0xffffffff805e051f = nfs_mount+0xcff
vfs_donmount() at 0xffffffff804d5092 = vfs_donmount+0xc92
nmount() at 0xffffffff804d5a33 = nmount+0x63
syscallenter() at 0xffffffff804866eb = syscallenter+0x1cb
syscall() at 0xffffffff806aec90 = syscall+0x60
Xfast_syscall() at 0xffffffff806997ad = Xfast_syscall+0xdd
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x8008a544c, rsp =
0x7fffffffd258, rbp = 0x7fffffffd30c ---
Two backtraces different slightly because of NFSMNT_NFSV3 is not set
in the old mount(8) case. From sys/nfsclient/nfs_vfsops.c:mountnfs()
if (argp->flags & NFSMNT_NFSV3)
nfs_fsinfo(nmp, *vpp, curthread->td_ucred, curthread);
else
VOP_GETATTR(*vpp, &attrs, curthread->td_ucred);
--
wbr,
pluknet
More information about the freebsd-fs
mailing list