NFSv4: prob err=10036

Tue Apr 15 21:01:29 UTC 2014

Marcelo Araujo wrote:
> 
> Hello Rick,
> 
> 
> Thanks by the prompt reply, and I'm sorry my late reply,
> unfortunately I'm located in Taiwan, so, timezone is an issue.
> 
> 
> So here attached is my pcap.
> 
> 
> Server IP: 172.17.32.42
> Client IP: 172.17.32.54
> 
> 
> Something related with RELEASE_LOCKOWNER, I'm still investigating,
> maybe I can find a solution before you reply, if yes, I will post
> here.
> 
Well, I looked at the packet trace and it is weird.

One field (the NFSv4 operation #) is incorrect in the packet.
It should have been 33 (0x21), which is PUTROOTFH and instead
it is 39 (0x27), which is RELEASELOCKOWNER.
All the arguments after the operation # are correct for the
RPC, if that operation# was 33 (PUTROOTFH).

Since the call looks like this (around line#4303 in sys/fs/nfsclient/nfs_clrpcops.c):

   nfscl_reqstart(nd, NFSPROC_PUTROOTFH, nmp, NULL, 0, &opcntp, NULL);

I can't imagine how NFSPROC_PUTROOTFH became NFSPROC_RELEASELCKOWN?
(Btw, there is a mapping from NFSPROC_xxx to NFSV4OP_xxx that occurs,
 so these arguments are 33 and 34 respectively and not 33 and 39.)

So, somehow the argument gets incremented by one when it is on the
stack for the call. (It would be 34 in nfscl_reqstart(), since the
tag is "Rellckown" and not "Dirpath" in the packet header. This tag
is for debugging only and doesn't affect the RPC's semantics. For
once, it was useful;-) So, this isn't some data error later, such as
"on the wire".

All I can suggest is that something is stomping on this field on
the stack or there is a memory problem where this stack argument
sits?

Aren't computers fun? rick

> 
> Thanks again.
> 
> 
> 
> 2014-04-14 22:00 GMT+08:00 Rick Macklem < rmacklem at uoguelph.ca > :
> 
> 
> 
> Marcelo Araujo wrote:
> > Hi all,
> > 
> > Anyone have saw this prob err before when try to mount a NFSv4?
> > 
> > machine_a# mount -t nfs -o nfsv4 192.168.2.100:/a /mnt/
> > machine_a# mount_nfs: /mnt, : Input/output error
> > machine_a# tail /var/log/messages |grep nfsv4
> > Apr 13 17:03:33 ESSD46B6E kernel: nfsv4 client/server protocol prob
> > err=10036
> > 
> Well, 10036 is NFSERR_BADXDR (they are all in sys/fs/nfs/nfsproto.h).
> This means that the server didn't like the RPC message presented to
> it.
> (I have no idea why that would be the case for machine_a?)
> 
> If you capture packets while attempting the mount, you can look at
> them in wireshark and maybe see how they are trashed? (I just got
> home,
> so I can take a look at a packet capture, if you email it to me as an
> attachment.)
> # tcpdump -s 0 -w mnt.pcap host 192.168.1.100
> - run on machine_a during the mount attempt, should do it (in
> mnt.pcap).
> 
> rick
> 
> 
> > I have another machine with the same settings that can mount
> > successfully
> > the same NFSv4 share.
> > 
> > machine_c# mount -t nfs -o nfsv4 192.168.2.100:/a /mnt/
> > machine_c#
> > 
> > Best Regards,
> > --
> > Marcelo Araujo
> > araujo at FreeBSD.org
> > _______________________________________________
> > freebsd-fs at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "
> > freebsd-fs-unsubscribe at freebsd.org "
> > 
> 
> 
> 
> 
> --
> Marcelo Araujo
> araujo at FreeBSD.org