Linux NFS client and FreeBSD server strangeness

Mike Tancsa mike at sentex.net
Wed Apr 4 18:27:33 UTC 2018


Not sure where the tweaking needs to happen, but I am getting strange
behaviour between a Linux nfs client and FreeBSD RELENG_11 NFS server.

The FreeBSD server starts with


nfs_client_enable="YES"
nfs_server_enable="YES"


rpcbind_enable="YES"
rpc_lockd_enable="YES"
rpc_statd_enable="YES"
nfs_server_flags="-u -t -n 16"

and on the Linux client I have been trying various options to no avail.
The mount works, but on a straight up write to the FreeBSD server,
everything is very bursty.  I noticed this (I think) a few months ago
where Linux dumps across an nfs mount seemed to take a lot longer and
were getting very bursty.

It seems if there are a mixture of reads and writes, everything is
pretty fast. But if a client is just writing to the server, something,
somewhere is blocking.  Doing something simple like
ls -l /nfsmount
from the client "wakes" up the server/client so that write stream can
keep going. Otherwise, it will do a big blast of writes and then several
seconds of pausing on the dump.

Linux Dump is a simple

/sbin/dump  u -0 -f - / | /bin/bzip2  >/backup/dump-root-0.bz2

Mount is

mount.nfs -o tcp,intr,noatime,vers=3  192.168.yy.xx:/path

If I run ifstat on the FreeBSD nfs server, the traffic pattern looks like

# ifstat -b -i cxl0
       cxl0
 Kbps in  Kbps out
    0.00      0.00
    0.00      0.00
    0.00      0.00
    0.00      0.00
    0.00      0.00
8.12e+06  45127.03
    0.00      0.00
    0.00      0.00
    0.00      0.00
    0.00      0.00
6.04e+06  33525.76
901122.1   4983.72
    0.00      0.00

if I do a bunch of ls -l /nfsmount on the client

eg

while true
do
ls -l /backup/ > /dev/null
done

traffic pattern is


       cxl0
 Kbps in  Kbps out
    0.00      0.00
3.31e+06  18520.03
5.89e+06  32571.52
4.84e+06  28325.71
2.12e+06  19466.56
614727.0  12246.10
874927.6  13557.18
1.06e+06  14386.78
917865.4  13696.87
1.09e+06  14608.64
1.06e+06  14376.12
164077.3   5286.64


Leading up to the stall, pcap snippet attached.

Note, doing something like

dd if=/dev/zero of=/backup/test.bin bs=4096 count=5000000

I can saturate the 10G link and max out the disk on the server

# dd if=/dev/zero of=/backup/test.bin bs=4096 count=5000000
5000000+0 records in
5000000+0 records out
20480000000 bytes (20 GB, 19 GiB) copied, 36.6238 s, 559 MB/s

and its a pretty steady stream unlike the dump.  Any ideas whats going
on and how I might be able to work around this ?


192.168.xx.yy:/zbackup1/virtbox4b/backup on /backup type nfs
(rw,noatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.242.254,mountvers=3,mountport=774,mountproto=tcp,local_lock=none,addr=192.168.yy.xx)



	---Mike


-------------------
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, mike at sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
-------------- next part --------------
13:21:15.457484 IP (tos 0x0, ttl 64, id 56097, offset 0, flags [DF], proto TCP (6), length 176)
    192.168.client.937 > 192.168.server.2049: Flags [P.], cksum 0x8ca8 (correct), seq 87513292:87513416, ack 109621, win 5632, options [nop,nop,TS val 1048683 ecr 500103872], length 124: NFS request xid 468733850 120 commit fh Unknown/B30BF41EDE48EC5C0A000
B0000000000847D82010000000000000000 0 bytes @ 0
13:21:15.457502 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 208, bad cksum 0 (->d3ce)!)
    192.168.server.2049 > 192.168.client.937: Flags [P.], cksum 0x671c (incorrect -> 0x4109), seq 109621:109777, ack 87513416, win 29127, options [nop,nop,TS val 500103873 ecr 1048683], length 156: NFS reply xid 468733850 reply ok 152 commit PRE: sz 416022
528 mtime 1522862475.456904000 ctime 1522862475.456904000 POST: REG 644 ids 28767/0 sz 416022528 nlink 1 rdev 102/830341216 fsid 1ef40bb3 fileid b a/m/ctime 1522862359.102177000 1522862475.456904000 1522862475.456904000
13:21:15.462504 IP (tos 0x0, ttl 64, id 56098, offset 0, flags [DF], proto TCP (6), length 164)
    192.168.client.937 > 192.168.server.2049: Flags [P.], cksum 0xa3ba (correct), seq 87513416:87513528, ack 109777, win 5632, options [nop,nop,TS val 1048684 ecr 500103873], length 112: NFS request xid 485511066 108 getattr fh Unknown/B30BF41EDE48EC5C0A00
0900000000006D7D82010000000000000000
13:21:15.462535 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 168, bad cksum 0 (->d3f6)!)
    192.168.server.2049 > 192.168.client.937: Flags [P.], cksum 0x66f4 (incorrect -> 0xccb2), seq 109777:109893, ack 87513528, win 29127, options [nop,nop,TS val 500103878 ecr 1048684], length 116: NFS reply xid 485511066 reply ok 112 getattr REG 644 ids 2
8767/0 sz 4096000000
13:21:15.462669 IP (tos 0x0, ttl 64, id 56099, offset 0, flags [DF], proto TCP (6), length 164)
    192.168.client.937 > 192.168.server.2049: Flags [P.], cksum 0x89d1 (correct), seq 87513528:87513640, ack 109893, win 5632, options [nop,nop,TS val 1048684 ecr 500103878], length 112: NFS request xid 502288282 108 getattr fh Unknown/B30BF41EDE48EC5C0A00
0A0000000000847D82010000000000000000
13:21:15.462713 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 168, bad cksum 0 (->d3f6)!)
    192.168.server.2049 > 192.168.client.937: Flags [P.], cksum 0x66f4 (incorrect -> 0x12fd), seq 109893:110009, ack 87513640, win 29127, options [nop,nop,TS val 500103878 ecr 1048684], length 116: NFS reply xid 502288282 reply ok 112 getattr REG 644 ids 2
8767/0 sz 3045
13:21:15.462845 IP (tos 0x0, ttl 64, id 56100, offset 0, flags [DF], proto TCP (6), length 188)
    192.168.client.937 > 192.168.server.2049: Flags [P.], cksum 0xe2db (correct), seq 87513640:87513776, ack 110009, win 5632, options [nop,nop,TS val 1048684 ecr 500103878], length 136: NFS request xid 519065498 132 readdirplus fh Unknown/B30BF41EDE48EC5C
0A00080000000000547D82010000000000000000 4096 bytes @ 0 max 32768 verf 0000000057467a00
13:21:15.462907 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 968, bad cksum 0 (->d0d6)!)
    192.168.server.2049 > 192.168.client.937: Flags [P.], cksum 0x6a14 (incorrect -> 0xc83b), seq 110009:110925, ack 87513776, win 29127, options [nop,nop,TS val 500103878 ecr 1048684], length 916: NFS reply xid 519065498 reply ok 912 readdirplus POST: DIR
 700 ids 28767/0 sz 5 nlink 2 rdev 0/9 fsid 1ef40bb3 fileid 8 a/m/ctime 1522862475.462872000 1522862359.102181000 1522862359.102181000 verf 0000000057467a00
13:21:15.499625 IP (tos 0x0, ttl 64, id 56101, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0xe242 (correct), seq 87513776, ack 110925, win 5632, options [nop,nop,TS val 1048694 ecr 500103878], length 0
13:21:17.476419 IP (tos 0x0, ttl 64, id 56102, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0xd6f1 (correct), seq 87513776:87522724, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500103878], length 8948: NFS request xid 535842714 8944 write fh Unknown/B30BF41EDE48EC5C0A000B0000000000847D82010000000000000000 131072 (131072) bytes @ 416022528 <unstable>
13:21:17.476429 IP (tos 0x0, ttl 64, id 56103, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0x07e8 (correct), seq 87522724:87531672, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500103878], length 8948
13:21:17.476449 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52, bad cksum 0 (->d46a)!)
    192.168.server.2049 > 192.168.client.937: Flags [.], cksum 0x6680 (incorrect -> 0x37df), seq 110925, ack 87531672, win 28847, options [nop,nop,TS val 500105892 ecr 1049188], length 0
13:21:17.476451 IP (tos 0x0, ttl 64, id 56104, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0x9a0c (correct), seq 87531672:87540620, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500103878], length 8948
13:21:17.476453 IP (tos 0x0, ttl 64, id 56105, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0x9285 (correct), seq 87540620:87549568, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500103878], length 8948
13:21:17.476455 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52, bad cksum 0 (->d46a)!)
    192.168.server.2049 > 192.168.client.937: Flags [.], cksum 0x6680 (incorrect -> 0xf30e), seq 110925, ack 87549568, win 28567, options [nop,nop,TS val 500105892 ecr 1049188], length 0
13:21:17.476455 IP (tos 0x0, ttl 64, id 56106, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0x1ebe (correct), seq 87549568:87558516, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500103878], length 8948
13:21:17.476457 IP (tos 0x0, ttl 64, id 56107, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0x0b17 (correct), seq 87558516:87567464, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500103878], length 8948
13:21:17.476459 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52, bad cksum 0 (->d46a)!)
    192.168.server.2049 > 192.168.client.937: Flags [.], cksum 0x6680 (incorrect -> 0xae3d), seq 110925, ack 87567464, win 28288, options [nop,nop,TS val 500105892 ecr 1049188], length 0
13:21:17.476460 IP (tos 0x0, ttl 64, id 56108, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0x0ee3 (correct), seq 87567464:87576412, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500103878], length 8948
13:21:17.476463 IP (tos 0x0, ttl 64, id 56109, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0x3657 (correct), seq 87576412:87585360, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500103878], length 8948
13:21:17.476465 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52, bad cksum 0 (->d46a)!)
    192.168.server.2049 > 192.168.client.937: Flags [.], cksum 0x6680 (incorrect -> 0x696d), seq 110925, ack 87585360, win 28008, options [nop,nop,TS val 500105892 ecr 1049188], length 0
13:21:17.476466 IP (tos 0x0, ttl 64, id 56110, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0x2539 (correct), seq 87585360:87594308, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500103878], length 8948
13:21:17.476483 IP (tos 0x0, ttl 64, id 56111, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0x3399 (correct), seq 87594308:87603256, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500103878], length 8948
13:21:17.476485 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52, bad cksum 0 (->d46a)!)
    192.168.server.2049 > 192.168.client.937: Flags [.], cksum 0x6680 (incorrect -> 0x203e), seq 110925, ack 87603256, win 28847, options [nop,nop,TS val 500105892 ecr 1049188], length 0
13:21:17.476567 IP (tos 0x0, ttl 64, id 56112, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0xf799 (correct), seq 87603256:87612204, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500105892], length 8948
13:21:17.476571 IP (tos 0x0, ttl 64, id 56113, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0x6a12 (correct), seq 87612204:87621152, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500105892], length 8948
13:21:17.476574 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52, bad cksum 0 (->d46a)!)
    192.168.server.2049 > 192.168.client.937: Flags [.], cksum 0x6680 (incorrect -> 0xdb6d), seq 110925, ack 87621152, win 28567, options [nop,nop,TS val 500105892 ecr 1049188], length 0
13:21:17.476575 IP (tos 0x0, ttl 64, id 56114, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0x1f3c (correct), seq 87621152:87630100, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500105892], length 8948
13:21:17.476578 IP (tos 0x0, ttl 64, id 56115, offset 0, flags [DF], proto TCP (6), length 9000)
    192.168.client.937 > 192.168.server.2049: Flags [.], cksum 0xa18c (correct), seq 87630100:87639048, ack 110925, win 5632, options [nop,nop,TS val 1049188 ecr 500105892], length 8948
13:21:17.476580 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52, bad cksum 0 (->d46a)!)


More information about the freebsd-fs mailing list