Why is NFSv4 so slow?
Rick C. Petty
rick-freebsd2009 at kiwi-computer.com
Mon Jun 28 14:20:26 UTC 2010
On Sun, Jun 27, 2010 at 09:58:53PM -0700, Jeremy Chadwick wrote:
> >
> > Again, my ports tree is mounted as FSType nfs with option nfsv4.
> > FreeBSD/amd64 8.1-PRERELEASE r208408M GENERIC kernel.
>
> This sounds like NFSv4 is "tickling" some kind of bug in your NIC driver
> but I'm not entirely sure. Can you provide output from:
>
> 1) ifconfig -a (you can X out the IPs + MACs if you want)
On the NFSv4 server:
nfe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8010b<RXCSUM,TXCSUM,VLAN_MTU,TSO4,LINKSTATE>
ether 00:22:15:b4:2d:XX
inet 172.XX.XX.4 netmask 0xffffff00 broadcast 172.XX.XX.255
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=3<RXCSUM,TXCSUM>
inet6 fe80::1 prefixlen 64 scopeid 0x2
inet6 ::1 prefixlen 128
inet 127.0.0.1 netmask 0xff000000
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
On one of the clients:
re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=389b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC>
ether e0:cb:4e:cd:d3:XX
inet 172.XX.XX.9 netmask 0xffffff00 broadcast 172.XX.XX.255
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=3<RXCSUM,TXCSUM>
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2
inet6 ::1 prefixlen 128
inet 127.0.0.1 netmask 0xff000000
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
> 2) netstat -m
server:
1739/1666/3405 mbufs in use (current/cache/total)
257/1257/1514/25600 mbuf clusters in use (current/cache/total/max)
256/547 mbuf+clusters out of packet secondary zone in use (current/cache)
0/405/405/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
948K/4550K/5499K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines
client:
264/2046/2310 mbufs in use (current/cache/total)
256/1034/1290/25600 mbuf clusters in use (current/cache/total/max)
256/640 mbuf+clusters out of packet secondary zone in use (current/cache)
3/372/375/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
590K/4067K/4657K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines
> 3) vmstat -i
Server:
interrupt total rate
irq1: atkbd0 24 0
irq18: atapci1 1883933 0
irq20: nfe0 ohci1 1712603504 793
cpu0: timer 4315963536 1999
irq256: hdac0 12 0
irq257: ahci0 139934363 64
cpu2: timer 4315960172 1999
cpu1: timer 4315960172 1999
cpu3: timer 4315960172 1999
Total 19118265888 8858
Client:
interrupt total rate
irq1: atkbd0 1063022 0
irq16: hdac0 16013959 6
irq17: atapci0+++ 6 0
irq18: ohci0 ohci1* 5324486 2
irq19: atapci1 7500968 2
irq20: ahc0 19 0
irq21: ahc1 112390 0
cpu0: timer 5125670841 1999
irq256: hdac1 2 0
irq257: re0 742537149 289
cpu1: timer 5125664297 1999
Total 11023887139 4301
> 4) prtconf -lvc (only need the Ethernet-related entries)
I'll assume you meant to type "pciconf", on the server:
nfe0 at pci0:0:10:0: class=0x020000 card=0x82f21043 chip=0x076010de rev=0xa2 hdr=0x00
vendor = 'NVIDIA Corporation'
device = 'NForce Network Controller (MCP78 NIC)'
class = network
subclass = ethernet
cap 01[44] = powerspec 2 supports D0 D1 D2 D3 current D0
cap 05[50] = MSI supports 16 messages, 64 bit, vector masks
cap 08[6c] = HT MSI fixed address window disabled at 0xfee00000
client:
re0 at pci0:1:0:0: class=0x020000 card=0x84321043 chip=0x816810ec rev=0x06 hdr=0x00
vendor = 'Realtek Semiconductor'
device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
class = network
subclass = ethernet
cap 01[40] = powerspec 3 supports D0 D1 D2 D3 current D0
cap 05[50] = MSI supports 1 message, 64 bit enabled with 1 message
cap 10[70] = PCI-Express 2 endpoint IRQ 2 max data 128(256) link x1(x1)
cap 11[b0] = MSI-X supports 4 messages in map 0x20
cap 03[d0] = VPD
> 5) sysctl dev.XXX.N (ex. for em0, XXX=em, N=0)
server:
dev.nfe.0.%desc: NVIDIA nForce MCP77 Networking Adapter
dev.nfe.0.%driver: nfe
dev.nfe.0.%location: slot=10 function=0 handle=\_SB_.PCI0.NMAC
dev.nfe.0.%pnpinfo: vendor=0x10de device=0x0760 subvendor=0x1043 subdevice=0x82f2 class=0x020000
dev.nfe.0.%parent: pci0
dev.nfe.0.process_limit: 192
dev.nfe.0.stats.rx.frame_errors: 0
dev.nfe.0.stats.rx.extra_bytes: 0
dev.nfe.0.stats.rx.late_cols: 0
dev.nfe.0.stats.rx.runts: 0
dev.nfe.0.stats.rx.jumbos: 0
dev.nfe.0.stats.rx.fifo_overuns: 0
dev.nfe.0.stats.rx.crc_errors: 0
dev.nfe.0.stats.rx.fae: 0
dev.nfe.0.stats.rx.len_errors: 0
dev.nfe.0.stats.rx.unicast: 1762645090
dev.nfe.0.stats.rx.multicast: 1
dev.nfe.0.stats.rx.broadcast: 7608
dev.nfe.0.stats.tx.octets: 2036479975330
dev.nfe.0.stats.tx.zero_rexmits: 2090186021
dev.nfe.0.stats.tx.one_rexmits: 0
dev.nfe.0.stats.tx.multi_rexmits: 0
dev.nfe.0.stats.tx.late_cols: 0
dev.nfe.0.stats.tx.fifo_underuns: 0
dev.nfe.0.stats.tx.carrier_losts: 0
dev.nfe.0.stats.tx.excess_deferrals: 0
dev.nfe.0.stats.tx.retry_errors: 0
dev.nfe.0.stats.tx.unicast: 0
dev.nfe.0.stats.tx.multicast: 0
dev.nfe.0.stats.tx.broadcast: 0
dev.nfe.0.wake: 0
client:
c: RealTek 8168/8111 B/C/CP/D/DP/E PCIe Gigabit Ethernet
dev.re.0.%driver: re
dev.re.0.%location: slot=0 function=0
dev.re.0.%pnpinfo: vendor=0x10ec device=0x8168 subvendor=0x1043 subdevice=0x8432 class=0x020000
dev.re.0.%parent: pci1
> check "dmesg" to see if there's any messages the kernel has
> been spitting out which look relevant? Thanks.
server, immediately after restarting all of nfs scripts (rpcbind nfsclient nfsuserd nfsserver mountd nfsd statd lockd nfscbd):
Jun 27 18:04:44 rpcbind: cannot get information for udp6
Jun 27 18:04:44 rpcbind: cannot get information for tcp6
NLM: failed to contact remote rpcbind, stat = 5, port = 28416
Jun 27 18:05:12 amanda kernel: NLM: failed to contact remote rpcbind, stat = 5, port = 28416
client, when noticing the mounting-over-directories problem:
NLM: failed to contact remote rpcbind, stat = 5, port = 28416
nfsv4 client/server protocol prob err=10020
nfsv4 client/server protocol prob err=10020
...
No other related messages were found in /var/log/messages either.
--
-- Rick C. Petty
More information about the freebsd-stable
mailing list