important NFS client patch for FreeBSD8.n
Chris H
chris# at 1command.com
Tue Jan 11 12:20:38 UTC 2011
Hello Jeremy, and thank you for your reply.
On Tue, January 11, 2011 12:17 am, Jeremy Chadwick wrote:
> On Mon, Jan 10, 2011 at 11:40:37PM -0800, Chris H wrote:
>
>> Greetings, and thank you for the "heads up".
>> On Mon, January 10, 2011 2:22 pm, Rick Macklem wrote:
>>
>>> I just commited a patch (r217242) to head. Anyone who is using client
>>> side NFS on FreeBSD8.n should apply this patch. It is also available at:
>>> http://people.freebsd.org/~rmacklem/krpc.patch
>>>
>>>
>>>
>>> It fixes a problem where the kernel rpc assumes that 4 bytes of data
>>> exists in the first mbuf without checking. If the data straddles multiple
>>> mbufs, it uses garbage and then a typical case will wedge for a minute or so
>>> until it times out and establishes a new TCP connection. It also replaces
>>> m_pullup() with m_copydata(), since m_pullup() can fail for rare cases when
>>> there is data available. (m_pullup() uses MGET(, M_DONTWAIT,) which can fail
>>> when mbuf allocation is constrainted, for example.)
>>>
>>> Thanks to john.gemignani at isilon.com for spotting this problem, rick
>>>
>>
>> I just fired a message off to @amd64 && @net because I am seeing messages
>> like:
>>
>>
>> nfe0: tx v2 error 0x6204<UNDERFLOW>
>>
>>
>> on a recent 8.1/amd64 install which is connected to an 8.0/i386 via NFS. They
>> both run NFS client && server, and they both utilize mount points on each
>> other. They are only 2 of several interconnected servers. The others are all
>> 7x/i386. But I only see these messages on the 8.1/amd64,
>> and only when connected to, and utilizing mounts on the 8.0/i386, and even
>> then, only when the data exceeds ~1.5Mb. I guess I'm asking if the messages
>> I'm receiving are related to the
>> corrections your patch provides. Or should I keep looking for the answer for
>> the messages I am seeing.
>
> The above message is coming from the nfe(4) NIC driver, not from NFS.
> It's possible that NFS tickles some kind of I/O throughput quirk in
> drivers such as nfe(4), given that they're intended for cheap desktops.
Well, I'd argue that point given I'm happily running an AM3 XIII 6-core
4Ghz motherboard that is military grade, which /also/ sports the nfe(4).
Oh, and it wasn't cheap. :)
However, the one I'm working with here is only an AM2 with a 2-core.
>
> CC'ing Yong-Hyeon Pyun to assist in debugging/explaining the above
> error.
Yong-Hyeon Pyun kindly responded to my message to @amd64 || @net, and
requested much the same info - which I provided. I /assumed/ that it
was an amd64 issue, as this box is the only amd64 of the lot, that, or
because it was the only 8.1 - the others are all <= 8.0. After posting/
responding @amd64 && @net, I noticed the NFS patch in the @stable, and
figured it worth asking about.
>
> In the interim, can you please provide output from the following
> commands:
>
>
> # uname -a
> # dmesg (please include relevant nfe details and miibus)
SEE ATTACHED FILE: dmesg.boot.udns0
> # pciconf -lvcb (please only include nfe-related output)
nfe0 at pci0:0:10:0: class=0x068000 card=0x73101462 chip=0x005710de rev=0xf3 hdr=0x00
vendor = 'NVIDIA Corporation'
device = 'NVIDIA Network Bus Enumerator (CK804)'
class = bridge
bar [10] = type Memory, range 32, base 0xf9ffb000, size 4096, enabled
bar [14] = type I/O Port, range 32, base 0xc080, size 8, enabled
cap 01[44] = powerspec 2 supports D0 D1 D2 D3 current D0
> # netstat -ind (you can XX-out MACs and/or IPs)
Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs
Coll Drop
nfe0 1500 <Link#1> 00:19:db:22:74:87 729801 0 0 529029 182
0 0
nfe0 1500 XXX.XXX.XXX.0 XXX.XXX.XXX.26 695750 - - 631781 -
- -
nfe0 1500 fe80:1::219:d fe80:1::219:dbff: 0 - - 6 -
- -
plip0 1500 <Link#2> 0 0 0 0 0
0 0
lo0 16384 <Link#3> 315 0 0 315 0
0 0
lo0 16384 127.0.0.0/8 127.0.0.1 313 - - 313 -
- -
lo0 16384 ::1/128 ::1 0 - - 2 -
- -
lo0 16384 fe80:3::1/64 fe80:3::1 0 - - 0 -
- -
> # ifconfig -a (you can XX-out MACs and/or IPs)
nfe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8010b<RXCSUM,TXCSUM,VLAN_MTU,TSO4,LINKSTATE>
ether 00:19:db:22:74:87
inet XXX.XXX.XXX.26 netmask 0xffffffe0 broadcast XXX.XXX.XXX.31
inet6 fe80::219:dbff:fe22:7487%nfe0 prefixlen 64 scopeid 0x1
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
media: Ethernet autoselect (100baseTX <half-duplex>)
status: active
plip0: flags=8810<POINTOPOINT,SIMPLEX,MULTICAST> metric 0 mtu 1500
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=3<RXCSUM,TXCSUM>
inet 127.0.0.1 netmask 0xff000000
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
>
>
> Thanks.
Thank you again Jeremy, for your thoughtful reply.
--Chris
>
>
> --
> | Jeremy Chadwick jdc at parodius.com |
> | Parodius Networking http://www.parodius.com/ |
> | UNIX Systems Administrator Mountain View, CA, USA |
> | Making life hard for others since 1977. PGP 4BD6C0CB |
>
>
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
>
>
--
-------------- next part --------------
Hello Jeremy, and thank you for your reply.
On Tue, January 11, 2011 12:17 am, Jeremy Chadwick wrote:
> On Mon, Jan 10, 2011 at 11:40:37PM -0800, Chris H wrote:
>
>> Greetings, and thank you for the "heads up".
>> On Mon, January 10, 2011 2:22 pm, Rick Macklem wrote:
>>
>>> I just commited a patch (r217242) to head. Anyone who is using client
>>> side NFS on FreeBSD8.n should apply this patch. It is also available at:
>>> http://people.freebsd.org/~rmacklem/krpc.patch
>>>
>>>
>>>
>>> It fixes a problem where the kernel rpc assumes that 4 bytes of data
>>> exists in the first mbuf without checking. If the data straddles multiple
>>> mbufs, it uses garbage and then a typical case will wedge for a minute or so
>>> until it times out and establishes a new TCP connection. It also replaces
>>> m_pullup() with m_copydata(), since m_pullup() can fail for rare cases when
>>> there is data available. (m_pullup() uses MGET(, M_DONTWAIT,) which can fail
>>> when mbuf allocation is constrainted, for example.)
>>>
>>> Thanks to john.gemignani at isilon.com for spotting this problem, rick
>>>
>>
>> I just fired a message off to @amd64 && @net because I am seeing messages
>> like:
>>
>>
>> nfe0: tx v2 error 0x6204<UNDERFLOW>
>>
>>
>> on a recent 8.1/amd64 install which is connected to an 8.0/i386 via NFS. They
>> both run NFS client && server, and they both utilize mount points on each
>> other. They are only 2 of several interconnected servers. The others are all
>> 7x/i386. But I only see these messages on the 8.1/amd64,
>> and only when connected to, and utilizing mounts on the 8.0/i386, and even
>> then, only when the data exceeds ~1.5Mb. I guess I'm asking if the messages
>> I'm receiving are related to the
>> corrections your patch provides. Or should I keep looking for the answer for
>> the messages I am seeing.
>
> The above message is coming from the nfe(4) NIC driver, not from NFS.
> It's possible that NFS tickles some kind of I/O throughput quirk in
> drivers such as nfe(4), given that they're intended for cheap desktops.
Well, I'd argue that point given I'm happily running an AM3 XIII 6-core
4Ghz motherboard that is military grade, which /also/ sports the nfe(4).
Oh, and it wasn't cheap. :)
However, the one I'm working with here is only an AM2 with a 2-core.
>
> CC'ing Yong-Hyeon Pyun to assist in debugging/explaining the above
> error.
Yong-Hyeon Pyun kindly responded to my message to @amd64 || @net, and
requested much the same info - which I provided. I /assumed/ that it
was an amd64 issue, as this box is the only amd64 of the lot, that, or
because it was the only 8.1 - the others are all <= 8.0. After posting/
responding @amd64 && @net, I noticed the NFS patch in the @stable, and
figured it worth asking about.
>
> In the interim, can you please provide output from the following
> commands:
>
>
> # uname -a
> # dmesg (please include relevant nfe details and miibus)
SEE ATTACHED FILE: dmesg.boot.udns0
> # pciconf -lvcb (please only include nfe-related output)
nfe0 at pci0:0:10:0: class=0x068000 card=0x73101462 chip=0x005710de rev=0xf3 hdr=0x00
vendor = 'NVIDIA Corporation'
device = 'NVIDIA Network Bus Enumerator (CK804)'
class = bridge
bar [10] = type Memory, range 32, base 0xf9ffb000, size 4096, enabled
bar [14] = type I/O Port, range 32, base 0xc080, size 8, enabled
cap 01[44] = powerspec 2 supports D0 D1 D2 D3 current D0
> # netstat -ind (you can XX-out MACs and/or IPs)
Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs
Coll Drop
nfe0 1500 <Link#1> 00:19:db:22:74:87 729801 0 0 529029 182
0 0
nfe0 1500 XXX.XXX.XXX.0 XXX.XXX.XXX.26 695750 - - 631781 -
- -
nfe0 1500 fe80:1::219:d fe80:1::219:dbff: 0 - - 6 -
- -
plip0 1500 <Link#2> 0 0 0 0 0
0 0
lo0 16384 <Link#3> 315 0 0 315 0
0 0
lo0 16384 127.0.0.0/8 127.0.0.1 313 - - 313 -
- -
lo0 16384 ::1/128 ::1 0 - - 2 -
- -
lo0 16384 fe80:3::1/64 fe80:3::1 0 - - 0 -
- -
> # ifconfig -a (you can XX-out MACs and/or IPs)
nfe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8010b<RXCSUM,TXCSUM,VLAN_MTU,TSO4,LINKSTATE>
ether 00:19:db:22:74:87
inet XXX.XXX.XXX.26 netmask 0xffffffe0 broadcast XXX.XXX.XXX.31
inet6 fe80::219:dbff:fe22:7487%nfe0 prefixlen 64 scopeid 0x1
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
media: Ethernet autoselect (100baseTX <half-duplex>)
status: active
plip0: flags=8810<POINTOPOINT,SIMPLEX,MULTICAST> metric 0 mtu 1500
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=3<RXCSUM,TXCSUM>
inet 127.0.0.1 netmask 0xff000000
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
>
>
> Thanks.
Thank you again Jeremy, for your thoughtful reply.
--Chris
>
>
> --
> | Jeremy Chadwick jdc at parodius.com |
> | Parodius Networking http://www.parodius.com/ |
> | UNIX Systems Administrator Mountain View, CA, USA |
> | Making life hard for others since 1977. PGP 4BD6C0CB |
>
>
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
>
>
--
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg.boot.udns0
Type: application/octet-stream
Size: 29534 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20110111/335ba49c/dmesg.boot.obj
More information about the freebsd-stable
mailing list