Frequent network access freeze (in 7.0)
Unga
unga888 at yahoo.com
Tue Feb 26 11:52:57 UTC 2008
--- Robert Watson <rwatson at FreeBSD.org> wrote:
>
> On Wed, 20 Feb 2008, Unga wrote:
>
> > I'm running 7.0-PRERELEASE (RC2, dated
> 15/02/2008), compiled from sources on
> > i386 machine (512MB RAM, 3.0GHz, tx0: <SMC
> EtherPower II 10/100>).
> >
> > Network access freezes very frequently. Cannot
> ping to any ip address. The
> > only way to get networking working again is
> reboot.
> >
> > I'm having this problem on 7.0 ever since I tried
> it from BETA4. I have
> > reported also to this list before but sadly nobody
> was interested on it.
> >
> > If somebody is interested to look into this
> problem, I could furnish with
> > more detail and participate in testing.
>
> This sort of problem frequently turns out to be a
> bug in a device driver or a
> problem with interrupt probing/configuration, so my
> first guess would be a
> problem with the if_tx driver. The usual starting
> diagnostics when ping fails
> are to try to use tcpdump to determine whether it's
> receive or transmit
> failing (or both). Quiet the network between two
> endpoints as much as you can
> so you can avoid noise from making the dumps more
> complex, and dump arp and
> icmp at both endpoints. Now try to ping from each
> end point to the other.
> One potential source of confusion is that ping
> requires ARP to work, and ARP
> can be a slightly confusing protocol as it usually
> resolves actively (query,
> response) but sometimes it receives passive updates
> or extends existing
> entries.
>
> What you want to look for is a packet sent by one
> side that isn't received by
> the other. You might find, for example, that your
> host receives packets fine,
> but the packets it transmits are never received.
> This would be indicative of a
> driver bug in which it fails to properly handle (for
> example) transmit queues
> filling, and might only trigger under very high
> load. Or, you might find that
> your host never receives anything the other side
> transmits, but can send fine.
> This might be indicative of a driver bug involving
> the receive code, or a
> problem with how interrupts are being handled more
> generally.
>
> It looks like the last non-routine maintenance to
> the driver was done by
> Maxime in about 2003; the more recent changes have
> all been updates to
> newbus/busdma infrastructure, ifnet changes, locking
> changes, etc. I've CC'd
> him as it sounds like he may have hardware... My
> advice would be to do the
> above tests and see if you can narrow down whether
> it's transmit, receive, or
> both failing.
>
Here are the detail when net access is working and
when not working:
When net access working
-----------------------
$ ifconfig
tx0:
flags=108843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,NEEDSGIANT>
metric 0 mtu 1500
options=8<VLAN_MTU>
ether 00:e1:20:34:bb:36
inet 192.168.1.20 netmask 0xffffff00 broadcast
192.168.1.255
media: Ethernet autoselect (10baseT/UTP)
status: active
plip0:
flags=108810<POINTOPOINT,SIMPLEX,MULTICAST,NEEDSGIANT>
metric 0 mtu 1500
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric
0 mtu 16384
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
inet6 ::1 prefixlen 128
inet 127.0.0.1 netmask 0xff000000
$ netstat -r
Routing tables
Internet:
Destination Gateway Flags Refs
Use Netif Expire
default 192.168.1.1 UGS 0
1090 tx0
localhost localhost UH 0
186 lo0
192.168.1.0 link#1 UC 0
0 tx0
192.168.1.1 00:91:d2:4c:54:f8 UHLW 2
0 tx0 892
Internet6:
Destination Gateway Flags Netif
Expire
localhost localhost UHL lo0
fe80::%lo0 fe80::1%lo0 U lo0
fe80::1%lo0 link#3 UHL lo0
ff01:3:: fe80::1%lo0 UC lo0
ff02::%lo0 fe80::1%lo0 UC lo0
When net access NOT working
---------------------------
$ ifconfig
tx0:
flags=108843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,NEEDSGIANT>
metric 0 mtu 1500
options=8<VLAN_MTU>
ether 00:e1:20:34:bb:36
inet 192.168.1.20 netmask 0xffffff00 broadcast
192.168.1.255
media: Ethernet autoselect (10baseT/UTP)
status: active
plip0:
flags=108810<POINTOPOINT,SIMPLEX,MULTICAST,NEEDSGIANT>
metric 0 mtu 1500
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric
0 mtu 16384
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
inet6 ::1 prefixlen 128
inet 127.0.0.1 netmask 0xff000000
$ netstat -r
Routing tables
Internet:
Destination Gateway Flags Refs
Use Netif Expire
default 192.168.1.1 UGS 0
3338 tx0
localhost localhost UH 0
204 lo0
192.168.1.0 link#1 UC 0
0 tx0
192.168.1.1 00:91:d2:4c:54:f8 UHLW 2
28 tx0 997
192.168.1.2 link#1 UHLW 1
1 tx0
Internet6:
Destination Gateway Flags Netif
Expire
localhost localhost UHL lo0
fe80::%lo0 fe80::1%lo0 U lo0
fe80::1%lo0 link#3 UHL lo0
ff01:3:: fe80::1%lo0 UC lo0
ff02::%lo0 fe80::1%lo0 UC lo0
tcpdump -i tx0 -v
NOTE: When ping to 192.168.1.1, no tcpdump output.
ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
^C
--- 192.168.1.1 ping statistics ---
58 packets transmitted, 0 packets received, 100.0%
packet loss
/var/log/messages:
Feb 26 15:26:14 blacktower kernel: tx0: ERROR! Can't
stop Rx DMA
Feb 26 15:26:14 blacktower kernel: tx0: promiscuous
mode enabled
Note: These two messages keep on repeat on
/var/log/messages.
/var/log/messages at the time of send this email:
Feb 26 17:32:17 blacktower kernel: tx0: link state
changed to DOWN
Feb 26 17:36:25 blacktower kernel: tx0: link state
changed to UP
Feb 26 17:36:30 blacktower kernel: tx0: link state
changed to DOWN
Feb 26 17:37:07 blacktower kernel: tx0: link state
changed to UP
Feb 26 17:37:14 blacktower kernel: tx0: link state
changed to DOWN
Feb 26 17:37:22 blacktower kernel: tx0: link state
changed to UP
When reboot, net access start working again.
Please let me know what other information is required.
Kind regards
Unga
____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
More information about the freebsd-current
mailing list