[SOLVED] MPCP Opcode Pause and unresponsive computer

David Naylor dbn at freebsd.org
Sun Feb 23 17:51:22 UTC 2014


Hi,

The issue was hardware error (corrupt memory module).  Once removed all 
symptoms disappeared.  

Please see below for specific follow up messages.  

Regards

On Monday, 17 February 2014 11:23:29 Yonghyeon PYUN wrote:
> On Thu, Feb 13, 2014 at 10:01:56PM +0300, David Naylor wrote:
> > Hi,
> > 
> > I recently installed FreeBSD 10.0-RELEASE on an headless Intense-PC.  I am
> > experiencing two network related issues with the computer.
> > 
> > First issue
> > -----------
> > When compiling lang/ruby19 the network freezes.  The build was done
> > directly from the command line using ssh.  After a while ssh reports
> > "Write failed: Broken pipe".  I attached the monitor and no messages were
> > displayed on the output (and the machine was still running).
> > 
> > The Intense-PC does not respond to pings at this point either.  Of note, I
> > was capable of transferring multiple GB of data and successfully compiled
> > other ports but compiling lang/ruby19 messes up everything.
> > 
> > Second issue
> > ------------
> > After a period of uptime (after the freeze from building lang/ruby19) the
> > entire network stops working, nothing is capable of connecting or
> > communicating on the network.  When I do a tcpdump (from a different,
> > affected computer) I find the following:
> > 
> > 20:57:58.254626 MPCP, Opcode Pause, length 46
> > 
> > These messages get repeated a few times a second.  The moment I disconnect
> > the Intense-PC from the network functionality is restored (and is clearly
> > illustrated by the tcpdump).
> > 
> > Information
> > -----------
> > # uname -a
> > FreeBSD dragonbsd 10.0-RELEASE FreeBSD 10.0-RELEASE #0
> > d44ce30(releng/10.0): Sun Feb  9 20:11:55 SAST 2014
> > root at dragon.dg:/tmp/home/freebsd/10.0/src/sys/MODULAR  amd64
> > 
> > # ifconfig
> > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
> > 
> >         options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
> >         inet6 ::1 prefixlen 128
> >         inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
> >         inet 127.0.0.1 netmask 0xff000000
> >         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
> > 
> > em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> > 
> >         options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TS
> >         O4,WOL_MAGIC,VLAN_HWTSO> ether XX:XX:XX:XX:XX:XX
> >         inet 192.168.0.160 netmask 0xffffff00 broadcast 192.168.0.255
> >         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> >         media: Ethernet autoselect (100baseTX <full-duplex>)
> >         status: active
> > 
> > re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> > 
> >         options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WO
> >         L_MAGIC,LINKSTATE> ether XX:XX:XX:XX:XX:XX
> >         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> >         media: Ethernet autoselect (none)
> >         status: no carrier
> > 
> > Any assistance to resolve this issue will be greatly appreciated.
> 
> It's not normal to see pause frames with tcpdump.  If my memory
> serves me right, MAC control frames which include pause frames
> should not be passed to host.  Which network driver do you see
> above pause frames?  Some drivers like fxp(4) allow passing pause
> frames to host but I think that's a bug in driver. I didn't change
> that behavior of the driver just because it used to enable that
> feature in the past.

This is what a web search also indicated.  In this case the machine receiving 
pause frames has:
# dmesg | grep 'em0\|re0'
em0: <Intel(R) PRO/1000 Network Connection 7.3.8> port 0xf040-0xf05f mem 
0xf7300000-0xf731ffff,0xf7328000-0xf7328fff irq 20 at device 25.0 on pci0
em0: Using an MSI interrupt
DragonSA at dragon:/tmp> dmesg | grep re0
re0: <RealTek 8169SC/8110SC Single-chip Gigabit Ethernet> port 0xd000-0xd0ff 
mem 0xf7220000-0xf72200ff irq 16 at device 0.0 on pci3
re0: Chip rev. 0x18000000
re0: MAC rev. 0x00000000
miibus0: <MII bus> on re0

# ifconfig bridge0
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        inet 192.168.0.2 netmask 0xffffff00 broadcast 192.168.0.255
        nd6 options=9<PERFORMNUD,IFDISABLED>
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: re0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 3 priority 128 path cost 55
        member: em0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 2 priority 128 path cost 2000000

Could it be bridge0 is causing the pause frames to be visible?  

> I'm not sure what's happening there but receiving pause frames will
> inhibit sending frames until the pause time expires such that you'll
> not get any response from the host.  Probably you have to know
> which host is sending these lots of pause frames.  Once you
> identify the guilty host, you have to narrow down what condition
> makes it send pause frames.

It turns out that the guilty host had a faulty memory module (that didn't show 
up in memtest86+ when run with another module in).  I've removed the offending 
memory module and no repeat of the incidences.  
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 326 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20140223/151588ee/attachment.sig>


More information about the freebsd-stable mailing list