Diagnosing packet loss
m.seaman at infracaninophile.co.uk
Thu Nov 24 11:43:14 UTC 2011
On 24/11/2011 10:07, Kees Jan Koster wrote:
> This seems to be local to my machine. Here is another reason why I
> say that: I can reliably transmit data when I bind to the aliased IP
> address: If I use mtr to measure packet loss from saffron (the stricken
> machine) to cumin (another machine in a different data center) I see the
> saffron (ip address a) -> cumin: packet loss
> saffron (ip address b) -> cumin: no packet loss
> cumin -> saffron (ip address a): packet loss
> cumin -> saffron (ip address b): no packet loss
> This is consistent from running mtr for 5 minutes straight. This to
> me shows that the hardware is fine. Using the alias IP address I can
> run with no packet loss for as long as I like.
> Sooo.... Now what? I am completely at a loss. :-/
Hmm... I wouldn't dismiss hardware problems just yet. Earlier you showed
the ifconfig output for your problem machine:
> [kjkoster at saffron ~]$ ifconfig bge0
> bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> ether 00:e0:81:32:ed:b4
> inet 22.214.171.124 netmask 0xfffffff8 broadcast 126.96.36.199
> inet 188.8.131.52 netmask 0xffffffff broadcast 184.108.40.206
> media: Ethernet autoselect (100baseTX <full-duplex,flowcontrol,rxpause,txpause>)
> status: active
Where there is a one-bit difference between the addresses. Can you try
temporarily using two even-numbered addresses and then two odd-numbered
addresses and repeat your mtr tests? If the packet loss problem
correlates with whether the address is even or odd, then I think that's
pretty good evidence for a dud network interface: a one-bit problem in a
memory register somewhere, occasionally flipping the least significant
bit in the address to 0.
Another test would be to swap the configuration order (ie. make .166 the
primary address and .165 the alias) -- if it's always the first
configured address that has problems, again that indicates memory
trouble in the hardware.
Are these NICs built-in to your motherboard? If so, they will almost
certainly share a PHY, which is where the problem would be, and why
swapping the cables between interfaces made no difference.
Unfortunately in that case to fix the problem, you'll either have to
swap out the motherboard or add a separate NIC card to your system.
Hopefully the system is still under warranty.
Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard
PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
JID: matthew at infracaninophile.co.uk Kent, CT11 9PW
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 267 bytes
Desc: OpenPGP digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20111124/7c106e38/signature.pgp
More information about the freebsd-questions