Diagnosing packet loss
Kees Jan Koster
kjkoster at gmail.com
Thu Nov 24 10:07:11 UTC 2011
Thank you so much for the excellent suggestions. I can tell some of you have a lot of experience troubleshooting this issue.
At this stage I ruled out hardware or network issues. These are server grade network interfaces, new cables and the ifconfig configuration seems in order. netstat shows no collisions or packet errors for the past week or so.
I am dead certain there is no dupe IP. The other machines on the switch are currently off (test and load test box) and I still see packet loss. There simply is no other machine on the subnet that might have the same IP.
This seems to be local to my machine. Here is another reason why I say that: I can reliably transmit data when I bind to the aliased IP address: If I use mtr to measure packet loss from saffron (the stricken machine) to cumin (another machine in a different data center) I see the following:
saffron (ip address a) -> cumin: packet loss
saffron (ip address b) -> cumin: no packet loss
cumin -> saffron (ip address a): packet loss
cumin -> saffron (ip address b): no packet loss
This is consistent from running mtr for 5 minutes straight. This to me shows that the hardware is fine. Using the alias IP address I can run with no packet loss for as long as I like.
Sooo.... Now what? I am completely at a loss. :-/
kjkoster at kjkoster.org
I hate unit tests; I much prefer the illusion that there are no errors in my code.
-- Hendrik Muller
More information about the freebsd-questions