rwatson at FreeBSD.org
Sat Jan 3 03:15:41 PST 2009
On Tue, 30 Dec 2008, Yony Yossef wrote:
>>> I'm facing lots of UDP "Connection refused" errors while running
>>> multistream iperf test. Analyzing it with wireshark showed several "ICMP
>>> Port Unreachable" problems.
>>> I've overriden it with "sysctl net.inet.udp.blackhole=1",
>> but I'm not
>>> sure this is the correct thing to do, I feel like I've sweeped the problem
>>> under the carpet.
>>> PS - I see similar failures with TCP bidirectional iperf
>> test, it can
>>> also be overriden by "sysctl net.inet.tcp.blackhole=1".
>>> My question is - can it be a NIC problem? If so, how can a driver problem
>>> cause an iperf UDP socket to be in a "non listening state"?
>> This is fairly unlikely to be a NIC problem, although anything is possible
>> in software. I'm not familiar with iperf, but generally speaking ICMP port
>> unreachable is a result of packets arriving at a closed socket;
>> net.inet.udp.blackhole suppresses that ICMP:
>> if (udp_blackhole)
>> goto badheadlocked;
>> if (badport_bandlim(BANDLIM_ICMP_UNREACH) < 0)
>> goto badheadlocked;
>> *ip = save_ip;
>> ip->ip_len += iphlen;
>> icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_PORT, 0, 0);
>> I think I'd suspect an application bug/feature, in which socket gets closed
>> and opened during execution and once in a while a datagram is delivered in
>> that window. Perhaps packets are being delivered with a non-trivial delay
>> causing them to arrive after the application has timed out waiting for it?
> I'm talking about a simple multistream UDP iperf test. One stream always
> works fine. More than one UDP stream has a chance of failing because of this
> problem. Wireshark analysis shows no such delay and no packet loss nor
> corruption, for what I've seen and understood. On the other hand, same test
> on a 1Gig NIC (I'm using a 10Gig) doesn't suffer from this issue without the
> blackhole assistance.
Hmm. These results are confusing, given the code. Would it be possible for
you to provide a packet trace that includes a few UDP packets and the ICMP
errors they resulted in? While I can't preclude a network stack bug, related
parts of the UDP code are relatively straight forward and well-exercised,
which generally suggests an application-layer bug or interaction. If the
packets are becoming corrupted during driver processing, then the ICMP
rejections should show the packet header the stack saw leading to their
rejection. Also, could you provide sockstat output for the application during
Robert N M Watson
University of Cambridge
More information about the freebsd-net