6-CURRENT Network stack issues w/SMP? (Was: Re: TreeListfailed:
Network write failure: ChannelMux.ProtocolError)
Robert Watson
rwatson at freebsd.org
Sun Sep 12 13:10:44 PDT 2004
On Sun, 12 Sep 2004, Andre Guibert de Bruet wrote:
> Using an rl-based network card, I am able to transfer data without any
> problems. Any idea who the nge maintainer is?
I'm not sure we have an nge maintainer, but I'm also not sure it's needed
much maintenance (perhaps until now). Bill Paul wrote it, I believe,
however. I'm thinking there are a couple of things we should try doing:
- First, we should confirm that Giant really is properly held in some
strategic places in the driver. I.e., slap down GIANT_REQUIRED in a
bunch of interesting looking places (perhaps the head of most of the
functions). We could be entering the ioctl code w/o Giant, perhaps, or
the watch dog.
- Attempt to identify whether or not the corruption corresponds with other
failure modes that may be present, such as packet loss. Perhaps we're
looking at a problem with reassembly and/or retransmission. It would be
useful to know, for example, if the counters relating to TCP packet loss
go up at about the time corruption occurs.
- We should probably build a test tool to characterize the corruption a
bit better. We could potentially start out just by dd'ing a big file of
zeros through netcat between two hosts using if_nge, and confirm that
the zeros get there in one piece, and then try with more complex data
patterns that would reveal improper ordering, etc.
- For grins, could you try running the same software with TCP SACK turned
off and confirm that the problem is still present?
Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
robert at fledge.watson.org Principal Research Scientist, McAfee Research
More information about the freebsd-current
mailing list