bge driver autoneg failure and system-wide stalls

Emil Mikulic emil at cs.rmit.edu.au
Mon Nov 28 10:07:36 GMT 2005


On Mon, Nov 28, 2005 at 09:17:58AM +0000, Bill Paul wrote:
> > On Fri, Nov 25, 2005 at 04:22:28PM +0300, Gleb Smirnoff wrote:
> > > On Fri, Nov 25, 2005 at 01:20:41PM +1100, Emil Mikulic wrote:
> In your original e-mail, you write:
> 
> > I have a network port with bad wiring in the walls - a cable tester
> > shows only wires 1,2,3 and 6 are actually connected.
> 
> Actually, this is not 'bad' wiring. It's correct for 10/100 ethernet

That was probably the thinking when the office got wired up initially.

> as long as a) the cabling is actually cat5, and not moldy old cat3
> or something, and b) the four wires are actually connected in the right
> sequence. Pins 1 and 2 form one pair, and pins 3 and 6 form the second
> pair. A typical installation may have the orange/orange+white pair
> on pins 1 and 2, and the blue/blue+white pair on 3 and 6. And both
> sides must match. If it's not done this way, then while you may have
> a DC path between all 4 pins on each side, you won't be getting the
> proper noise cancellation effect of twisted pair cabling. This can
> cause signal distortion, dropped packets, and possibly botched autoneg.
> 
> You didn't say if you checked for this though, so we can't speculate
> if this is really the problem.

I can't see what's in the walls but I attached the two parts of the
cable tester to both ends of the line and 1-2-3-6 on one end was 1-2-3-6
on the other end.

> A couple things you neglected to mention (and which Gleb failed to
> ask you about):

(yeah, sorry about that.  I spent like an hour with the original
message in a text editor then after I sent it I kept remembering things
I'd forgotten)

> - Exactly what kind of switch is on the other end of this wiring?

The switch is managed, and does gigabit.  It's a Nortel but I don't
know the exact model off by heart (I can look it up tomorrow if it
matters)

> - Is the port that corresponds to this wall jack a gigabit ethernet
>   port, or just 10/100?

All the wiring in the walls is, as you said, for 100Mbit.  The switch
and my network card are gigabit.

> If it is a gigE port, then you're being silly.

Yeah, I thought so.  =(

> 4 pairs are required for gigE. Period. The NWAY autonegotiation
> exchange can take place over just 2 pairs, but the gigE signalling
> scheme requires all 4 pairs to be present in order to establish a
> link. If there's just two pairs connected, both sides will can
> announce that they support gigabit speeds, and both sides will try
> configuring themselves for gigE operation, but no link will ever be
> established.

I understand that gigabit will never work over the current wiring.
I've accepted that and moved on.  =)

But are you saying the autonegotiation will never work over the current
wiring either?  As in, it can't try a slower link speed?

> If you manually override the autonegotiation in this case, you should
> do "ifconfig bge0 media 100baseTX" only. Do not specify full duplex.
> This won't work.
>
> [...]
>
> If you manually specify full, this will create a duplex mismatch, and
> you'll get rotten throughput.

10baseT/UTP and 100baseTX worked in both half-duplex and full-duplex
but I didn't check throughput...

Important bit:

> Also, the DELAY(10) here can probably be replaced with a tsleep() or
> something, which will allow the CPU to do other work while waiting for
> the PHY instead of hard busywaiting and blocking up the whole system
> (allowing a reschedule here should not hurt).

This would be cool!

I realise that I am doing stupid things with broken wiring and that it
won't work, but if the "periodic lockup" problem could be fixed, that
would be an improvement to the bge(4) driver, IMO.

When I first noticed this symptom, it took quite a while to go from
"FreeBSD is doing this annoying lockup thing every few seconds and I
keep losing keypresses" to "oh, it's the network card / cabling /
driver"

> If the switch is managed, and you have the password to it, you can try
> programming it to only announce 10/100 support on that port until such
> time as you can recable the place for gigE.

As I wrote in the original message, I'm currently connecting the machine
directly to the switch (fortunately it's right above my desk).  And the
place -is- going to get re-cabled sometime soon, which is cool.

It's just that I was in a situation where I could reproduce a problem I
had some time ago, and hoped that I could use the opportunity to get it
fixed.

Also, thanks a lot for the autoneg explanation, Bill.

--Emil


More information about the freebsd-current mailing list