em driver testing

Nikolay Pavlov quetzal at zone3000.net
Wed Nov 8 16:47:40 UTC 2006


On Wednesday,  8 November 2006 at  7:41:02 -0800, Jeremy Chadwick wrote:
> On Wed, Nov 08, 2006 at 04:40:03PM +0200, Nikolay Pavlov wrote:
> > Well i have 5.5 box with very similar symptomatic :)
> > I do not see watchdog timeouts on it, but a lot of UP/DOWN events.
> 
> Are you sure this is the same problem as what's being discussed
> here?  If you revert to a previous kernel or em driver, does the
> problem (link up/down) go away?  Are you sure you don't actually
> have a flaky cable or RJ45 connector?  What does the switch your
> NIC is connected to say? (does it show link going up and down)

I am pretty sure. All my servers using the same em chip, on all my 6.1
boxes either UP or SMP i see watchdog timeout, average load of this 
adapters is 5000 - 6000 interrunpts per second. I have only one box with
5.5 (same task and same platform), but i am not claiming that this is 
exactly the watchdog problem, it's just very symptomatic in context of 
discussion. In any case new 6.2 em patch works for me, at least i do not 
see watchdog timeouts after 48 hours of uptime.

By the way the box is connected to 2950 switch, i can't find any
problems on cabling.

Here is how it looks like on 5.5:

Oct 18 05:38:45 ms6 kernel: em0: Link is up 1000 Mbps Full Duplex
Oct 18 05:38:50 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 18 05:39:21 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 18 05:39:32 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: is alive again
Oct 18 05:52:22 ms6 kernel: em0: Link is Down
Oct 18 05:55:13 ms6 kernel: em0: Link is up 1000 Mbps Full Duplex
Oct 18 05:55:13 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 18 05:55:44 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 18 05:55:46 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: is alive again
Oct 18 06:01:52 ms6 kernel: em0: Link is Down
Oct 18 06:03:54 ms6 kernel: em0: Link is up 1000 Mbps Full Duplex
Oct 18 06:03:54 ms6 kernel: em0: Link is Down
Oct 18 06:04:01 ms6 kernel: em0: Link is up 1000 Mbps Full Duplex
Oct 18 06:16:07 ms6 kernel: em0: Link is Down
Oct 18 06:18:16 ms6 kernel: em0: Link is up 1000 Mbps Full Duplex
Oct 18 06:21:55 ms6 kernel: em0: Link is Down
Oct 18 06:25:12 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 18 06:25:25 ms6 kernel: em0: Link is up 1000 Mbps Full Duplex
Oct 18 06:25:27 ms6 kernel: em0: Link is Down
Oct 18 06:25:33 ms6 kernel: em0: Link is up 1000 Mbps Full Duplex
Oct 18 06:25:43 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 18 06:26:10 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: is alive again
Oct 18 06:43:12 ms6 kernel: em0: Link is Down
Oct 18 06:45:13 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 18 06:45:44 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 18 06:46:15 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 18 06:46:27 ms6 kernel: em0: Link is up 1000 Mbps Full Duplex
Oct 18 06:46:28 ms6 kernel: em0: Link is Down
Oct 18 06:46:34 ms6 kernel: em0: Link is up 1000 Mbps Full Duplex
Oct 18 06:46:46 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 18 06:47:17 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 18 06:47:26 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: is alive again
Oct 18 07:02:51 ms6 kernel: em0: Link is Down
Oct 18 07:04:42 ms6 kernel: em0: Link is up 1000 Mbps Full Duplex
Oct 18 07:04:44 ms6 kernel: em0: Link is Down
Oct 18 07:04:50 ms6 kernel: em0: Link is up 1000 Mbps Full Duplex
Oct 18 07:05:13 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 18 07:05:25 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: is alive again
Oct 18 16:40:05 ms6 kernel: receive error 60 from nfs server 206.53.x.x:/usr/home/shared
Oct 19 03:55:13 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: not responding
Oct 19 03:55:15 ms6 kernel: nfs server 206.53.x.x:/usr/home/shared: is alive again

After that date it was rebooted at least three times and i don't 
see such symptoms any more.

> 
> I feel horrible for both Scott and Jack -- I think there's tons
> of people coming out of the woodwork with "ME TOO" comments who
> may in fact be suffering from other problems, and are looking for
> a scapegoat thread.

Just ignore me. Patch works for me and this is end.

-- 
======================================================================  
- Best regards, Nikolay Pavlov. <<<-----------------------------------    
======================================================================  



More information about the freebsd-stable mailing list