em device hangs on ifconfig alias ...

User Freebsd freebsd at hub.org
Fri Jul 7 21:18:46 UTC 2006


On Fri, 7 Jul 2006, Atanas wrote:

> Robert Watson said the following on 7/7/06 7:17 AM:
>>> > I just left a "tcpdump -n arp host 10.10.64.40" on a third machine > 
>>> sniffing around and tested all em module versions I had (the stock 6.1, > 
>>> 6-STABLE and 6-STABLE with your patch), but got silence on all three:
>>> 
>>> That's odd. I've tested it on CURRENT and I could see the ARP packet. Are 
>>> you sure you patched correctly? If so I have to build a RELENG_6 machine 
>>> and give it try.
>> 
>> Is it possible you're seeing an interaction between the reset generated as 
>> part of IP address changing, and the time it takes to negotiate link?  It's 
>> possible that the arp packets are being eaten during the link negotiation, 
>> so for systems negotiating quickly (or not at all) then the arp packet is 
>> seen on other hosts, and otherwise not...
>> 
> Looks like this is exactly what happens.
>
> I was able to see it by running two tcpdump instances - one on the EM machine 
> running in background and another running elsewhere on the same subnet.
>
> So on the EM machine the arp packet actually gets generated by em(4) and 
> caught by the tcpdump running there:
>
> EM# tcpdump -n arp and ether src 00:04:23:b5:1b:ff &
> EM#
> EM# ifconfig em1 inet alias 10.10.64.40
> EM# 11:28:37.178946 arp who-has 10.10.64.40 tell 10.10.64.40
> EM#
>
> But it doesn't reach the other tcpdump instance running on another host. It 
> seems that the arp packet gets killed before leaving the EM machine, due to 
> the card initialization or something else.
>
> I tried sending it manually with arping, just to make sure both tcpdumps 
> operate properly and yes, the packet got delivered to both.
>
> I think that I have patched, built and loaded the em(4) kernel module 
> correctly. After applying the patch there were no rejects, before building 
> the module I intentionally appended " (patched)" to its version string in 
> if_em.c, and could see that in dmesg every time I loaded the module:
> em1: <Intel(R) PRO/1000 Network Connection Version - 3.2.18 (patched)>

Is it possible that we're going at this issue backwards?  It isn't the 
lack of ARP packet going out that is causing the problems with moving IPs, 
but that delay that we're seeing when aliasing a new IP on the stack?  The 
ARP packet *is* being attempted, but is timing out before the re-init is 
completing?

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email . scrappy at hub.org                              MSN . scrappy at hub.org
Yahoo . yscrappy               Skype: hub.org        ICQ . 7615664


More information about the freebsd-stable mailing list