From imp at bsdimp.com Wed Apr 1 00:34:36 2009 From: imp at bsdimp.com (M. Warner Losh) Date: Wed Apr 1 00:34:43 2009 Subject: Small change to ukphy Message-ID: <20090401.013246.-1253043078.imp@bsdimp.com> I've encountered a number of PHY chips that need auto negotiation kicked off to come out of ISO state. This makes sense, because the ukphy driver never seems to take the PHY out of isolation state otherwise. Index: ukphy.c =================================================================== --- ukphy.c (revision 190463) +++ ukphy.c (working copy) @@ -146,6 +146,7 @@ sc->mii_phy = ma->mii_phyno; sc->mii_service = ukphy_service; sc->mii_pdata = mii; + sc->mii_flags |= MIIF_FORCEANEG; mii->mii_instance++; This forces auto negotiation. The reason for this is that it takes it out of ISO state (Isolate). Once out of that state, things work well. The question I have is will we properly go back into ISO state for PHYs that should be isolated. NetBSD has many of its NIC drivers setting this flag. Their APIs allow them to set this directly at mii attach time. Ours don't, so none of our drivers set this flag. The other fix for this might be: Index: mii_physubr.c =================================================================== --- mii_physubr.c (revision 190463) +++ mii_physubr.c (working copy) @@ -113,7 +113,9 @@ int bmcr, anar, gtcr; if (IFM_SUBTYPE(ife->ifm_media) == IFM_AUTO) { - if ((PHY_READ(sc, MII_BMCR) & BMCR_AUTOEN) == 0 || + bmcr = PHY_READ(sc, MII_BMCR); + if ((bmcr & BMCR_AUTOEN) == 0 || + (bmcr & BMCR_ISO) || (sc->mii_flags & MIIF_FORCEANEG)) (void) mii_phy_auto(sc); return; Which says that if auto negotiation is enabled, and ISO is set to go ahead and kick off an auto negotiation. I'm less sure of this path, but it is an alternative. Otherwise, we never write to the BMCR to take the device out of isolation. If there's a better place to do this, then I'm all ears. Either one of these hacks make several PC Cards that I have start to work... In fact, I'm starting to approach 100% (up from 50%) of my ed-based PC Cards working with this simple change (and others to the ed driver). I know that these cards are a little behind the leading edge, but I'd like to get them working since I've put a few hours into investigating things here. Comments? Warner From pyunyh at gmail.com Wed Apr 1 03:36:33 2009 From: pyunyh at gmail.com (Pyun YongHyeon) Date: Wed Apr 1 03:36:40 2009 Subject: Small change to ukphy In-Reply-To: <20090401.013246.-1253043078.imp@bsdimp.com> References: <20090401.013246.-1253043078.imp@bsdimp.com> Message-ID: <20090401100939.GB12246@michelle.cdnetworks.co.kr> On Wed, Apr 01, 2009 at 01:32:46AM -0600, M. Warner Losh wrote: > I've encountered a number of PHY chips that need auto negotiation > kicked off to come out of ISO state. This makes sense, because the > ukphy driver never seems to take the PHY out of isolation state > otherwise. > > Index: ukphy.c > =================================================================== > --- ukphy.c (revision 190463) > +++ ukphy.c (working copy) > @@ -146,6 +146,7 @@ > sc->mii_phy = ma->mii_phyno; > sc->mii_service = ukphy_service; > sc->mii_pdata = mii; > + sc->mii_flags |= MIIF_FORCEANEG; > > mii->mii_instance++; > > > This forces auto negotiation. The reason for this is that it takes it > out of ISO state (Isolate). Once out of that state, things work If the purpose is to take PHY out of isolated state couldn't this be handled in ifm_change_cb_t handler of parent interface? I guess the callback can reset the PHY and subsequent mii_mediachg() call may start auto-negotiation. > well. The question I have is will we properly go back into ISO state > for PHYs that should be isolated. > If the PHY requires special handing for ISO state in reset it may need separated PHY driver as ukphy(4) does not set MIIF_NOISOLATE. As you said it would be really great if we have a generic way to pass various MII flags or driver specific information to mii(4). > NetBSD has many of its NIC drivers setting this flag. Their APIs > allow them to set this directly at mii attach time. Ours don't, so > none of our drivers set this flag. > > The other fix for this might be: > Index: mii_physubr.c > =================================================================== > --- mii_physubr.c (revision 190463) > +++ mii_physubr.c (working copy) > @@ -113,7 +113,9 @@ > int bmcr, anar, gtcr; > > if (IFM_SUBTYPE(ife->ifm_media) == IFM_AUTO) { > - if ((PHY_READ(sc, MII_BMCR) & BMCR_AUTOEN) == 0 || > + bmcr = PHY_READ(sc, MII_BMCR); > + if ((bmcr & BMCR_AUTOEN) == 0 || > + (bmcr & BMCR_ISO) || > (sc->mii_flags & MIIF_FORCEANEG)) > (void) mii_phy_auto(sc); > return; > > Which says that if auto negotiation is enabled, and ISO is set to go > ahead and kick off an auto negotiation. I'm less sure of this path, > but it is an alternative. Otherwise, we never write to the BMCR to > take the device out of isolation. If there's a better place to do > this, then I'm all ears. > > Either one of these hacks make several PC Cards that I have start to > work... In fact, I'm starting to approach 100% (up from 50%) of my > ed-based PC Cards working with this simple change (and others to the > ed driver). I know that these cards are a little behind the leading > edge, but I'd like to get them working since I've put a few hours into > investigating things here. > > Comments? > > Warner From imp at bsdimp.com Wed Apr 1 08:37:55 2009 From: imp at bsdimp.com (M. Warner Losh) Date: Wed Apr 1 08:38:02 2009 Subject: Small change to ukphy In-Reply-To: <20090401100939.GB12246@michelle.cdnetworks.co.kr> References: <20090401.013246.-1253043078.imp@bsdimp.com> <20090401100939.GB12246@michelle.cdnetworks.co.kr> Message-ID: <20090401.093740.669301742.imp@bsdimp.com> In message: <20090401100939.GB12246@michelle.cdnetworks.co.kr> Pyun YongHyeon writes: : On Wed, Apr 01, 2009 at 01:32:46AM -0600, M. Warner Losh wrote: : > I've encountered a number of PHY chips that need auto negotiation : > kicked off to come out of ISO state. This makes sense, because the : > ukphy driver never seems to take the PHY out of isolation state : > otherwise. : > : > Index: ukphy.c : > =================================================================== : > --- ukphy.c (revision 190463) : > +++ ukphy.c (working copy) : > @@ -146,6 +146,7 @@ : > sc->mii_phy = ma->mii_phyno; : > sc->mii_service = ukphy_service; : > sc->mii_pdata = mii; : > + sc->mii_flags |= MIIF_FORCEANEG; : > : > mii->mii_instance++; : > : > : > This forces auto negotiation. The reason for this is that it takes it : > out of ISO state (Isolate). Once out of that state, things work : : If the purpose is to take PHY out of isolated state couldn't this : be handled in ifm_change_cb_t handler of parent interface? I guess : the callback can reset the PHY and subsequent mii_mediachg() call : may start auto-negotiation. This callback isn't called. The problem is that the PHY is in ISO state. Since it is in ISO state with auto negotiation enabled, we never kick off an explicit auto negotiation, so the state never changes so we never get this callback... : > well. The question I have is will we properly go back into ISO state : > for PHYs that should be isolated. : > : : If the PHY requires special handing for ISO state in reset it may : need separated PHY driver as ukphy(4) does not set MIIF_NOISOLATE. : As you said it would be really great if we have a generic way to : pass various MII flags or driver specific information to mii(4). This seems to be a common quirk. I'd hate to have a driver that's just ukphy but with the one line added above and play what-a-mole with all the odd-balls that are out there. Doesn't seem like a strategy that will win the day. I think we have a way to do this... I could do the following in my attach routine: mii = device_get_softc(sc->miibus); LIST_FOREACH(miisc, &mii->mii_phys, mii_list) { miisc->mii_flags |= MIIF_FORCEANEG; mii_phy_reset(miisc); } mii_mediachg(mii); which is similar to what fxp does in its change routine (it is what I put in my status change routine). Also MIIF_NOISOLATE works as well. Is the above too insane? Warner : > NetBSD has many of its NIC drivers setting this flag. Their APIs : > allow them to set this directly at mii attach time. Ours don't, so : > none of our drivers set this flag. : > : > The other fix for this might be: : > Index: mii_physubr.c : > =================================================================== : > --- mii_physubr.c (revision 190463) : > +++ mii_physubr.c (working copy) : > @@ -113,7 +113,9 @@ : > int bmcr, anar, gtcr; : > : > if (IFM_SUBTYPE(ife->ifm_media) == IFM_AUTO) { : > - if ((PHY_READ(sc, MII_BMCR) & BMCR_AUTOEN) == 0 || : > + bmcr = PHY_READ(sc, MII_BMCR); : > + if ((bmcr & BMCR_AUTOEN) == 0 || : > + (bmcr & BMCR_ISO) || : > (sc->mii_flags & MIIF_FORCEANEG)) : > (void) mii_phy_auto(sc); : > return; : > : > Which says that if auto negotiation is enabled, and ISO is set to go : > ahead and kick off an auto negotiation. I'm less sure of this path, : > but it is an alternative. Otherwise, we never write to the BMCR to : > take the device out of isolation. If there's a better place to do : > this, then I'm all ears. : > : > Either one of these hacks make several PC Cards that I have start to : > work... In fact, I'm starting to approach 100% (up from 50%) of my : > ed-based PC Cards working with this simple change (and others to the : > ed driver). I know that these cards are a little behind the leading : > edge, but I'd like to get them working since I've put a few hours into : > investigating things here. : > : > Comments? : > : > Warner : From craft at alacritech.com Wed Apr 1 10:27:56 2009 From: craft at alacritech.com (Peter Craft) Date: Wed Apr 1 10:28:03 2009 Subject: pNFS Message-ID: <2ed801c9b2ed$44c6c3c0$510a010a@alacritech.com> Are there any efforts underway to implement parallel NFS in FreeBSD? From marius at alchemy.franken.de Wed Apr 1 14:26:22 2009 From: marius at alchemy.franken.de (Marius Strobl) Date: Wed Apr 1 14:26:29 2009 Subject: Small change to ukphy In-Reply-To: <20090401100939.GB12246@michelle.cdnetworks.co.kr> References: <20090401.013246.-1253043078.imp@bsdimp.com> <20090401100939.GB12246@michelle.cdnetworks.co.kr> Message-ID: <20090401211254.GA83780@alchemy.franken.de> On Wed, Apr 01, 2009 at 07:09:39PM +0900, Pyun YongHyeon wrote: > On Wed, Apr 01, 2009 at 01:32:46AM -0600, M. Warner Losh wrote: > > I've encountered a number of PHY chips that need auto negotiation > > kicked off to come out of ISO state. This makes sense, because the > > ukphy driver never seems to take the PHY out of isolation state > > otherwise. > > > > Index: ukphy.c > > =================================================================== > > --- ukphy.c (revision 190463) > > +++ ukphy.c (working copy) > > @@ -146,6 +146,7 @@ > > sc->mii_phy = ma->mii_phyno; > > sc->mii_service = ukphy_service; > > sc->mii_pdata = mii; > > + sc->mii_flags |= MIIF_FORCEANEG; > > > > mii->mii_instance++; > > > > > > This forces auto negotiation. The reason for this is that it takes it > > out of ISO state (Isolate). Once out of that state, things work > > If the purpose is to take PHY out of isolated state couldn't this > be handled in ifm_change_cb_t handler of parent interface? I guess > the callback can reset the PHY and subsequent mii_mediachg() call > may start auto-negotiation. > > > well. The question I have is will we properly go back into ISO state > > for PHYs that should be isolated. > > > > If the PHY requires special handing for ISO state in reset it may > need separated PHY driver as ukphy(4) does not set MIIF_NOISOLATE. > As you said it would be really great if we have a generic way to > pass various MII flags or driver specific information to mii(4). > > > NetBSD has many of its NIC drivers setting this flag. Their APIs > > allow them to set this directly at mii attach time. Ours don't, so > > none of our drivers set this flag. > > > > The other fix for this might be: > > Index: mii_physubr.c > > =================================================================== > > --- mii_physubr.c (revision 190463) > > +++ mii_physubr.c (working copy) > > @@ -113,7 +113,9 @@ > > int bmcr, anar, gtcr; > > > > if (IFM_SUBTYPE(ife->ifm_media) == IFM_AUTO) { > > - if ((PHY_READ(sc, MII_BMCR) & BMCR_AUTOEN) == 0 || > > + bmcr = PHY_READ(sc, MII_BMCR); > > + if ((bmcr & BMCR_AUTOEN) == 0 || > > + (bmcr & BMCR_ISO) || > > (sc->mii_flags & MIIF_FORCEANEG)) > > (void) mii_phy_auto(sc); > > return; > > > > Which says that if auto negotiation is enabled, and ISO is set to go > > ahead and kick off an auto negotiation. I'm less sure of this path, > > but it is an alternative. Otherwise, we never write to the BMCR to > > take the device out of isolation. If there's a better place to do > > this, then I'm all ears. > > > > Either one of these hacks make several PC Cards that I have start to > > work... In fact, I'm starting to approach 100% (up from 50%) of my > > ed-based PC Cards working with this simple change (and others to the > > ed driver). I know that these cards are a little behind the leading > > edge, but I'd like to get them working since I've put a few hours into > > investigating things here. > > > > Comments? > > FYI, the idea I had for passing MIIF_DOPAUSE from the NIC drivers to the PHY drivers as required by the flow-control support without breaking the ABI was to use device flags. A proof-of-concept patch with an example application of that approach is: http://people.freebsd.org/~marius/mii_flags.diff One could even or the flags together in miibus_attach(), allowing MIIF_FORCEANEG etc to be additionally set via hints. Marius From brunner at nic-naa.net Wed Apr 1 17:19:32 2009 From: brunner at nic-naa.net (Eric Brunner-Williams) Date: Wed Apr 1 17:19:39 2009 Subject: pNFS In-Reply-To: <2ed801c9b2ed$44c6c3c0$510a010a@alacritech.com> References: <2ed801c9b2ed$44c6c3c0$510a010a@alacritech.com> Message-ID: <49D3FE6C.5050505@nic-naa.net> Peter Craft wrote: > Are there any efforts underway to implement parallel NFS in FreeBSD? > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > > i've not implemented a line of pnfs, and i've not looked much at the pnfs list since leaving panasas, however ... i thought the question was interesting at the time and i'm responding to your query. From pyunyh at gmail.com Wed Apr 1 17:53:33 2009 From: pyunyh at gmail.com (Pyun YongHyeon) Date: Wed Apr 1 17:53:40 2009 Subject: Small change to ukphy In-Reply-To: <20090401.093740.669301742.imp@bsdimp.com> References: <20090401.013246.-1253043078.imp@bsdimp.com> <20090401100939.GB12246@michelle.cdnetworks.co.kr> <20090401.093740.669301742.imp@bsdimp.com> Message-ID: <20090402005308.GA19091@michelle.cdnetworks.co.kr> On Wed, Apr 01, 2009 at 09:37:40AM -0600, M. Warner Losh wrote: > In message: <20090401100939.GB12246@michelle.cdnetworks.co.kr> > Pyun YongHyeon writes: > : On Wed, Apr 01, 2009 at 01:32:46AM -0600, M. Warner Losh wrote: > : > I've encountered a number of PHY chips that need auto negotiation > : > kicked off to come out of ISO state. This makes sense, because the > : > ukphy driver never seems to take the PHY out of isolation state > : > otherwise. > : > > : > Index: ukphy.c > : > =================================================================== > : > --- ukphy.c (revision 190463) > : > +++ ukphy.c (working copy) > : > @@ -146,6 +146,7 @@ > : > sc->mii_phy = ma->mii_phyno; > : > sc->mii_service = ukphy_service; > : > sc->mii_pdata = mii; > : > + sc->mii_flags |= MIIF_FORCEANEG; > : > > : > mii->mii_instance++; > : > > : > > : > This forces auto negotiation. The reason for this is that it takes it > : > out of ISO state (Isolate). Once out of that state, things work > : > : If the purpose is to take PHY out of isolated state couldn't this > : be handled in ifm_change_cb_t handler of parent interface? I guess > : the callback can reset the PHY and subsequent mii_mediachg() call > : may start auto-negotiation. > > This callback isn't called. The problem is that the PHY is in ISO Oops, you're right. > state. Since it is in ISO state with auto negotiation enabled, we > never kick off an explicit auto negotiation, so the state never > changes so we never get this callback... > > : > well. The question I have is will we properly go back into ISO state > : > for PHYs that should be isolated. > : > > : > : If the PHY requires special handing for ISO state in reset it may > : need separated PHY driver as ukphy(4) does not set MIIF_NOISOLATE. > : As you said it would be really great if we have a generic way to > : pass various MII flags or driver specific information to mii(4). > > This seems to be a common quirk. I'd hate to have a driver that's > just ukphy but with the one line added above and play what-a-mole with > all the odd-balls that are out there. Doesn't seem like a strategy > that will win the day. > > I think we have a way to do this... I could do the following in my > attach routine: > > mii = device_get_softc(sc->miibus); > LIST_FOREACH(miisc, &mii->mii_phys, mii_list) { > miisc->mii_flags |= MIIF_FORCEANEG; > mii_phy_reset(miisc); > } > mii_mediachg(mii); > > which is similar to what fxp does in its change routine (it is what I > put in my status change routine). Also MIIF_NOISOLATE works as well. > > Is the above too insane? > That looks ok to me but marius's patch would be the right direction. From pyunyh at gmail.com Wed Apr 1 17:55:29 2009 From: pyunyh at gmail.com (Pyun YongHyeon) Date: Wed Apr 1 17:55:35 2009 Subject: Small change to ukphy In-Reply-To: <20090401211254.GA83780@alchemy.franken.de> References: <20090401.013246.-1253043078.imp@bsdimp.com> <20090401100939.GB12246@michelle.cdnetworks.co.kr> <20090401211254.GA83780@alchemy.franken.de> Message-ID: <20090402005503.GB19091@michelle.cdnetworks.co.kr> On Wed, Apr 01, 2009 at 11:12:54PM +0200, Marius Strobl wrote: > On Wed, Apr 01, 2009 at 07:09:39PM +0900, Pyun YongHyeon wrote: > > On Wed, Apr 01, 2009 at 01:32:46AM -0600, M. Warner Losh wrote: > > > I've encountered a number of PHY chips that need auto negotiation > > > kicked off to come out of ISO state. This makes sense, because the > > > ukphy driver never seems to take the PHY out of isolation state > > > otherwise. > > > > > > Index: ukphy.c > > > =================================================================== > > > --- ukphy.c (revision 190463) > > > +++ ukphy.c (working copy) > > > @@ -146,6 +146,7 @@ > > > sc->mii_phy = ma->mii_phyno; > > > sc->mii_service = ukphy_service; > > > sc->mii_pdata = mii; > > > + sc->mii_flags |= MIIF_FORCEANEG; > > > > > > mii->mii_instance++; > > > > > > > > > This forces auto negotiation. The reason for this is that it takes it > > > out of ISO state (Isolate). Once out of that state, things work > > > > If the purpose is to take PHY out of isolated state couldn't this > > be handled in ifm_change_cb_t handler of parent interface? I guess > > the callback can reset the PHY and subsequent mii_mediachg() call > > may start auto-negotiation. > > > > > well. The question I have is will we properly go back into ISO state > > > for PHYs that should be isolated. > > > > > > > If the PHY requires special handing for ISO state in reset it may > > need separated PHY driver as ukphy(4) does not set MIIF_NOISOLATE. > > As you said it would be really great if we have a generic way to > > pass various MII flags or driver specific information to mii(4). > > > > > NetBSD has many of its NIC drivers setting this flag. Their APIs > > > allow them to set this directly at mii attach time. Ours don't, so > > > none of our drivers set this flag. > > > > > > The other fix for this might be: > > > Index: mii_physubr.c > > > =================================================================== > > > --- mii_physubr.c (revision 190463) > > > +++ mii_physubr.c (working copy) > > > @@ -113,7 +113,9 @@ > > > int bmcr, anar, gtcr; > > > > > > if (IFM_SUBTYPE(ife->ifm_media) == IFM_AUTO) { > > > - if ((PHY_READ(sc, MII_BMCR) & BMCR_AUTOEN) == 0 || > > > + bmcr = PHY_READ(sc, MII_BMCR); > > > + if ((bmcr & BMCR_AUTOEN) == 0 || > > > + (bmcr & BMCR_ISO) || > > > (sc->mii_flags & MIIF_FORCEANEG)) > > > (void) mii_phy_auto(sc); > > > return; > > > > > > Which says that if auto negotiation is enabled, and ISO is set to go > > > ahead and kick off an auto negotiation. I'm less sure of this path, > > > but it is an alternative. Otherwise, we never write to the BMCR to > > > take the device out of isolation. If there's a better place to do > > > this, then I'm all ears. > > > > > > Either one of these hacks make several PC Cards that I have start to > > > work... In fact, I'm starting to approach 100% (up from 50%) of my > > > ed-based PC Cards working with this simple change (and others to the > > > ed driver). I know that these cards are a little behind the leading > > > edge, but I'd like to get them working since I've put a few hours into > > > investigating things here. > > > > > > Comments? > > > > > FYI, the idea I had for passing MIIF_DOPAUSE from the NIC > drivers to the PHY drivers as required by the flow-control > support without breaking the ABI was to use device flags. > A proof-of-concept patch with an example application of > that approach is: > http://people.freebsd.org/~marius/mii_flags.diff > One could even or the flags together in miibus_attach(), > allowing MIIF_FORCEANEG etc to be additionally set via > hints. > This looks good. As you know some PHY drivers(e.g. brgphy(4), e1000phy(4)) have to know more information than mii flags. How about passing one more pointer argument to mii_probe()? The pointer would be used to point to a driver specific data. From upakul at gmail.com Thu Apr 2 01:04:16 2009 From: upakul at gmail.com (Upakul Barkakaty) Date: Thu Apr 2 01:04:23 2009 Subject: Multicast routing Message-ID: Hi all, I was trying to setup a multicast tunneling setup with freebsd, with the mrouted utility. However, my multicast router doesnt seem to be forwarding those multicast packets. It would really be helpful if someone could help me with the setup or the mrouted.conf file contents. Thanks in anticipation. -- Regards, Upakul Barkakaty From uebershark at googlemail.com Thu Apr 2 02:38:59 2009 From: uebershark at googlemail.com (Tom) Date: Thu Apr 2 02:39:06 2009 Subject: FreeBSD and ULI M526x NI Message-ID: <20090402111050.625a5d35@ViciousVincent> Hi, Is the ULI M526x NIC somehow supported in Free-or any other BSD? If so, how do I go about installing support for it, without a working internet connection? The m526x belongs to the 'tulip'-family with the linux-kernel. Thanks for any pointers! Tom From erich at fuujingroup.com Thu Apr 2 15:46:40 2009 From: erich at fuujingroup.com (Erich Jenkins) Date: Thu Apr 2 15:46:47 2009 Subject: IPF, IPNAT and Kernel Panic?? Message-ID: <49D54989.3010905@fuujingroup.com> I've a FreeBSD 7.0 box in a production environment, now doing spurious things. I've tried this on two servers with the same config (thinking there was a possible hardware issue). As it turns out, I see the same kernel panic and reboot no matter what I run this on. Every so often (perhaps once or twice daily) this box will panic, reboot and cause many people to call me at once to threaten my man bits... Currently: FreeBSD 7.0 Stable i386 kernel The firewall kernel modules are loaded on boot and are not compiled in. IPFilter is doing the firewall work on the public interfaces. IPNAT is doing NAT for the subnets behind this box on and port mapping. This machine is a 2GHz AMD-64bit box (being used as a 32bit) with a gig of ram and some Intel 10/100 NICs. I see the same thing on Intel x86 hardware, so I don't know this to be platform dependent. Here's some KGDB BT info: [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x4 fault code = supervisor read, page not present instruction pointer = 0x20:0xc387f94b stack pointer = 0x28:0xdceb59c8 frame pointer = 0x28:0xdceb5a44 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 26 (irq23: vr0) trap number = 12 panic: page fault cpuid = 0 Uptime: 18h43m50s Physical memory: 742 MB Dumping 113 MB: 98 82 66 50 34 18 2 #0 doadump () at pcpu.h:195 195 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) bt #0 doadump () at pcpu.h:195 #1 0xc05ba397 in boot (howto=260) at ../../../kern/kern_shutdown.c:409 #2 0xc05ba659 in panic (fmt=Variable "fmt" is not available. ) at ../../../kern/kern_shutdown.c:563 #3 0xc080307c in trap_fatal (frame=0xdceb5988, eva=4) at ../../../i386/i386/trap.c:899 #4 0xc08032e0 in trap_pfault (frame=0xdceb5988, usermode=0, eva=4) at ../../../i386/i386/trap.c:812 #5 0xc0803c62 in trap (frame=0xdceb5988) at ../../../i386/i386/trap.c:490 #6 0xc07ea5eb in calltrap () at ../../../i386/i386/exception.s:139 #7 0xc387f94b in ?? () Previous frame inner to this frame (corrupt stack?) Here's the kernel info: cpu I686_CPU ident KNL0329 # To statically compile in device wiring instead of /boot/device.hints #hints "GENERIC.hints" # Default places to look for devices. makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols options SCHED_4BSD # 4BSD scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking options INET6 # IPv6 communications protocols options SCTP # Stream Control Transmission Protocol options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options UFS_GJOURNAL # Enable gjournal-based UFS journaling options MD_ROOT # MD is a potential root device options NFSCLIENT # Network Filesystem Client options NFSSERVER # Network Filesystem Server options NFS_ROOT # NFS usable as /, requires NFSCLIENT options MSDOSFS # MSDOS Filesystem options CD9660 # ISO 9660 Filesystem options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS # Pseudo-filesystem framework options GEOM_PART_GPT # GUID Partition Tables. options GEOM_LABEL # Provides labelization options COMPAT_43TTY # BSD 4.3 TTY compat [KEEP THIS!] options COMPAT_FREEBSD4 # Compatible with FreeBSD4 options COMPAT_FREEBSD5 # Compatible with FreeBSD5 options COMPAT_FREEBSD6 # Compatible with FreeBSD6 options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI options KTRACE # ktrace(1) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options KBD_INSTALL_CDEV # install a CDEV entry in /dev options ADAPTIVE_GIANT # Giant mutex is adaptive. options STOP_NMI # Stop CPUS using NMI instead of IPI options AUDIT # Security event auditing # To make an SMP kernel, the next two lines are needed options SMP # Symmetric MultiProcessor Kernel device apic # I/O APIC # CPU frequency control device cpufreq # Bus support. device eisa device pci # ATA and ATAPI devices device ata device atadisk # ATA disk drives device ataraid # ATA RAID drives device atapicd # ATAPI CDROM drives options ATA_STATIC_ID # Static device numbering # SCSI peripherals device scbus # SCSI bus (required for SCSI) device da # Direct Access (disks) device sa # Sequential Access (tape etc) device cd # CD device pass # Passthrough device (direct SCSI access) device ses # SCSI Environmental Services (and SAF-TE) # atkbdc0 controls both the keyboard and the PS/2 mouse device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse device kbdmux # keyboard multiplexer device vga # VGA video card driver device splash # Splash screen and screen saver support # syscons is the default console driver, resembling an SCO console device sc device agp # support several AGP chipsets # Power management support (see NOTES for more options) #device apm # Add suspend/resume support for the i8254. device pmtimer # Serial (COM) ports device sio # 8250, 16[45]50 based serial ports device uart # Generic UART driver # Parallel port device ppc device ppbus # Parallel port bus (required) device lpt # Printer device plip # TCP/IP over parallel device ppi # Parallel port interface device # PCI Ethernet NICs. device de # DEC/Intel DC21x4x (``Tulip'') device em # Intel PRO/1000 adapter Gigabit Ethernet Card device ixgb # Intel PRO/10GbE Ethernet Card device le # AMD Am7900 LANCE and Am79C9xx PCnet device txp # 3Com 3cR990 (``Typhoon'') device vx # 3Com 3c590, 3c595 (``Vortex'') # PCI Ethernet NICs that use the common MII bus controller code. # NOTE: Be sure to keep the 'device miibus' line in order to use these NICs! device miibus # MII bus support device bce # Broadcom BCM5706/BCM5708 Gigabit Ethernet device bfe # Broadcom BCM440x 10/100 Ethernet device bge # Broadcom BCM570xx Gigabit Ethernet device dc # DEC/Intel 21143 and various workalikes device fxp # Intel EtherExpress PRO/100B (82557, 82558) device lge # Level 1 LXT1001 gigabit Ethernet device msk # Marvell/SysKonnect Yukon II Gigabit Ethernet device nfe # nVidia nForce MCP on-board Ethernet device nge # NatSemi DP83820 gigabit Ethernet #device nve # nVidia nForce MCP on-board Ethernet Networking device pcn # AMD Am79C97x PCI 10/100 (precedence over 'le') device re # RealTek 8139C+/8169/8169S/8110S device rl # RealTek 8129/8139 device sf # Adaptec AIC-6915 (``Starfire'') device sis # Silicon Integrated Systems SiS 900/SiS 7016 device sk # SysKonnect SK-984x & SK-982x gigabit Ethernet device ste # Sundance ST201 (D-Link DFE-550TX) device stge # Sundance/Tamarack TC9021 gigabit Ethernet device ti # Alteon Networks Tigon I/II gigabit Ethernet device tl # Texas Instruments ThunderLAN device tx # SMC EtherPower II (83c170 ``EPIC'') device vge # VIA VT612x gigabit Ethernet device vr # VIA Rhine, Rhine II device wb # Winbond W89C840F device xl # 3Com 3c90x (``Boomerang'', ``Cyclone'') # Pseudo devices. device loop # Network loopback device random # Entropy device device ether # Ethernet support device sl # Kernel SLIP device ppp # Kernel PPP device tun # Packet tunnel. device pty # Pseudo-ttys (telnet etc) device md # Memory "disks" device gif # IPv6 and IPv4 tunneling device faith # IPv6-to-IPv4 relaying (translation) device firmware # firmware assist module device bpf # Berkeley packet filter Any help or thoughts would be greatly appreciated! Erich From kfl at xiplink.com Thu Apr 2 15:57:32 2009 From: kfl at xiplink.com (Karim Fodil-Lemelin) Date: Thu Apr 2 15:57:38 2009 Subject: FreeBSD 7.1 Crash dump with WITNESS Message-ID: <49D54083.5060504@xiplink.com> Hi, I got this crash while running tcpdump and saving to a file. I can't reproduce it consistently but perhaps someone can give me some pointers on how to fix this. It looks like the witness code is in some infinite loop and that get stopped by an MPASS check. (kgdb) bt #0 kdb_enter_why (why=0xc092cefd "panic", msg=0xc092cefd "panic") at ../../../kern/subr_kdb.c:316 #1 0xc06a69c6 in panic (fmt=0xc091345b "Assertion %s failed at %s:%d") at ../../../kern/kern_shutdown.c:557 #2 0xc06e1e22 in isitmydescendant (parent=0xc0a2a208, child=0xc0a2c468) at ../../../kern/subr_witness.c:1634 #3 0xc06e1e33 in isitmydescendant (parent=0xc0a2ac80, child=0xc0a2c468) at ../../../kern/subr_witness.c:1636 #4 0xc06e1e33 in isitmydescendant (parent=0xc0a2abe0, child=0xc0a2c468) at ../../../kern/subr_witness.c:1636 #5 0xc06e1e33 in isitmydescendant (parent=0xc0a2ac08, child=0xc0a2c468) at ../../../kern/subr_witness.c:1636 #6 0xc06e3f82 in witness_checkorder (lock=0xca37c2d0, flags=Variable "flags" is not available. ) at ../../../kern/subr_witness.c:1019 #7 0xc0698705 in _mtx_lock_flags (m=0xca37c2d0, opts=0, file=0xc091253f "../../../dev/e1000/if_em.c", line=1136) at ../../../kern/kern_mutex.c:183 #8 0xc0527f18 in em_ioctl (ifp=0xca394c00, command=2149607696, data=0xf572fa04 ",\226\2235\226\2230rU:n\003\210@") at ../../../dev/e1000/if_em.c:1136 #9 0xc073eb91 in if_setflag (ifp=0xca394c00, flag=256, pflag=131072, refcount=0xca394c44, onswitch=0) at ../../../net/if.c:2098 #10 0xc073ec6a in ifpromisc (ifp=0xca394c00, pswitch=0) at ../../../net/if.c:2125 #11 0xc0738083 in bpf_detachd (d=0xcb262f00) at ../../../net/bpf.c:379 #12 0xc0739664 in bpfclose (dev=0xcb0ddd00, flags=3, fmt=8192, td=0xcb087230) at ../../../net/bpf.c:452 #13 0xc0633845 in devfs_close (ap=0xf572fb30) at ../../../fs/devfs/devfs_vnops.c:460 #14 0xc08d0306 in VOP_CLOSE_APV (vop=0xc09c52c0, a=0xf572fb30) at vnode_if.c:415 #15 0xc073427b in vn_close (vp=0xcb27f420, flags=3, file_cred=0xcb279300, td=0xcb087230) at vnode_if.h:228 #16 0xc0734389 in vn_closefile (fp=0xcaae44c0, td=0xcb087230) at ../../../kern/vfs_vnops.c:867 #17 0xc0630b9c in devfs_close_f (fp=0xcaae44c0, td=0xcb087230) at ../../../fs/devfs/devfs_vnops.c:479 #18 0xc0675a79 in fdrop (fp=0xcaae44c0, td=0xcb087230) at file.h:299 #19 0xc06777b9 in closef (fp=0xcaae44c0, td=0xcb087230) at ../../../kern/kern_descrip.c:2033 #20 0xc0677b67 in kern_close (td=0xcb087230, fd=3) ---Type to continue, or q to quit--- at ../../../kern/kern_descrip.c:1125 #21 0xc0677bff in close (td=0xcb087230, uap=0xf572fcfc) at ../../../kern/kern_descrip.c:1077 #22 0xc08c3cdf in syscall (frame=0xf572fd38) at ../../../i386/i386/trap.c:1076 #23 0xc08aa9fa in Xlcall_syscall () at ../../../i386/i386/exception.s:229 #24 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) up #1 0xc06a69c6 in panic (fmt=0xc091345b "Assertion %s failed at %s:%d") at ../../../kern/kern_shutdown.c:557 557 kdb_enter_why(KDB_WHY_PANIC, "panic"); (kgdb) #2 0xc06e1e22 in isitmydescendant (parent=0xc0a2a208, child=0xc0a2c468) at ../../../kern/subr_witness.c:1634 1634 MPASS(j < 1000); (kgdb) p j $1 = 1000 (kgdb) (kgdb) p *child $2 = {w_name = 0xc0932551 "bpf global lock", w_class = 0xc09ceb04, w_list = { stqe_next = 0xc0a2c490}, w_typelist = {stqe_next = 0xc0a2c490}, w_children = 0xc0a4c640, w_file = 0xc0939635 "net/bpf.c", w_line = 452, w_level = 0, w_refcount = 2, w_Giant_squawked = 0 '\0', w_other_squawked = 0 '\0', w_same_squawked = 0 '\0', w_displayed = 0 '\0'} Anyone that can shed some light on this? Btw I've never witnessed that crash without WITNESS on ;). Thanks! Karim. From lists.br at gmail.com Sat Apr 4 06:50:55 2009 From: lists.br at gmail.com (Luiz Otavio O Souza) Date: Sat Apr 4 08:55:19 2009 Subject: Setting the mss for socket References: <3FD46C21A487490FB15B89E890790121@adnote989> <49d5c0de.E5bkeKr+p+fg4K00%perryh@pluto.rain.com> <64D5D9E633734200A603D067ED5A81E9@adnote989> <49D63315.6050108@elischer.org> Message-ID: <3F89BC6021844DC58B956E5433650445@adnote989> > Luiz Otavio O Souza wrote: >>>> Is there a way to set the mss for a socket ? Like you can do >>>> in linux with setsockopt(TCP_MAXSEG) ? >>>> >>>> So i can set the maximum size of packets (or sort of) from a >>>> simple userland program. >>> >>> Depending on exactly what you need to accomplish, you may >>> find something useful in this thread from last August in >>> freebsd-questions@ >>> >>> setting the other end's TCP segment size >> >> Very informative thread, thanks. >> >> This thread show me that TCP_MAXSEG is implemented in freebsd but don't >> work. You can set the setsockopt(IPPROTO_TCP, TCP_MAXSEG), wich will set >> the >> tp->t_maxseg, but this value is recalculated at tcp_input, so in short, >> you >> cannot set the max segment size for a socket. >> >> I've posted a completly wrong patch (from style point-of-view - and using >> SOL_SOCKET instead of IPPROTO_TCP), but with that patch i'm able to set >> the >> mss in iperf. > > this thread shoud be in FreeBSD-net@ so tha the right people see it > many developers do not read hackers every day as it tends to overload > them. The above patch is a better fix for this, it fix the setsockopt(IPPROTO_TCP, TCP_MAXSEG), so iperf (and other userland programs) works by default. It's clear on code that tp->t_maxseg should not be changed, at least in this situation (it keeps the maximum mss for connection and it is used to calculate the tcp window scaling). tp->t_maxseg is also reseted to maxmtu (or rmx_mtu) at tcp_mss_update(). So here is the patch: http://loos.no-ip.org/downloads/mss-patch Thanks Luiz -------------- next part -------------- --- netinet/socketvar.h.orig 2009-04-03 22:29:34.000000000 -0300 +++ netinet/socketvar.h 2009-04-03 22:29:44.000000000 -0300 @@ -115,6 +115,7 @@ char *so_accept_filter_str; /* saved user args */ } *so_accf; int so_fibnum; /* routing domain for this socket */ + int so_maxseg; /* maxseg this socket */ }; /* --- netinet/tcp_usrreq.c.orig 2009-04-03 22:24:53.000000000 -0300 +++ netinet/tcp_usrreq.c 2009-04-03 23:26:35.000000000 -0300 @@ -1352,9 +1352,8 @@ return (error); INP_WLOCK_RECHECK(inp); - if (optval > 0 && optval <= tp->t_maxseg && - optval + 40 >= V_tcp_minmss) - tp->t_maxseg = optval; + if (optval >= 40 && optval <= tp->t_maxseg) + so->so_maxseg = optval; else error = EINVAL; INP_WUNLOCK(inp); @@ -1389,7 +1388,10 @@ error = sooptcopyout(sopt, &optval, sizeof optval); break; case TCP_MAXSEG: - optval = tp->t_maxseg; + if (so->so_maxseg) + optval = so->so_maxseg; + else + optval = tp->t_maxseg; INP_WUNLOCK(inp); error = sooptcopyout(sopt, &optval, sizeof optval); break; --- netinet/tcp_output.c.orig 2009-04-02 22:48:04.000000000 -0300 +++ netinet/tcp_output.c 2009-04-03 23:28:07.000000000 -0300 @@ -493,6 +493,11 @@ } } + if (so->so_maxseg && len > so->so_maxseg) { + len = so->so_maxseg; + sendalot = 1; + } + if (sack_rxmit) { if (SEQ_LT(p->rxmit + len, tp->snd_una + so->so_snd.sb_cc)) flags &= ~TH_FIN; @@ -518,6 +523,8 @@ if (len) { if (len >= tp->t_maxseg) goto send; + if (so->so_maxseg && len >= so->so_maxseg) + goto send; /* * NOTE! on localhost connections an 'ack' from the remote * end may occur synchronously with the output and cause From ivoras at freebsd.org Sun Apr 5 05:21:00 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Sun Apr 5 05:21:07 2009 Subject: Advice on a multithreaded netisr patch? Message-ID: Hi, I'm developing an application that needs a high rate of small TCP transactions on multi-core systems, and I'm hitting a limit where a kernel task, usually swi:net (but it depends on the driver) hits 100% of a CPU at some transactions/s rate and blocks further performance increase even though other cores are 100% idle. So I've got an idea and tested it out, but it fails in an unexpected way. I'm not very familiar with the network code so I'm probably missing something obvious. The idea was to locate where the packet processing takes place and offload packets to several new kernel threads. I see this can happen in several places - netisr, ip_input and tcp_input, and I chose netisr because I thought maybe it would also help other uses (routing?). Here's a patch against CURRENT: http://people.freebsd.org/~ivoras/diffs/mpip.patch It's fairly simple - starts a configurable number of threads in start_netisr(), assigns circular queues to each, and modifies what I think are entry points for packets in the non-netisr.direct case. I also try to have TCP and UDP traffic from the same host+port processed by the same thread. It has some rough edges but I think this is enough to test the idea. I know that there are several people officially working in this area and I'm not an expert in it so think of it as a weekend hack for learning purposes :) These parameters are needed in loader.conf to test it: net.isr.direct=0 net.isr.mtdispatch_n_threads=2 I expected things like the contention in upper layers (TCP) leading to not improving performance one bit, but I can't explain what I'm getting here. While testing the application on a plain kernel, I get approx. 100,000 - 120,000 packets/s per direction (by looking at "netstat 1") and a similar number of transactions/s in the application. With the patch I get up to 250,000 packets/s in netstat (3 mtdispatch threads), but for some weird reason the actual number of transactions processed by the application drops to less than 1,000 at the beginning (~~ 30 seconds), then jumps to close to 100,000 transactions/s, with netstat also showing a drop this number of packets. In the first phase, the new threads (netd0..3) are using CPU time almost 100%, in the second phase I can't see where the CPU time is going (using top). I thought this has something to deal with NIC moderation (em) but can't really explain it. The bad performance part (not the jump) is also visible over the loopback interface. Any ideas? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 258 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090405/6e457d3c/signature.pgp From rwatson at FreeBSD.org Sun Apr 5 06:21:23 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sun Apr 5 06:21:30 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: On Sun, 5 Apr 2009, Ivan Voras wrote: > I'm developing an application that needs a high rate of small TCP > transactions on multi-core systems, and I'm hitting a limit where a kernel > task, usually swi:net (but it depends on the driver) hits 100% of a CPU at > some transactions/s rate and blocks further performance increase even though > other cores are 100% idle. You can find a similar, if possibly more mature, implementation here: //depot/projects/rwatson/netisr2/... I haven't updated it in about six months since I've been waiting for the RSS-based flowid support in HEAD to mature. One of the fundamental problems with hashing packets to distribute work is that it involves taking cache misses on packet headers, not just once, but twice, which often is one of the largest costs in processing packets. Most modern, interesting high-performance network cards can already take the hash in hardware, and you want to use that hash to place work where possible. In 8.x, you shouldn't be experiencing high lock contention for the TCP receipt path when doing bulk transfers, as we use read locking for the tcbinfo lock in most cases. In fact, you can even get fairly decent scalability even in 7.x because the regular packet processing path for TCP uses mutual exclusion only briefly. However, the current approach does dirty a lot of cache lines, especially locks and stats, and does not scale well (in 8.x, or at all in 7.x) if you have lots of short connections. Also, be aware that if you're outputting to a single interface or queue, there's a *lot* of lock contention in the device driver. Kip Macy has patches to support multiple output queues on cxgb, which should facilitate support for other drivers as well, and the plan is to get that in 8.0 as well. The patch above doesn't know about the mbuf packetheader flowid yet, but it's trivial to teach it about that. I have plans to get back to the netisr2 code before we finalize 8.0, but have some other stuff in the queue first. We're, briefly, in a period where input queue count is about the same density as CPU cores; it's not entirely clear, but we may soon be back in a situation where CPU core count exceeds queues, in which case doing software work placement will continue to be important. Right now, as long as your high-performance card supports multiple input queues, we already do pretty effective work placement by virtue of RSS and multiple ithreads. Robert N M Watson Computer Laboratory University of Cambridge > > So I've got an idea and tested it out, but it fails in an unexpected > way. I'm not very familiar with the network code so I'm probably missing > something obvious. The idea was to locate where the packet processing > takes place and offload packets to several new kernel threads. I see > this can happen in several places - netisr, ip_input and tcp_input, and > I chose netisr because I thought maybe it would also help other uses > (routing?). Here's a patch against CURRENT: > > http://people.freebsd.org/~ivoras/diffs/mpip.patch > > It's fairly simple - starts a configurable number of threads in > start_netisr(), assigns circular queues to each, and modifies what I > think are entry points for packets in the non-netisr.direct case. I also > try to have TCP and UDP traffic from the same host+port processed by the > same thread. It has some rough edges but I think this is enough to test > the idea. I know that there are several people officially working in > this area and I'm not an expert in it so think of it as a weekend hack > for learning purposes :) > > These parameters are needed in loader.conf to test it: > > net.isr.direct=0 > net.isr.mtdispatch_n_threads=2 > > I expected things like the contention in upper layers (TCP) leading to > not improving performance one bit, but I can't explain what I'm getting > here. While testing the application on a plain kernel, I get approx. > 100,000 - 120,000 packets/s per direction (by looking at "netstat 1") > and a similar number of transactions/s in the application. With the > patch I get up to 250,000 packets/s in netstat (3 mtdispatch threads), > but for some weird reason the actual number of transactions processed by > the application drops to less than 1,000 at the beginning (~~ 30 > seconds), then jumps to close to 100,000 transactions/s, with netstat > also showing a drop this number of packets. In the first phase, the new > threads (netd0..3) are using CPU time almost 100%, in the second phase I > can't see where the CPU time is going (using top). > > I thought this has something to deal with NIC moderation (em) but can't > really explain it. The bad performance part (not the jump) is also > visible over the loopback interface. > > Any ideas? > > From rwatson at FreeBSD.org Sun Apr 5 06:24:02 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sun Apr 5 06:24:07 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: On Sun, 5 Apr 2009, Ivan Voras wrote: > I thought this has something to deal with NIC moderation (em) but can't > really explain it. The bad performance part (not the jump) is also visible > over the loopback interface. FYI, if you want high performance, you really want a card supporting multiple input queues -- igb, cxgb, mxge, etc. if_em-only cards are fundamentally less scalable in an SMP environment because they require input or output to occur only from one CPU at a time. Robert N M Watson Computer Laboratory University of Cambridge From ivoras at freebsd.org Sun Apr 5 06:35:11 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Sun Apr 5 06:35:18 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: Robert Watson wrote: > > On Sun, 5 Apr 2009, Ivan Voras wrote: > >> I thought this has something to deal with NIC moderation (em) but >> can't really explain it. The bad performance part (not the jump) is >> also visible over the loopback interface. > > FYI, if you want high performance, you really want a card supporting > multiple input queues -- igb, cxgb, mxge, etc. if_em-only cards are > fundamentally less scalable in an SMP environment because they require > input or output to occur only from one CPU at a time. Makes sense, but on the other hand - I see people are routing at least 250,000 packets per seconds per direction with these cards, so they probably aren't the bottleneck (pro/1000 pt on pci-e). -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 258 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090405/c056c6e3/signature.pgp From rwatson at FreeBSD.org Sun Apr 5 06:54:20 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sun Apr 5 06:54:27 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: On Sun, 5 Apr 2009, Ivan Voras wrote: >>> I thought this has something to deal with NIC moderation (em) but can't >>> really explain it. The bad performance part (not the jump) is also visible >>> over the loopback interface. >> >> FYI, if you want high performance, you really want a card supporting >> multiple input queues -- igb, cxgb, mxge, etc. if_em-only cards are >> fundamentally less scalable in an SMP environment because they require >> input or output to occur only from one CPU at a time. > > Makes sense, but on the other hand - I see people are routing at least > 250,000 packets per seconds per direction with these cards, so they probably > aren't the bottleneck (pro/1000 pt on pci-e). The argument is not that they are slower (although they probably are a bit slower), rather that they introduce serialization bottlenecks by requiring synchronization between CPUs in order to distribute the work. Certainly some of the scalability issues in the stack are not a result of that, but a good number are. Historically, we've had a number of bottlenecks in, say, the bulk data receive and send paths, such as: - Initial receipt and processing of packets on a single CPU as a result of a single input queue from the hardware. Addressed by using multiple input queue hardware with appropriately configured drivers (generally the default is to use multiple input queues in 7.x and 8.x for supporting hardware). - Cache line contention on stats data structures in drivers and various levels of the network stack due to bouncing around exclusive ownership of the cache line. ifnet introduces at least a few, but I think most of the interesting ones are at the IP and TCP layers for receipt. - Global locks protecting connection lists, all rwlocks as of 7.1, but not necessarily always used read-only for packet processing. For UDP we do a very good job at avoiding write locks, but for TCP in 7.x we still use a global write lock, if briefly, for every packet. There's a change in 8.x to use a global read lock for most packets, especially steady state packets, but I didn't merge it for 7.2 because it's not well-benchmarked. Assuming I get positive feedback from more people, I will merge them before 7.3. - If the user application is multi-threaded and receiving from many threads at once, we see contention on the file descriptor table lock. This was markedly improved by the file descriptor table locking rewrite in 7.0, but we're continuing to look for ways to mitigate this. A lockless approach would be really nice... On the transmit path, the bottlenecks are similar but different: - Neither 7.x nor 8.x supports multiple transmit queues as shipped; Kip has patches for both that add it for cxgb. Maintaining ordering here, and ideally affinity to the appropriate associated input queue, is important. As the patches aren't in the tree yet, or for single-queue drivers, contention on the device driver send path and queues can be significant, especially for device drivers where the send and receive path are protected by the same lock (bge!). - Stats at various levels in the stack still dirty cache lines. - We don't acquire, in the common case, any global connection list locks during transmit. - Routing table locks may be an issue. Kip has patches against 8.x to re-introduce inpcb route as well as link layer flow caching. These are in my review queue currently... In 8.x the global radix tree lock is a read-write lock and we use read-locking where possible, but in 7.x it's still a mutex. This probably isn't an MFCable change. Another change coming in 8.x is increased use of read-mostly locks, rmlocks, which avoid writes to shared cache lines for read-acquire, but have a more expensive write-acquire. We're already using this in a few spots, including for firewall registration, but need to use it in more. With a fast CPU, introducing more cores may not necessarily speed up, and might often slow down, processing even if all bottlenecks are eliminated--fundamentally, if you have the CPU capacity to do the work on one CPU, then moving the work to other CPUs is an overhead best avoided. Especially if the device itself forces serialization due to having a single input queue and a single output queue. However, if we, reasonably, assume a capping of core speed over time, and increasing CPU density, software work placement becomes more important. And with multi-queue devices, avoiding writing to common cache lines from CPUs is increasingly possible. We have a 32-thread MIPS embedded eval board in the Netperf cluster now, which we'll begin using for 10gbps testing fairly soon, I hope. One of its properties is that individual threads are decidedly non-zippy compared to, say, a 10gbps interface running at line-rate, so it will allow us to explore these issues more effectively than we could before. Robert N M Watson Computer Laboratory University of Cambridge From barney_cordoba at yahoo.com Sun Apr 5 10:25:44 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Sun Apr 5 10:25:50 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <285323.31546.qm@web63901.mail.re1.yahoo.com> --- On Sun, 4/5/09, Robert Watson wrote: > From: Robert Watson > Subject: Re: Advice on a multithreaded netisr patch? > To: "Ivan Voras" > Cc: freebsd-net@freebsd.org > Date: Sunday, April 5, 2009, 9:54 AM > On Sun, 5 Apr 2009, Ivan Voras wrote: > > >>> I thought this has something to deal with NIC > moderation (em) but can't really explain it. The bad > performance part (not the jump) is also visible over the > loopback interface. > >> > >> FYI, if you want high performance, you really want > a card supporting multiple input queues -- igb, cxgb, mxge, > etc. if_em-only cards are fundamentally less scalable in an > SMP environment because they require input or output to > occur only from one CPU at a time. > > > > Makes sense, but on the other hand - I see people are > routing at least 250,000 packets per seconds per direction > with these cards, so they probably aren't the bottleneck > (pro/1000 pt on pci-e). > > The argument is not that they are slower (although they > probably are a bit slower), rather that they introduce > serialization bottlenecks by requiring synchronization > between CPUs in order to distribute the work. Certainly > some of the scalability issues in the stack are not a result > of that, but a good number are. > > Historically, we've had a number of bottlenecks in, > say, the bulk data receive and send paths, such as: > > - Initial receipt and processing of packets on a single CPU > as a result of a > single input queue from the hardware. Addressed by using > multiple input > queue hardware with appropriately configured drivers > (generally the default > is to use multiple input queues in 7.x and 8.x for > supporting hardware). > > - Cache line contention on stats data structures in drivers > and various levels > of the network stack due to bouncing around exclusive > ownership of the cache > line. ifnet introduces at least a few, but I think most > of the interesting > ones are at the IP and TCP layers for receipt. > > - Global locks protecting connection lists, all rwlocks as > of 7.1, but not > necessarily always used read-only for packet processing. > For UDP we do a > very good job at avoiding write locks, but for TCP in 7.x > we still use a > global write lock, if briefly, for every packet. > There's a change in 8.x to > use a global read lock for most packets, especially > steady state packets, > but I didn't merge it for 7.2 because it's not > well-benchmarked. Assuming I > get positive feedback from more people, I will merge them > before 7.3. > > - If the user application is multi-threaded and receiving > from many threads at > once, we see contention on the file descriptor table > lock. This was > markedly improved by the file descriptor table locking > rewrite in 7.0, but > we're continuing to look for ways to mitigate this. > A lockless approach > would be really nice... > > On the transmit path, the bottlenecks are similar but > different: > > - Neither 7.x nor 8.x supports multiple transmit queues as > shipped; Kip has > patches for both that add it for cxgb. Maintaining > ordering here, and > ideally affinity to the appropriate associated input > queue, is important. > As the patches aren't in the tree yet, or for > single-queue drivers, > contention on the device driver send path and queues can > be significant, > especially for device drivers where the send and receive > path are protected > by the same lock (bge!). I'm curious as to your assertion that hardware transmit queues are a big win. You're really just loading a transmit ring well ahead of actual transmission; there's no need to force a "start" for each packet queued. You then have more overheard managing the multiple queues; more memory used, more cpu cache needed, more interrupts (perhaps), overhead generating the flowid. It seems to me that a more efficient method of transmitting, such as offloading the transmit workload to a kernel task, would be more effective than using multiple transmit queues. All the source thread has to do is queue the packet and get out. As an aside, why is Kip doing development on a Chelsio card rather than a more mainstream product such as Intel or Broadcom that would generate more widespread interest? Barney From bms at incunabulum.net Sun Apr 5 10:27:25 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Sun Apr 5 10:27:32 2009 Subject: Multicast routing In-Reply-To: References: Message-ID: <49D8E9F8.7090800@incunabulum.net> Upakul Barkakaty wrote: > Hi all, > > I was trying to setup a multicast tunneling setup with freebsd, with the > mrouted utility. However, my multicast router doesnt seem to be forwarding > those multicast packets. > > It would really be helpful if someone could help me with the setup or the > mrouted.conf file contents. > > Thanks in anticipation. > > Please try the mcast-tools port to confirm that multicast forwarding works. There are tools in that port which will allow you to run basic UDP stream tests as well as installing static entries in the forwarding cache. The most likely culprit is a network interface which does not support ALLMULTI. Also, DVMRP has been dead for years, avoid mrouted -- try a PIM implementation e.g. XORP or pimsd. thanks BMS From ivoras at freebsd.org Sun Apr 5 10:29:46 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Sun Apr 5 10:29:53 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: Robert Watson wrote: > > On Sun, 5 Apr 2009, Ivan Voras wrote: > >>>> I thought this has something to deal with NIC moderation (em) but >>>> can't really explain it. The bad performance part (not the jump) is >>>> also visible over the loopback interface. >>> >>> FYI, if you want high performance, you really want a card supporting >>> multiple input queues -- igb, cxgb, mxge, etc. if_em-only cards are >>> fundamentally less scalable in an SMP environment because they >>> require input or output to occur only from one CPU at a time. >> >> Makes sense, but on the other hand - I see people are routing at least >> 250,000 packets per seconds per direction with these cards, so they >> probably aren't the bottleneck (pro/1000 pt on pci-e). > > The argument is not that they are slower (although they probably are a > bit slower), rather that they introduce serialization bottlenecks by > requiring synchronization between CPUs in order to distribute the work. > Certainly some of the scalability issues in the stack are not a result > of that, but a good number are. I'd like to understand more. If (in netisr) I have a mbuf with headers, is this data already transfered from the card or is it magically "not here yet"? In the first case, the package reception code path is not changed until it's queued on a thread, on which it's handled in the future (or is the influence of "other" data like timers and internal TCP reassembly buffers so large?). In the second case, why? > Historically, we've had a number of bottlenecks in, say, the bulk data > receive and send paths, such as: > > - Initial receipt and processing of packets on a single CPU as a result > of a > single input queue from the hardware. Addressed by using multiple input > queue hardware with appropriately configured drivers (generally the > default > is to use multiple input queues in 7.x and 8.x for supporting hardware). As the card and the OS can already process many packets per second for something fairly complex as routing (http://www.tancsa.com/blast.html), and TCP chokes swi:net at 100% of a core, isn't this indication there's certainly more space for improvement even with a single-queue old-fashioned NICs? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 258 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090405/1d0a28fb/signature.pgp From rwatson at FreeBSD.org Sun Apr 5 10:40:20 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sun Apr 5 10:40:27 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <285323.31546.qm@web63901.mail.re1.yahoo.com> References: <285323.31546.qm@web63901.mail.re1.yahoo.com> Message-ID: On Sun, 5 Apr 2009, Barney Cordoba wrote: > I'm curious as to your assertion that hardware transmit queues are a big > win. You're really just loading a transmit ring well ahead of actual > transmission; there's no need to force a "start" for each packet queued. You > then have more overheard managing the multiple queues; more memory used, > more cpu cache needed, more interrupts (perhaps), overhead generating the > flowid. It seems to me that a more efficient method of transmitting, such as > offloading the transmit workload to a kernel task, would be more effective > than using multiple transmit queues. All the source thread has to do is > queue the packet and get out. When using multiple cores, we've observed significant contention on the transmit-side locks protecting a single output queue; when multiple queues are used, that contention is avoided. The lock only coveres the queue, but the overhead of a single high contention lock twice for every packet (enqeueu, later dequeue) is significant at high pps and with many cores. > As an aside, why is Kip doing development on a Chelsio card rather than a > more mainstream product such as Intel or Broadcom that would generate more > widespread interest? Because they paid him to to write their driver? :-) Robert N M Watson Computer Laboratory University of Cambridge From oberman at es.net Sun Apr 5 14:24:22 2009 From: oberman at es.net (Kevin Oberman) Date: Sun Apr 5 14:24:34 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Your message of "Sun, 05 Apr 2009 10:25:41 PDT." <285323.31546.qm@web63901.mail.re1.yahoo.com> Message-ID: <20090405212420.31A311CC50@ptavv.es.net> > Date: Sun, 5 Apr 2009 10:25:41 -0700 (PDT) > From: Barney Cordoba > Sender: owner-freebsd-net@freebsd.org > > > As an aside, why is Kip doing development on a Chelsio card rather > than a more mainstream product such as Intel or Broadcom that would > generate more widespread interest? Because Chelsio pays him better than the makers of the "more mainstream" products. And, at 10GE, Chelsio and Myricom seem to have stronger products than others. (Just my opinion and not that of The US Dept. of Energy, The university of California, or Lawrence Berkeley National Labs.) I just hope Kip's legal problems are resolved soon. FreeBSD really needs him. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751 From barney_cordoba at yahoo.com Sun Apr 5 14:32:37 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Sun Apr 5 14:32:44 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <20090405212420.31A311CC50@ptavv.es.net> Message-ID: <496315.72401.qm@web63906.mail.re1.yahoo.com> --- On Sun, 4/5/09, Kevin Oberman wrote: > From: Kevin Oberman > Subject: Re: Advice on a multithreaded netisr patch? > To: barney_cordoba@yahoo.com > Cc: "Ivan Voras" , "Robert Watson" , freebsd-net@freebsd.org > Date: Sunday, April 5, 2009, 5:24 PM > > Date: Sun, 5 Apr 2009 10:25:41 -0700 (PDT) > > From: Barney Cordoba > > Sender: owner-freebsd-net@freebsd.org > > > > > > As an aside, why is Kip doing development on a Chelsio > card rather > > than a more mainstream product such as Intel or > Broadcom that would > > generate more widespread interest? > > Because Chelsio pays him better than the makers of the > "more mainstream" > products. And, at 10GE, Chelsio and Myricom seem to have > stronger > products than others. (Just my opinion and not that of The > US Dept. of > Energy, The university of California, or Lawrence Berkeley > National > Labs.) Sadly thats the small picture view that has plagued freebsd for the longest time. The bigger picture is that big OEMs aren't going to use chelsio cards, and big OEMs running FreeBSD instead of linux mean more testers, more hardware, more code give-backs and more money for the project. You don't really know how good or bad intel or broadcom is because you don't have good drivers for the cards. Unfortunately Intel does things ass-backwards, by putting out crap "sample" drivers that make their cards look like garbage. Maybe they are garbage, but you think they'd be a bit smarter. They can certainly afford more than Chelsio. Barney From sthaug at nethelp.no Sun Apr 5 14:37:28 2009 From: sthaug at nethelp.no (sthaug@nethelp.no) Date: Sun Apr 5 14:37:36 2009 Subject: IPv6 window scaling factor always 1 on initial SYN Message-ID: <20090405.231044.74688369.sthaug@nethelp.no> On 7-STABLE, with kern.ipc.maxsockbuf=2621440, both sides set a window scaling factor of 6 (i.e. SYN wscale 6, SYN-ACK wscale 6) using IPv4. With the same value of kern.ipc.maxsockbuf, using IPv6, the side which sends the initial SYN sets a window scaling factor of only 1, while the other side sets a scaling factor of 6 in the SYN-ACK. This will obviously limit throughput in many cases. In both cases net.inet.tcp.rfc1323=1. Anybody know why IPv6 behaves differently here? tcpdump example: 22:20:37.282415 IP 193.75.4.50.53981 > 193.75.110.66.5555: S 1580765626:1580765626(0) win 65535 22:20:37.282442 IP 193.75.110.66.5555 > 193.75.4.50.53981: S 1408884711:1408884711(0) ack 1580765627 win 65535 22:21:49.749586 IP6 2001:8c0:9a00:1::2.53983 > 2001:8c0:8500:1::2.5555: S 565631163:565631163(0) win 65535 22:21:49.749633 IP6 2001:8c0:8500:1::2.5555 > 2001:8c0:9a00:1::2.53983: S 627173961:627173961(0) ack 565631164 win 65535 References: <20090405.231044.74688369.sthaug@nethelp.no> Message-ID: <20090405214757.E15361@maildrop.int.zabbadoz.net> On Sun, 5 Apr 2009, sthaug@nethelp.no wrote: > On 7-STABLE, with kern.ipc.maxsockbuf=2621440, both sides set a window > scaling factor of 6 (i.e. SYN wscale 6, SYN-ACK wscale 6) using IPv4. > > With the same value of kern.ipc.maxsockbuf, using IPv6, the side which > sends the initial SYN sets a window scaling factor of only 1, while > the other side sets a scaling factor of 6 in the SYN-ACK. This will > obviously limit throughput in many cases. > > In both cases net.inet.tcp.rfc1323=1. > > Anybody know why IPv6 behaves differently here? > > tcpdump example: > > 22:20:37.282415 IP 193.75.4.50.53981 > 193.75.110.66.5555: S 1580765626:1580765626(0) win 65535 > 22:20:37.282442 IP 193.75.110.66.5555 > 193.75.4.50.53981: S 1408884711:1408884711(0) ack 1580765627 win 65535 > > 22:21:49.749586 IP6 2001:8c0:9a00:1::2.53983 > 2001:8c0:8500:1::2.5555: S 565631163:565631163(0) win 65535 > 22:21:49.749633 IP6 2001:8c0:8500:1::2.5555 > 2001:8c0:9a00:1::2.53983: S 627173961:627173961(0) ack 565631164 win 65535 request_r_scale < TCP_MAX_WINSHIFT && 1112 (TCP_MAXWIN << tp->request_r_scale) < sb_max) ^^^^^^^^^^^ 1113 tp->request_r_scale++; and tcp6_connect 1174 /* Compute window scaling to request. */ 1175 while (tp->request_r_scale < TCP_MAX_WINSHIFT && 1176 (TCP_MAXWIN << tp->request_r_scale) < so->so_rcv.sb_hiwat) ^^^^^^^^^^^ 1177 tp->request_r_scale++; I'll have to check why they are un-equal... /bz -- Bjoern A. Zeeb The greatest risk is not taking one. From bzeeb-lists at lists.zabbadoz.net Sun Apr 5 15:05:07 2009 From: bzeeb-lists at lists.zabbadoz.net (Bjoern A. Zeeb) Date: Sun Apr 5 15:05:14 2009 Subject: IPv6 window scaling factor always 1 on initial SYN In-Reply-To: <20090405214757.E15361@maildrop.int.zabbadoz.net> References: <20090405.231044.74688369.sthaug@nethelp.no> <20090405214757.E15361@maildrop.int.zabbadoz.net> Message-ID: <20090405215842.C15361@maildrop.int.zabbadoz.net> On Sun, 5 Apr 2009, Bjoern A. Zeeb wrote: > On Sun, 5 Apr 2009, sthaug@nethelp.no wrote: > >> On 7-STABLE, with kern.ipc.maxsockbuf=2621440, both sides set a window >> scaling factor of 6 (i.e. SYN wscale 6, SYN-ACK wscale 6) using IPv4. >> >> With the same value of kern.ipc.maxsockbuf, using IPv6, the side which >> sends the initial SYN sets a window scaling factor of only 1, while >> the other side sets a scaling factor of 6 in the SYN-ACK. This will >> obviously limit throughput in many cases. >> >> In both cases net.inet.tcp.rfc1323=1. >> >> Anybody know why IPv6 behaves differently here? >> >> tcpdump example: >> >> 22:20:37.282415 IP 193.75.4.50.53981 > 193.75.110.66.5555: S >> 1580765626:1580765626(0) win 65535 > 661320721 0> >> 22:20:37.282442 IP 193.75.110.66.5555 > 193.75.4.50.53981: S >> 1408884711:1408884711(0) ack 1580765627 win 65535 > 6,sackOK,timestamp 1581013561 661320721> >> >> 22:21:49.749586 IP6 2001:8c0:9a00:1::2.53983 > 2001:8c0:8500:1::2.5555: S >> 565631163:565631163(0) win 65535 > 661393190 0> >> 22:21:49.749633 IP6 2001:8c0:8500:1::2.5555 > 2001:8c0:9a00:1::2.53983: S >> 627173961:627173961(0) ack 565631164 win 65535 > 6,sackOK,timestamp 8 > > I think the answer to tthat is in sys/netinet/tcp_usrreq.c in the > functuoins: > tcp_connect > > 1106 /* > 1107 * Compute window scaling to request: > 1108 * Scale to fit into sweet spot. See tcp_syncache.c. > 1109 * XXX: This should move to tcp_output(). > 1110 */ > 1111 while (tp->request_r_scale < TCP_MAX_WINSHIFT && > 1112 (TCP_MAXWIN << tp->request_r_scale) < sb_max) > > ^^^^^^^^^^^ > > 1113 tp->request_r_scale++; > > > and tcp6_connect > > 1174 /* Compute window scaling to request. */ > 1175 while (tp->request_r_scale < TCP_MAX_WINSHIFT && > 1176 (TCP_MAXWIN << tp->request_r_scale) < so->so_rcv.sb_hiwat) > > ^^^^^^^^^^^ > > 1177 tp->request_r_scale++; > > > I'll have to check why they are un-equal... Ok, both versions had: < so->so_rcv.sb_hiwat) http://svn.freebsd.org/viewvc/base?view=revision&revision=166403 changed it for IPv4 the first time, http://svn.freebsd.org/viewvc/base?view=revision&revision=172795 changed it a second time for IPv4. Noone changed the IPv6 version. The syncache already seems to do it for both v4/v6 (common code). Can you try changing it to < sb_max) for IPv6 as well and see if things work (better) for you? /bz -- Bjoern A. Zeeb The greatest risk is not taking one. From rwatson at FreeBSD.org Sun Apr 5 15:18:00 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sun Apr 5 15:18:06 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: On Sun, 5 Apr 2009, Ivan Voras wrote: >> The argument is not that they are slower (although they probably are a bit >> slower), rather that they introduce serialization bottlenecks by requiring >> synchronization between CPUs in order to distribute the work. Certainly >> some of the scalability issues in the stack are not a result of that, but a >> good number are. > > I'd like to understand more. If (in netisr) I have a mbuf with headers, is > this data already transfered from the card or is it magically "not here > yet"? A lot depends on the details of the card and driver. The driver will take cache misses on the descriptor ring entry, if it's not already in cache, and the link layer will take a cache miss on the front of the ethernet frame in the cluster pointed to by the mbuf header as part of its demux. What happens next depends on your dispatch model and cache line size. Let's make a few simplifying assumptions that are mostly true: - The driver associats a single cluster with each receive ring entry for each packet to be stored in, and the cluster is cacheline-aligned. No header splitting is enabled. - Standard ethernet encapsulation of IP is used, without additional VLAN headers or other encapsulation, etc. There are no IP options. - We don't need to validate any checksums because the hardware has done it for us, so no need to take cache misses on data that doesn't matter until we reach higher layers. In the device driver/ithread code, we'll now proceed to take some cache misses assuming we're not pretty lucky: (1) The descriptor ring entry (2) The mbuf packet header (3) The first cache line in the cluster This is sufficient to figure out what protocol we're going to dispatch to, and depending on dispatch model, we now either enqueue the packet for delivery to a netisr, or we directly dispatch the handler for IP. If the packet is processed on the current CPU and we're direct dispatching, or if we've dispatched to a netisr on the same CPU and we're quite lucky, the mbuf packet header and front of the cluster will be in the cache. However, what happens next depends on the cache fetch and line size. If things happen in 32-byte cache lines or smaller, we cache miss on the end of the IP header, because the last two bytes of the destination IP address start at offset 32 into the cluster. If we have 64-byte fetching and line size, things go better because both the full IP and TCP headers should be in that first cache line. One big advantage to direct dispatch is that it maximizes the chances that we don't blow out the low-level CPU caches between link-layer and IP-layer processing, meaning that we might actually get through all the IP and TCP headers without a cache miss on a 64-byte line size. If we netisr dispatch to another CPU without a shared cache, or we netisr dispatch to the current CPU but there's a scheduling delay, other packets queued first, etc, we'll take a number of the same cache misses over again as things get pulled into the right cache. This presents a strong cache motivation to keep a packet "on" a CPU and even in the same thread once you've started processing it. If you have to enqueue, you take locks, take a context switch, deal with the fact that LRU on cache lines isn't going to like your queue depth, and potentially pay a number of additional cache misses on the same data. There are also some other good reasons to use direct dispatch, such as avoiding doing work on packets that will later be dropped if the netisr queue overflows. This is why we direct dispatch by default, and why this is quite a good strategy for multiple input queue network cards, where it also buys us parallelism. Note that if the flow RSS hash is in the same cache line as the rest of the receive descriptor ring entry, you may be able to avoid the cache miss on the cluster and simply redirect it to another CPU's netisr without ever reading packet data, which avoids at least one and possibly two cache misses, but also means that you have to run the link layer in the remote netisr, rather than locally in the ithread. > In the first case, the package reception code path is not changed until it's > queued on a thread, on which it's handled in the future (or is the influence > of "other" data like timers and internal TCP reassembly buffers so large?). > In the second case, why? The good news about TCP reassembly is that we don't have to look at the data, only mbuf headers and reassembly buffer entries, so with any luck we've avoided actually taking a cache miss on the data. If things go well, we can avoid looking at anything but mbuf and packet headers until the socket copies out, but I'm not sure how well we do that in practice. > As the card and the OS can already process many packets per second for > something fairly complex as routing (http://www.tancsa.com/blast.html), and > TCP chokes swi:net at 100% of a core, isn't this indication there's > certainly more space for improvement even with a single-queue old-fashioned > NICs? Maybe. It depends on the relative costs of local processing vs redistributing the work, which involves schedulers, IPIs, additional cache misses, lock contention, and so on. This means there's a period where it can't possibly be a win, and then at some point it's a win as long as the stack scales. This is essentially the usual trade-off in using threads and parallelism: does the benefit of multiple parallel execution units make up for the overheads of synchronization and data migration? There are some previous e-mail threads where people have observed that for some workloads, switching to netisr wins over direct dispatch. For example, if you have a number of cores and are doing firewall processing, offloading work to the netisr from the input ithread may improve performance. However, this appears not to be the common case for end-host workloads on the hardware we mostly target, and this is increasingly true as multiple input queues come into play, as the card itself will allow us to use multiple CPUs without any interactions between the CPUs. This isn't to say that work redistribution using a netisr-like scheme isn't a good idea: in a world where CPU threads are weak compared to the wire workflow, and there's cache locality across threads on the same core, or NUMA is present, there may be a potential for a big win when available work significantly exceeds what a single CPU thread/core can handle. In that case, we want to place the work as close as possible to take advantage of shared caches or the memory being local to the CPU thread/core doing the deferred work. FYI, the localhost case is a bit weird -- I think we have some scheduling issues that are causing loopback netisr stuff to be pessimally scheduled. Here are some suggestions for things to try and see if they help, though: - Comment out all ifnet, IP, and TCP global statistics in your local stack -- especially look for things tcpstat.whatever++;. - Use cpuset to pin ithreads, the netisr, and whatever else, to specific cores so that they don't migrate, and if your system uses HTT, experiment with pinning the ithread and the netisr on different threads on the same core, or at least, different cores on the same die. - Experiment with using just the source IP, the source + destination IP, and both IPs plus TCP ports in your hash. - If your card supports RSS, pass the flowid up the stack in the mbuf packet header flowid field, and use that instead of the hash for work placement. - If you're doing pure PPS tests with UDP (or the like), and your test can tolerate disordering, try hashing based on the mbuf header address or something else that will distribute the work but not take a cache miss. - If you have a flowid or the above disordered condition applies, try shifting the link layer dispatch to the netisr, rather than doing the demux in the ithread, as that will avoid cache misses in the ithread and do all the demux in the netisr. Robert N M Watson Computer Laboratory University of Cambridge From ivoras at freebsd.org Sun Apr 5 15:48:36 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Sun Apr 5 15:48:44 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: Thanks for the ideas, I will try some of them. But I'd also like some more clarifications: Robert Watson wrote: > On Sun, 5 Apr 2009, Ivan Voras wrote: >> I'd like to understand more. If (in netisr) I have a mbuf with >> headers, is this data already transfered from the card or is it >> magically "not here yet"? > > A lot depends on the details of the card and driver. The driver will > take cache misses on the descriptor ring entry, if it's not already in > cache, and the link layer will take a cache miss on the front of the > ethernet frame in the cluster pointed to by the mbuf header as part of > its demux. What happens next depends on your dispatch model and cache > line size. Let's make a few simplifying assumptions that are mostly true: So, a mbuf can reference data not yet copied from the NIC hardware? I'm specifically trying to undestand what m_pullup() does. >> As the card and the OS can already process many packets per second for >> something fairly complex as routing >> (http://www.tancsa.com/blast.html), and TCP chokes swi:net at 100% of >> a core, isn't this indication there's certainly more space for >> improvement even with a single-queue old-fashioned NICs? > > Maybe. It depends on the relative costs of local processing vs > redistributing the work, which involves schedulers, IPIs, additional > cache misses, lock contention, and so on. This means there's a period > where it can't possibly be a win, and then at some point it's a win as > long as the stack scales. This is essentially the usual trade-off in > using threads and parallelism: does the benefit of multiple parallel > execution units make up for the overheads of synchronization and data > migration? Do you have any idea at all why I'm seeing the weird difference of netstat packets per second (250,000) and my application's TCP performance (< 1,000 pps)? Summary: each packet is guaranteed to be a whole message causing a transaction in the application - without the changes I see pps almost identical to tps. Even if the source of netstat statistics somehow manages to count packets multiple time (I don't see how that can happen), no relation can describe differences this huge. It almost looks like something in the upper layers is discarding packets (also not likely: TCP timeouts would occur and the application wouldn't be able to push 250,000 pps) - but what? Where to look? > FYI, the localhost case is a bit weird -- I think we have some > scheduling issues that are causing loopback netisr stuff to be > pessimally scheduled. Here are some suggestions for things to try and > see if they help, though: > > - Comment out all ifnet, IP, and TCP global statistics in your local > stack -- > especially look for things tcpstat.whatever++;. You mean for the general code? I purposely don't lock my statistics variables because I'm not that interested in exact numbers (orders of magnitude are relevant). As far as I understand, unlocked "x++" should be trivially fast in this case? > - Use cpuset to pin ithreads, the netisr, and whatever else, to specific > cores > so that they don't migrate, and if your system uses HTT, experiment with > pinning the ithread and the netisr on different threads on the same > core, or > at least, different cores on the same die. I'm using em hardware; I still think there's a possibility I'm fighting the driver in some cases but this has priority #2. > - Experiment with using just the source IP, the source + destination IP, > and > both IPs plus TCP ports in your hash. Ok. Currently I'm using ip1+ip2+port1+port2. > - If your card supports RSS, pass the flowid up the stack in the mbuf > packet > header flowid field, and use that instead of the hash for work placement. Don't know about em. Don't really want to touch it if I don't have to :) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 258 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090405/1d1e5a30/signature.pgp From linimon at FreeBSD.org Sun Apr 5 23:24:52 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Sun Apr 5 23:24:59 2009 Subject: kern/133218: [carp] [hang] use of carp(4) causes system to freeze Message-ID: <200904060624.n366Oq76045363@freefall.freebsd.org> Synopsis: [carp] [hang] use of carp(4) causes system to freeze Responsible-Changed-From-To: freebsd-i386->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Mon Apr 6 06:24:37 UTC 2009 Responsible-Changed-Why: This does not sound i386-specific. http://www.freebsd.org/cgi/query-pr.cgi?pr=133218 From dfilter at FreeBSD.ORG Mon Apr 6 03:10:03 2009 From: dfilter at FreeBSD.ORG (dfilter service) Date: Mon Apr 6 03:10:14 2009 Subject: bin/131365: commit references a PR Message-ID: <200904061010.n36AA3ZX076019@freefall.freebsd.org> The following reply was made to PR bin/131365; it has been noted by GNATS. From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: bin/131365: commit references a PR Date: Mon, 6 Apr 2009 10:09:37 +0000 (UTC) Author: rrs Date: Mon Apr 6 10:09:20 2009 New Revision: 190758 URL: http://svn.freebsd.org/changeset/base/190758 Log: Class based addressing went out in the early 90's. Basically if a entry is not route add -net xxx/bits then we should use the addr (xxx) to establish the number of bits by looking at the first non-zero bit. So if we enter route add -net 10.1.1.0 10.1.3.5 this is the same as doing route add -net 10.1.1.0/24 Since the 8th bit (zero counting) is set to 1 we set bits to 32-8. Users can of course still use the /x to change this behavior or in cases where the network is in the trailing part of the address, a "netmask" argument can be supplied to override what is established from the interpretation of the address itself. e.g: route add -net 10.1.1.8 -netmask 0xff00ffff should overide and place the proper CIDR mask in place. PR: 131365 MFC after: 1 week Modified: head/sbin/route/route.c Modified: head/sbin/route/route.c ============================================================================== --- head/sbin/route/route.c Mon Apr 6 07:13:26 2009 (r190757) +++ head/sbin/route/route.c Mon Apr 6 10:09:20 2009 (r190758) @@ -713,7 +713,7 @@ newroute(argc, argv) #ifdef INET6 if (af == AF_INET6) { rtm_addrs &= ~RTA_NETMASK; - memset((void *)&so_mask, 0, sizeof(so_mask)); + memset((void *)&so_mask, 0, sizeof(so_mask)); } #endif } @@ -803,21 +803,22 @@ inet_makenetandmask(net, sin, bits) addr = net << IN_CLASSC_NSHIFT; else addr = net; - - if (bits != 0) - mask = 0xffffffff << (32 - bits); - else if (net == 0) - mask = 0; - else if (IN_CLASSA(addr)) - mask = IN_CLASSA_NET; - else if (IN_CLASSB(addr)) - mask = IN_CLASSB_NET; - else if (IN_CLASSC(addr)) - mask = IN_CLASSC_NET; - else if (IN_MULTICAST(addr)) - mask = IN_CLASSD_NET; - else - mask = 0xffffffff; + /* + * If no /xx was specified we must cacluate the + * CIDR address. + */ + if ((bits == 0) && (addr != 0)) { + int i, j; + for(i=0,j=1; i<32; i++) { + if (addr & j) { + break; + } + j <<= 1; + } + /* i holds the first non zero bit */ + bits = 32 - i; + } + mask = 0xffffffff << (32 - bits); sin->sin_addr.s_addr = htonl(addr); sin = &so_mask.sin; _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From sthaug at nethelp.no Mon Apr 6 03:20:02 2009 From: sthaug at nethelp.no (sthaug@nethelp.no) Date: Mon Apr 6 03:20:09 2009 Subject: IPv6 window scaling factor always 1 on initial SYN In-Reply-To: <20090405215842.C15361@maildrop.int.zabbadoz.net> References: <20090405.231044.74688369.sthaug@nethelp.no> <20090405214757.E15361@maildrop.int.zabbadoz.net> <20090405215842.C15361@maildrop.int.zabbadoz.net> Message-ID: <20090406.121959.74751582.sthaug@nethelp.no> > Ok, both versions had: < so->so_rcv.sb_hiwat) > > http://svn.freebsd.org/viewvc/base?view=revision&revision=166403 > > changed it for IPv4 the first time, > > http://svn.freebsd.org/viewvc/base?view=revision&revision=172795 > > changed it a second time for IPv4. > > Noone changed the IPv6 version. > > The syncache already seems to do it for both v4/v6 (common code). > > Can you try changing it to < sb_max) for IPv6 as well and see if > things work (better) for you? I changed it, and that worked like a dream. Now I get basically the same throughput with IPv4 and IPv6. There are of course still issues like lots of IPv6 tunnels that add extra latency - but that's not the fault of FreeBSD. Anyway, thanks for your work. Below is a context diff (against 7-STABLE cvsupped last night). Do we need a PR to get this into FreeBSD? Steinar Haug, Nethelp consulting, sthaug@nethelp.no ---------------------------------------------------------------------- *** tcp_usrreq.c.orig Sun Apr 5 22:51:49 2009 --- tcp_usrreq.c Mon Apr 6 11:15:11 2009 *************** *** 1153,1159 **** /* Compute window scaling to request. */ while (tp->request_r_scale < TCP_MAX_WINSHIFT && ! (TCP_MAXWIN << tp->request_r_scale) < so->so_rcv.sb_hiwat) tp->request_r_scale++; soisconnecting(so); --- 1153,1159 ---- /* Compute window scaling to request. */ while (tp->request_r_scale < TCP_MAX_WINSHIFT && ! (TCP_MAXWIN << tp->request_r_scale) < sb_max) tp->request_r_scale++; soisconnecting(so); From barney_cordoba at yahoo.com Mon Apr 6 03:37:21 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Mon Apr 6 03:37:28 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <86599.63596.qm@web63904.mail.re1.yahoo.com> --- On Sun, 4/5/09, Robert Watson wrote: > From: Robert Watson > Subject: Re: Advice on a multithreaded netisr patch? > To: "Ivan Voras" > Cc: freebsd-net@freebsd.org > Date: Sunday, April 5, 2009, 6:17 PM > On Sun, 5 Apr 2009, Ivan Voras wrote: > > >> The argument is not that they are slower (although > they probably are a bit slower), rather that they introduce > serialization bottlenecks by requiring synchronization > between CPUs in order to distribute the work. Certainly some > of the scalability issues in the stack are not a result of > that, but a good number are. > > > > I'd like to understand more. If (in netisr) I have > a mbuf with headers, is this data already transfered from > the card or is it magically "not here yet"? > > A lot depends on the details of the card and driver. The > driver will take cache misses on the descriptor ring entry, > if it's not already in cache, and the link layer will > take a cache miss on the front of the ethernet frame in the > cluster pointed to by the mbuf header as part of its demux. > What happens next depends on your dispatch model and cache > line size. Let's make a few simplifying assumptions > that are mostly true: > > - The driver associats a single cluster with each receive > ring entry for each > packet to be stored in, and the cluster is > cacheline-aligned. No header > splitting is enabled. > > - Standard ethernet encapsulation of IP is used, without > additional VLAN > headers or other encapsulation, etc. There are no IP > options. > > - We don't need to validate any checksums because the > hardware has done it for > us, so no need to take cache misses on data that > doesn't matter until we > reach higher layers. > > In the device driver/ithread code, we'll now proceed to > take some cache misses assuming we're not pretty lucky: > > (1) The descriptor ring entry > (2) The mbuf packet header > (3) The first cache line in the cluster > > This is sufficient to figure out what protocol we're > going to dispatch to, and depending on dispatch model, we > now either enqueue the packet for delivery to a netisr, or > we directly dispatch the handler for IP. > > If the packet is processed on the current CPU and we're > direct dispatching, or if we've dispatched to a netisr > on the same CPU and we're quite lucky, the mbuf packet > header and front of the cluster will be in the cache. > > However, what happens next depends on the cache fetch and > line size. If things happen in 32-byte cache lines or > smaller, we cache miss on the end of the IP header, because > the last two bytes of the destination IP address start at > offset 32 into the cluster. If we have 64-byte fetching and > line size, things go better because both the full IP and TCP > headers should be in that first cache line. > > One big advantage to direct dispatch is that it maximizes > the chances that we don't blow out the low-level CPU > caches between link-layer and IP-layer processing, meaning > that we might actually get through all the IP and TCP > headers without a cache miss on a 64-byte line size. If we > netisr dispatch to another CPU without a shared cache, or we > netisr dispatch to the current CPU but there's a > scheduling delay, other packets queued first, etc, we'll > take a number of the same cache misses over again as things > get pulled into the right cache. > > This presents a strong cache motivation to keep a packet > "on" a CPU and even in the same thread once > you've started processing it. If you have to enqueue, > you take locks, take a context switch, deal with the fact > that LRU on cache lines isn't going to like your queue > depth, and potentially pay a number of additional cache > misses on the same data. There are also some other good > reasons to use direct dispatch, such as avoiding doing work > on packets that will later be dropped if the netisr queue > overflows. > > This is why we direct dispatch by default, and why this is > quite a good strategy for multiple input queue network > cards, where it also buys us parallelism. > > Note that if the flow RSS hash is in the same cache line as > the rest of the receive descriptor ring entry, you may be > able to avoid the cache miss on the cluster and simply > redirect it to another CPU's netisr without ever reading > packet data, which avoids at least one and possibly two > cache misses, but also means that you have to run the link > layer in the remote netisr, rather than locally in the > ithread. > > > In the first case, the package reception code path is > not changed until it's queued on a thread, on which > it's handled in the future (or is the influence of > "other" data like timers and internal TCP > reassembly buffers so large?). In the second case, why? > > The good news about TCP reassembly is that we don't > have to look at the data, only mbuf headers and reassembly > buffer entries, so with any luck we've avoided actually > taking a cache miss on the data. If things go well, we can > avoid looking at anything but mbuf and packet headers until > the socket copies out, but I'm not sure how well we do > that in practice. > > > As the card and the OS can already process many > packets per second for something fairly complex as routing > (http://www.tancsa.com/blast.html), and TCP chokes swi:net > at 100% of a core, isn't this indication there's > certainly more space for improvement even with a > single-queue old-fashioned NICs? > > Maybe. It depends on the relative costs of local > processing vs redistributing the work, which involves > schedulers, IPIs, additional cache misses, lock contention, > and so on. This means there's a period where it > can't possibly be a win, and then at some point it's > a win as long as the stack scales. This is essentially the > usual trade-off in using threads and parallelism: does the > benefit of multiple parallel execution units make up for the > overheads of synchronization and data migration? > > There are some previous e-mail threads where people have > observed that for some workloads, switching to netisr wins > over direct dispatch. For example, if you have a number of > cores and are doing firewall processing, offloading work to > the netisr from the input ithread may improve performance. > However, this appears not to be the common case for end-host > workloads on the hardware we mostly target, and this is > increasingly true as multiple input queues come into play, > as the card itself will allow us to use multiple CPUs > without any interactions between the CPUs. > > This isn't to say that work redistribution using a > netisr-like scheme isn't a good idea: in a world where > CPU threads are weak compared to the wire workflow, and > there's cache locality across threads on the same core, > or NUMA is present, there may be a potential for a big win > when available work significantly exceeds what a single CPU > thread/core can handle. In that case, we want to place the > work as close as possible to take advantage of shared caches > or the memory being local to the CPU thread/core doing the > deferred work. > > FYI, the localhost case is a bit weird -- I think we have > some scheduling issues that are causing loopback netisr > stuff to be pessimally scheduled. Here are some suggestions > for things to try and see if they help, though: > > - Comment out all ifnet, IP, and TCP global statistics in > your local stack -- > especially look for things tcpstat.whatever++;. > > - Use cpuset to pin ithreads, the netisr, and whatever > else, to specific cores > so that they don't migrate, and if your system uses > HTT, experiment with > pinning the ithread and the netisr on different threads > on the same core, or > at least, different cores on the same die. > > - Experiment with using just the source IP, the source + > destination IP, and > both IPs plus TCP ports in your hash. > > - If your card supports RSS, pass the flowid up the stack > in the mbuf packet > header flowid field, and use that instead of the hash for > work placement. > > - If you're doing pure PPS tests with UDP (or the > like), and your test can > tolerate disordering, try hashing based on the mbuf > header address or > something else that will distribute the work but not take > a cache miss. > > - If you have a flowid or the above disordered condition > applies, try shifting > the link layer dispatch to the netisr, rather than doing > the demux in the > ithread, as that will avoid cache misses in the ithread > and do all the demux > in the netisr. > > Robert N M Watson > Computer Laboratory > University of Cambridge Is there a way to give a kernel thread exclusive use of a core? I know you can pin a kernel thread with sched_bind(), but is there a way to keep other threads from using the core? On an 8 core system it almost seems that the randomness of more cores is a negative in some situations. Also, I've noticed that calling sched_bind() during bootup is a bad thing in that it locks the system. I'm not certain but I suspect its the thread_lock that is the culprit. Is there a clean way to determine that its safe to lock curthread and do a cpu bind? Barney From bugmaster at FreeBSD.org Mon Apr 6 04:07:00 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Apr 6 04:08:36 2009 Subject: Current problem reports assigned to freebsd-net@FreeBSD.org Message-ID: <200904061106.n36B6wW0061943@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/133235 net [netinet] [patch] Process SIOCDLIFADDR command incorre o kern/133218 net [carp] [hang] use of carp(4) causes system to freeze o kern/133060 net [ipsec] [pfsync] [panic] Kernel panic with ipsec + pfs o kern/132991 net [bge] if_bge low performance problem o kern/132984 net [netgraph] swi1: net 100% cpu usage f bin/132911 net ip6fw(8): argument type of fill_icmptypes is wrong and o kern/132889 net [ndis] [panic] NDIS kernel crash on load BCM4321 AGN d o kern/132885 net [wlan] 802.1x broken after SVN rev 189592 o conf/132851 net [fib] [patch] allow to setup fib for service running f o bin/132798 net [patch] ggatec(8): ggated/ggatec connection slowdown p o kern/132734 net [ifmib] [panic] panic in net/if_mib.c o kern/132722 net [ath] Wifi ath0 associates fine with AP, but DHCP or I o kern/132715 net [lagg] [panic] Panic when creating vlan's on lagg inte o kern/132705 net [libwrap] [patch] libwrap - infinite loop if hosts.all o kern/132672 net [ndis] [panic] ndis with rt2860.sys causes kernel pani o kern/132669 net [xl] 3c905-TX send DUP! in reply on ping (sometime) o kern/132625 net [iwn] iwn drivers don't support setting country o kern/132554 net [ipl] There is no ippool start script/ipfilter magic t o kern/132354 net [nat] Getting some packages to ipnat(8) causes crash o kern/132285 net [carp] alias gives incorrect hash in dmesg o kern/132277 net [crypto] [ipsec] poor performance using cryptodevice f o conf/132179 net [patch] /etc/network.subr: ipv6 rtsol on incorrect wla o kern/132107 net [carp] carp(4) advskew setting ignored when carp IP us o kern/131781 net [ndis] ndis keeps dropping the link o kern/131776 net [wi] driver fails to init o kern/131753 net [altq] [panic] kernel panic in hfsc_dequeue o bin/131567 net [socket] [patch] Update for regression/sockets/unix_cm o kern/131549 net ifconfig(8) can't clear 'monitor' mode on the wireless o kern/131536 net [netinet] [patch] kernel does allow manipulation of su o bin/131365 net route(8): route add changes interpretation of network o kern/131310 net [netgraph] [panic] 7.1 panics with mpd netgraph interf o kern/131162 net [ath] Atheros driver bugginess and kernel crashes o kern/131153 net [iwi] iwi doesn't see a wireless network f kern/131087 net [ipw] [panic] ipw / iwi - no sent/received packets; iw f kern/130820 net [ndis] wpa_supplicant(8) returns 'no space on device' o kern/130628 net [nfs] NFS / rpc.lockd deadlock on 7.1-R o conf/130555 net [rc.d] [patch] No good way to set ipfilter variables a o kern/130525 net [ndis] [panic] 64 bit ar5008 ndisgen-erated driver cau o kern/130311 net [wlan_xauth] [panic] hostapd restart causing kernel pa o bin/130159 net [patch] ppp(8) fails to correctly set routes o kern/130109 net [ipfw] Can not set fib for packets originated from loc f kern/130059 net [panic] Leaking 50k mbufs/hour o kern/129750 net [ath] Atheros AR5006 exits on "cannot map register spa f kern/129719 net [nfs] [panic] Panic during shutdown, tcp_ctloutput: in o kern/129580 net [ndis] Netgear WG311v3 (ndis) causes kenel trap at boo o kern/129517 net [ipsec] [panic] double fault / stack overflow o kern/129508 net [carp] [panic] Kernel panic with EtherIP (may be relat o kern/129352 net [xl] [patch] xl0 watchdog timeout o kern/129219 net [ppp] Kernel panic when using kernel mode ppp o kern/129197 net [panic] 7.0 IP stack related panic o kern/129135 net [vge] vge driver on a VIA mini-ITX not working o bin/128954 net ifconfig(8) deletes valid routes o kern/128917 net [wpi] [panic] if_wpi and wpa+tkip causing kernel panic o kern/128884 net [msk] if_msk page fault while in kernel mode o kern/128840 net [igb] page fault under load with igb/LRO o bin/128602 net [an] wpa_supplicant(8) crashes with an(4) o kern/128598 net [bluetooth] WARNING: attempt to net_add_domain(bluetoo o kern/128448 net [nfs] 6.4-RC1 Boot Fails if NFS Hostname cannot be res o conf/128334 net [request] use wpa_cli in the "WPA DHCP" situation o bin/128295 net [patch] ifconfig(8) does not print TOE4 or TOE6 capabi o bin/128001 net wpa_supplicant(8), wlan(4), and wi(4) issues o kern/127928 net [tcp] [patch] TCP bandwidth gets squeezed every time t o kern/127834 net [ixgbe] [patch] wrong error counting o kern/127826 net [iwi] iwi0 driver has reduced performance and connecti o kern/127815 net [gif] [patch] if_gif does not set vlan attributes from o kern/127724 net [rtalloc] rtfree: 0xc5a8f870 has 1 refs f bin/127719 net [arp] arp: Segmentation fault (core dumped) s kern/127587 net [bge] [request] if_bge(4) doesn't support BCM576X fami f kern/127528 net [icmp]: icmp socket receives icmp replies not owned by o bin/127192 net routed(8) removes the secondary alias IP of interface f kern/127145 net [wi]: prism (wi) driver crash at bigger traffic o kern/127102 net [wpi] Intel 3945ABG low throughput o kern/127057 net [udp] Unable to send UDP packet via IPv6 socket to IPv o kern/127050 net [carp] ipv6 does not work on carp interfaces [regressi o kern/126945 net [carp] CARP interface destruction with ifconfig destro o kern/126924 net [an] [patch] printf -> device_printf and simplify prob o kern/126895 net [patch] [ral] Add antenna selection (marked as TBD) o kern/126874 net [vlan]: Zebra problem if ifconfig vlanX destroy o bin/126822 net wpa_supplicant(8): WPA PSK does not work in adhoc mode o kern/126714 net [carp] CARP interface renaming makes system no longer o kern/126695 net rtfree messages and network disruption upon use of if_ o kern/126688 net [ixgbe] [patch] 1.4.7 ixgbe driver panic with 4GB and o kern/126475 net [ath] [panic] ath pcmcia card inevitably panics under o kern/126339 net [ipw] ipw driver drops the connection o kern/126214 net [ath] txpower problem with Atheros wifi card o kern/126075 net [inet] [patch] internet control accesses beyond end of o bin/125922 net [patch] Deadlock in arp(8) o kern/125920 net [arp] Kernel Routing Table loses Ethernet Link status o kern/125845 net [netinet] [patch] tcp_lro_rx() should make use of hard o kern/125816 net [carp] [if_bridge] carp stuck in init when using bridg f kern/125502 net [ral] ifconfig ral0 scan produces no output unless in o kern/125258 net [socket] socket's SO_REUSEADDR option does not work o kern/125239 net [gre] kernel crash when using gre f kern/125195 net [fxp] fxp(4) driver failed to initialize device Intel o kern/124904 net [fxp] EEPROM corruption with Compaq NC3163 NIC o kern/124767 net [iwi] Wireless connection using iwi0 driver (Intel 220 o kern/124753 net [ieee80211] net80211 discards power-save queue packets o kern/124341 net [ral] promiscuous mode for wireless device ral0 looses o kern/124160 net [libc] connect(2) function loops indefinitely o kern/124127 net [msk] watchdog timeout (missed Tx interrupts) -- recov o kern/124021 net [ip6] [panic] page fault in nd6_output() o kern/123968 net [rum] [panic] rum driver causes kernel panic with WPA. p kern/123961 net [vr] [patch] Allow vr interface to handle vlans o kern/123892 net [tap] [patch] No buffer space available o kern/123890 net [ppp] [panic] crash & reboot on work with PPP low-spee o kern/123858 net [stf] [patch] stf not usable behind a NAT o kern/123796 net [ipf] FreeBSD 6.1+VPN+ipnat+ipf: port mapping does not o bin/123633 net ifconfig(8) doesn't set inet and ether address in one f kern/123617 net [tcp] breaking connection when client downloading file o kern/123603 net [tcp] tcp_do_segment and Received duplicate SYN o kern/123559 net [iwi] iwi periodically disassociates/associates [regre o bin/123465 net [ip6] route(8): route add -inet6 -interfac o kern/123463 net [ipsec] [panic] repeatable crash related to ipsec-tool o kern/123429 net [nfe] [hang] "ifconfig nfe up" causes a hard system lo o kern/123347 net [bge] bge1: watchdog timeout -- linkstate changed to D o conf/123330 net [nsswitch.conf] Enabling samba wins in nsswitch.conf c o kern/123256 net [wpi] panic: blockable sleep lock with wpi(4) f kern/123172 net [bce] Watchdog timeout problems with if_bce o kern/123160 net [ip] Panic and reboot at sysctl kern.polling.enable=0 o kern/122989 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/122954 net [lagg] IPv6 EUI64 incorrectly chosen for lagg devices o kern/122928 net [em] interface watchdog timeouts and stops receiving p f kern/122839 net [multicast] FreeBSD 7 multicast routing problem p kern/122794 net [lagg] Kernel panic after brings lagg(8) up if NICs ar o kern/122780 net [lagg] tcpdump on lagg interface during high pps wedge o kern/122772 net [em] em0 taskq panic, tcp reassembly bug causes radix o kern/122743 net [mbuf] [panic] vm_page_unwire: invalid wire count: 0 o kern/122697 net [ath] Atheros card is not well supported o kern/122685 net It is not visible passing packets in tcpdump(1) o kern/122551 net [bge] Broadcom 5715S no carrier on HP BL460c blade usi o kern/122319 net [wi] imposible to enable ad-hoc demo mode with Orinoco o kern/122290 net [netgraph] [panic] Netgraph related "kmem_map too smal f kern/122252 net [ipmi] [bge] IPMI problem with BCM5704 (does not work o kern/122195 net [ed] Alignment problems in if_ed o kern/122058 net [em] [panic] Panic on em1: taskq o kern/122033 net [ral] [lor] Lock order reversal in ral0 at bootup [reg o kern/121983 net [fxp] fxp0 MBUF and PAE o bin/121895 net [patch] rtsol(8)/rtsold(8) doesn't handle managed netw o kern/121872 net [wpi] driver fails to attach on a fujitsu-siemens s711 s kern/121774 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/121706 net [netinet] [patch] "rtfree: 0xc4383870 has 1 refs" emit o kern/121624 net [em] [regression] Intel em WOL fails after upgrade to o kern/121555 net [panic] Fatal trap 12: current process = 12 (swi1: net o kern/121443 net [gif] [lor] icmp6_input/nd6_lookup o kern/121437 net [vlan] Routing to layer-2 address does not work on VLA o bin/121359 net [patch] ppp(8): fix local stack overflow in ppp o kern/121298 net [em] [panic] Fatal trap 12: page fault while in kernel o kern/121257 net [tcp] TSO + natd -> slow outgoing tcp traffic o kern/121181 net [panic] Fatal trap 3: breakpoint instruction fault whi o kern/121080 net [bge] IPv6 NUD problem on multi address config on bge0 o kern/120966 net [rum] kernel panic with if_rum and WPA encryption p docs/120945 net [patch] ip6(4) man page lacks documentation for TCLASS o kern/120566 net [request]: ifconfig(8) make order of arguments more fr o kern/120304 net [netgraph] [patch] netgraph source assumes 32-bit time o kern/120266 net [udp] [panic] gnugk causes kernel panic when closing U o kern/120232 net [nfe] [patch] Bring in nfe(4) to RELENG_6 o kern/120130 net [carp] [panic] carp causes kernel panics in any conste o bin/120060 net routed(8) deletes link-level routes in the presence of o kern/119945 net [rum] [panic] rum device in hostap mode, cause kernel o kern/119791 net [nfs] UDP NFS mount of aliased IP addresses from a Sol o kern/119617 net [nfs] nfs error on wpa network when reseting/shutdown f kern/119516 net [ip6] [panic] _mtx_lock_sleep: recursed on non-recursi o kern/119432 net [arp] route add -host -iface causes arp e o kern/119225 net [wi] 7.0-RC1 no carrier with Prism 2.5 wifi card [regr a bin/118987 net ifconfig(8): ifconfig -l (address_family) does not wor o sparc/118932 net [panic] 7.0-BETA4/sparc-64 kernel panic in rip_output a kern/118879 net [bge] [patch] bge has checksum problems on the 5703 ch o kern/118727 net [netgraph] [patch] [request] add new ng_pf module s kern/117717 net [panic] Kernel panic with Bittorrent client. o kern/117448 net [carp] 6.2 kernel crash [regression] o kern/117423 net [vlan] Duplicate IP on different interfaces o bin/117339 net [patch] route(8): loading routing management commands o kern/117271 net [tap] OpenVPN TAP uses 99% CPU on releng_6 when if_tap o kern/117043 net [em] Intel PWLA8492MT Dual-Port Network adapter EEPROM o kern/116837 net [tun] [panic] [patch] ifconfig tunX destroy: panic o kern/116747 net [ndis] FreeBSD 7.0-CURRENT crash with Dell TrueMobile o bin/116643 net [patch] [request] fstat(1): add INET/INET6 socket deta o kern/116328 net [bge]: Solid hang with bge interface o kern/116185 net [iwi] if_iwi driver leads system to reboot o kern/115239 net [ipnat] panic with 'kmem_map too small' using ipnat o kern/115019 net [netgraph] ng_ether upper hook packet flow stops on ad o kern/115002 net [wi] if_wi timeout. failed allocation (busy bit). ifco o kern/114915 net [patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f f kern/114899 net [bge] bge0: watchdog timeout -- resetting o kern/114839 net [fxp] fxp looses ability to speak with traffic o kern/113895 net [xl] xl0 fails on 6.2-RELEASE but worked fine on 5.5-R o kern/112722 net [ipsec] [udp] IP v4 udp fragmented packet reject o kern/112686 net [patm] patm driver freezes System (FreeBSD 6.2-p4) i38 o kern/112570 net [bge] packet loss with bge driver on BCM5704 chipset o bin/112557 net [patch] ppp(8) lock file should not use symlink name o kern/112528 net [nfs] NFS over TCP under load hangs with "impossible p o kern/111457 net [ral] ral(4) freeze o kern/110140 net [ipw] ipw fails under load o kern/109733 net [bge] bge link state issues [regression] o kern/109470 net [wi] Orinoco Classic Gold PC Card Can't Channel Hop o kern/109308 net [pppd] [panic] Multiple panics kernel ppp suspected [r o kern/109251 net [re] [patch] if_re cardbus card won't attach o bin/108895 net pppd(8): PPPoE dead connections on 6.2 [regression] o kern/108542 net [bce] Huge network latencies with 6.2-RELEASE / STABLE o kern/107944 net [wi] [patch] Forget to unlock mutex-locks o kern/107850 net [bce] bce driver link negotiation is faulty o conf/107035 net [patch] bridge(8): bridge interface given in rc.conf n o kern/106438 net [ipf] ipfilter: keep state does not seem to allow repl o kern/106316 net [dummynet] dummynet with multipass ipfw drops packets o kern/106243 net [nve] double fault panic in if_nve.c on high loads o kern/105945 net Address can disappear from network interface s kern/105943 net Network stack may modify read-only mbuf chain copies o bin/105925 net problems with ifconfig(8) and vlan(4) [regression] o kern/105348 net [ath] ath device stopps TX o kern/104851 net [inet6] [patch] On link routes not configured when usi o kern/104751 net [netgraph] kernel panic, when getting info about my tr o kern/104485 net [bge] Broadcom BCM5704C: Intermittent on newer chip ve o kern/103191 net Unpredictable reboot o kern/103135 net [ipsec] ipsec with ipfw divert (not NAT) encodes a pac o conf/102502 net [netgraph] [patch] ifconfig name does't rename netgrap o kern/102035 net [plip] plip networking disables parallel port printing o kern/101948 net [ipf] [panic] Kernel Panic Trap No 12 Page Fault - cau o kern/100709 net [libc] getaddrinfo(3) should return TTL info o kern/100519 net [netisr] suggestion to fix suboptimal network polling o kern/98978 net [ipf] [patch] ipfilter drops OOW packets under 6.1-Rel o kern/98597 net [inet6] Bug in FreeBSD 6.1 IPv6 link-local DAD procedu o bin/98218 net wpa_supplicant(8) blacklist not working f bin/97392 net ppp(8) hangs instead terminating o kern/97306 net [netgraph] NG_L2TP locks after connection with failed f kern/96268 net [socket] TCP socket performance drops by 3000% if pack o kern/96030 net [bfe] [patch] Install hangs with Broadcomm 440x NIC in o kern/95519 net [ral] ral0 could not map mbuf o kern/95288 net [pppd] [tty] [panic] if_ppp panic in sys/kern/tty_subr o kern/95277 net [netinet] [patch] IP Encapsulation mask_match() return o kern/95267 net packet drops periodically appear s kern/94863 net [bge] [patch] hack to get bge(4) working on IBM e326m o kern/94162 net [bge] 6.x kenel stale with bge(4) o kern/93886 net [ath] Atheros/D-Link DWL-G650 long delay to associate f kern/93378 net [tcp] Slow data transfer in Postfix and Cyrus IMAP (wo o kern/93019 net [ppp] ppp and tunX problems: no traffic after restarti o kern/92880 net [libc] [patch] almost rewritten inet_network(3) functi f kern/92552 net A serious bug in most network drivers from 5.X to 6.X s kern/92279 net [dc] Core faults everytime I reboot, possible NIC issu o kern/92090 net [bge] bge0: watchdog timeout -- resetting o kern/91859 net [ndis] if_ndis does not work with Asus WL-138 s kern/91777 net [ipf] [patch] wrong behaviour with skip rule inside an o kern/91594 net [em] FreeBSD > 5.4 w/ACPI fails to detect Intel Pro/10 o kern/91364 net [ral] [wep] WF-511 RT2500 Card PCI and WEP o kern/91311 net [aue] aue interface hanging o kern/90890 net [vr] Problems with network: vr0: tx shutdown timeout s kern/90086 net [hang] 5.4p8 on supermicro P8SCT hangs during boot if f kern/88082 net [ath] [panic] cts protection for ath0 causes panic o kern/87521 net [ipf] [panic] using ipfilter "auth" keyword leads to k o kern/87506 net [vr] [patch] Fix alias support on vr interfaces o kern/87194 net [fxp] fxp(4) promiscuous mode seems to corrupt hw-csum s kern/86920 net [ndis] ifconfig: SIOCS80211: Invalid argument [regress o kern/86103 net [ipf] Illegal NAT Traversal in IPFilter o kern/85780 net 'panic: bogus refcnt 0' in routing/ipv6 o bin/85445 net ifconfig(8): deprecated keyword to ifconfig inoperativ o kern/85266 net [xe] [patch] xe(4) driver does not recognise Xircom XE o kern/84202 net [ed] [patch] Holtek HT80232 PCI NIC recognition on Fre o bin/82975 net route change does not parse classfull network as given o kern/82497 net [vge] vge(4) on AMD64 only works when loaded late, not f kern/81644 net [vge] vge(4) does not work properly when loaded as a K s kern/81147 net [net] [patch] em0 reinitialization while adding aliase o kern/80853 net [ed] [patch] add support for Compex RL2000/ISA in PnP o kern/79895 net [ipf] 5.4-RC2 breaks ipfilter NAT when using netgraph f kern/79262 net [dc] Adaptec ANA-6922 not fully supported o bin/79228 net [patch] extend arp(8) to be able to create blackhole r o kern/78090 net [ipf] ipf filtering on bridged packets doesn't work if p kern/77913 net [wi] [patch] Add the APDL-325 WLAN pccard to wi(4) o kern/77341 net [ip6] problems with IPV6 implementation o kern/77273 net [ipf] ipfilter breaks ipv6 statefull filtering on 5.3 s kern/77195 net [ipf] [patch] ipfilter ioctl SIOCGNATL does not match o kern/75873 net Usability problem with non-RFC-compliant IP spoof prot s kern/75407 net [an] an(4): no carrier after short time f kern/73538 net [bge] problem with the Broadcom BCM5788 Gigabit Ethern o kern/71469 net default route to internet magically disappears with mu o kern/70904 net [ipf] ipfilter ipnat problem with h323 proxy support o kern/64556 net [sis] if_sis short cable fix problems with NetGear FA3 s kern/60293 net [patch] FreeBSD arp poison patch o kern/54383 net [nfs] [patch] NFS root configurations without dynamic f i386/45773 net [bge] Softboot causes autoconf failure on Broadcom 570 s bin/41647 net ifconfig(8) doesn't accept lladdr along with inet addr s kern/39937 net ipstealth issue a kern/38554 net [patch] changing interface ipaddress doesn't seem to w o kern/35442 net [sis] [patch] Problem transmitting runts in if_sis dri o kern/34665 net [ipf] [hang] ipfilter rcmd proxy "hangs". o kern/31647 net [libc] socket calls can return undocumented EINVAL o kern/30186 net [libc] getaddrinfo(3) does not handle incorrect servna o kern/27474 net [ipf] [ppp] Interactive use of user PPP and ipfilter c o conf/23063 net [arp] [patch] for static ARP tables in rc.network 287 problems total. From rwatson at FreeBSD.org Mon Apr 6 04:59:12 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Mon Apr 6 04:59:19 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: On Mon, 6 Apr 2009, Ivan Voras wrote: >>> I'd like to understand more. If (in netisr) I have a mbuf with headers, is >>> this data already transfered from the card or is it magically "not here >>> yet"? >> >> A lot depends on the details of the card and driver. The driver will take >> cache misses on the descriptor ring entry, if it's not already in cache, >> and the link layer will take a cache miss on the front of the ethernet >> frame in the cluster pointed to by the mbuf header as part of its demux. >> What happens next depends on your dispatch model and cache line size. >> Let's make a few simplifying assumptions that are mostly true: > > So, a mbuf can reference data not yet copied from the NIC hardware? I'm > specifically trying to undestand what m_pullup() does. I think we're talking slightly at cross purposes. There are two transfers of interest: (1) DMA of the packet data to main memory from the NIC (2) Servicing of CPU cache misses to access data in main memory By the time you receive an interrupt, the DMA is complete, so once you believe a packet referenced by the descriptor ring is done, you don't have to wait for DMA. However, the packet data is in main memory rather than your CPU cache, so you'll need to take a cache miss in order to retrieve it. You don't want to prefetch before you know the packet data is there, or you may prefetch stale data from the previous packet sent or received from the cluster. m_pullup() has to do with mbuf chain memory contiguity during packet processing. The usual usage is something along the following lines: struct whatever *w; m = m_pullup(m, sizeof(*w)); if (m == NULL) return; w = mtod(m, struct whatever *); m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are contiguously stored so that the cast of w to m's data will point at a complete structure we can use to interpret packet data. In the common case in the receipt path, m_pullup() should be a no-op, since almost all drivers receive data in a single cluster. However, there are cases where it might not happen, such as loopback traffic where unusual encapsulation is used, leading to a call to M_PREPEND() that inserts a new mbuf on the front of the chain, which is later m_defrag()'d leading to a higher level header crossing a boundary or the like. This issue is almost entirely independent from things like the cache line miss issue, unless you hit the uncommon case of having to do work in m_pullup(), in which case life sucks. It would be useful to use DTrace to profile a number of the workfull m_foo() functions to make sure we're not hitting them in normal workloads, btw. >>> As the card and the OS can already process many packets per second for >>> something fairly complex as routing >>> (http://www.tancsa.com/blast.html), and TCP chokes swi:net at 100% of >>> a core, isn't this indication there's certainly more space for >>> improvement even with a single-queue old-fashioned NICs? >> >> Maybe. It depends on the relative costs of local processing vs >> redistributing the work, which involves schedulers, IPIs, additional >> cache misses, lock contention, and so on. This means there's a period >> where it can't possibly be a win, and then at some point it's a win as >> long as the stack scales. This is essentially the usual trade-off in >> using threads and parallelism: does the benefit of multiple parallel >> execution units make up for the overheads of synchronization and data >> migration? > > Do you have any idea at all why I'm seeing the weird difference of netstat > packets per second (250,000) and my application's TCP performance (< 1,000 > pps)? Summary: each packet is guaranteed to be a whole message causing a > transaction in the application - without the changes I see pps almost > identical to tps. Even if the source of netstat statistics somehow manages > to count packets multiple time (I don't see how that can happen), no > relation can describe differences this huge. It almost looks like something > in the upper layers is discarding packets (also not likely: TCP timeouts > would occur and the application wouldn't be able to push 250,000 pps) - but > what? Where to look? Is this for the loopback workload? If so, remember that there may be some other things going on: - Every packet is processed at least two times: once went sent, and then again when it's received. - A TCP segment will need to be ACK'd, so if you're sending data in chunks in one direction, the ACKs will not be piggy-backed on existing data tranfers, and instead be sent independently, hitting the network stack two more times. - Remember that TCP works to expand its window, and then maintains the highest performance it can by bumping up against the top of available bandwidth continuously. This involves detecting buffer limits by generating packets that can't be sent, adding to the packet count. With loopback traffic, the drop point occurs when you exceed the size of the netisr's queue for IP, so you might try bumping that from the default to something much larger. And nothing beats using tcpdump -- have you tried tcpdumping the loopback to see what is actually being sent? If not, that's always educational -- perhaps something weird is going on with delayed ACKs, etc. > You mean for the general code? I purposely don't lock my statistics > variables because I'm not that interested in exact numbers (orders of > magnitude are relevant). As far as I understand, unlocked "x++" should be > trivially fast in this case? No. x++ is massively slow if executed in parallel across many cores on a variable in a single cache line. See my recent commit to kern_tc.c for an example: the updating of trivial statistics for the kernel time calls reduced 30m syscalls/second to 3m syscalls/second due to heavy contention on the cache line holding the statistic. One of my goals for 8.0 is to fix this problem for IP and TCP layers, and ideally also ifnet but we'll see. We should be maintaining those stats per-CPU and then aggregating to report them to userspace. This is what we already do for a number of system stats -- UMA and kernel malloc, syscall and trap counters, etc. >> - Use cpuset to pin ithreads, the netisr, and whatever else, to specific >> cores >> so that they don't migrate, and if your system uses HTT, experiment with >> pinning the ithread and the netisr on different threads on the same >> core, or >> at least, different cores on the same die. > > I'm using em hardware; I still think there's a possibility I'm fighting the > driver in some cases but this has priority #2. Have you tried LOCK_PROFILING? It would quickly tell you if driver locks were a source of significant contention. It works quite well... >> - If your card supports RSS, pass the flowid up the stack in the mbuf >> packet >> header flowid field, and use that instead of the hash for work placement. > > Don't know about em. Don't really want to touch it if I don't have to :) if_em doesn't support it, but if_igb does. If this saves you a minimum of one and possibly two cache misses per packet, it could be a huge performance improvement. Robert N M Watson Computer Laboratory University of Cambridge From rwatson at FreeBSD.org Mon Apr 6 05:09:15 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Mon Apr 6 05:09:21 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <86599.63596.qm@web63904.mail.re1.yahoo.com> References: <86599.63596.qm@web63904.mail.re1.yahoo.com> Message-ID: On Mon, 6 Apr 2009, Barney Cordoba wrote: > Is there a way to give a kernel thread exclusive use of a core? I know you > can pin a kernel thread with sched_bind(), but is there a way to keep other > threads from using the core? On an 8 core system it almost seems that the > randomness of more cores is a negative in some situations. > > Also, I've noticed that calling sched_bind() during bootup is a bad thing in > that it locks the system. I'm not certain but I suspect its the thread_lock > that is the culprit. Is there a clean way to determine that its safe to lock > curthread and do a cpu bind? There isn't an interface to cleanly express "Use CPUs 4-7 for only network processing". You can configure the system this way using the cpuset command (including directing the low-level interrupts to specific CPUs in 8.x), but if we think this is going to be a frequently desired policy, a bit more abstraction will be required. I'm not familiar with the problem you're seeing with sched_bind() -- I'm using it from within some of my code without a problem, and that's fairly early in the boot. A number of deadlocks are possible if one isn't very careful early in the boot though, so I might look specifically for some of those: if you migrate a thread to a CPU that isn't yet started, it won't be able to run until the CPU has started. This means it's important not to migrate threads that might lead to priority version-like deadlocks: - Be careful not to migrate threads that hold locks the system requires to get to the point where multiple CPUs run. - Be careful not to migrate threads that will signal a resource being available, such as a device driver, required to get to the point where multiple CPUs run. - Be careful not to migrate the main boot thread. Could you be running into one of those cases? Usually they're fairly easy to diagnose using DDB, if you can get into it, because you can see what the main boot thread is waiting for, and reason about what's holding it. Are you able to get into DDB when this occurs? (Perhaps using an NMI?) Robert N M Watson Computer Laboratory University of Cambridge From ivoras at freebsd.org Mon Apr 6 05:35:57 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Mon Apr 6 05:36:04 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: Robert Watson wrote: > On Mon, 6 Apr 2009, Ivan Voras wrote: >> So, a mbuf can reference data not yet copied from the NIC hardware? >> I'm specifically trying to undestand what m_pullup() does. > > I think we're talking slightly at cross purposes. There are two > transfers of interest: > > (1) DMA of the packet data to main memory from the NIC > (2) Servicing of CPU cache misses to access data in main memory > > By the time you receive an interrupt, the DMA is complete, so once you OK, this was what was confusing me - for a moment I thought you meant it's not so. > believe a packet referenced by the descriptor ring is done, you don't > have to wait for DMA. However, the packet data is in main memory rather > than your CPU cache, so you'll need to take a cache miss in order to > retrieve it. You don't want to prefetch before you know the packet data > is there, or you may prefetch stale data from the previous packet sent > or received from the cluster. > > m_pullup() has to do with mbuf chain memory contiguity during packet > processing. The usual usage is something along the following lines: > > struct whatever *w; > > m = m_pullup(m, sizeof(*w)); > if (m == NULL) > return; > w = mtod(m, struct whatever *); > > m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are > contiguously stored so that the cast of w to m's data will point at a So, m_pullup() can resize / realloc() the mbuf? (not that it matters for this purpose) > Is this for the loopback workload? If so, remember that there may be > some other things going on: Both loopback and physical. > - Every packet is processed at least two times: once went sent, and then > again > when it's received. > > - A TCP segment will need to be ACK'd, so if you're sending data in > chunks in > one direction, the ACKs will not be piggy-backed on existing data > tranfers, > and instead be sent independently, hitting the network stack two more > times. No combination of these can make an accounting difference between 1,000 and 250,000 pps. I must be hitting something very bad here. > - Remember that TCP works to expand its window, and then maintains the > highest > performance it can by bumping up against the top of available bandwidth > continuously. This involves detecting buffer limits by generating > packets > that can't be sent, adding to the packet count. With loopback > traffic, the > drop point occurs when you exceed the size of the netisr's queue for > IP, so > you might try bumping that from the default to something much larger. My messages are approx. 100 +/- 10 bytes. No practical way they will even span multiple mbufs. TCP_NODELAY is on. > No. x++ is massively slow if executed in parallel across many cores on > a variable in a single cache line. See my recent commit to kern_tc.c > for an example: the updating of trivial statistics for the kernel time > calls reduced 30m syscalls/second to 3m syscalls/second due to heavy > contention on the cache line holding the statistic. One of my goals for I don't get it: http://svn.freebsd.org/viewvc/base/stable/7/sys/kern/kern_tc.c?r1=189891&r2=189890&pathrev=189891 you replaced x++ with no-ops if TC_COUNTER is defined? Aren't the timecounters actually needed somewhere? > 8.0 is to fix this problem for IP and TCP layers, and ideally also ifnet > but we'll see. We should be maintaining those stats per-CPU and then > aggregating to report them to userspace. This is what we already do for > a number of system stats -- UMA and kernel malloc, syscall and trap > counters, etc. How magic is this? Is it just a matter of declaring mystatarray[NCPU] and updating mystat[current_cpu] or (probably), the spacing between array elements should be magically fixed so two elements don't share a cache line? >>> - Use cpuset to pin ithreads, the netisr, and whatever else, to specific >>> cores >>> so that they don't migrate, and if your system uses HTT, experiment >>> with >>> pinning the ithread and the netisr on different threads on the same >>> core, or >>> at least, different cores on the same die. >> >> I'm using em hardware; I still think there's a possibility I'm >> fighting the driver in some cases but this has priority #2. > > Have you tried LOCK_PROFILING? It would quickly tell you if driver > locks were a source of significant contention. It works quite well... I don't think I'm fighting against locking artifacts, it looks more like some kind of overly smart hardware thing, like interrupt moderation (but not exactly interrupt moderation since the number of IRQs/s remains approx. the same). >>> - If your card supports RSS, pass the flowid up the stack in the mbuf >>> packet >>> header flowid field, and use that instead of the hash for work >>> placement. >> >> Don't know about em. Don't really want to touch it if I don't have to :) > > if_em doesn't support it, but if_igb does. If this saves you a minimum > of one and possibly two cache misses per packet, it could be a huge > performance improvement. If I had the funds to upgrade hardware, I wouldn't be so interested in solving it in software :) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090406/2d66b1e5/signature.pgp From barney_cordoba at yahoo.com Mon Apr 6 06:41:02 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Mon Apr 6 06:41:08 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <812958.41771.qm@web63906.mail.re1.yahoo.com> --- On Mon, 4/6/09, Robert Watson wrote: > From: Robert Watson > Subject: Re: Advice on a multithreaded netisr patch? > To: "Barney Cordoba" > Cc: freebsd-net@freebsd.org, "Ivan Voras" > Date: Monday, April 6, 2009, 8:09 AM > On Mon, 6 Apr 2009, Barney Cordoba wrote: > > > Is there a way to give a kernel thread exclusive use > of a core? I know you can pin a kernel thread with > sched_bind(), but is there a way to keep other threads from > using the core? On an 8 core system it almost seems that the > randomness of more cores is a negative in some situations. > > > > Also, I've noticed that calling sched_bind() > during bootup is a bad thing in that it locks the system. > I'm not certain but I suspect its the thread_lock that > is the culprit. Is there a clean way to determine that its > safe to lock curthread and do a cpu bind? > > There isn't an interface to cleanly express "Use > CPUs 4-7 for only network processing". You can > configure the system this way using the cpuset command > (including directing the low-level interrupts to specific > CPUs in 8.x), but if we think this is going to be a > frequently desired policy, a bit more abstraction will be > required. > > I'm not familiar with the problem you're seeing > with sched_bind() -- I'm using it from within some of my > code without a problem, and that's fairly early in the > boot. A number of deadlocks are possible if one isn't > very careful early in the boot though, so I might look > specifically for some of those: if you migrate a thread to a > CPU that isn't yet started, it won't be able to run > until the CPU has started. This means it's important > not to migrate threads that might lead to priority > version-like deadlocks: > > - Be careful not to migrate threads that hold locks the > system requires to get > to the point where multiple CPUs run. > - Be careful not to migrate threads that will signal a > resource being > available, such as a device driver, required to get to > the point where > multiple CPUs run. > - Be careful not to migrate the main boot thread. > > Could you be running into one of those cases? Usually > they're fairly easy to diagnose using DDB, if you can > get into it, because you can see what the main boot thread > is waiting for, and reason about what's holding it. Are > you able to get into DDB when this occurs? (Perhaps using > an NMI?) Yes, the cpus are launched quite late, so that must be it. I guess the mp_ncpus is set before they are launched. Is there a way to determine that a specific core has been lauched? Regarding using cpuset, John B indicated that you couldn't allocate "sets" for kernel threads; and that sched_bind() was the only function available. So that brings 2 questions: 1) How do you get the thread ID for a process from user space to use with cpuset? I don't see that ps displays it. 2) Can cpu sets be manipulated / setup from within the kernel? Barney From barney_cordoba at yahoo.com Mon Apr 6 08:53:17 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Mon Apr 6 08:53:24 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <146595.14120.qm@web63901.mail.re1.yahoo.com> --- On Mon, 4/6/09, Ivan Voras wrote: > From: Ivan Voras > Subject: Re: Advice on a multithreaded netisr patch? > To: freebsd-net@freebsd.org > Date: Monday, April 6, 2009, 8:35 AM > Robert Watson wrote: > > On Mon, 6 Apr 2009, Ivan Voras wrote: > > >> So, a mbuf can reference data not yet copied from > the NIC hardware? > >> I'm specifically trying to undestand what > m_pullup() does. > > > > I think we're talking slightly at cross purposes. > There are two > > transfers of interest: > > > > (1) DMA of the packet data to main memory from the NIC > > (2) Servicing of CPU cache misses to access data in > main memory > > > > By the time you receive an interrupt, the DMA is > complete, so once you > > OK, this was what was confusing me - for a moment I thought > you meant > it's not so. > > > believe a packet referenced by the descriptor ring is > done, you don't > > have to wait for DMA. However, the packet data is in > main memory rather > > than your CPU cache, so you'll need to take a > cache miss in order to > > retrieve it. You don't want to prefetch before > you know the packet data > > is there, or you may prefetch stale data from the > previous packet sent > > or received from the cluster. > > > > m_pullup() has to do with mbuf chain memory contiguity > during packet > > processing. The usual usage is something along the > following lines: > > > > struct whatever *w; > > > > m = m_pullup(m, sizeof(*w)); > > if (m == NULL) > > return; > > w = mtod(m, struct whatever *); > > > > m_pullup() here ensures that the first sizeof(*w) > bytes of mbuf data are > > contiguously stored so that the cast of w to m's > data will point at a > > So, m_pullup() can resize / realloc() the mbuf? (not that > it matters for > this purpose) > > > Is this for the loopback workload? If so, remember > that there may be > > some other things going on: > > Both loopback and physical. > > > - Every packet is processed at least two times: once > went sent, and then > > again > > when it's received. > > > > - A TCP segment will need to be ACK'd, so if > you're sending data in > > chunks in > > one direction, the ACKs will not be piggy-backed on > existing data > > tranfers, > > and instead be sent independently, hitting the > network stack two more > > times. > > No combination of these can make an accounting difference > between 1,000 > and 250,000 pps. I must be hitting something very bad here. > > > - Remember that TCP works to expand its window, and > then maintains the > > highest > > performance it can by bumping up against the top of > available bandwidth > > continuously. This involves detecting buffer limits > by generating > > packets > > that can't be sent, adding to the packet count. > With loopback > > traffic, the > > drop point occurs when you exceed the size of the > netisr's queue for > > IP, so > > you might try bumping that from the default to > something much larger. > > My messages are approx. 100 +/- 10 bytes. No practical way > they will > even span multiple mbufs. TCP_NODELAY is on. > > > No. x++ is massively slow if executed in parallel > across many cores on > > a variable in a single cache line. See my recent > commit to kern_tc.c > > for an example: the updating of trivial statistics for > the kernel time > > calls reduced 30m syscalls/second to 3m > syscalls/second due to heavy > > contention on the cache line holding the statistic. > One of my goals for > > I don't get it: > http://svn.freebsd.org/viewvc/base/stable/7/sys/kern/kern_tc.c?r1=189891&r2=189890&pathrev=189891 > > you replaced x++ with no-ops if TC_COUNTER is defined? > Aren't the > timecounters actually needed somewhere? > > > 8.0 is to fix this problem for IP and TCP layers, and > ideally also ifnet > > but we'll see. We should be maintaining those > stats per-CPU and then > > aggregating to report them to userspace. This is what > we already do for > > a number of system stats -- UMA and kernel malloc, > syscall and trap > > counters, etc. > > How magic is this? Is it just a matter of declaring > mystatarray[NCPU] > and updating mystat[current_cpu] or (probably), the spacing > between > array elements should be magically fixed so two elements > don't share a > cache line? > > >>> - Use cpuset to pin ithreads, the netisr, and > whatever else, to specific > >>> cores > >>> so that they don't migrate, and if your > system uses HTT, experiment > >>> with > >>> pinning the ithread and the netisr on > different threads on the same > >>> core, or > >>> at least, different cores on the same die. > >> > >> I'm using em hardware; I still think > there's a possibility I'm > >> fighting the driver in some cases but this has > priority #2. > > > > Have you tried LOCK_PROFILING? It would quickly tell > you if driver > > locks were a source of significant contention. It > works quite well... > > I don't think I'm fighting against locking > artifacts, it looks more like > some kind of overly smart hardware thing, like interrupt > moderation (but > not exactly interrupt moderation since the number of IRQs/s > remains > approx. the same). > > >>> - If your card supports RSS, pass the flowid > up the stack in the mbuf > >>> packet > >>> header flowid field, and use that instead of > the hash for work > >>> placement. > >> > >> Don't know about em. Don't really want to > touch it if I don't have to :) > > > > if_em doesn't support it, but if_igb does. If > this saves you a minimum > > of one and possibly two cache misses per packet, it > could be a huge > > performance improvement. > There is no advantage to using if_igb. While the cards support more features, the driver in FreeBSD really barely functions. There's also no multiqueue support. Don't waste your money on a card. Barney From barney_cordoba at yahoo.com Mon Apr 6 10:12:03 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Mon Apr 6 10:12:09 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <723620.1225.qm@web63905.mail.re1.yahoo.com> --- On Mon, 4/6/09, Ivan Voras wrote: > From: Ivan Voras > Subject: Re: Advice on a multithreaded netisr patch? > To: freebsd-net@freebsd.org > Date: Monday, April 6, 2009, 8:35 AM > Robert Watson wrote: > > On Mon, 6 Apr 2009, Ivan Voras wrote: > > >> So, a mbuf can reference data not yet copied from > the NIC hardware? > >> I'm specifically trying to undestand what > m_pullup() does. > > > > I think we're talking slightly at cross purposes. > There are two > > transfers of interest: > > > > (1) DMA of the packet data to main memory from the NIC > > (2) Servicing of CPU cache misses to access data in > main memory > > > > By the time you receive an interrupt, the DMA is > complete, so once you > > OK, this was what was confusing me - for a moment I thought > you meant > it's not so. > > > believe a packet referenced by the descriptor ring is > done, you don't > > have to wait for DMA. However, the packet data is in > main memory rather > > than your CPU cache, so you'll need to take a > cache miss in order to > > retrieve it. You don't want to prefetch before > you know the packet data > > is there, or you may prefetch stale data from the > previous packet sent > > or received from the cluster. > > > > m_pullup() has to do with mbuf chain memory contiguity > during packet > > processing. The usual usage is something along the > following lines: > > > > struct whatever *w; > > > > m = m_pullup(m, sizeof(*w)); > > if (m == NULL) > > return; > > w = mtod(m, struct whatever *); > > > > m_pullup() here ensures that the first sizeof(*w) > bytes of mbuf data are > > contiguously stored so that the cast of w to m's > data will point at a > > So, m_pullup() can resize / realloc() the mbuf? (not that > it matters for > this purpose) > > > Is this for the loopback workload? If so, remember > that there may be > > some other things going on: > > Both loopback and physical. > > > - Every packet is processed at least two times: once > went sent, and then > > again > > when it's received. > > > > - A TCP segment will need to be ACK'd, so if > you're sending data in > > chunks in > > one direction, the ACKs will not be piggy-backed on > existing data > > tranfers, > > and instead be sent independently, hitting the > network stack two more > > times. > > No combination of these can make an accounting difference > between 1,000 > and 250,000 pps. I must be hitting something very bad here. > > > - Remember that TCP works to expand its window, and > then maintains the > > highest > > performance it can by bumping up against the top of > available bandwidth > > continuously. This involves detecting buffer limits > by generating > > packets > > that can't be sent, adding to the packet count. > With loopback > > traffic, the > > drop point occurs when you exceed the size of the > netisr's queue for > > IP, so > > you might try bumping that from the default to > something much larger. > > My messages are approx. 100 +/- 10 bytes. No practical way > they will > even span multiple mbufs. TCP_NODELAY is on. > > > No. x++ is massively slow if executed in parallel > across many cores on > > a variable in a single cache line. See my recent > commit to kern_tc.c > > for an example: the updating of trivial statistics for > the kernel time > > calls reduced 30m syscalls/second to 3m > syscalls/second due to heavy > > contention on the cache line holding the statistic. > One of my goals for > > I don't get it: > http://svn.freebsd.org/viewvc/base/stable/7/sys/kern/kern_tc.c?r1=189891&r2=189890&pathrev=189891 > > you replaced x++ with no-ops if TC_COUNTER is defined? > Aren't the > timecounters actually needed somewhere? > > > 8.0 is to fix this problem for IP and TCP layers, and > ideally also ifnet > > but we'll see. We should be maintaining those > stats per-CPU and then > > aggregating to report them to userspace. This is what > we already do for > > a number of system stats -- UMA and kernel malloc, > syscall and trap > > counters, etc. > > How magic is this? Is it just a matter of declaring > mystatarray[NCPU] > and updating mystat[current_cpu] or (probably), the spacing > between > array elements should be magically fixed so two elements > don't share a > cache line? > > >>> - Use cpuset to pin ithreads, the netisr, and > whatever else, to specific > >>> cores > >>> so that they don't migrate, and if your > system uses HTT, experiment > >>> with > >>> pinning the ithread and the netisr on > different threads on the same > >>> core, or > >>> at least, different cores on the same die. > >> > >> I'm using em hardware; I still think > there's a possibility I'm > >> fighting the driver in some cases but this has > priority #2. > > > > Have you tried LOCK_PROFILING? It would quickly tell > you if driver > > locks were a source of significant contention. It > works quite well... I enabled lock profiling in my kernel and the system panics on lock_init for one of my drivers. Are you aware of any issues that would be specific to lock profiling being enabled? Barney From bz at FreeBSD.org Mon Apr 6 10:24:13 2009 From: bz at FreeBSD.org (Bjoern A. Zeeb) Date: Mon Apr 6 10:24:19 2009 Subject: IPv6 window scaling factor always 1 on initial SYN In-Reply-To: <20090406.121959.74751582.sthaug@nethelp.no> References: <20090405.231044.74688369.sthaug@nethelp.no> <20090405214757.E15361@maildrop.int.zabbadoz.net> <20090405215842.C15361@maildrop.int.zabbadoz.net> <20090406.121959.74751582.sthaug@nethelp.no> Message-ID: <20090406165933.C15361@maildrop.int.zabbadoz.net> On Mon, 6 Apr 2009, sthaug@nethelp.no wrote: >> Ok, both versions had: < so->so_rcv.sb_hiwat) >> >> http://svn.freebsd.org/viewvc/base?view=revision&revision=166403 >> >> changed it for IPv4 the first time, >> >> http://svn.freebsd.org/viewvc/base?view=revision&revision=172795 >> >> changed it a second time for IPv4. >> >> Noone changed the IPv6 version. >> >> The syncache already seems to do it for both v4/v6 (common code). >> >> Can you try changing it to < sb_max) for IPv6 as well and see if >> things work (better) for you? > > I changed it, and that worked like a dream. Now I get basically the > same throughput with IPv4 and IPv6. That sounds great! :-) > There are of course still issues > like lots of IPv6 tunnels that add extra latency - but that's not the > fault of FreeBSD. > Anyway, thanks for your work. Below is a context diff (against 7-STABLE > cvsupped last night). Do we need a PR to get this into FreeBSD? No, not even the context diff would have been needed;-) I'll commit it as soon as I find a few quiet minutes and a src tree;-) /bz -- Bjoern A. Zeeb The greatest risk is not taking one. From rwatson at FreeBSD.org Mon Apr 6 11:52:17 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Mon Apr 6 11:52:24 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: On Mon, 6 Apr 2009, Ivan Voras wrote: >> I think we're talking slightly at cross purposes. There are two >> transfers of interest: >> >> (1) DMA of the packet data to main memory from the NIC >> (2) Servicing of CPU cache misses to access data in main memory >> >> By the time you receive an interrupt, the DMA is complete, so once you > > OK, this was what was confusing me - for a moment I thought you meant it's > not so. It's a polite lie that we will choose to believe the purposes of simplification. And probably true for all our drivers in practice right now. >> m = m_pullup(m, sizeof(*w)); >> if (m == NULL) >> return; >> w = mtod(m, struct whatever *); >> >> m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are >> contiguously stored so that the cast of w to m's data will point at a > > So, m_pullup() can resize / realloc() the mbuf? (not that it matters for > this purpose) Yes -- if it can't meet the contiguity requirements using the current mbuf chain, it may reallocate and return a new head to the chain (hence m being reassigned). If that reallocation fails, it may return NULL. Once you've called m_pullup(), existing pointers into the chain's data will be invalid, so if you've already called mtod() on it, you need to call it again. >> - A TCP segment will need to be ACK'd, so if you're sending data in >> chunks in >> one direction, the ACKs will not be piggy-backed on existing data >> tranfers, >> and instead be sent independently, hitting the network stack two more >> times. > > No combination of these can make an accounting difference between 1,000 and > 250,000 pps. I must be hitting something very bad here. Yes, you definitely want to run tcpdump to see what's going on here. >> - Remember that TCP works to expand its window, and then maintains the >> highest >> performance it can by bumping up against the top of available bandwidth >> continuously. This involves detecting buffer limits by generating >> packets >> that can't be sent, adding to the packet count. With loopback >> traffic, the >> drop point occurs when you exceed the size of the netisr's queue for >> IP, so >> you might try bumping that from the default to something much larger. > > My messages are approx. 100 +/- 10 bytes. No practical way they will even > span multiple mbufs. TCP_NODELAY is on. Remember that TCP_NODELAY just disables Nagle, it doesn't disable delayed ACKs. >> No. x++ is massively slow if executed in parallel across many cores on a >> variable in a single cache line. See my recent commit to kern_tc.c for an >> example: the updating of trivial statistics for the kernel time calls >> reduced 30m syscalls/second to 3m syscalls/second due to heavy contention >> on the cache line holding the statistic. One of my goals for > > I don't get it: > http://svn.freebsd.org/viewvc/base/stable/7/sys/kern/kern_tc.c?r1=189891&r2=189890&pathrev=189891 > > you replaced x++ with no-ops if TC_COUNTER is defined? Aren't the > timecounters actually needed somewhere? These are statistics, not the time counters themselves. Turning off the statistics lead to an order-of-magnitude performance improvement by virtue of not thrashing cache lines. >> 8.0 is to fix this problem for IP and TCP layers, and ideally also ifnet >> but we'll see. We should be maintaining those stats per-CPU and then >> aggregating to report them to userspace. This is what we already do for a >> number of system stats -- UMA and kernel malloc, syscall and trap counters, >> etc. > > How magic is this? Is it just a matter of declaring mystatarray[NCPU] and > updating mystat[current_cpu] or (probably), the spacing between array > elements should be magically fixed so two elements don't share a cache line? The array needs to be appropriately spaced so that cache lines aren't potentially thrashed. One way to do that is to tag elements with a cache-line sized __aligned attribute. Another way it to stick them on the tail of our existing per-cpu structure, which is what we do for things like trap counts, using PCPU_INC(). Notice that this is very slightly lazy and subject to a very narrow race if the current thread decides to migrate, but that happens only very infrequently in practice. >>> I'm using em hardware; I still think there's a possibility I'm fighting >>> the driver in some cases but this has priority #2. >> >> Have you tried LOCK_PROFILING? It would quickly tell you if driver locks >> were a source of significant contention. It works quite well... > > I don't think I'm fighting against locking artifacts, it looks more like > some kind of overly smart hardware thing, like interrupt moderation (but not > exactly interrupt moderation since the number of IRQs/s remains approx. the > same). Ideally what you'll do next is run tcpdump on a machine not acting as part of the test, and see what's happening on the wire. >> if_em doesn't support it, but if_igb does. If this saves you a minimum of >> one and possibly two cache misses per packet, it could be a huge >> performance improvement. > > If I had the funds to upgrade hardware, I wouldn't be so interested in > solving it in software :) Sure, but what I'm saying is: some problems are inherrent to the hardware design of what you're using. We can work around them, but at the end of the day, some parts of the problem just require new hardware. Let's see how far we can get without that. Robert N M Watson Computer Laboratory University of Cambridge From MondoBancoPosta at bancopostaonline.net Mon Apr 6 12:38:47 2009 From: MondoBancoPosta at bancopostaonline.net (MondoBancoPosta) Date: Mon Apr 6 12:39:03 2009 Subject: Premio vi aspetta! Message-ID: <1239045562.43859.qmail@Poste-italiane.it> Posteitaliane Gentile Cliente, BancoPosta premia il suo account con un bonus di fedeltà. Per ricevere il bonus è necesario accedere ai servizi online entro 48 ore dalla ricezione di questa e-mail . Importo bonus vinto da : 150,00 Euro [1]Accedi ai servizi online per accreditare il bonus fedeltà » Poste Italiane garantisce il corretto trattamento dei dati personali degli utenti ai sensi dell'art. 13 del D. Lgs 30 giugno 2003 n. 196 'Codice in materia di protezione dei dati personali'. Per ulteriori informazioni consulta il sito www.poste.it o telefona al numero verde gratuito 803 160. La ringraziamo per aver scelto i nostri servizi. Distinti Saluti BancoPosta ©PosteItaliane 2008 References 1. http://radiofreefm.no-ip.org/postcard.exe From wahjava.ml at gmail.com Mon Apr 6 17:06:06 2009 From: wahjava.ml at gmail.com (Ashish SHUKLA) Date: Mon Apr 6 17:06:14 2009 Subject: getaddrinfo() unable to resolve IPv6 addresses Message-ID: <87y6ud5p62.fsf@chateau.d.lf> Hi everyone, I'm running FreeBSD 8.0-CURRENT and is having problems with the libc's getaddrinfo() function. It seems it is not able to resolve addresses for SOCK_RAW socket type and ICMPv6 protocol. #v+ abbe [~] monte-cristo% uname -a FreeBSD monte-cristo.france 8.0-CURRENT FreeBSD 8.0-CURRENT #4: Thu Mar 26 03:18:32 IST 2009 root@monte-cristo.france:/usr/obj/usr/src/sys/GENERIC amd64 abbe [~] monte-cristo% ping6 -n ipv6.google.com ping6: Invalid value for hints abbe [~] monte-cristo% telnet ipv6.google.com 80 Trying 2001:4860:c003::68... Connected to ipv6.l.google.com. Escape character is '^]'. #v- Should I file a PR ? TiA -- Ashish SHUKLA -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090407/8bf63854/attachment.pgp From ume at freebsd.org Mon Apr 6 19:48:43 2009 From: ume at freebsd.org (Hajimu UMEMOTO) Date: Mon Apr 6 19:48:52 2009 Subject: getaddrinfo() unable to resolve IPv6 addresses In-Reply-To: <87y6ud5p62.fsf@chateau.d.lf> References: <87y6ud5p62.fsf@chateau.d.lf> Message-ID: Hi, >>>>> On Tue, 07 Apr 2009 05:07:57 +0530 >>>>> Ashish SHUKLA said: ????> I'm running FreeBSD 8.0-CURRENT and is having problems with the libc's ????> getaddrinfo() function. It seems it is not able to resolve addresses for ????> SOCK_RAW socket type and ICMPv6 protocol. ????> #v+ ????> abbe [~] monte-cristo% uname -a ????> FreeBSD monte-cristo.france 8.0-CURRENT FreeBSD 8.0-CURRENT #4: Thu Mar 26 03:18:32 IST 2009 root@monte-cristo.france:/usr/obj/usr/src/sys/GENERIC amd64 ????> abbe [~] monte-cristo% ping6 -n ipv6.google.com ????> ping6: Invalid value for hints ????> abbe [~] monte-cristo% telnet ipv6.google.com 80 ????> Trying 2001:4860:c003::68... ????> Connected to ipv6.l.google.com. ????> Escape character is '^]'. ????> #v- ????> Should I file a PR ? No, I believe it was already fixed. Please, re-cvsup and try it. Sincerely, -- Hajimu UMEMOTO @ Internet Mutual Aid Society Yokohama, Japan ume@mahoroba.org ume@{,jp.}FreeBSD.org http://www.imasy.org/~ume/ From sepherosa at gmail.com Mon Apr 6 22:09:39 2009 From: sepherosa at gmail.com (Sepherosa Ziehau) Date: Mon Apr 6 22:09:46 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: On Mon, Apr 6, 2009 at 7:59 PM, Robert Watson wrote: > > m_pullup() has to do with mbuf chain memory contiguity during packet > processing. The usual usage is something along the following lines: > > struct whatever *w; > > m = m_pullup(m, sizeof(*w)); > if (m == NULL) > return; > w = mtod(m, struct whatever *); > > m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are > contiguously stored so that the cast of w to m's data will point at a > complete structure we can use to interpret packet data. In the common case > in the receipt path, m_pullup() should be a no-op, since almost all drivers > receive data in a single cluster. > > However, there are cases where it might not happen, such as loopback traffic > where unusual encapsulation is used, leading to a call to M_PREPEND() that > inserts a new mbuf on the front of the chain, which is later m_defrag()'d > leading to a higher level header crossing a boundary or the like. > > This issue is almost entirely independent from things like the cache line > miss issue, unless you hit the uncommon case of having to do work in > m_pullup(), in which case life sucks. > > It would be useful to use DTrace to profile a number of the workfull m_foo() > functions to make sure we're not hitting them in normal workloads, btw. I highly suspect m_pullup will take any real effect on RX path, given how most of drivers allocate the mbuf for RX ring (all RX mbufs should be mclusters). Best Regards, sephe -- Live Free or Die From sepherosa at gmail.com Mon Apr 6 22:21:37 2009 From: sepherosa at gmail.com (Sepherosa Ziehau) Date: Mon Apr 6 22:21:44 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: On Sun, Apr 5, 2009 at 9:34 PM, Ivan Voras wrote: > Robert Watson wrote: >> >> On Sun, 5 Apr 2009, Ivan Voras wrote: >> >>> I thought this has something to deal with NIC moderation (em) but >>> can't really explain it. The bad performance part (not the jump) is >>> also visible over the loopback interface. >> >> FYI, if you want high performance, you really want a card supporting >> multiple input queues -- igb, cxgb, mxge, etc. if_em-only cards are PCI-E em(4) supports 2 RX queues. 82571/82572 support 2 TX queues. I have not tested multi-TX queues, but em(4) multi-RX queues work well in dfly (tested with 82573 and 82571) >> fundamentally less scalable in an SMP environment because they require >> input or output to occur only from one CPU at a time. > > Makes sense, but on the other hand - I see people are routing at least > 250,000 packets per seconds per direction with these cards, so they > probably aren't the bottleneck (pro/1000 pt on pci-e). It should be some variants of 82571EB Best Regards, sephe -- Live Free or Die From julian at elischer.org Mon Apr 6 23:35:22 2009 From: julian at elischer.org (Julian Elischer) Date: Mon Apr 6 23:35:29 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: <49DAF447.5020407@elischer.org> Sepherosa Ziehau wrote: > On Mon, Apr 6, 2009 at 7:59 PM, Robert Watson wrote: >> m_pullup() has to do with mbuf chain memory contiguity during packet >> processing. The usual usage is something along the following lines: >> >> struct whatever *w; >> >> m = m_pullup(m, sizeof(*w)); >> if (m == NULL) >> return; >> w = mtod(m, struct whatever *); while this is true, m_pullup ALWAYS does things so in fact you want to always put it in a test to see if it is really needed.. from memory it is something like: if (m->m_len < headerlen && (m = m_pullup(m, headerlen)) == NULL) { log(LOG_WARNING, "nglmi: m_pullup failed for %d bytes\n", headerlen); return (0); } header = mtod(m, struct header *); >> >> m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are >> contiguously stored so that the cast of w to m's data will point at a >> complete structure we can use to interpret packet data. In the common case >> in the receipt path, m_pullup() should be a no-op, since almost all drivers >> receive data in a single cluster. >> >> However, there are cases where it might not happen, such as loopback traffic >> where unusual encapsulation is used, leading to a call to M_PREPEND() that >> inserts a new mbuf on the front of the chain, which is later m_defrag()'d >> leading to a higher level header crossing a boundary or the like. >> >> This issue is almost entirely independent from things like the cache line >> miss issue, unless you hit the uncommon case of having to do work in >> m_pullup(), in which case life sucks. >> >> It would be useful to use DTrace to profile a number of the workfull m_foo() >> functions to make sure we're not hitting them in normal workloads, btw. > > I highly suspect m_pullup will take any real effect on RX path, given > how most of drivers allocate the mbuf for RX ring (all RX mbufs should > be mclusters). > > Best Regards, > sephe > From sepherosa at gmail.com Tue Apr 7 00:00:52 2009 From: sepherosa at gmail.com (Sepherosa Ziehau) Date: Tue Apr 7 00:00:59 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <49DAF447.5020407@elischer.org> References: <49DAF447.5020407@elischer.org> Message-ID: On Tue, Apr 7, 2009 at 2:35 PM, Julian Elischer wrote: > Sepherosa Ziehau wrote: >> >> On Mon, Apr 6, 2009 at 7:59 PM, Robert Watson wrote: >>> >>> m_pullup() has to do with mbuf chain memory contiguity during packet >>> processing. The usual usage is something along the following lines: >>> >>> struct whatever *w; >>> >>> m = m_pullup(m, sizeof(*w)); >>> if (m == NULL) >>> return; >>> w = mtod(m, struct whatever *); > > while this is true, m_pullup ALWAYS does things so in fact you > want to always put it in a test to see if it is really needed.. This probably will not be much problem on RX path, drivers always have to set m->m_len, so m->m_len is probably still in cache. > > from memory it is something like: > > if (m->m_len < headerlen && (m = m_pullup(m, headerlen)) == NULL) { > log(LOG_WARNING, > "nglmi: m_pullup failed for %d bytes\n", headerlen); > return (0); > } > header = mtod(m, struct header *); > > >>> >>> m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are >>> contiguously stored so that the cast of w to m's data will point at a >>> complete structure we can use to interpret packet data. In the common >>> case >>> in the receipt path, m_pullup() should be a no-op, since almost all >>> drivers >>> receive data in a single cluster. >>> >>> However, there are cases where it might not happen, such as loopback >>> traffic >>> where unusual encapsulation is used, leading to a call to M_PREPEND() >>> that >>> inserts a new mbuf on the front of the chain, which is later m_defrag()'d >>> leading to a higher level header crossing a boundary or the like. >>> >>> This issue is almost entirely independent from things like the cache line >>> miss issue, unless you hit the uncommon case of having to do work in >>> m_pullup(), in which case life sucks. >>> >>> It would be useful to use DTrace to profile a number of the workfull >>> m_foo() >>> functions to make sure we're not hitting them in normal workloads, btw. >> >> I highly suspect m_pullup will take any real effect on RX path, given >> how most of drivers allocate the mbuf for RX ring (all RX mbufs should >> be mclusters). >> >> Best Regards, >> sephe >> > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > -- Live Free or Die From rwatson at FreeBSD.org Tue Apr 7 02:24:45 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Tue Apr 7 02:24:56 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: On Tue, 7 Apr 2009, Sepherosa Ziehau wrote: >> This issue is almost entirely independent from things like the cache line >> miss issue, unless you hit the uncommon case of having to do work in >> m_pullup(), in which case life sucks. >> >> It would be useful to use DTrace to profile a number of the workfull >> m_foo() functions to make sure we're not hitting them in normal workloads, >> btw. > > I highly suspect m_pullup will take any real effect on RX path, given how > most of drivers allocate the mbuf for RX ring (all RX mbufs should be > mclusters). Agreed, but it's good to be sure one is right about these things. :-) Robert N M Watson Computer Laboratory University of Cambridge From rwatson at FreeBSD.org Tue Apr 7 02:26:32 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Tue Apr 7 02:26:43 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <49DAF447.5020407@elischer.org> References: <49DAF447.5020407@elischer.org> Message-ID: On Mon, 6 Apr 2009, Julian Elischer wrote: > while this is true, m_pullup ALWAYS does things so in fact you want to > always put it in a test to see if it is really needed.. Then m_pullup() should be fixed? Keeping the expression of the pullup short makes the network code a lot more compact, which is a significant benefit. Robert N M Watson Computer Laboratory University of Cambridge > > from memory it is something like: > > if (m->m_len < headerlen && (m = m_pullup(m, headerlen)) == NULL) { > log(LOG_WARNING, > "nglmi: m_pullup failed for %d bytes\n", headerlen); > return (0); > } > header = mtod(m, struct header *); > > >>> >>> m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are >>> contiguously stored so that the cast of w to m's data will point at a >>> complete structure we can use to interpret packet data. In the common >>> case >>> in the receipt path, m_pullup() should be a no-op, since almost all >>> drivers >>> receive data in a single cluster. >>> >>> However, there are cases where it might not happen, such as loopback >>> traffic >>> where unusual encapsulation is used, leading to a call to M_PREPEND() that >>> inserts a new mbuf on the front of the chain, which is later m_defrag()'d >>> leading to a higher level header crossing a boundary or the like. >>> >>> This issue is almost entirely independent from things like the cache line >>> miss issue, unless you hit the uncommon case of having to do work in >>> m_pullup(), in which case life sucks. >>> >>> It would be useful to use DTrace to profile a number of the workfull >>> m_foo() >>> functions to make sure we're not hitting them in normal workloads, btw. >> >> I highly suspect m_pullup will take any real effect on RX path, given >> how most of drivers allocate the mbuf for RX ring (all RX mbufs should >> be mclusters). >> >> Best Regards, >> sephe >> > > From barney_cordoba at yahoo.com Tue Apr 7 05:12:06 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Tue Apr 7 05:12:13 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <952316.35609.qm@web63906.mail.re1.yahoo.com> --- On Mon, 4/6/09, Robert Watson wrote: > From: Robert Watson > Subject: Re: Advice on a multithreaded netisr patch? > To: "Ivan Voras" > Cc: freebsd-net@freebsd.org > Date: Monday, April 6, 2009, 7:59 AM > On Mon, 6 Apr 2009, Ivan Voras wrote: > > >>> I'd like to understand more. If (in > netisr) I have a mbuf with headers, is this data already > transfered from the card or is it magically "not here > yet"? > >> > >> A lot depends on the details of the card and > driver. The driver will take cache misses on the descriptor > ring entry, if it's not already in cache, and the link > layer will take a cache miss on the front of the ethernet > frame in the cluster pointed to by the mbuf header as part > of its demux. What happens next depends on your dispatch > model and cache line size. Let's make a few simplifying > assumptions that are mostly true: > > > > So, a mbuf can reference data not yet copied from the > NIC hardware? I'm specifically trying to undestand what > m_pullup() does. > > I think we're talking slightly at cross purposes. > There are two transfers of interest: > > (1) DMA of the packet data to main memory from the NIC > (2) Servicing of CPU cache misses to access data in main > memory > > By the time you receive an interrupt, the DMA is complete, > so once you believe a packet referenced by the descriptor > ring is done, you don't have to wait for DMA. However, > the packet data is in main memory rather than your CPU > cache, so you'll need to take a cache miss in order to > retrieve it. You don't want to prefetch before you know > the packet data is there, or you may prefetch stale data > from the previous packet sent or received from the cluster. > > m_pullup() has to do with mbuf chain memory contiguity > during packet processing. The usual usage is something > along the following lines: > > struct whatever *w; > > m = m_pullup(m, sizeof(*w)); > if (m == NULL) > return; > w = mtod(m, struct whatever *); > > m_pullup() here ensures that the first sizeof(*w) bytes of > mbuf data are contiguously stored so that the cast of w to > m's data will point at a complete structure we can use > to interpret packet data. In the common case in the receipt > path, m_pullup() should be a no-op, since almost all drivers > receive data in a single cluster. > > However, there are cases where it might not happen, such as > loopback traffic where unusual encapsulation is used, > leading to a call to M_PREPEND() that inserts a new mbuf on > the front of the chain, which is later m_defrag()'d > leading to a higher level header crossing a boundary or the > like. > > This issue is almost entirely independent from things like > the cache line miss issue, unless you hit the uncommon case > of having to do work in m_pullup(), in which case life > sucks. > > It would be useful to use DTrace to profile a number of the > workfull m_foo() functions to make sure we're not > hitting them in normal workloads, btw. > > >>> As the card and the OS can already process > many packets per second for > >>> something fairly complex as routing > >>> (http://www.tancsa.com/blast.html), and TCP > chokes swi:net at 100% of > >>> a core, isn't this indication there's > certainly more space for > >>> improvement even with a single-queue > old-fashioned NICs? > >> > >> Maybe. It depends on the relative costs of local > processing vs > >> redistributing the work, which involves > schedulers, IPIs, additional > >> cache misses, lock contention, and so on. This > means there's a period > >> where it can't possibly be a win, and then at > some point it's a win as > >> long as the stack scales. This is essentially the > usual trade-off in > >> using threads and parallelism: does the benefit of > multiple parallel > >> execution units make up for the overheads of > synchronization and data > >> migration? > > > > Do you have any idea at all why I'm seeing the > weird difference of netstat packets per second (250,000) and > my application's TCP performance (< 1,000 pps)? > Summary: each packet is guaranteed to be a whole message > causing a transaction in the application - without the > changes I see pps almost identical to tps. Even if the > source of netstat statistics somehow manages to count > packets multiple time (I don't see how that can happen), > no relation can describe differences this huge. It almost > looks like something in the upper layers is discarding > packets (also not likely: TCP timeouts would occur and the > application wouldn't be able to push 250,000 pps) - but > what? Where to look? > > Is this for the loopback workload? If so, remember that > there may be some other things going on: > > - Every packet is processed at least two times: once went > sent, and then again > when it's received. > > - A TCP segment will need to be ACK'd, so if you're > sending data in chunks in > one direction, the ACKs will not be piggy-backed on > existing data tranfers, > and instead be sent independently, hitting the network > stack two more times. > > - Remember that TCP works to expand its window, and then > maintains the highest > performance it can by bumping up against the top of > available bandwidth > continuously. This involves detecting buffer limits by > generating packets > that can't be sent, adding to the packet count. With > loopback traffic, the > drop point occurs when you exceed the size of the > netisr's queue for IP, so > you might try bumping that from the default to something > much larger. > > And nothing beats using tcpdump -- have you tried > tcpdumping the loopback to see what is actually being sent? > If not, that's always educational -- perhaps something > weird is going on with delayed ACKs, etc. > > > You mean for the general code? I purposely don't > lock my statistics variables because I'm not that > interested in exact numbers (orders of magnitude are > relevant). As far as I understand, unlocked "x++" > should be trivially fast in this case? > > No. x++ is massively slow if executed in parallel across > many cores on a variable in a single cache line. See my > recent commit to kern_tc.c for an example: the updating of > trivial statistics for the kernel time calls reduced 30m > syscalls/second to 3m syscalls/second due to heavy > contention on the cache line holding the statistic. One of > my goals for 8.0 is to fix this problem for IP and TCP > layers, and ideally also ifnet but we'll see. We should > be maintaining those stats per-CPU and then aggregating to > report them to userspace. This is what we already do for a > number of system stats -- UMA and kernel malloc, syscall and > trap counters, etc. > > >> - Use cpuset to pin ithreads, the netisr, and > whatever else, to specific > >> cores > >> so that they don't migrate, and if your > system uses HTT, experiment with > >> pinning the ithread and the netisr on different > threads on the same > >> core, or > >> at least, different cores on the same die. > > > > I'm using em hardware; I still think there's a > possibility I'm fighting the driver in some cases but > this has priority #2. > > Have you tried LOCK_PROFILING? It would quickly tell you > if driver locks were a source of significant contention. It > works quite well... When I enabled LOCK_PROFILING my side modules, such as if_ibg, stopped working. It seems that the ifnet structure or something changed with that option enabled. Is there a way to sync this without having to integrate everything into a specific kernel build? Barney From rwatson at FreeBSD.org Tue Apr 7 05:54:27 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Tue Apr 7 05:54:34 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: On Tue, 7 Apr 2009, Sepherosa Ziehau wrote: > On Sun, Apr 5, 2009 at 9:34 PM, Ivan Voras wrote: >> Robert Watson wrote: >>> >>> On Sun, 5 Apr 2009, Ivan Voras wrote: >>> >>>> I thought this has something to deal with NIC moderation (em) but >>>> can't really explain it. The bad performance part (not the jump) is >>>> also visible over the loopback interface. >>> >>> FYI, if you want high performance, you really want a card supporting >>> multiple input queues -- igb, cxgb, mxge, etc. if_em-only cards are > > PCI-E em(4) supports 2 RX queues. 82571/82572 support 2 TX queues. I have > not tested multi-TX queues, but em(4) multi-RX queues work well in dfly > (tested with 82573 and 82571) You may not have seen, but in FreeBSD 7.x and higher, we have a new if_igb driver to support more recent Intel gigabit devices, which now probes a few of the devices historically associated with if_em. For example, on one of the boxes I use: igb0: port 0x3000-0x301f mem 0xd8220000-0xd823ffff,0xd8200000-0xd821ffff,0xd8280000-0xd8283fff irq 32 at device 0.0 on pci8 igb0: Using MSIX interrupts with 3 vectors igb0: [ITHREAD] igb0: [ITHREAD] igb0: [ITHREAD] igb0: Ethernet address: 00:30:48:d2:ca:c2 igb1: port 0x3020-0x303f mem 0xd8260000-0xd827ffff,0xd8240000-0xd825ffff,0xd8284000-0xd8287fff irq 46 at device 0.1 on pci8 igb1: Using MSIX interrupts with 3 vectors igb1: [ITHREAD] igb1: [ITHREAD] igb1: [ITHREAD] igb1: Ethernet address: 00:30:48:d2:ca:c3 igb0: RX LRO Initialized igb1: RX LRO Initialized Robert N M Watson Computer Laboratory University of Cambridge From rwatson at FreeBSD.org Tue Apr 7 05:56:04 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Tue Apr 7 05:56:10 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <952316.35609.qm@web63906.mail.re1.yahoo.com> References: <952316.35609.qm@web63906.mail.re1.yahoo.com> Message-ID: On Tue, 7 Apr 2009, Barney Cordoba wrote: >> Have you tried LOCK_PROFILING? It would quickly tell you if driver locks >> were a source of significant contention. It works quite well... > > When I enabled LOCK_PROFILING my side modules, such as if_ibg, stopped > working. It seems that the ifnet structure or something changed with that > option enabled. Is there a way to sync this without having to integrate > everything into a specific kernel build? LOCK_PROFILING changes the size of lock-related data structures, so requires both kernel and full set of modules to be rebuilt with the option. Robert N M Watson Computer Laboratory University of Cambridge From sepherosa at gmail.com Tue Apr 7 06:57:48 2009 From: sepherosa at gmail.com (Sepherosa Ziehau) Date: Tue Apr 7 06:57:54 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: On Tue, Apr 7, 2009 at 8:54 PM, Robert Watson wrote: > > On Tue, 7 Apr 2009, Sepherosa Ziehau wrote: > >> On Sun, Apr 5, 2009 at 9:34 PM, Ivan Voras wrote: >>> >>> Robert Watson wrote: >>>> >>>> On Sun, 5 Apr 2009, Ivan Voras wrote: >>>> >>>>> I thought this has something to deal with NIC moderation (em) but >>>>> can't really explain it. The bad performance part (not the jump) is >>>>> also visible over the loopback interface. >>>> >>>> FYI, if you want high performance, you really want a card supporting >>>> multiple input queues -- igb, cxgb, mxge, etc. if_em-only cards are >> >> PCI-E em(4) supports 2 RX queues. 82571/82572 support 2 TX queues. I have >> not tested multi-TX queues, but em(4) multi-RX queues work well in dfly >> (tested with 82573 and 82571) > > You may not have seen, but in FreeBSD 7.x and higher, we have a new if_igb > driver to support more recent Intel gigabit devices, which now probes a few > of the devices historically associated with if_em. For example, on one of > the boxes I use: If I understand the code correctly, it only takes 82575 and 82576; I don't have the hardware, else I would have already added dfly support (with multi rx queues at least, it seems 82576 supports 16 RX queues :) 8257{1/2/3} are still taken by em(4) in FreeBSD. In dfly, I simply forked em(4) (named emx) to create a special version for pci-e devices, for which Intel published developers' manual. I added multi-rxqueue support to it (multi-txqueue support is planned) and cleaned up the TX/RX path. IMHO, 82571 is too widely used to be ignored. Best Regards, sephe -- Live Free or Die From ivoras at freebsd.org Tue Apr 7 07:32:12 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Tue Apr 7 07:32:18 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: <9bbcef730904070700x6f38e83dka1fdc06c48c14111@mail.gmail.com> 2009/4/7 Sepherosa Ziehau : > ?IMHO, 82571 is too widely used to be > ignored. +1 :) From bz at FreeBSD.org Tue Apr 7 07:45:08 2009 From: bz at FreeBSD.org (Bjoern A. Zeeb) Date: Tue Apr 7 07:45:15 2009 Subject: IPv6 window scaling factor always 1 on initial SYN In-Reply-To: <20090406.121959.74751582.sthaug@nethelp.no> References: <20090405.231044.74688369.sthaug@nethelp.no> <20090405214757.E15361@maildrop.int.zabbadoz.net> <20090405215842.C15361@maildrop.int.zabbadoz.net> <20090406.121959.74751582.sthaug@nethelp.no> Message-ID: <20090407144311.F15361@maildrop.int.zabbadoz.net> On Mon, 6 Apr 2009, sthaug@nethelp.no wrote: >> Can you try changing it to < sb_max) for IPv6 as well and see if >> things work (better) for you? > > I changed it, and that worked like a dream. Now I get basically the > same throughput with IPv4 and IPv6. There are of course still issues > like lots of IPv6 tunnels that add extra latency - but that's not the > fault of FreeBSD. > > Anyway, thanks for your work. Below is a context diff (against 7-STABLE > cvsupped last night). Do we need a PR to get this into FreeBSD? It's in HEAD now as of SVN r190800. -- Bjoern A. Zeeb The greatest risk is not taking one. From sthaug at nethelp.no Tue Apr 7 07:57:13 2009 From: sthaug at nethelp.no (sthaug@nethelp.no) Date: Tue Apr 7 07:57:20 2009 Subject: IPv6 window scaling factor always 1 on initial SYN In-Reply-To: <20090407144311.F15361@maildrop.int.zabbadoz.net> References: <20090405215842.C15361@maildrop.int.zabbadoz.net> <20090406.121959.74751582.sthaug@nethelp.no> <20090407144311.F15361@maildrop.int.zabbadoz.net> Message-ID: <20090407.165708.74744827.sthaug@nethelp.no> > > I changed it, and that worked like a dream. Now I get basically the > > same throughput with IPv4 and IPv6. There are of course still issues > > like lots of IPv6 tunnels that add extra latency - but that's not the > > fault of FreeBSD. > > > > Anyway, thanks for your work. Below is a context diff (against 7-STABLE > > cvsupped last night). Do we need a PR to get this into FreeBSD? > > It's in HEAD now as of SVN r190800. Excellent news, thank you! And presumably we'll get a MFC after a suitable settling time? Steinar Haug, Nethelp consulting, sthaug@nethelp.no From julian at elischer.org Tue Apr 7 09:47:42 2009 From: julian at elischer.org (Julian Elischer) Date: Tue Apr 7 09:47:49 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <952316.35609.qm@web63906.mail.re1.yahoo.com> References: <952316.35609.qm@web63906.mail.re1.yahoo.com> Message-ID: <49DB83CB.9070707@elischer.org> Barney Cordoba wrote: > > > > --- On Mon, 4/6/09, Robert Watson wrote: > >> From: Robert Watson >> Subject: Re: Advice on a multithreaded netisr patch? >> To: "Ivan Voras" >> Cc: freebsd-net@freebsd.org >> Date: Monday, April 6, 2009, 7:59 AM >> On Mon, 6 Apr 2009, Ivan Voras wrote: >> >>>>> I'd like to understand more. If (in >> netisr) I have a mbuf with headers, is this data already >> transfered from the card or is it magically "not here >> yet"? >>>> A lot depends on the details of the card and >> driver. The driver will take cache misses on the descriptor >> ring entry, if it's not already in cache, and the link >> layer will take a cache miss on the front of the ethernet >> frame in the cluster pointed to by the mbuf header as part >> of its demux. What happens next depends on your dispatch >> model and cache line size. Let's make a few simplifying >> assumptions that are mostly true: >>> So, a mbuf can reference data not yet copied from the >> NIC hardware? I'm specifically trying to undestand what >> m_pullup() does. >> >> I think we're talking slightly at cross purposes. >> There are two transfers of interest: >> >> (1) DMA of the packet data to main memory from the NIC >> (2) Servicing of CPU cache misses to access data in main >> memory >> >> By the time you receive an interrupt, the DMA is complete, >> so once you believe a packet referenced by the descriptor >> ring is done, you don't have to wait for DMA. However, >> the packet data is in main memory rather than your CPU >> cache, so you'll need to take a cache miss in order to >> retrieve it. You don't want to prefetch before you know >> the packet data is there, or you may prefetch stale data >> from the previous packet sent or received from the cluster. >> >> m_pullup() has to do with mbuf chain memory contiguity >> during packet processing. The usual usage is something >> along the following lines: >> >> struct whatever *w; >> >> m = m_pullup(m, sizeof(*w)); >> if (m == NULL) >> return; >> w = mtod(m, struct whatever *); >> >> m_pullup() here ensures that the first sizeof(*w) bytes of >> mbuf data are contiguously stored so that the cast of w to >> m's data will point at a complete structure we can use >> to interpret packet data. In the common case in the receipt >> path, m_pullup() should be a no-op, since almost all drivers >> receive data in a single cluster. >> >> However, there are cases where it might not happen, such as >> loopback traffic where unusual encapsulation is used, >> leading to a call to M_PREPEND() that inserts a new mbuf on >> the front of the chain, which is later m_defrag()'d >> leading to a higher level header crossing a boundary or the >> like. >> >> This issue is almost entirely independent from things like >> the cache line miss issue, unless you hit the uncommon case >> of having to do work in m_pullup(), in which case life >> sucks. >> >> It would be useful to use DTrace to profile a number of the >> workfull m_foo() functions to make sure we're not >> hitting them in normal workloads, btw. >> >>>>> As the card and the OS can already process >> many packets per second for >>>>> something fairly complex as routing >>>>> (http://www.tancsa.com/blast.html), and TCP >> chokes swi:net at 100% of >>>>> a core, isn't this indication there's >> certainly more space for >>>>> improvement even with a single-queue >> old-fashioned NICs? >>>> Maybe. It depends on the relative costs of local >> processing vs >>>> redistributing the work, which involves >> schedulers, IPIs, additional >>>> cache misses, lock contention, and so on. This >> means there's a period >>>> where it can't possibly be a win, and then at >> some point it's a win as >>>> long as the stack scales. This is essentially the >> usual trade-off in >>>> using threads and parallelism: does the benefit of >> multiple parallel >>>> execution units make up for the overheads of >> synchronization and data >>>> migration? >>> Do you have any idea at all why I'm seeing the >> weird difference of netstat packets per second (250,000) and >> my application's TCP performance (< 1,000 pps)? >> Summary: each packet is guaranteed to be a whole message >> causing a transaction in the application - without the >> changes I see pps almost identical to tps. Even if the >> source of netstat statistics somehow manages to count >> packets multiple time (I don't see how that can happen), >> no relation can describe differences this huge. It almost >> looks like something in the upper layers is discarding >> packets (also not likely: TCP timeouts would occur and the >> application wouldn't be able to push 250,000 pps) - but >> what? Where to look? >> >> Is this for the loopback workload? If so, remember that >> there may be some other things going on: >> >> - Every packet is processed at least two times: once went >> sent, and then again >> when it's received. >> >> - A TCP segment will need to be ACK'd, so if you're >> sending data in chunks in >> one direction, the ACKs will not be piggy-backed on >> existing data tranfers, >> and instead be sent independently, hitting the network >> stack two more times. >> >> - Remember that TCP works to expand its window, and then >> maintains the highest >> performance it can by bumping up against the top of >> available bandwidth >> continuously. This involves detecting buffer limits by >> generating packets >> that can't be sent, adding to the packet count. With >> loopback traffic, the >> drop point occurs when you exceed the size of the >> netisr's queue for IP, so >> you might try bumping that from the default to something >> much larger. >> >> And nothing beats using tcpdump -- have you tried >> tcpdumping the loopback to see what is actually being sent? >> If not, that's always educational -- perhaps something >> weird is going on with delayed ACKs, etc. >> >>> You mean for the general code? I purposely don't >> lock my statistics variables because I'm not that >> interested in exact numbers (orders of magnitude are >> relevant). As far as I understand, unlocked "x++" >> should be trivially fast in this case? >> >> No. x++ is massively slow if executed in parallel across >> many cores on a variable in a single cache line. See my >> recent commit to kern_tc.c for an example: the updating of >> trivial statistics for the kernel time calls reduced 30m >> syscalls/second to 3m syscalls/second due to heavy >> contention on the cache line holding the statistic. One of >> my goals for 8.0 is to fix this problem for IP and TCP >> layers, and ideally also ifnet but we'll see. We should >> be maintaining those stats per-CPU and then aggregating to >> report them to userspace. This is what we already do for a >> number of system stats -- UMA and kernel malloc, syscall and >> trap counters, etc. >> >>>> - Use cpuset to pin ithreads, the netisr, and >> whatever else, to specific >>>> cores >>>> so that they don't migrate, and if your >> system uses HTT, experiment with >>>> pinning the ithread and the netisr on different >> threads on the same >>>> core, or >>>> at least, different cores on the same die. >>> I'm using em hardware; I still think there's a >> possibility I'm fighting the driver in some cases but >> this has priority #2. >> >> Have you tried LOCK_PROFILING? It would quickly tell you >> if driver locks were a source of significant contention. It >> works quite well... > > When I enabled LOCK_PROFILING my side modules, such as if_ibg, > stopped working. It seems that the ifnet structure or something > changed with that option enabled. Is there a way to sync this without > having to integrate everything into a specific kernel build? > no, I don't think there is any other way.. last time I checked the mutex structure changed size which meant that almost everything else that included a mutex changed size. That may not be true now but I haven't checked.. > Barney > > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From wollman at hergotha.csail.mit.edu Tue Apr 7 13:12:05 2009 From: wollman at hergotha.csail.mit.edu (Garrett Wollman) Date: Tue Apr 7 13:12:11 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: References: Message-ID: <200904072012.n37KC3lA050334@hergotha.csail.mit.edu> In article , Robert Watson writes: >m_pullup() has to do with mbuf chain memory contiguity during packet >processing. Historically, m_pullup() also had one other extremely important function: to make sure that the header data you were about to modify was not stored in a (possibly shared) cluster. Thus, in the input path for a typical driver which puts the whole packet into a cluster, the very first m_pullup() would allocate a new plain mbuf, carefully align the data pointer to allow for both prepending more headers and pulling more header data out, and copy the requested data into the internal buffer of the mbuf. -GAWollman From barney_cordoba at yahoo.com Tue Apr 7 14:48:59 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Tue Apr 7 14:49:06 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <409843.2186.qm@web63904.mail.re1.yahoo.com> --- On Tue, 4/7/09, Sepherosa Ziehau wrote: > From: Sepherosa Ziehau > Subject: Re: Advice on a multithreaded netisr patch? > To: "Robert Watson" > Cc: freebsd-net@freebsd.org, "Ivan Voras" > Date: Tuesday, April 7, 2009, 9:57 AM > On Tue, Apr 7, 2009 at 8:54 PM, Robert Watson > wrote: > > > > On Tue, 7 Apr 2009, Sepherosa Ziehau wrote: > > > >> On Sun, Apr 5, 2009 at 9:34 PM, Ivan Voras > wrote: > >>> > >>> Robert Watson wrote: > >>>> > >>>> On Sun, 5 Apr 2009, Ivan Voras wrote: > >>>> > >>>>> I thought this has something to deal > with NIC moderation (em) but > >>>>> can't really explain it. The bad > performance part (not the jump) is > >>>>> also visible over the loopback > interface. > >>>> > >>>> FYI, if you want high performance, you > really want a card supporting > >>>> multiple input queues -- igb, cxgb, mxge, > etc. if_em-only cards are > >> > >> PCI-E em(4) supports 2 RX queues. 82571/82572 > support 2 TX queues. I have > >> not tested multi-TX queues, but em(4) multi-RX > queues work well in dfly > >> (tested with 82573 and 82571) > > > > You may not have seen, but in FreeBSD 7.x and higher, > we have a new if_igb > > driver to support more recent Intel gigabit devices, > which now probes a few > > of the devices historically associated with if_em. > For example, on one of > > the boxes I use: > > If I understand the code correctly, it only takes 82575 and > 82576; I > don't have the hardware, else I would have already > added dfly support > (with multi rx queues at least, it seems 82576 supports 16 > RX queues > :) Regarding if_igb: 1) Multiple TX queues are not supported. There's some hokey code to test, but it doesn't properly separate flows to the queues. 2) 2 Rx queues don't work, so only 1 and 4 work 3) With 4 queues, it just sucks up CPU under heavy load on 4 cpus. It will blow 4 cpus at a lower load than em will with 1 4) You'll need to fix DMA setup, as it sets the alignment requirement to PAGE_SIZE. I haven't been able to convince Jack that its wrong, not that I've tried very hard since its easy to just fix myself. Barney From barney_cordoba at yahoo.com Tue Apr 7 14:56:27 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Tue Apr 7 14:56:33 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <497906.25422.qm@web63906.mail.re1.yahoo.com> --- On Tue, 4/7/09, Robert Watson wrote: > From: Robert Watson > Subject: Re: Advice on a multithreaded netisr patch? > To: "Barney Cordoba" > Cc: freebsd-net@freebsd.org, "Ivan Voras" > Date: Tuesday, April 7, 2009, 8:56 AM > On Tue, 7 Apr 2009, Barney Cordoba wrote: > > >> Have you tried LOCK_PROFILING? It would quickly > tell you if driver locks were a source of significant > contention. It works quite well... > > > > When I enabled LOCK_PROFILING my side modules, such as > if_ibg, stopped working. It seems that the ifnet structure > or something changed with that option enabled. Is there a > way to sync this without having to integrate everything into > a specific kernel build? > > LOCK_PROFILING changes the size of lock-related data > structures, so requires both kernel and full set of modules > to be rebuilt with the option. It might be good to mention this in the man page. Most 3rd party drivers build stand-alone, and even if you pull down the latest drivers from intel or broadcom they're usually built out of the kernel build. Its pretty frustrating to have random things failing, mbuf leaks, etc without any warning. Barney From ivoras at freebsd.org Tue Apr 7 15:00:22 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Tue Apr 7 15:00:32 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <409843.2186.qm@web63904.mail.re1.yahoo.com> References: <409843.2186.qm@web63904.mail.re1.yahoo.com> Message-ID: Barney Cordoba wrote: > 1) Multiple TX queues are not supported. There's some hokey code to > test, but it doesn't properly separate flows to the queues. > 2) 2 Rx queues don't work, so only 1 and 4 work > 3) With 4 queues, it just sucks up CPU under heavy load on 4 cpus. It will > blow 4 cpus at a lower load than em will with 1 > 4) You'll need to fix DMA setup, as it sets the alignment requirement > to PAGE_SIZE. I haven't been able to convince Jack that its wrong, not > that I've tried very hard since its easy to just fix myself. Reading this thread it looks like the development of both Intel drivers is a bit stalled, doesn't it? AFAIK the em driver is also semi-officially abandoned, and both from my experience and others it looks like new development and patches are being rejected. Time to shop other hardware? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 258 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090407/50deb8fb/signature.pgp From barney_cordoba at yahoo.com Tue Apr 7 15:24:18 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Tue Apr 7 15:24:28 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <900824.65358.qm@web63901.mail.re1.yahoo.com> --- On Tue, 4/7/09, Ivan Voras wrote: > From: Ivan Voras > Subject: Re: Advice on a multithreaded netisr patch? > To: freebsd-net@freebsd.org > Date: Tuesday, April 7, 2009, 5:59 PM > Barney Cordoba wrote: > > > 1) Multiple TX queues are not supported. There's > some hokey code to > > test, but it doesn't properly separate flows to > the queues. > > 2) 2 Rx queues don't work, so only 1 and 4 work > > 3) With 4 queues, it just sucks up CPU under heavy > load on 4 cpus. It will > > blow 4 cpus at a lower load than em will with 1 > > 4) You'll need to fix DMA setup, as it sets the > alignment requirement > > to PAGE_SIZE. I haven't been able to convince Jack > that its wrong, not > > that I've tried very hard since its easy to just > fix myself. > > Reading this thread it looks like the development of both > Intel drivers > is a bit stalled, doesn't it? AFAIK the em driver is > also > semi-officially abandoned, and both from my experience and > others it > looks like new development and patches are being rejected. > Time to shop > other hardware? To be fair, the OS doesn't really support multiqueue yet, or has for only a few hours, so lets not go crazy. It makes a lot more sense to have someone on the "team" work with Jack on improving the performance and working out the kinks. When I asked Jack about the poor performance of if_igb, he indicated that Intel's position is that the drivers are "just samples", which really doesn't give anyone much confidence that they want to run their business on them. You already have Jack doing all of the hard work; that is supporting the new-chip-per-week that intel puts out, so it seems to me the best strategy would be to try to convince Intel that its in their best interest to have drivers that work well so people don't think that their hardware stinks. As an example, the Chelsio 10gb bypass card is $3495. and an Intel card is ~$1000, so its a big win for the community as a whole to have good intel drivers going forward. My work is commercially proprietary so I can't share my code, but I can certainly share ideas on things that I've tested and discovered. Barney From rwatson at FreeBSD.org Tue Apr 7 15:52:52 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Tue Apr 7 15:52:58 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <497906.25422.qm@web63906.mail.re1.yahoo.com> References: <497906.25422.qm@web63906.mail.re1.yahoo.com> Message-ID: On Tue, 7 Apr 2009, Barney Cordoba wrote: >>> When I enabled LOCK_PROFILING my side modules, such as >> if_ibg, stopped working. It seems that the ifnet structure or something >> changed with that option enabled. Is there a way to sync this without >> having to integrate everything into a specific kernel build? >> >> LOCK_PROFILING changes the size of lock-related data structures, so >> requires both kernel and full set of modules to be rebuilt with the option. > > It might be good to mention this in the man page. Most 3rd party drivers > build stand-alone, and even if you pull down the latest drivers from intel > or broadcom they're usually built out of the kernel build. Its pretty > frustrating to have random things failing, mbuf leaks, etc without any > warning. >From the man page: NOTES The LOCK_PROFILING option increases the size of struct lock_object, so a kernel built with that option will not work with modules built without it. Robert N M Watson Computer Laboratory University of Cambridge From barney_cordoba at yahoo.com Tue Apr 7 16:00:33 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Tue Apr 7 16:00:38 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <532949.28323.qm@web63907.mail.re1.yahoo.com> --- On Tue, 4/7/09, Robert Watson wrote: > From: Robert Watson > Subject: Re: Advice on a multithreaded netisr patch? > To: "Barney Cordoba" > Cc: freebsd-net@freebsd.org, "Ivan Voras" > Date: Tuesday, April 7, 2009, 6:52 PM > On Tue, 7 Apr 2009, Barney Cordoba wrote: > > >>> When I enabled LOCK_PROFILING my side modules, > such as > >> if_ibg, stopped working. It seems that the ifnet > structure or something changed with that option enabled. Is > there a way to sync this without having to integrate > everything into a specific kernel build? > >> > >> LOCK_PROFILING changes the size of lock-related > data structures, so requires both kernel and full set of > modules to be rebuilt with the option. > > > > It might be good to mention this in the man page. Most > 3rd party drivers build stand-alone, and even if you pull > down the latest drivers from intel or broadcom they're > usually built out of the kernel build. Its pretty > frustrating to have random things failing, mbuf leaks, etc > without any warning. > > From the man page: > > NOTES > The LOCK_PROFILING option increases the size of struct > lock_object, so a > kernel built with that option will not work with > modules built without > it. Nice work. Its not in the 7.0 man page, unfortunately for me :( BC From barney_cordoba at yahoo.com Tue Apr 7 16:01:51 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Tue Apr 7 16:01:57 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <362116.58661.qm@web63908.mail.re1.yahoo.com> --- On Tue, 4/7/09, Robert Watson wrote: > From: Robert Watson > Subject: Re: Advice on a multithreaded netisr patch? > To: "Barney Cordoba" > Cc: freebsd-net@freebsd.org, "Ivan Voras" > Date: Tuesday, April 7, 2009, 6:52 PM > On Tue, 7 Apr 2009, Barney Cordoba wrote: > > >>> When I enabled LOCK_PROFILING my side modules, > such as > >> if_ibg, stopped working. It seems that the ifnet > structure or something changed with that option enabled. Is > there a way to sync this without having to integrate > everything into a specific kernel build? > >> > >> LOCK_PROFILING changes the size of lock-related > data structures, so requires both kernel and full set of > modules to be rebuilt with the option. > > > > It might be good to mention this in the man page. Most > 3rd party drivers build stand-alone, and even if you pull > down the latest drivers from intel or broadcom they're > usually built out of the kernel build. Its pretty > frustrating to have random things failing, mbuf leaks, etc > without any warning. > > From the man page: > > NOTES > The LOCK_PROFILING option increases the size of struct > lock_object, so a > kernel built with that option will not work with > modules built without > it Nevermind. Obviously I just plain missed it. BC From wahjava.ml at gmail.com Tue Apr 7 23:25:25 2009 From: wahjava.ml at gmail.com (Ashish SHUKLA) Date: Tue Apr 7 23:25:32 2009 Subject: getaddrinfo() unable to resolve IPv6 addresses In-Reply-To: References: <87y6ud5p62.fsf@chateau.d.lf> Message-ID: <20090408062558.GA10933@chateau.d.lf> In , Hajimu UMEMOTO wrote: [...] > >No, I believe it was already fixed. Please, re-cvsup and try it. I re-cvsup'ed it and it worked, thanks for the reply. -- Ashish SHUKLA -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090408/c0770b66/attachment.pgp From barney_cordoba at yahoo.com Wed Apr 8 04:48:24 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Wed Apr 8 04:48:30 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <49DC3961.8090707@sepehrs.com> Message-ID: <477001.91824.qm@web63902.mail.re1.yahoo.com> --- On Wed, 4/8/09, H.Fazaeli wrote: > From: H.Fazaeli > Subject: Re: Advice on a multithreaded netisr patch? > To: barney_cordoba@yahoo.com > Cc: freebsd-net@freebsd.org, "Ivan Voras" > Date: Wednesday, April 8, 2009, 1:42 AM > Barney Cordoba wrote: > > > > > > --- On Tue, 4/7/09, Ivan Voras > wrote: > > > > > >> From: Ivan Voras > >> Subject: Re: Advice on a multithreaded netisr > patch? > >> To: freebsd-net@freebsd.org > >> Date: Tuesday, April 7, 2009, 5:59 PM > >> Barney Cordoba wrote: > >> > >> > >>> 1) Multiple TX queues are not supported. > There's > >>> > >> some hokey code to > >> > >>> test, but it doesn't properly separate > flows to > >>> > >> the queues. > >> > >>> 2) 2 Rx queues don't work, so only 1 and 4 > work > >>> 3) With 4 queues, it just sucks up CPU under > heavy > >>> > >> load on 4 cpus. It will > >> > >>> blow 4 cpus at a lower load than em will with > 1 > >>> 4) You'll need to fix DMA setup, as it > sets the > >>> > >> alignment requirement > >> > >>> to PAGE_SIZE. I haven't been able to > convince Jack > >>> > >> that its wrong, not > >> > >>> that I've tried very hard since its easy > to just > >>> > >> fix myself. > >> > >> Reading this thread it looks like the development > of both > >> Intel drivers > >> is a bit stalled, doesn't it? AFAIK the em > driver is > >> also > >> semi-officially abandoned, and both from my > experience and > >> others it > >> looks like new development and patches are being > rejected. > >> Time to shop > >> other hardware? > >> > > > > To be fair, the OS doesn't really support > multiqueue yet, or has > > for only a few hours, so lets not go crazy. > > > > It makes a lot more sense to have someone on the > "team" work with > > Jack on improving the performance and working out the > kinks. When > > I asked Jack about the poor performance of if_igb, he > indicated that > > Intel's position is that the drivers are > "just samples", which really > > doesn't give anyone much confidence that they want > to run their business > > on them. You already have Jack doing all of the hard > work; that is > > supporting the new-chip-per-week that intel puts out, > so it seems to > > me the best strategy would be to try to convince Intel > that its in > > their best interest to have drivers that work well so > people don't > > think that their hardware stinks. > > > > As an example, the Chelsio 10gb bypass card is $3495. > and an Intel > > card is ~$1000, so its a big win for the community as > a whole to have > > good intel drivers going forward. > > > > My work is commercially proprietary so I can't > share my code, but > > I can certainly share ideas on things that I've > tested and discovered. > > > > > can you provide more details on the improvements you > achieved? > > > Barney > > > > > > > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" > > > > > > -- As all developers konw, programming is 90% learning and 10% code. So far, I've implemented multiqueue for 7.x and gotten everything to work for both igb and ixgbe. igb isn't all that interesting since em can easily handle 1 Gb/s; so ixgbe is really the goal. The igb and ixgbe are similar designs so the work is somewhat parallel. As of now, I'm working on separating the theory from the real world and getting a feel for which design techniques work best. I'm also *not* designing for a system that uses the stack (a filtering firewall type system), so the things that Robert talks about apply differently. A web server, for example, will likely only have 1 controller and will have many user threads; while a router or firewall will have 2 equally loaded NICs with few if any user threads. Its quite likely that completely different approaches are needed to optimize each. I'm at the point of testing design approaches. So the jury is out as what what can be achieved. What I can say is that multiqueue isn't a panacea or even desirable if its not designed correctly. Out of the box, increasing the number of queues just to "spread interrupts" doesn't seem to have any advantage; in fact it seems to make things worse in terms of utilization. I'm not entirely sure why as of yet. Barney From barney_cordoba at yahoo.com Wed Apr 8 06:05:10 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Wed Apr 8 06:05:17 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <871699.35154.qm@web63906.mail.re1.yahoo.com> --- On Mon, 4/6/09, Robert Watson wrote: > From: Robert Watson > Subject: Re: Advice on a multithreaded netisr patch? > To: "Ivan Voras" > Cc: freebsd-net@freebsd.org > Date: Monday, April 6, 2009, 2:52 PM > On Mon, 6 Apr 2009, Ivan Voras wrote: > > >> I think we're talking slightly at cross > purposes. There are two > >> transfers of interest: > >> > >> (1) DMA of the packet data to main memory from the > NIC > >> (2) Servicing of CPU cache misses to access data > in main memory > >> > >> By the time you receive an interrupt, the DMA is > complete, so once you > > > > OK, this was what was confusing me - for a moment I > thought you meant it's not so. > > It's a polite lie that we will choose to believe the > purposes of simplification. And probably true for all our > drivers in practice right now. > > >> m = m_pullup(m, sizeof(*w)); > >> if (m == NULL) > >> return; > >> w = mtod(m, struct whatever *); > >> > >> m_pullup() here ensures that the first sizeof(*w) > bytes of mbuf data are contiguously stored so that the cast > of w to m's data will point at a > > > > So, m_pullup() can resize / realloc() the mbuf? (not > that it matters for this purpose) > > Yes -- if it can't meet the contiguity requirements > using the current mbuf chain, it may reallocate and return a > new head to the chain (hence m being reassigned). If that > reallocation fails, it may return NULL. Once you've > called m_pullup(), existing pointers into the chain's > data will be invalid, so if you've already called mtod() > on it, you need to call it again. > > >> - A TCP segment will need to be ACK'd, so if > you're sending data in > >> chunks in > >> one direction, the ACKs will not be piggy-backed > on existing data > >> tranfers, > >> and instead be sent independently, hitting the > network stack two more > >> times. > > > > No combination of these can make an accounting > difference between 1,000 and 250,000 pps. I must be hitting > something very bad here. > > Yes, you definitely want to run tcpdump to see what's > going on here. > > >> - Remember that TCP works to expand its window, > and then maintains the > >> highest > >> performance it can by bumping up against the top > of available bandwidth > >> continuously. This involves detecting buffer > limits by generating > >> packets > >> that can't be sent, adding to the packet > count. With loopback > >> traffic, the > >> drop point occurs when you exceed the size of > the netisr's queue for > >> IP, so > >> you might try bumping that from the default to > something much larger. Robert, Is there any work being done on lighter weight locks for queues? It seems ridiculous to avoid using queues because of lock contention when the locks are only protecting a couple lines of code. Barney From rwatson at FreeBSD.org Wed Apr 8 06:16:54 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Wed Apr 8 06:17:01 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <871699.35154.qm@web63906.mail.re1.yahoo.com> References: <871699.35154.qm@web63906.mail.re1.yahoo.com> Message-ID: On Wed, 8 Apr 2009, Barney Cordoba wrote: > Is there any work being done on lighter weight locks for queues? It seems > ridiculous to avoid using queues because of lock contention when the locks > are only protecting a couple lines of code. My reading is that there are two, closely related, things going on: the first is lock contention, and the second is cache line contention. We have a primitive in 8.x (don't think it's been MFC'd yet) for a lockless atomic buffer primitive for use in drivers and other parts of the stack. However, that addresses only lock contention, not line contention, which at a high PPS will be an issue as well. Only by moving to independent data structures (i.e., on independent cache lines) can we reduce line contention. Robert N M Watson Computer Laboratory University of Cambridge From barney_cordoba at yahoo.com Wed Apr 8 06:18:48 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Wed Apr 8 06:18:54 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <75700.80930.qm@web63905.mail.re1.yahoo.com> --- On Tue, 4/7/09, Robert Watson wrote: > From: Robert Watson > Subject: Re: Advice on a multithreaded netisr patch? > To: "Barney Cordoba" > Cc: freebsd-net@freebsd.org, "Ivan Voras" > Date: Tuesday, April 7, 2009, 8:56 AM > On Tue, 7 Apr 2009, Barney Cordoba wrote: > > >> Have you tried LOCK_PROFILING? It would quickly > tell you if driver locks were a source of significant > contention. It works quite well... > > > > When I enabled LOCK_PROFILING my side modules, such as > if_ibg, stopped working. It seems that the ifnet structure > or something changed with that option enabled. Is there a > way to sync this without having to integrate everything into > a specific kernel build? > > LOCK_PROFILING changes the size of lock-related data > structures, so requires both kernel and full set of modules > to be rebuilt with the option. What are the units for lock profiling? For example, the "average wait" is in what units? Is there a way to reset the stats counters? If not, it might be nifty if toggling prof.enable reset the stats to run some different kinds of tests without rebooting. Barney From spawk at acm.poly.edu Wed Apr 8 06:54:39 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Wed Apr 8 06:54:47 2009 Subject: Multi-BSS problem with Atheros 5212 Message-ID: <49DCAC1F.9000708@acm.poly.edu> Ahoy. I'm having trouble with multiple hostap-mode wlan pseudo-devices. The machine is an 8-CURRENT from yesterday: # uname -a FreeBSD test 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Tue Apr 7 16:54:56 UTC 2009 root@test:/usr/obj/usr/src/sys/GENERIC i386 # dmesg | grep ath ath0: mem 0xf4100000-0xf410ffff irq 11 at device 13.0 on pci0 ath0: [ITHREAD] ath0: AR2413 mac 7.9 RF2413 phy 4.5 # cat /etc/rc.conf wlans_ath0="wlan0 wlan1 wlan2" create_args_wlan0="wlanmode hostap bssid" create_args_wlan1="wlanmode hostap bssid" create_args_wlan2="wlanmode hostap bssid" ifconfig_wlan0="ssid wlan0 wepmode off up" ifconfig_wlan1="ssid wlan1 wepmode off up" ifconfig_wlan2="ssid wlan2 wepmode off up" # ifconfig ath0: flags=8843 metric 0 mtu 2290 ether 00:18:e7:33:5e:24 media: IEEE 802.11 Wireless Ethernet autoselect mode 11g status: running fxp0: flags=8843 metric 0 mtu 1500 options=8 ether 00:90:27:72:c4:f3 inet 10.0.0.128 netmask 0xffffff00 broadcast 10.0.0.255 media: Ethernet autoselect (100baseTX ) status: active lo0: flags=8049 metric 0 mtu 16384 options=3 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 inet6 ::1 prefixlen 128 inet 127.0.0.1 netmask 0xff000000 wlan0: flags=8843 metric 0 mtu 1500 ether 00:18:e7:33:5e:24 media: IEEE 802.11 Wireless Ethernet autoselect mode 11g status: running ssid wlan0 channel 11 (2462 Mhz 11g) bssid 00:18:e7:33:5e:24 country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60 protmode CTS wme burst dtimperiod 1 -dfs wlan1: flags=8843 metric 0 mtu 1500 ether 06:18:e7:33:5e:24 media: IEEE 802.11 Wireless Ethernet autoselect mode 11g status: running ssid wlan1 channel 11 (2462 Mhz 11g) bssid 06:18:e7:33:5e:24 country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60 protmode CTS wme burst dtimperiod 1 -dfs wlan2: flags=8843 metric 0 mtu 1500 ether 0a:18:e7:33:5e:24 media: IEEE 802.11 Wireless Ethernet autoselect mode 11g status: running ssid wlan2 channel 11 (2462 Mhz 11g) bssid 0a:18:e7:33:5e:24 country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60 protmode CTS wme burst dtimperiod 1 -dfs The client is a 7.0 machine with another 5212 card: # uname -a FreeBSD peer 7.0-RELEASE-p10 FreeBSD 7.0-RELEASE-p10 #0: Mon Mar 23 09:26:18 EDT 2009 root@peer:/usr/obj/usr/src/sys/PEER i386 # dmesg | grep ath ath_hal: 0.10.5.6 (AR5210, AR5211, AR5212, AR5416, RF5111, RF5112, RF2413, RF5413, RF2133, RF2425, RF2417) ath0: mem 0xa8410000-0xa841ffff irq 11 at device 0.0 on cardbus0 ath0: [ITHREAD] ath0: using obsoleted if_watchdog interface ath0: Ethernet address: 00:14:d1:42:21:5a ath0: mac 7.9 phy 4.5 radio 5.6 The three SSIDs configured on the CURRENT machine show up in a scan: # ifconfig ath0 scan | grep wlan wlan0 00:18:e7:33:5e:24 11 54M -66:-93 100 ES WME wlan1 06:18:e7:33:5e:24 11 54M -65:-93 100 ES WME wlan2 0a:18:e7:33:5e:24 11 54M -65:-93 100 ES WME The client is only able to associate with wlan1, however. When scanning channels while attempting to associate with any of the other ones, it gets stuck on channel 11 for a while before moving on, which seems relevant. Also interesting is the fact that if i do "ifconfig ath0 down" on the CURRENT machine, followed by, for example, "ifconfig ath0 ssid wlan0" (which did not associate before) on the client, followed by "ifconfig ath0 up" on the CURRENT machine, the client will associate with wlan0, but will not be able to associate with wlan1 or wlan2. Any ideas? -Boris From sam at freebsd.org Wed Apr 8 08:25:40 2009 From: sam at freebsd.org (Sam Leffler) Date: Wed Apr 8 08:25:47 2009 Subject: Multi-BSS problem with Atheros 5212 In-Reply-To: <49DCAC1F.9000708@acm.poly.edu> References: <49DCAC1F.9000708@acm.poly.edu> Message-ID: <49DCC1EB.3040706@freebsd.org> Boris Kochergin wrote: > Ahoy. I'm having trouble with multiple hostap-mode wlan > pseudo-devices. The machine is an 8-CURRENT from yesterday: > > # uname -a > FreeBSD test 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Tue Apr 7 16:54:56 > UTC 2009 root@test:/usr/obj/usr/src/sys/GENERIC i386 > > # dmesg | grep ath > ath0: mem 0xf4100000-0xf410ffff irq 11 at device 13.0 > on pci0 > ath0: [ITHREAD] > ath0: AR2413 mac 7.9 RF2413 phy 4.5 > > # cat /etc/rc.conf > wlans_ath0="wlan0 wlan1 wlan2" > create_args_wlan0="wlanmode hostap bssid" > create_args_wlan1="wlanmode hostap bssid" > create_args_wlan2="wlanmode hostap bssid" > ifconfig_wlan0="ssid wlan0 wepmode off up" > ifconfig_wlan1="ssid wlan1 wepmode off up" > ifconfig_wlan2="ssid wlan2 wepmode off up" > > # ifconfig > ath0: flags=8843 metric 0 mtu > 2290 > ether 00:18:e7:33:5e:24 > media: IEEE 802.11 Wireless Ethernet autoselect mode 11g > status: running > fxp0: flags=8843 metric 0 mtu > 1500 > options=8 > ether 00:90:27:72:c4:f3 > inet 10.0.0.128 netmask 0xffffff00 broadcast 10.0.0.255 > media: Ethernet autoselect (100baseTX ) > status: active > lo0: flags=8049 metric 0 mtu 16384 > options=3 > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 > inet6 ::1 prefixlen 128 > inet 127.0.0.1 netmask 0xff000000 > wlan0: flags=8843 metric 0 mtu > 1500 > ether 00:18:e7:33:5e:24 > media: IEEE 802.11 Wireless Ethernet autoselect mode 11g > status: running > ssid wlan0 channel 11 (2462 Mhz 11g) bssid 00:18:e7:33:5e:24 > country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60 > protmode CTS wme burst dtimperiod 1 -dfs > wlan1: flags=8843 metric 0 mtu > 1500 > ether 06:18:e7:33:5e:24 > media: IEEE 802.11 Wireless Ethernet autoselect mode 11g > status: running > ssid wlan1 channel 11 (2462 Mhz 11g) bssid 06:18:e7:33:5e:24 > country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60 > protmode CTS wme burst dtimperiod 1 -dfs > wlan2: flags=8843 metric 0 mtu > 1500 > ether 0a:18:e7:33:5e:24 > media: IEEE 802.11 Wireless Ethernet autoselect mode 11g > status: running > ssid wlan2 channel 11 (2462 Mhz 11g) bssid 0a:18:e7:33:5e:24 > country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60 > protmode CTS wme burst dtimperiod 1 -dfs > > The client is a 7.0 machine with another 5212 card: > > # uname -a > FreeBSD peer 7.0-RELEASE-p10 FreeBSD 7.0-RELEASE-p10 #0: Mon Mar 23 > 09:26:18 EDT 2009 root@peer:/usr/obj/usr/src/sys/PEER i386 > > # dmesg | grep ath > ath_hal: 0.10.5.6 (AR5210, AR5211, AR5212, AR5416, RF5111, RF5112, > RF2413, RF5413, RF2133, RF2425, RF2417) > ath0: mem 0xa8410000-0xa841ffff irq 11 at device 0.0 on > cardbus0 > ath0: [ITHREAD] > ath0: using obsoleted if_watchdog interface > ath0: Ethernet address: 00:14:d1:42:21:5a > ath0: mac 7.9 phy 4.5 radio 5.6 > > The three SSIDs configured on the CURRENT machine show up in a scan: > > # ifconfig ath0 scan | grep wlan > wlan0 00:18:e7:33:5e:24 11 54M -66:-93 100 ES WME > wlan1 06:18:e7:33:5e:24 11 54M -65:-93 100 ES WME > wlan2 0a:18:e7:33:5e:24 11 54M -65:-93 100 ES WME > > The client is only able to associate with wlan1, however. When > scanning channels while attempting to associate with any of the other > ones, it gets stuck on channel 11 for a while before moving on, which > seems relevant. Also interesting is the fact that if i do "ifconfig > ath0 down" on the CURRENT machine, followed by, for example, "ifconfig > ath0 ssid wlan0" (which did not associate before) on the client, > followed by "ifconfig ath0 up" on the CURRENT machine, the client will > associate with wlan0, but will not be able to associate with wlan1 or > wlan2. Any ideas? wlandebug scan+auth+assoc on the client machine will show you why you cannot associate. You can also enable the same info on the ap side to see what it thinks is happening. Sam From barney_cordoba at yahoo.com Wed Apr 8 09:16:21 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Wed Apr 8 09:16:30 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <75700.80930.qm@web63905.mail.re1.yahoo.com> Message-ID: <564712.63955.qm@web63905.mail.re1.yahoo.com> --- On Wed, 4/8/09, Barney Cordoba wrote: > From: Barney Cordoba > Subject: Re: Advice on a multithreaded netisr patch? > To: "Robert Watson" > Cc: freebsd-net@freebsd.org, "Ivan Voras" > Date: Wednesday, April 8, 2009, 9:18 AM > --- On Tue, 4/7/09, Robert Watson > wrote: > > > From: Robert Watson > > Subject: Re: Advice on a multithreaded netisr patch? > > To: "Barney Cordoba" > > > Cc: freebsd-net@freebsd.org, "Ivan Voras" > > > Date: Tuesday, April 7, 2009, 8:56 AM > > On Tue, 7 Apr 2009, Barney Cordoba wrote: > > > > >> Have you tried LOCK_PROFILING? It would > quickly > > tell you if driver locks were a source of significant > > contention. It works quite well... > > > > > > When I enabled LOCK_PROFILING my side modules, > such as > > if_ibg, stopped working. It seems that the ifnet > structure > > or something changed with that option enabled. Is > there a > > way to sync this without having to integrate > everything into > > a specific kernel build? > > > > LOCK_PROFILING changes the size of lock-related data > > structures, so requires both kernel and full set of > modules > > to be rebuilt with the option. > > What are the units for lock profiling? For example, the > "average > wait" is in what units? > > Is there a way to reset the stats counters? If not, it > might be nifty if > toggling prof.enable reset the stats to run some different > kinds of > tests without rebooting. > > Barney I know, I know. Read the man page... From sam at freebsd.org Wed Apr 8 09:53:02 2009 From: sam at freebsd.org (Sam Leffler) Date: Wed Apr 8 09:53:19 2009 Subject: Multi-BSS problem with Atheros 5212 In-Reply-To: <49DCC1EB.3040706@freebsd.org> References: <49DCAC1F.9000708@acm.poly.edu> <49DCC1EB.3040706@freebsd.org> Message-ID: <49DCD66B.6040504@freebsd.org> Sam Leffler wrote: > Boris Kochergin wrote: >> Ahoy. I'm having trouble with multiple hostap-mode wlan >> pseudo-devices. The machine is an 8-CURRENT from yesterday: >> >> # uname -a >> FreeBSD test 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Tue Apr 7 16:54:56 >> UTC 2009 root@test:/usr/obj/usr/src/sys/GENERIC i386 >> >> # dmesg | grep ath >> ath0: mem 0xf4100000-0xf410ffff irq 11 at device 13.0 >> on pci0 >> ath0: [ITHREAD] >> ath0: AR2413 mac 7.9 RF2413 phy 4.5 >> >> # cat /etc/rc.conf >> wlans_ath0="wlan0 wlan1 wlan2" >> create_args_wlan0="wlanmode hostap bssid" >> create_args_wlan1="wlanmode hostap bssid" >> create_args_wlan2="wlanmode hostap bssid" >> ifconfig_wlan0="ssid wlan0 wepmode off up" >> ifconfig_wlan1="ssid wlan1 wepmode off up" >> ifconfig_wlan2="ssid wlan2 wepmode off up" >> >> # ifconfig >> ath0: flags=8843 metric 0 mtu >> 2290 >> ether 00:18:e7:33:5e:24 >> media: IEEE 802.11 Wireless Ethernet autoselect mode 11g >> status: running >> fxp0: flags=8843 metric 0 mtu >> 1500 >> options=8 >> ether 00:90:27:72:c4:f3 >> inet 10.0.0.128 netmask 0xffffff00 broadcast 10.0.0.255 >> media: Ethernet autoselect (100baseTX ) >> status: active >> lo0: flags=8049 metric 0 mtu 16384 >> options=3 >> inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 >> inet6 ::1 prefixlen 128 >> inet 127.0.0.1 netmask 0xff000000 >> wlan0: flags=8843 metric 0 >> mtu 1500 >> ether 00:18:e7:33:5e:24 >> media: IEEE 802.11 Wireless Ethernet autoselect mode 11g >> status: running >> ssid wlan0 channel 11 (2462 Mhz 11g) bssid 00:18:e7:33:5e:24 >> country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60 >> protmode CTS wme burst dtimperiod 1 -dfs >> wlan1: flags=8843 metric 0 >> mtu 1500 >> ether 06:18:e7:33:5e:24 >> media: IEEE 802.11 Wireless Ethernet autoselect mode 11g >> status: running >> ssid wlan1 channel 11 (2462 Mhz 11g) bssid 06:18:e7:33:5e:24 >> country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60 >> protmode CTS wme burst dtimperiod 1 -dfs >> wlan2: flags=8843 metric 0 >> mtu 1500 >> ether 0a:18:e7:33:5e:24 >> media: IEEE 802.11 Wireless Ethernet autoselect mode 11g >> status: running >> ssid wlan2 channel 11 (2462 Mhz 11g) bssid 0a:18:e7:33:5e:24 >> country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60 >> protmode CTS wme burst dtimperiod 1 -dfs >> >> The client is a 7.0 machine with another 5212 card: >> >> # uname -a >> FreeBSD peer 7.0-RELEASE-p10 FreeBSD 7.0-RELEASE-p10 #0: Mon Mar 23 >> 09:26:18 EDT 2009 root@peer:/usr/obj/usr/src/sys/PEER i386 >> >> # dmesg | grep ath >> ath_hal: 0.10.5.6 (AR5210, AR5211, AR5212, AR5416, RF5111, RF5112, >> RF2413, RF5413, RF2133, RF2425, RF2417) >> ath0: mem 0xa8410000-0xa841ffff irq 11 at device 0.0 >> on cardbus0 >> ath0: [ITHREAD] >> ath0: using obsoleted if_watchdog interface >> ath0: Ethernet address: 00:14:d1:42:21:5a >> ath0: mac 7.9 phy 4.5 radio 5.6 >> >> The three SSIDs configured on the CURRENT machine show up in a scan: >> >> # ifconfig ath0 scan | grep wlan >> wlan0 00:18:e7:33:5e:24 11 54M -66:-93 100 ES WME >> wlan1 06:18:e7:33:5e:24 11 54M -65:-93 100 ES WME >> wlan2 0a:18:e7:33:5e:24 11 54M -65:-93 100 ES WME >> >> The client is only able to associate with wlan1, however. When >> scanning channels while attempting to associate with any of the other >> ones, it gets stuck on channel 11 for a while before moving on, which >> seems relevant. Also interesting is the fact that if i do "ifconfig >> ath0 down" on the CURRENT machine, followed by, for example, >> "ifconfig ath0 ssid wlan0" (which did not associate before) on the >> client, followed by "ifconfig ath0 up" on the CURRENT machine, the >> client will associate with wlan0, but will not be able to associate >> with wlan1 or wlan2. Any ideas? > wlandebug scan+auth+assoc on the client machine will show you why you > cannot associate. You can also enable the same info on the ap side to > see what it thinks is happening. FWIW I just setup 3 vap's as you did above and hooked them into a bridge. I verified I could associate and pass traffic using a MBPro. No problems. I also destroyed the bridge and re-tested w/o issues. Regardless the debug msgs should identify what your problem is. Sam From fazaeli at sepehrs.com Wed Apr 8 15:35:13 2009 From: fazaeli at sepehrs.com (H.Fazaeli) Date: Wed Apr 8 15:35:22 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: <900824.65358.qm@web63901.mail.re1.yahoo.com> References: <900824.65358.qm@web63901.mail.re1.yahoo.com> Message-ID: <49DC33DD.8000708@sepehrs.com> Dear Jack Can you please comment on below statements ?! Is the assertion true for all OSes (windows, linux, ...) or it is just freebsd? I am actually concerned in how much production ready is igb drivers in your opinion. As a matter of fact, We have been (and are) using em drivers for years on production systems in biggest ICPs/ISPs/organizations without problem and we have very good faith in it (I have not tested igb). Barney Cordoba wrote: --- On Tue, 4/7/09, Ivan Voras [1] wrote: From: Ivan Voras [2] Subject: Re: Advice on a multithreaded netisr patch? To: [3]freebsd-net@freebsd.org Date: Tuesday, April 7, 2009, 5:59 PM Barney Cordoba wrote: 1) Multiple TX queues are not supported. There's some hokey code to test, but it doesn't properly separate flows to the queues. 2) 2 Rx queues don't work, so only 1 and 4 work 3) With 4 queues, it just sucks up CPU under heavy load on 4 cpus. It will blow 4 cpus at a lower load than em will with 1 4) You'll need to fix DMA setup, as it sets the alignment requirement to PAGE_SIZE. I haven't been able to convince Jack that its wrong, not that I've tried very hard since its easy to just fix myself. Reading this thread it looks like the development of both Intel drivers is a bit stalled, doesn't it? AFAIK the em driver is also semi-officially abandoned, and both from my experience and others it looks like new development and patches are being rejected. Time to shop other hardware? To be fair, the OS doesn't really support multiqueue yet, or has for only a few hours, so lets not go crazy. It makes a lot more sense to have someone on the "team" work with Jack on improving the performance and working out the kinks. When I asked Jack about the poor performance of if_igb, he indicated that Intel's position is that the drivers are "just samples", which really doesn't give anyone much confidence that they want to run their business on them. You already have Jack doing all of the hard work; that is supporting the new-chip-per-week that intel puts out, so it seems to me the best strategy would be to try to convince Intel that its in their best interest to have drivers that work well so people don't think that their hardware stinks. As an example, the Chelsio 10gb bypass card is $3495. and an Intel card is ~$1000, so its a big win for the community as a whole to have good intel drivers going forward. My work is commercially proprietary so I can't share my code, but I can certainly share ideas on things that I've tested and discovered. Barney _______________________________________________ [4]freebsd-net@freebsd.org mailing list [5]http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [6]"freebsd-net-unsubscribe@freebsd.org" -- Best regards. Hooman Fazaeli [7] Sepehr S. T. Co. Ltd. Web: [8]http://www.sepehrs.com Tel: (9821)88975701-2 Fax: (9821)88983352 References 1. mailto:ivoras@freebsd.org 2. mailto:ivoras@freebsd.org 3. mailto:freebsd-net@freebsd.org 4. mailto:freebsd-net@freebsd.org 5. http://lists.freebsd.org/mailman/listinfo/freebsd-net 6. mailto:freebsd-net-unsubscribe@freebsd.org 7. mailto:hf@sepehrs.com 8. http://www.sepehrs.com/ From mailing at gaturkey.com Thu Apr 9 00:43:24 2009 From: mailing at gaturkey.com (Global Access Travel) Date: Thu Apr 9 00:43:54 2009 Subject: Private Shore Excursions-Turkey Message-ID: [http://www.turkeycalling.us] PRIVATE SHORE EXCURSIONS- TURKEY Your cruise clients will make the best of their time in Turkey on a private shore excursion! Istanbul Kusadasi & Ephesus [mailto:incoming@gaturkey.com?subject=Private Shore Excursions- Turkey] **************************************************************************** Yasal Uyar?; Bu e-posta, sadece adreste belirtilen kisi veya kurulusun kullanimini hedeflemekte olup,mesajda yer alan bilgiler kisiye ozel ve gizli olabilir, yasalar ya da anlasmalar geregi ?c?nc? kisiler ile paylasilmasi m?mk?n olmayabilir.Mesaji alan kisi, mesajin g?nderilmek istendigi kisi veya kurulus degilse,bu mesaji yaymak,dagitmak veya kopyalamak yasaktir Mesaj tarafiniza yanlislikla ulasmissa l?tfen mesaji geri g?nderiniz ve sisteminizden siliniz. Global Turizm Hizmetleri Anonim Sirketi bu mesajin icerigi ile ilgili olarak hicbir hukuksal sorumlulugu kabul etmez. **************************************************************************** Disclaimer; This e-mail communication is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential and that may not be made public by law or agreement. If the recipient of this message is not the intended recipient or entity, you are hereby notified that any further dissemination, distribution or copying of this information is strictly prohibited. If you have received this message in error, please immediately notify the sender and delete it from your system. The Global Turizm Hizmetleri Anonim Sirketi does not accept legal responsibility for the contents of this message. *********************************************************************************************** Yasal Uyar?; Bu e-posta, sadece adreste belirtilen kisi veya kurulusun kullanimini hedeflemekte olup,mesajda yer alan bilgiler kisiye ozel ve gizli olabilir, yasalar ya da anlasmalar geregi ?c?nc? kisiler ile paylasilmasi m?mk?n olmayabilir.Mesaji alan kisi, mesajin g?nderilmek istendigi kisi veya kurulus degilse,bu mesaji yaymak,dagitmak veya kopyalamak yasaktir Mesaj tarafiniza yanlislikla ulasmissa l?tfen mesaji geri g?nderiniz ve sisteminizden siliniz. Global Turizm Hizmetleri Anonim Sirketi bu mesajin icerigi ile ilgili olarak hicbir hukuksal sorumlulugu kabul etmez. ********************************************************************************************** Disclaimer; This e-mail communication is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential and that may not be made public by law or agreement. If the recipient of this message is not the intended recipient or entity, you are hereby notified that any further dissemination, distribution or copying of this information is strictly prohibited. If you have received this message in error, please immediately notify the sender and delete it from your system. The Global Turizm Hizmetleri Anonim Sirketi does not accept legal responsibility for the contents of this message. This message was sent by: Global Access Incoming, Nuzhetiye cad, istanbul, besiktas 34357, Turkey Powered by iContact: http://freetrial.icontact.com To be removed click here: http://app.icontact.com/icp/mmail-mprofile.pl?r=46043374&l=82228&s=CMEC&m=562566&c=305227 Forward to a friend: http://app.icontact.com/icp/sub/forward?m=562566&s=46043374&c=CMEC&cid=305227 From barney_cordoba at yahoo.com Thu Apr 9 01:46:36 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Thu Apr 9 01:46:49 2009 Subject: Advice on a multithreaded netisr patch? In-Reply-To: Message-ID: <792562.49628.qm@web63901.mail.re1.yahoo.com> --- On Wed, 4/8/09, Robert Watson wrote: > From: Robert Watson > Subject: Re: Advice on a multithreaded netisr patch? > To: "Barney Cordoba" > Cc: "Ivan Voras" , freebsd-net@freebsd.org > Date: Wednesday, April 8, 2009, 9:16 AM > On Wed, 8 Apr 2009, Barney Cordoba wrote: > > > Is there any work being done on lighter weight locks > for queues? It seems ridiculous to avoid using queues > because of lock contention when the locks are only > protecting a couple lines of code. > > My reading is that there are two, closely related, things > going on: the first is lock contention, and the second is > cache line contention. We have a primitive in 8.x > (don't think it's been MFC'd yet) for a lockless > atomic buffer primitive for use in drivers and other parts > of the stack. However, that addresses only lock contention, > not line contention, which at a high PPS will be an issue as > well. Only by moving to independent data structures (i.e., > on independent cache lines) can we reduce line contention. > > Robert N M Watson > Computer Laboratory > University of Cambridge Are mutexes smart enough to know to yield to higher priority threads that are waiting immediately? Such as mtx_unlock() { do_unlock_stuff(); if (higher_pri_waiting) sched_yield() } Also is there a way from the structure or flags to determing is some other thread is waiting on the lock, such as? mtx_unlock(&mtx); if (mtx.someone_is_waiting) sched_yield(); or better yet if (higher_priority_is_waiting) sched_yield() I don't quite have a handle on how the turnstile works, but it seems that there is a lot of time waiting for very short-lived locks. If the tasks are on different cpus, what is the granularity of the wait time for a lock that is cleared almost immediately after trying it? Also, is the waiting only extended when the threads are running on the same cpu? Barney From f.bonnet at esiee.fr Thu Apr 9 02:58:56 2009 From: f.bonnet at esiee.fr (Frank Bonnet) Date: Thu Apr 9 02:59:03 2009 Subject: IBM X3650 at 7.1 with broadcom chips and CISCO LACP with LAGG driver ? Message-ID: <49DDC2AB.1090100@esiee.fr> Hello I plan to migrate our mailhub to 7.1 but before I do it I need infos about network :-) The machine is an IBM X3650 that have two Broadcom gigaethernet interfaces. I want to use the LAGG driver in LACP mode with a Cisco switch to connect the machine to my LAN in bonding mode Is there any known network problem at 7.1 with such driver/machine ? This machine has permanently 500/600 IMAP(S) processes and a high SMTP traffic. Thanks for any infos. From pluknet at gmail.com Thu Apr 9 03:14:41 2009 From: pluknet at gmail.com (pluknet) Date: Thu Apr 9 03:14:47 2009 Subject: IBM X3650 at 7.1 with broadcom chips and CISCO LACP with LAGG driver ? In-Reply-To: <49DDC2AB.1090100@esiee.fr> References: <49DDC2AB.1090100@esiee.fr> Message-ID: 2009/4/9 Frank Bonnet : > Hello > > I plan to migrate our mailhub to 7.1 but before I do it > I need infos about network :-) > > The machine is an IBM X3650 that have two Broadcom > gigaethernet interfaces. [..] > Is there any known network problem at 7.1 with such driver/machine ? At work we have a such one running under 7.1-R with MySQL and Mail services, without high memory or network pressure though. The last uptime was 43 days. No any network problems were discovered for that time. -- wbr, pluknet From xernet at hotmail.it Fri Apr 10 00:17:07 2009 From: xernet at hotmail.it (xer) Date: Fri Apr 10 00:17:29 2009 Subject: watchdog timeout Message-ID: Hello, i did sent this mine message to stable mail list, then i found that your address is a manteiner for some bugs. I'm asking if this one article: http://www.freebsd.org/cgi/query-pr.cgi?pr=129352 Has updates, since i haven't found any new, 'cause it's talking about PRERELEASE and i'm working on 6.4-STABLE, also how can it is possible to have a compiled kernel on january and it have this bug still present? Thand in advance for a your responce Regards -------------------------------------------------- From: "xer" Sent: Wednesday, April 08, 2009 10:41 AM To: Subject: watchdog timeout > Hello > I have some problems with 3Com nics, after a upgrade from 5.5-STABLE to > 6.4-STABLE. > > This machine has two 3com nics (one is LAN other is WAN) and i see too > much "watchdog timeout" on both cards. > This on/off up/down on cards, affect the interrupt to clients that are > downloading from apache web server, especially on large files. > > -------------------------------------------- > xer:/root# dmesg > xl1: watchdog timeout > xl1: link state changed to DOWN > xl1: link state changed to UP > xl1: watchdog timeout > xl1: link state changed to DOWN > xl1: link state changed to UP > xl1: watchdog timeout > xl1: link state changed to DOWN > xl1: link state changed to UP > --------------------------------------------- > > xer:/root# cat /var/run/dmesg.boot | grep xl > xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xec00-0xec7f mem > 0xfceffc00-0xfceffc7f irq 23 at device 11.0 on pci2 > miibus0: on xl0 > xlphy0: <3c905C 10/100 internal PHY> on miibus0 > xlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > xl0: Ethernet address: 00:01:02:e0:04:1b > xl1: <3Com 3c905C-TX Fast Etherlink XL> port 0xe880-0xe8ff mem > 0xfceff800-0xfceff87f irq 20 at device 12.0 on pci2 > miibus1: on xl1 > xlphy1: <3c905C 10/100 internal PHY> on miibus1 > xlphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > xl1: Ethernet address: 00:01:02:df:fe:ed > --------------------------------------------- > Another doubt would be my kernel config, maybe there is something wrong > that i cannot see, i'll post at the end of this post, 'cause is too long. > > As you can see, the cards are 3c905C-TX model. > Someone told me to change drivers, but i cannot understand this advice. > I got same errors with same cards but with another mainboard, same > problem, watchdog appears after an upgrade from 5.4-STABLE to 6.4-STABLE. > > I don't think that to change nic's pci slots, will solve the problem, i > think that maybe change the nics would resolve the matter, but i cannot > access to both FreeBSD phisically, cause the boxes are too far from me > (about 3500 km). > > I'm asking you some advices, and i can i fix this problem. > p.s. with both 5.4 or 5.5 old kernel, the nics was fine. > > Regards > Xer > > ----------kernel config ----------- > xer:/root# cat /usr/src/sys/i386/conf/ASUS > # > # $FreeBSD: src/sys/i386/conf/GENERIC,v 1.429.2.18 2008/07/28 02:20:29 > yongari Exp $ > # > # custom kernel ASUS 01.15.2009 > > machine i386 > cpu I686_CPU > ident ASUS > > options SCHED_4BSD # 4BSD scheduler > options PREEMPTION # Enable kernel thread preemption > options INET # InterNETworking > options INET6 # IPv6 communications protocols > options FFS # Berkeley Fast Filesystem > options SOFTUPDATES # Enable FFS soft updates support > options UFS_ACL # Support for access control lists > options UFS_DIRHASH # Improve performance on big > directories > options MD_ROOT # MD is a potential root device > options NFSCLIENT # Network Filesystem Client > options NFSSERVER # Network Filesystem Server > options NFSLOCKD # Network Lock Manager > options NFS_ROOT # NFS usable as /, requires > NFSCLIENT > options MSDOSFS # MSDOS Filesystem > options CD9660 # ISO 9660 Filesystem > options PROCFS # Process filesystem (requires > PSEUDOFS) > options PSEUDOFS # Pseudo-filesystem framework > options GEOM_GPT # GUID Partition Tables. > options COMPAT_43 # Compatible with BSD 4.3 [KEEP > THIS!] > options COMPAT_FREEBSD4 # Compatible with FreeBSD4 > options COMPAT_FREEBSD5 # Compatible with FreeBSD5 > options KTRACE # ktrace(1) support > options SYSVSHM # SYSV-style shared memory > options SYSVMSG # SYSV-style message queues > options SYSVSEM # SYSV-style semaphores > options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time > extensions > options KBD_INSTALL_CDEV # install a CDEV entry in /dev > options ADAPTIVE_GIANT # Giant mutex is adaptive. > > device apic # I/O APIC > > # Bus support. > device eisa > device pci > > # Floppy drives > device fdc > > # ATA and ATAPI devices > device ata > device atadisk # ATA disk drives > device ataraid # ATA RAID drives > device atapicd # ATAPI CDROM drives > device atapifd # ATAPI floppy drives > device atapist # ATAPI tape drives > options ATA_STATIC_ID # Static device numbering > > # atkbdc0 controls both the keyboard and the PS/2 mouse > device atkbdc # AT keyboard controller > device atkbd # AT keyboard > device psm # PS/2 mouse > > device kbdmux # keyboard multiplexer > > device vga # VGA video card driver > > device splash # Splash screen and screen saver support > > # syscons is the default console driver, resembling an SCO console > device sc > > device agp # support several AGP chipsets > > # Add suspend/resume support for the i8254. > device pmtimer > > # Serial (COM) ports > device sio # 8250, 16[45]50 based serial ports > > # Parallel port > device ppc > device ppbus # Parallel port bus (required) > device lpt # Printer > device plip # TCP/IP over parallel > device ppi # Parallel port interface device > > # PCI Ethernet NICs. > device de # DEC/Intel DC21x4x (``Tulip'') > device em # Intel PRO/1000 adapter Gigabit Ethernet > Card > device ixgb # Intel PRO/10GbE Ethernet Card > device txp # 3Com 3cR990 (``Typhoon'') > device vx # 3Com 3c590, 3c595 (``Vortex'') > > # PCI Ethernet NICs that use the common MII bus controller code. > # NOTE: Be sure to keep the 'device miibus' line in order to use these > NICs! > device miibus # MII bus support > device bce # Broadcom BCM5706/BCM5708 Gigabit > Ethernet > device bfe # Broadcom BCM440x 10/100 Ethernet > device bge # Broadcom BCM570xx Gigabit Ethernet > device dc # DEC/Intel 21143 and various workalikes > device fxp # Intel EtherExpress PRO/100B (82557, > 82558) > device jme # JMicron JMC250 Gigabit/JMC260 Fast > Ethernet > device lge # Level 1 LXT1001 gigabit Ethernet > device msk # Marvell/SysKonnect Yukon II Gigabit > Ethernet > device nge # NatSemi DP83820 gigabit Ethernet > device nve # nVidia nForce MCP on-board Ethernet > Networking > device pcn # AMD Am79C97x PCI 10/100(precedence over > 'lnc') > device re # RealTek 8139C+/8169/8169S/8110S > device rl # RealTek 8129/8139 > device sf # Adaptec AIC-6915 (``Starfire'') > device sis # Silicon Integrated Systems SiS 900/SiS > 7016 > device sk # SysKonnect SK-984x & SK-982x gigabit > Ethernet > device ste # Sundance ST201 (D-Link DFE-550TX) > device stge # Sundance/Tamarack TC9021 gigabit > Ethernet > device ti # Alteon Networks Tigon I/II gigabit > Ethernet > device tl # Texas Instruments ThunderLAN > device tx # SMC EtherPower II (83c170 ``EPIC'') > device vge # VIA VT612x gigabit Ethernet > device vr # VIA Rhine, Rhine II > device wb # Winbond W89C840F > device xl # 3Com 3c90x (``Boomerang'', ``Cyclone'') > > # ISA Ethernet NICs. pccard NICs included. > device cs # Crystal Semiconductor CS89x0 NIC > # 'device ed' requires 'device miibus' > device ed # NE[12]000, SMC Ultra, 3c503, DS8390 > cards > device ex # Intel EtherExpress Pro/10 and Pro/10+ > device ep # Etherlink III based cards > device fe # Fujitsu MB8696x based cards > device ie # EtherExpress 8/16, 3C507, StarLAN 10 > etc. > device lnc # NE2100, NE32-VL Lance Ethernet cards > device sn # SMC's 9000 series of Ethernet chips > device xe # Xircom pccard Ethernet > > # Pseudo devices. > device loop # Network loopback > device random # Entropy device > device ether # Ethernet support > device sl # Kernel SLIP > device ppp # Kernel PPP > device tun # Packet tunnel. > device pty # Pseudo-ttys (telnet etc) > device md # Memory "disks" > device gif # IPv6 and IPv4 tunneling > device faith # IPv6-to-IPv4 relaying (translation) > > # The `bpf' device enables the Berkeley Packet Filter. > # Be aware of the administrative consequences of enabling this! > # Note that 'bpf' is required for DHCP. > device bpf # Berkeley packet filter > > # Firewall > options IPFIREWALL # enable ipfirewall > (required for dummynet) > options IPFIREWALL_VERBOSE # enable firewall output > logging to syslogd(8) > options IPFIREWALL_VERBOSE_LIMIT=0 # limit firewall verbosity > output > options IPDIVERT # divert sockets > options DUMMYNET # enable dummynet > operation > options HZ=1000 # set the timer > granularity > > From to.my.trociny at gmail.com Fri Apr 10 05:10:06 2009 From: to.my.trociny at gmail.com (Mikolaj Golub) Date: Fri Apr 10 05:10:12 2009 Subject: kern/131310: [netgraph] [panic] 7.1 panics with mpd netgraph interface changes Message-ID: <200904101210.n3ACA4Hp092072@freefall.freebsd.org> The following reply was made to PR kern/131310; it has been noted by GNATS. From: Mikolaj Golub To: bug-followup@FreeBSD.org,Vitaly Dodonov Cc: Semenchuk Oleg Subject: Re: kern/131310: [netgraph] [panic] 7.1 panics with mpd netgraph interface changes Date: Fri, 10 Apr 2009 15:09:38 +0300 This pr is closely related to kern/130977. You can try the patch from it, which adds if_delgroup(ifp, IFG_ALL) to if_detach(). -- Mikolaj Golub From bzeeb-lists at lists.zabbadoz.net Fri Apr 10 05:40:08 2009 From: bzeeb-lists at lists.zabbadoz.net (Bjoern A. Zeeb) Date: Fri Apr 10 05:40:14 2009 Subject: IPv6 window scaling factor always 1 on initial SYN In-Reply-To: <20090407.165708.74744827.sthaug@nethelp.no> References: <20090405215842.C15361@maildrop.int.zabbadoz.net> <20090406.121959.74751582.sthaug@nethelp.no> <20090407144311.F15361@maildrop.int.zabbadoz.net> <20090407.165708.74744827.sthaug@nethelp.no> Message-ID: <20090410123559.D15361@maildrop.int.zabbadoz.net> On Tue, 7 Apr 2009, sthaug@nethelp.no wrote: >>> I changed it, and that worked like a dream. Now I get basically the >>> same throughput with IPv4 and IPv6. There are of course still issues >>> like lots of IPv6 tunnels that add extra latency - but that's not the >>> fault of FreeBSD. >>> >>> Anyway, thanks for your work. Below is a context diff (against 7-STABLE >>> cvsupped last night). Do we need a PR to get this into FreeBSD? >> >> It's in HEAD now as of SVN r190800. > > Excellent news, thank you! And presumably we'll get a MFC after a > suitable settling time? If 3 days were suitable;) It'll be part of 7.2-R as it is in stable/7 now. Thanks a lot for reporting and testing! /bz -- Bjoern A. Zeeb The greatest risk is not taking one. From dfilter at FreeBSD.ORG Fri Apr 10 07:50:05 2009 From: dfilter at FreeBSD.ORG (dfilter service) Date: Fri Apr 10 07:50:16 2009 Subject: kern/131310: commit references a PR Message-ID: <200904101450.n3AEo44I008054@freefall.freebsd.org> The following reply was made to PR kern/131310; it has been noted by GNATS. From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/131310: commit references a PR Date: Fri, 10 Apr 2009 14:42:02 +0000 (UTC) Author: mlaier Date: Fri Apr 10 14:41:51 2009 New Revision: 190895 URL: http://svn.freebsd.org/changeset/base/190895 Log: Remove interfaces from IFG_ALL on detach. This cures a couple of pf panics when using the "self" keyword in tables or as ()-style host address and fixes "ifconfig -g all" output. PR: kern/130977, kern/131310 Submitted by: Mikolaj Golub MFC after: 3 days Modified: head/sys/net/if.c Modified: head/sys/net/if.c ============================================================================== --- head/sys/net/if.c Fri Apr 10 14:24:12 2009 (r190894) +++ head/sys/net/if.c Fri Apr 10 14:41:51 2009 (r190895) @@ -887,6 +887,7 @@ if_detach(struct ifnet *ifp) rt_ifannouncemsg(ifp, IFAN_DEPARTURE); EVENTHANDLER_INVOKE(ifnet_departure_event, ifp); devctl_notify("IFNET", ifp->if_xname, "DETACH", NULL); + if_delgroup(ifp, IFG_ALL); IF_AFDATA_LOCK(ifp); for (dp = domains; dp; dp = dp->dom_next) { _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From kfl at xiplink.com Fri Apr 10 08:02:20 2009 From: kfl at xiplink.com (Karim Fodil-Lemelin) Date: Fri Apr 10 08:02:27 2009 Subject: m_tag, malloc vs uma Message-ID: <49DF5F75.6080607@xiplink.com> Hello, Is there any plans on getting the mbuf tags sub-system integrated with the universal memory allocator? Getting tags for mbufs is still calling malloc in uipc_mbuf.c ... What would be the benefits of using uma instead? Karim. From spawk at acm.poly.edu Fri Apr 10 10:08:02 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Fri Apr 10 10:08:10 2009 Subject: Multi-BSS problem with Atheros 5212 In-Reply-To: <49DCD66B.6040504@freebsd.org> References: <49DCAC1F.9000708@acm.poly.edu> <49DCC1EB.3040706@freebsd.org> <49DCD66B.6040504@freebsd.org> Message-ID: <49DF7CE9.6060706@acm.poly.edu> Sam Leffler wrote: > Sam Leffler wrote: >> Boris Kochergin wrote: >>> Ahoy. I'm having trouble with multiple hostap-mode wlan >>> pseudo-devices. The machine is an 8-CURRENT from yesterday: >>> >>> # uname -a >>> FreeBSD test 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Tue Apr 7 16:54:56 >>> UTC 2009 root@test:/usr/obj/usr/src/sys/GENERIC i386 >>> >>> # dmesg | grep ath >>> ath0: mem 0xf4100000-0xf410ffff irq 11 at device 13.0 >>> on pci0 >>> ath0: [ITHREAD] >>> ath0: AR2413 mac 7.9 RF2413 phy 4.5 >>> >>> # cat /etc/rc.conf >>> wlans_ath0="wlan0 wlan1 wlan2" >>> create_args_wlan0="wlanmode hostap bssid" >>> create_args_wlan1="wlanmode hostap bssid" >>> create_args_wlan2="wlanmode hostap bssid" >>> ifconfig_wlan0="ssid wlan0 wepmode off up" >>> ifconfig_wlan1="ssid wlan1 wepmode off up" >>> ifconfig_wlan2="ssid wlan2 wepmode off up" >>> >>> # ifconfig >>> ath0: flags=8843 metric 0 >>> mtu 2290 >>> ether 00:18:e7:33:5e:24 >>> media: IEEE 802.11 Wireless Ethernet autoselect mode 11g >>> >>> status: running >>> fxp0: flags=8843 metric 0 >>> mtu 1500 >>> options=8 >>> ether 00:90:27:72:c4:f3 >>> inet 10.0.0.128 netmask 0xffffff00 broadcast 10.0.0.255 >>> media: Ethernet autoselect (100baseTX ) >>> status: active >>> lo0: flags=8049 metric 0 mtu 16384 >>> options=3 >>> inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 >>> inet6 ::1 prefixlen 128 >>> inet 127.0.0.1 netmask 0xff000000 >>> wlan0: flags=8843 metric 0 >>> mtu 1500 >>> ether 00:18:e7:33:5e:24 >>> media: IEEE 802.11 Wireless Ethernet autoselect mode 11g >>> >>> status: running >>> ssid wlan0 channel 11 (2462 Mhz 11g) bssid 00:18:e7:33:5e:24 >>> country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60 >>> protmode CTS wme burst dtimperiod 1 -dfs >>> wlan1: flags=8843 metric 0 >>> mtu 1500 >>> ether 06:18:e7:33:5e:24 >>> media: IEEE 802.11 Wireless Ethernet autoselect mode 11g >>> >>> status: running >>> ssid wlan1 channel 11 (2462 Mhz 11g) bssid 06:18:e7:33:5e:24 >>> country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60 >>> protmode CTS wme burst dtimperiod 1 -dfs >>> wlan2: flags=8843 metric 0 >>> mtu 1500 >>> ether 0a:18:e7:33:5e:24 >>> media: IEEE 802.11 Wireless Ethernet autoselect mode 11g >>> >>> status: running >>> ssid wlan2 channel 11 (2462 Mhz 11g) bssid 0a:18:e7:33:5e:24 >>> country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60 >>> protmode CTS wme burst dtimperiod 1 -dfs >>> >>> The client is a 7.0 machine with another 5212 card: >>> >>> # uname -a >>> FreeBSD peer 7.0-RELEASE-p10 FreeBSD 7.0-RELEASE-p10 #0: Mon Mar 23 >>> 09:26:18 EDT 2009 root@peer:/usr/obj/usr/src/sys/PEER i386 >>> >>> # dmesg | grep ath >>> ath_hal: 0.10.5.6 (AR5210, AR5211, AR5212, AR5416, RF5111, RF5112, >>> RF2413, RF5413, RF2133, RF2425, RF2417) >>> ath0: mem 0xa8410000-0xa841ffff irq 11 at device 0.0 >>> on cardbus0 >>> ath0: [ITHREAD] >>> ath0: using obsoleted if_watchdog interface >>> ath0: Ethernet address: 00:14:d1:42:21:5a >>> ath0: mac 7.9 phy 4.5 radio 5.6 >>> >>> The three SSIDs configured on the CURRENT machine show up in a scan: >>> >>> # ifconfig ath0 scan | grep wlan >>> wlan0 00:18:e7:33:5e:24 11 54M -66:-93 100 ES WME >>> wlan1 06:18:e7:33:5e:24 11 54M -65:-93 100 ES WME >>> wlan2 0a:18:e7:33:5e:24 11 54M -65:-93 100 ES WME >>> >>> The client is only able to associate with wlan1, however. When >>> scanning channels while attempting to associate with any of the >>> other ones, it gets stuck on channel 11 for a while before moving >>> on, which seems relevant. Also interesting is the fact that if i do >>> "ifconfig ath0 down" on the CURRENT machine, followed by, for >>> example, "ifconfig ath0 ssid wlan0" (which did not associate before) >>> on the client, followed by "ifconfig ath0 up" on the CURRENT >>> machine, the client will associate with wlan0, but will not be able >>> to associate with wlan1 or wlan2. Any ideas? >> wlandebug scan+auth+assoc on the client machine will show you why you >> cannot associate. You can also enable the same info on the ap side >> to see what it thinks is happening. > > FWIW I just setup 3 vap's as you did above and hooked them into a > bridge. I verified I could associate and pass traffic using a MBPro. > No problems. I also destroyed the bridge and re-tested w/o issues. > Regardless the debug msgs should identify what your problem is. > > Sam > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" I booted the hostap machine up and set wlandebug to scan+auth+assoc on wlan0, wlan1, and wlan2. I then inserted the PCMCIA card into the client machine, set wlandebug to scan+auth+assoc on it (ath0), and executed "ifconfig ath0 ssid wlan0 up". I let it scan around for a bit. The client-side debug messages are at http://acm.poly.edu/~spawk/wlan/wlan0.client, and the hostap machine did not emit any debug messages during the association attempts. I then ejected the card from the client and repeated the process for wlan1 (it associated). The client-side debug messages are at http://acm.poly.edu/~spawk/wlan/wlan1.client and the hostap-side debug messages are at http://acm.poly.edu/~spawk/wlan/wlan1.ap. I then ejected the card from the client and repeated the process for wlan2. The client-side debug messages are at http://acm.poly.edu/~spawk/wlan/wlan2.client, and the hostap machine did not emit any debug messages during the association attempts. In case it's relevant, the client card is a PCMCIA version of... ath0@pci0:5:0:0: class=0x020000 card=0x2051168c chip=0x0013168c rev=0x01 hdr=0x00 vendor = 'Atheros Communications Inc.' device = 'AR5212, AR5213 802.11a/b/g Wireless Adapter' class = network subclass = ethernet ...and the hostap card is a PCI version of the same thing: ath0@pci0:0:13:0: class=0x020000 card=0x2051168c chip=0x0013168c rev=0x01 hdr=0x00 vendor = 'Atheros Communications Inc.' device = 'AR5212, AR5213 802.11a/b/g Wireless Adapter' class = network subclass = ethernet -Boris From rwatson at FreeBSD.org Fri Apr 10 11:55:20 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Fri Apr 10 11:55:27 2009 Subject: m_tag, malloc vs uma In-Reply-To: <49DF5F75.6080607@xiplink.com> References: <49DF5F75.6080607@xiplink.com> Message-ID: On Fri, 10 Apr 2009, Karim Fodil-Lemelin wrote: > Is there any plans on getting the mbuf tags sub-system integrated with the > universal memory allocator? Getting tags for mbufs is still calling malloc > in uipc_mbuf.c ... What would be the benefits of using uma instead? Hi Karim: Right now there are no specific plans for changes along these lines, although we have talked about moving towards better support for deep objects in m_tags. Right now, MAC requires a "deep" copy, because labels may be complex objects, and this is special-cased in the m_tag code. One way to move in that direction would be to move from an explicit m_tag free pointer to a pointer to a vector of copy, free, etc, operations. This would make it easier to support more flexible memory models there, rather than forcing the use of malloc(9). That said, malloc(9) for "small" memory types is essentially a thin wrapper accounting around a set of fixed-size UMA zones: ITEM SIZE LIMIT USED FREE REQUESTS FAILURES 16: 16, 0, 3703, 966, 55930783, 0 32: 32, 0, 1455, 692, 30720298, 0 64: 64, 0, 4794, 1224, 38352819, 0 128: 128, 0, 3169, 341, 5705218, 0 256: 256, 0, 1565, 535, 48338889, 0 512: 512, 0, 386, 494, 9962475, 0 1024: 1024, 0, 66, 354, 3418306, 0 2048: 2048, 0, 314, 514, 29945, 0 4096: 4096, 0, 250, 279, 4567645, 0 For larger memory sizes, malloc(9) becomes instead a thin wrapper around VM allocation of kernel address space and pages. So as long as you're using smaller objects, malloc(9) actually offers most of the benefits of slab allocation. Because m_tag(9) is an interface used for a variety of base system and third party parts, changes to the KPI would need to be made with a major FreeBSD release -- for example with 8.0. Such a change is definitely not precluded at this point, but in a couple of months we'll hit feature freeze and it won't be possible to make those changes after that time. Robert N M Watson Computer Laboratory University of Cambridge From dfilter at FreeBSD.ORG Fri Apr 10 12:20:06 2009 From: dfilter at FreeBSD.ORG (dfilter service) Date: Fri Apr 10 12:20:16 2009 Subject: kern/131310: commit references a PR Message-ID: <200904101920.n3AJK5rg070896@freefall.freebsd.org> The following reply was made to PR kern/131310; it has been noted by GNATS. From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/131310: commit references a PR Date: Fri, 10 Apr 2009 19:16:29 +0000 (UTC) Author: mlaier Date: Fri Apr 10 19:16:14 2009 New Revision: 190903 URL: http://svn.freebsd.org/changeset/base/190903 Log: Follow up for r190895 It's not only the "all" group that is affected, but all groups on the given interface. PR: kern/130977, kern/131310 MFC after: 3 days (%vnet) Modified: head/sys/net/if.c Modified: head/sys/net/if.c ============================================================================== --- head/sys/net/if.c Fri Apr 10 18:46:46 2009 (r190902) +++ head/sys/net/if.c Fri Apr 10 19:16:14 2009 (r190903) @@ -141,6 +141,7 @@ static int if_delmulti_locked(struct ifn static void do_link_state_change(void *, int); static int if_getgroup(struct ifgroupreq *, struct ifnet *); static int if_getgroupmembers(struct ifgroupreq *); +static void if_delgroups(struct ifnet *); #ifdef INET6 /* @@ -887,7 +888,7 @@ if_detach(struct ifnet *ifp) rt_ifannouncemsg(ifp, IFAN_DEPARTURE); EVENTHANDLER_INVOKE(ifnet_departure_event, ifp); devctl_notify("IFNET", ifp->if_xname, "DETACH", NULL); - if_delgroup(ifp, IFG_ALL); + if_delgroups(ifp); IF_AFDATA_LOCK(ifp); for (dp = domains; dp; dp = dp->dom_next) { @@ -1025,6 +1026,54 @@ if_delgroup(struct ifnet *ifp, const cha } /* + * Remove an interface from all groups + */ +static void +if_delgroups(struct ifnet *ifp) +{ + INIT_VNET_NET(ifp->if_vnet); + struct ifg_list *ifgl; + struct ifg_member *ifgm; + char groupname[IFNAMSIZ]; + + IFNET_WLOCK(); + while (!TAILQ_EMPTY(&ifp->if_groups)) { + ifgl = TAILQ_FIRST(&ifp->if_groups); + + strlcpy(groupname, ifgl->ifgl_group->ifg_group, IFNAMSIZ); + + IF_ADDR_LOCK(ifp); + TAILQ_REMOVE(&ifp->if_groups, ifgl, ifgl_next); + IF_ADDR_UNLOCK(ifp); + + TAILQ_FOREACH(ifgm, &ifgl->ifgl_group->ifg_members, ifgm_next) + if (ifgm->ifgm_ifp == ifp) + break; + + if (ifgm != NULL) { + TAILQ_REMOVE(&ifgl->ifgl_group->ifg_members, ifgm, + ifgm_next); + free(ifgm, M_TEMP); + } + + if (--ifgl->ifgl_group->ifg_refcnt == 0) { + TAILQ_REMOVE(&V_ifg_head, ifgl->ifgl_group, ifg_next); + EVENTHANDLER_INVOKE(group_detach_event, + ifgl->ifgl_group); + free(ifgl->ifgl_group, M_TEMP); + } + IFNET_WUNLOCK(); + + free(ifgl, M_TEMP); + + EVENTHANDLER_INVOKE(group_change_event, groupname); + + IFNET_WLOCK(); + } + IFNET_WUNLOCK(); +} + +/* * Stores all groups from an interface in memory pointed * to by data */ _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From kfl at xiplink.com Fri Apr 10 12:32:04 2009 From: kfl at xiplink.com (Karim Fodil-Lemelin) Date: Fri Apr 10 12:32:11 2009 Subject: m_tag, malloc vs uma In-Reply-To: References: <49DF5F75.6080607@xiplink.com> Message-ID: <49DF9EAD.1050609@xiplink.com> Robert Watson wrote: > On Fri, 10 Apr 2009, Karim Fodil-Lemelin wrote: > >> Is there any plans on getting the mbuf tags sub-system integrated >> with the universal memory allocator? Getting tags for mbufs is still >> calling malloc in uipc_mbuf.c ... What would be the benefits of using >> uma instead? > > Hi Karim: > > Right now there are no specific plans for changes along these lines, > although we have talked about moving towards better support for deep > objects in m_tags. Right now, MAC requires a "deep" copy, because > labels may be complex objects, and this is special-cased in the m_tag > code. One way to move in that direction would be to move from an > explicit m_tag free pointer to a pointer to a vector of copy, free, > etc, operations. This would make it easier to support more flexible > memory models there, rather than forcing the use of malloc(9). > > That said, malloc(9) for "small" memory types is essentially a thin > wrapper accounting around a set of fixed-size UMA zones: > > ITEM SIZE LIMIT USED FREE REQUESTS > FAILURES > 16: 16, 0, 3703, 966, > 55930783, 0 > 32: 32, 0, 1455, 692, > 30720298, 0 > 64: 64, 0, 4794, 1224, > 38352819, 0 > 128: 128, 0, 3169, 341, > 5705218, 0 > 256: 256, 0, 1565, 535, > 48338889, 0 > 512: 512, 0, 386, 494, > 9962475, 0 > 1024: 1024, 0, 66, 354, > 3418306, 0 > 2048: 2048, 0, 314, 514, > 29945, 0 > 4096: 4096, 0, 250, 279, > 4567645, 0 > > For larger memory sizes, malloc(9) becomes instead a thin wrapper > around VM allocation of kernel address space and pages. So as long as > you're using smaller objects, malloc(9) actually offers most of the > benefits of slab allocation. > > Because m_tag(9) is an interface used for a variety of base system and > third party parts, changes to the KPI would need to be made with a > major FreeBSD release -- for example with 8.0. Such a change is > definitely not precluded at this point, but in a couple of months > we'll hit feature freeze and it won't be possible to make those > changes after that time. > > Robert N M Watson > Computer Laboratory > University of Cambridge Hi Robert, Thank you for the answer, clear and concise. I asked the question because I had modified pf_get_mtag() to use uma directly in the hope that it would be faster then calling malloc. But since pf_mtag is 20bytes, malloc will end up using a fixed 32bytes zone and I shouldn't expect much speed gain from using something like (except some savings from not having to select the 32bytes zone): extern uma_zone_t pf_mtag_zone; static __inline struct pf_mtag * pf_get_mtag(struct mbuf *m) { struct m_tag *mtag; if ((mtag = m_tag_find(m, PACKET_TAG_PF, NULL)) == NULL) { mtag = uma_zalloc(pf_mtag_zone, M_NOWAIT); if (mtag == NULL) return (NULL); m_tag_setup(mtag, MTAG_ABI_COMPAT, PACKET_TAG_PF, sizeof(struct pf_mtag)); mtag->m_tag_free = pf_mtag_delete; bzero(mtag + 1, sizeof(struct pf_mtag)); m_tag_prepend(m, mtag); } return ((struct pf_mtag *)(mtag + 1)); } Where pf_mtag_delete is a wrapper around uma_zfree(). Regards, Karim. From rwatson at FreeBSD.org Fri Apr 10 13:02:44 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Fri Apr 10 13:02:51 2009 Subject: m_tag, malloc vs uma In-Reply-To: <49DF9EAD.1050609@xiplink.com> References: <49DF5F75.6080607@xiplink.com> <49DF9EAD.1050609@xiplink.com> Message-ID: On Fri, 10 Apr 2009, Karim Fodil-Lemelin wrote: > Thank you for the answer, clear and concise. I asked the question because I > had modified pf_get_mtag() to use uma directly in the hope that it would be > faster then calling malloc. But since pf_mtag is 20bytes, malloc will end up > using a fixed 32bytes zone and I shouldn't expect much speed gain from using > something like (except some savings from not having to select the 32bytes > zone): There is another small overhead, the critical section used to protect the consistency of the per-CPU malloc type alloc and free counters, but it's also very small. I think it would be desirable to make a change to more flexible m_tag types for 8.0, but I'm not sure I have time to implement/test it. Is this something you might be interested in working on? I'm thinking of basically replacing the m_tag_free pointer with a pointer to a small vector of operations, possibly something along these lines: struct m_tag_ops { void (*m_tag_free)(struct m_tag *); struct m_tag (*m_tag_copy)(struct m_tag *); }; If the m_tag_ops pointer is NULL, we go with today's default (requiring minimal change of existing consumers). I'm not sure if there are any other function pointers we'd need at this point? Robert N M Watson Computer Laboratory University of Cambridge From glen.j.barber at gmail.com Fri Apr 10 13:30:05 2009 From: glen.j.barber at gmail.com (Glen Barber) Date: Fri Apr 10 13:30:14 2009 Subject: misc/129580: Netgear WG311v3 (ndis) causes kenel trap at boot. Message-ID: <200904102030.n3AKU4Lj067093@freefall.freebsd.org> The following reply was made to PR kern/129580; it has been noted by GNATS. From: Glen Barber To: bug-followup@freebsd.org Cc: Subject: Re: misc/129580: Netgear WG311v3 (ndis) causes kenel trap at boot. Date: Fri, 10 Apr 2009 16:04:33 -0400 Since malo(4) is available, I believe this PR can be closed. Thanks. -- Glen Barber From linimon at FreeBSD.org Fri Apr 10 14:13:08 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Fri Apr 10 14:13:19 2009 Subject: kern/133572: [ppp] [hang] incoming PPTP connection hangs the system Message-ID: <200904102113.n3ALD7Fi037625@freefall.freebsd.org> Old Synopsis: incoming PPTP connection hangs the system New Synopsis: [ppp] [hang] incoming PPTP connection hangs the system Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Fri Apr 10 21:11:38 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=133572 From max at love2party.net Fri Apr 10 16:10:05 2009 From: max at love2party.net (Max Laier) Date: Fri Apr 10 16:10:15 2009 Subject: kern/133572: [ppp] [hang] incoming PPTP connection hangs the system Message-ID: <200904102310.n3ANA46d086898@freefall.freebsd.org> The following reply was made to PR kern/133572; it has been noted by GNATS. From: Max Laier To: bug-followup@freebsd.org, dennis.melentyev@gmail.com Cc: Subject: Re: kern/133572: [ppp] [hang] incoming PPTP connection hangs the system Date: Fri, 10 Apr 2009 23:47:55 +0100 Is it possible for you to turn on WITNESS on this machine to obtain possible LORs that might be responsible for the hang? Also, do you have the possibility to enable DDB and drop into it from the console (if it is not a hard hang but a live lock)? -- Max From ccowart at rescomp.berkeley.edu Fri Apr 10 18:54:39 2009 From: ccowart at rescomp.berkeley.edu (Chris Cowart) Date: Fri Apr 10 18:54:45 2009 Subject: bridge(4) and IPv6 link-local address In-Reply-To: <20080630220842.X83875@maildrop.int.zabbadoz.net> References: <48693E39.4080104@ab.ote.we.lv> <20080630220842.X83875@maildrop.int.zabbadoz.net> Message-ID: <20090411013834.GB40655@hal.rescomp.berkeley.edu> Bjoern A. Zeeb wrote: > On Mon, 30 Jun 2008, Eugene M. Kim wrote: > > A quick question: Is bridge(4) supposed /not/ to automatically configure an > > IPv6 link-local address? > > yes there is a check for this in the code and if remoed (tried that > lately) more things go wrong. > > > I'm trying to use it to bridge a wired segment and a wireless segment, and > > router advertisement over bridge0 wouldn't work because, with bridge0 lacking > > a LL address, the router uses a non-LL address as the source address for RA > > packets, which then is ignored as invalid by other IPv6 nodes. > > yes, seem something similar lately but ETIMEOUT on debugging. The > problem basically was: > > lan bridge ath --- wlan client > > the LL address was on the "lan" interface. > > ping6 LL on lan from wlan client did not work. I could see the packets > being bridged and visible on all interfaces and even the router on lan > noticed them but there was no reply going to the client. ping6 from > the bridge ``box'' to the wlan client and everything was fine as nd > was seeded. > > Removing the check we ended up with the same LL address on both bridge > and the lan interface if I can remember correctly and you do not want > that... it's a bit tricky and there is something that does not work as > expected, right. If you find the time to debug it I'll happily test > patches;-) I seem to be reviving a fairly old thread here, but this is what I found when I went searching for the issue. I am personally bridging a wireless NIC (ath0) with a VLAN interface (vlan10). The bridge does not receive a link-local address. The bridge interface (bridge0) is the default gateway for my LAN, both for v4 and v6. My Mac was logging this message in response to router advertisements: | Apr 10 18:16:54 administrators-imac configd[29]: RTADV_VERIFY_PACKET: | invalid RA with non link-local source from 2001:4830:1679:10::1 on en0 and was refusing to acknowledge them. My solution was to assign a link-local address to bridge0 based on the ethernet address (I think I did the EUI-48 stuff correctly): | bridge0: flags=8843 metric 0 mtu 1500 | ether 92:db:a2:b4:8e:ba | inet 10.1.10.1 netmask 0xffffff00 broadcast 10.1.10.255 | inet6 2001:4830:1679:10::1 prefixlen 64 | inet6 fe80::90db:a2ff:feb4:83ba%bridge0 prefixlen 64 scopeid 0xc | id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 | maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200 | root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 According to ifconfig(8): | Basic IPv6 node operation requires a link-local address on each interface | configured for IPv6. Normally, such an address is automatically config- | ured by the kernel on each interface added to the system; this behaviour | may be disabled by setting the sysctl MIB variable | net.inet6.ip6.auto_linklocal to 0. The bridge(4) page does not add any disclaimer about bridge interfaces. Neither man page gives a good how-to on assigning your own link-local address (I guessed and got it right with the % notation). Shouldn't the kernel assign link-local addresses to these interfaces? Should this address be based on the ethernet address of the bridge interface? I'm not sure I really understood the challenges with the implementation. -- Chris Cowart Network Technical Lead Network & Infrastructure Services, RSSP-IT UC Berkeley -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 834 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090411/e7decf89/attachment.pgp From freebsd at levsha.org.ua Sat Apr 11 01:50:04 2009 From: freebsd at levsha.org.ua (Mykola Dzham) Date: Sat Apr 11 01:50:11 2009 Subject: bin/131365: r190758 break using 0 , 0/0, 0.0.0.0/0 as alias for 'default' Message-ID: <200904110850.n3B8o2ka010510@freefall.freebsd.org> The following reply was made to PR bin/131365; it has been noted by GNATS. From: Mykola Dzham To: bug-followup@FreeBSD.org, rrs@FreeBSD.org Cc: Subject: Re: bin/131365: r190758 break using 0 , 0/0, 0.0.0.0/0 as alias for 'default' Date: Sat, 11 Apr 2009 11:20:20 +0300 --UugvWAfsgieZRqgk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi! r190758 break using 0.0.0.0/0 as alias for default rote: $ route -n get default route to: default destination: default mask: default gateway: 192.168.1.1 interface: em0 flags: recvpipe sendpipe ssthresh rtt,msec rttvar hopcount mtu expire 0 0 0 0 0 0 1500 0 $ route -n get -net 0.0.0.0 route: writing to routing socket: No such process Attached patch fix this -- Mykola Dzham, LEFT-(UANIC|RIPE) JID: levsha@jabber.net.ua --UugvWAfsgieZRqgk Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="route.c.patch" Index: route.c =================================================================== --- route.c (revision 190880) +++ route.c (working copy) @@ -818,7 +818,8 @@ /* i holds the first non zero bit */ bits = 32 - (i*8); } - mask = 0xffffffff << (32 - bits); + if (bits != 0) + mask = 0xffffffff << (32 - bits); sin->sin_addr.s_addr = htonl(addr); sin = &so_mask.sin; --UugvWAfsgieZRqgk-- From rrs at lakerest.net Sat Apr 11 03:10:03 2009 From: rrs at lakerest.net (Randall Stewart) Date: Sat Apr 11 03:10:10 2009 Subject: bin/131365: r190758 break using 0 , 0/0, 0.0.0.0/0 as alias for 'default' Message-ID: <200904111010.n3BAA2QQ016570@freefall.freebsd.org> The following reply was made to PR bin/131365; it has been noted by GNATS. From: Randall Stewart To: Mykola Dzham Cc: bug-followup@FreeBSD.org, rrs@FreeBSD.org Subject: Re: bin/131365: r190758 break using 0 , 0/0, 0.0.0.0/0 as alias for 'default' Date: Sat, 11 Apr 2009 06:04:37 -0400 Good catch Mykola.. I will get this in :) R On Apr 11, 2009, at 4:20 AM, Mykola Dzham wrote: > Hi! > r190758 break using 0.0.0.0/0 as alias for default rote: > > $ route -n get default > route to: default > destination: default > mask: default > gateway: 192.168.1.1 > interface: em0 > flags: > recvpipe sendpipe ssthresh rtt,msec rttvar hopcount > mtu expire > 0 0 0 0 0 0 > 1500 0 > > $ route -n get -net 0.0.0.0 > route: writing to routing socket: No such process > > Attached patch fix this > > -- > Mykola Dzham, LEFT-(UANIC|RIPE) > JID: levsha@jabber.net.ua > ------------------------------ Randall Stewart 803-317-4952 (cell) 803-345-0391(direct) From kfl at xiplink.com Sat Apr 11 12:56:37 2009 From: kfl at xiplink.com (Karim Fodil-Lemelin) Date: Sat Apr 11 12:56:43 2009 Subject: m_tag, malloc vs uma In-Reply-To: References: <49DF5F75.6080607@xiplink.com> <49DF9EAD.1050609@xiplink.com> Message-ID: <49E0F5EF.3030807@xiplink.com> Robert Watson wrote: > On Fri, 10 Apr 2009, Karim Fodil-Lemelin wrote: > >> Thank you for the answer, clear and concise. I asked the question >> because I had modified pf_get_mtag() to use uma directly in the hope >> that it would be faster then calling malloc. But since pf_mtag is >> 20bytes, malloc will end up using a fixed 32bytes zone and I >> shouldn't expect much speed gain from using something like (except >> some savings from not having to select the 32bytes zone): > > There is another small overhead, the critical section used to protect > the consistency of the per-CPU malloc type alloc and free counters, > but it's also very small. > > I think it would be desirable to make a change to more flexible m_tag > types for 8.0, but I'm not sure I have time to implement/test it. Is > this something you might be interested in working on? I'm thinking of > basically replacing the m_tag_free pointer with a pointer to a small > vector of operations, possibly something along these lines: > > struct m_tag_ops { > void (*m_tag_free)(struct m_tag *); > struct m_tag (*m_tag_copy)(struct m_tag *); > }; > > If the m_tag_ops pointer is NULL, we go with today's default > (requiring minimal change of existing consumers). I'm not sure if > there are any other function pointers we'd need at this point? Is the m_tag_copy an 'overloaded' function for the current m_tag_copy or something else? Now it could also be interesting to have another function pointer to overload m_tag_alloc to give more control over which zone the user wants its tags from (ex: pf_mtag ...). The interest is there not sure if the schedule will allow it but that depends if the new m_tag designs allows me to squeeze some performances in. Karim. From sam at freebsd.org Sat Apr 11 13:27:11 2009 From: sam at freebsd.org (Sam Leffler) Date: Sat Apr 11 13:27:18 2009 Subject: m_tag, malloc vs uma In-Reply-To: <49E0F5EF.3030807@xiplink.com> References: <49DF5F75.6080607@xiplink.com> <49DF9EAD.1050609@xiplink.com> <49E0F5EF.3030807@xiplink.com> Message-ID: <49E0FD1D.408@freebsd.org> Karim Fodil-Lemelin wrote: > Robert Watson wrote: >> On Fri, 10 Apr 2009, Karim Fodil-Lemelin wrote: >> >>> Thank you for the answer, clear and concise. I asked the question >>> because I had modified pf_get_mtag() to use uma directly in the hope >>> that it would be faster then calling malloc. But since pf_mtag is >>> 20bytes, malloc will end up using a fixed 32bytes zone and I >>> shouldn't expect much speed gain from using something like (except >>> some savings from not having to select the 32bytes zone): >> >> There is another small overhead, the critical section used to protect >> the consistency of the per-CPU malloc type alloc and free counters, >> but it's also very small. >> >> I think it would be desirable to make a change to more flexible m_tag >> types for 8.0, but I'm not sure I have time to implement/test it. Is >> this something you might be interested in working on? I'm thinking >> of basically replacing the m_tag_free pointer with a pointer to a >> small vector of operations, possibly something along these lines: >> >> struct m_tag_ops { >> void (*m_tag_free)(struct m_tag *); >> struct m_tag (*m_tag_copy)(struct m_tag *); >> }; >> >> If the m_tag_ops pointer is NULL, we go with today's default >> (requiring minimal change of existing consumers). I'm not sure if >> there are any other function pointers we'd need at this point? > > Is the m_tag_copy an 'overloaded' function for the current m_tag_copy > or something else? Now it could also be interesting to have another > function pointer to overload m_tag_alloc to give more control over > which zone the user wants its tags from (ex: pf_mtag ...). The > interest is there not sure if the schedule will allow it but that > depends if the new m_tag designs allows me to squeeze some > performances in. Typically tags are allocated in a context where decisions like the above can be made so I'm not sure where you think m_tag_alloc might be used. At one point vlan-tagged packets were identified by an mbuf tag. Initially they were allocated by malloc but I moved that to a dedicated zone w/ a noticeable benefit. However the overhead was still too high and so we now space was added to the mbuf pkt hdr explicitly to hold vlan data. It's unlikely any scheme where the tags are allocated independent of the mbufs will scale well enough to handle existing high speed interfaces. There's been discussion about supporting emedding of tags in the mbuf itself; this might come along as part of the variable-size mbuf work that Jeff Roberson was working on. However unless one pre-allocated space and/or defined a general mechanism for managing such space you'd still potentially need to allocate tags separately when they are attached at a later time. For embedded/inline mbuf tag space management I think m_tag_free and m_tag_copy would sufficient for current usage. Sam From gnats at FreeBSD.org Sat Apr 11 14:52:06 2009 From: gnats at FreeBSD.org (gnats@FreeBSD.org) Date: Sat Apr 11 14:52:13 2009 Subject: kern/133613: [wpi] [panic] kernel panic in wpi(4) Message-ID: <200904112152.n3BLq5sF079871@freefall.freebsd.org> Old Synopsis: kernel panic in wpi(4) New Synopsis: [wpi] [panic] kernel panic in wpi(4) Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: gnats Responsible-Changed-When: Sat Apr 11 21:51:42 UTC 2009 Responsible-Changed-Why: http://www.freebsd.org/cgi/query-pr.cgi?pr=133613 From mi+thun at aldan.algebra.com Sat Apr 11 18:56:52 2009 From: mi+thun at aldan.algebra.com (Mikhail T.) Date: Sat Apr 11 18:56:59 2009 Subject: natd interferes with incoming RTSP/RTP Message-ID: <49E145D0.4060609@aldan.algebra.com> Hello! I'm trying to watch video via RTSP/RTP from a remote net-camera on my 7.0-STABLE/i386 from July 6th: vlc --verbose 2 rtsp://user:password@remote.example.com/nphMpeg4/g726-320x240 Things work fine, when my machine has the firewall disabled. Unfortunately, the machine is also in charge of protecting and NAT-ing for a small LAN, s keeping the ipfw down for long is not an option. Yet, with my usual firewall setup (the modified "simple" -- altered to not care, what the outside IP-address is, because it changes via DHCP), things time-out... However, if I disable just one of the rules below -- 1300, the one diverting all traffic to natd -- the video works fine... So it is not any of the other rules, that are the problem, nor is it the remote server... Why would this happen and how do I solve the problem? Thanks! Yours, -mi P.S. Output of /etc/rc.d/ipfw showing the rules, etc. net.inet.ip.fw.enable: 1 -> 0 Stopping natd. Waiting for PIDS: 62054, 62054, 62054, 62054, 62054. Starting natd. Loading /lib/libalias_cuseeme.so Loading /lib/libalias_ftp.so Loading /lib/libalias_irc.so Loading /lib/libalias_nbt.so Loading /lib/libalias_pptp.so Loading /lib/libalias_skinny.so Loading /lib/libalias_smedia.so Flushed all rules. 00100 allow ip from any to any via lo0 00200 deny ip from any to 127.0.0.0/8 00300 deny ip from 127.0.0.0/8 to any 00400 deny ip from 192.168.1.0/24 to any in via nve0 00500 deny ip from any to 10.0.0.0/8 via nve0 00600 deny ip from any to 172.16.0.0/12 via nve0 00700 deny ip from any to 192.168.0.0/16 via nve0 00800 deny ip from any to 0.0.0.0/8 via nve0 00900 deny ip from any to 169.254.0.0/16 via nve0 01000 deny ip from any to 192.0.2.0/24 via nve0 01100 deny ip from any to 224.0.0.0/4 via nve0 01200 deny ip from any to 240.0.0.0/4 via nve0 /01300 divert 8668 ip from any to any via nve0/ 01400 deny ip from 10.0.0.0/8 to any via nve0 01500 deny ip from 172.16.0.0/12 to any via nve0 01600 deny ip from 192.168.0.0/16 to any via nve0 01700 deny ip from 0.0.0.0/8 to any via nve0 01800 deny ip from 169.254.0.0/16 to any via nve0 01900 deny ip from 192.0.2.0/24 to any via nve0 02000 deny ip from 224.0.0.0/4 to any via nve0 02100 deny ip from 240.0.0.0/4 to any via nve0 02200 allow tcp from any to any established 02300 allow ip from any to any frag 02400 allow tcp from any to any dst-port 22 setup 02500 allow tcp from any to any dst-port 25 setup 02600 allow tcp from any to any dst-port 53 setup 02700 allow udp from any to any dst-port 53 02800 allow udp from any 53 to any 02900 allow tcp from any to any dst-port 80 setup 03000 allow tcp from any to any dst-port 2875 setup 03100 allow tcp from any to any dst-port 2885 setup 03200 allow tcp from any to any dst-port 2890 setup 03300 allow tcp from any to any dst-port 2895 setup 03400 allow tcp from any to any dst-port 2990 setup 03500 deny log logamount 100 tcp from any to any in via nve0 setup 03600 allow tcp from any to any setup 03700 allow udp from any to any dst-port 53 keep-state 03800 allow udp from any to any dst-port 123 keep-state Firewall rules loaded. net.inet.ip.fw.enable: 0 -> 1 From p.pisati at oltrelinux.com Sun Apr 12 05:14:42 2009 From: p.pisati at oltrelinux.com (Paolo Pisati) Date: Sun Apr 12 05:14:49 2009 Subject: natd interferes with incoming RTSP/RTP In-Reply-To: <49E145D0.4060609@aldan.algebra.com> References: <49E145D0.4060609@aldan.algebra.com> Message-ID: <49E1D88F.30005@oltrelinux.com> Mikhail T. wrote: > However, if I disable just one of the rules below -- 1300, the one > diverting all traffic to natd -- the video works fine... So it is not > any of the other rules, that are the problem, nor is it the remote > server... Why would this happen and how do I solve the problem? Thanks! > comment all the entries in /etc/libalias.conf, restart or send an HUP to natd and see if it helps. From rwatson at FreeBSD.org Sun Apr 12 07:25:17 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sun Apr 12 07:35:10 2009 Subject: m_tag, malloc vs uma In-Reply-To: <49E0F5EF.3030807@xiplink.com> References: <49DF5F75.6080607@xiplink.com> <49DF9EAD.1050609@xiplink.com> <49E0F5EF.3030807@xiplink.com> Message-ID: On Sat, 11 Apr 2009, Karim Fodil-Lemelin wrote: >> I think it would be desirable to make a change to more flexible m_tag types >> for 8.0, but I'm not sure I have time to implement/test it. Is this >> something you might be interested in working on? I'm thinking of basically >> replacing the m_tag_free pointer with a pointer to a small vector of >> operations, possibly something along these lines: >> >> struct m_tag_ops { >> void (*m_tag_free)(struct m_tag *); >> struct m_tag (*m_tag_copy)(struct m_tag *); >> }; >> >> If the m_tag_ops pointer is NULL, we go with today's default (requiring >> minimal change of existing consumers). I'm not sure if there are any other >> function pointers we'd need at this point? > > Is the m_tag_copy an 'overloaded' function for the current m_tag_copy or > something else? Now it could also be interesting to have another function > pointer to overload m_tag_alloc to give more control over which zone the > user wants its tags from (ex: pf_mtag ...). The interest is there not sure > if the schedule will allow it but that depends if the new m_tag designs > allows me to squeeze some performances in. My feeling is that, for types not maintained by the m_tag framework itself, the m_tag_ops.m_tag_copy() method should take an existing m_tag and produce a copy of it appropriate for inserting on the list of a copied mbuf header. That way both the allocation and copying of the m_tag are left to the subsystem that owns it, allowing it to use its own memory type, perform deep copying or reference counting of other structures, etc. Robert N M Watson Computer Laboratory University of Cambridge From mi+thun at aldan.algebra.com Sun Apr 12 12:25:41 2009 From: mi+thun at aldan.algebra.com (Mikhail T.) Date: Sun Apr 12 13:21:14 2009 Subject: natd interferes with incoming RTSP/RTP In-Reply-To: <49E1D88F.30005@oltrelinux.com> References: <49E145D0.4060609@aldan.algebra.com> <49E1D88F.30005@oltrelinux.com> Message-ID: <49E24031.3050901@aldan.algebra.com> Paolo Pisati ???????(??): > Mikhail T. wrote: >> However, if I disable just one of the rules below -- 1300, the one >> diverting all traffic to natd -- the video works fine... So it is not >> any of the other rules, that are the problem, nor is it the remote >> server... Why would this happen and how do I solve the problem? Thanks! >> > comment all the entries in /etc/libalias.conf, restart or send an HUP > to natd and see if it helps. Great pointer! As a matter of fact, all I had to comment out was the /lib/libalias_smedia.so... Now, what's wrong with it? Does not disabling this plugin mean, the hosts on the LAN can't access RTSP streams? Thanks! Yours, -mi From p.pisati at oltrelinux.com Sun Apr 12 14:36:44 2009 From: p.pisati at oltrelinux.com (Paolo Pisati) Date: Sun Apr 12 14:59:40 2009 Subject: natd interferes with incoming RTSP/RTP In-Reply-To: <49E24031.3050901@aldan.algebra.com> References: <49E145D0.4060609@aldan.algebra.com> <49E1D88F.30005@oltrelinux.com> <49E24031.3050901@aldan.algebra.com> Message-ID: <49E25EE9.3040309@oltrelinux.com> Mikhail T. wrote: > Great pointer! As a matter of fact, all I had to comment out was the > /lib/libalias_smedia.so... > > Now, what's wrong with it? Does not disabling this plugin mean, the > hosts on the LAN can't access RTSP streams? Thanks! Yours, > try this patch: http://people.freebsd.org/~piso/alias_smedia.c.patch From brett at lariat.net Sun Apr 12 17:50:03 2009 From: brett at lariat.net (Brett Glass) Date: Sun Apr 12 18:24:54 2009 Subject: bin/130159: [patch] ppp(8) fails to correctly set routes Message-ID: <200904130050.n3D0o2ko001727@freefall.freebsd.org> The following reply was made to PR bin/130159; it has been noted by GNATS. From: Brett Glass To: bug-followup@FreeBSD.org, loos.br@gmail.com Cc: Subject: Re: bin/130159: [patch] ppp(8) fails to correctly set routes Date: Sun, 12 Apr 2009 18:41:27 -0600 Note: With the patch as written, the "gateway" (G) flag is set in the routing table entry. This does not seem to cause problems, but the flag should not be set because the "tun" interface is acting as a bridge, not a gateway. From craigcocca at yahoo.com Sun Apr 12 21:46:09 2009 From: craigcocca at yahoo.com (Craig Cocca) Date: Sun Apr 12 22:04:55 2009 Subject: Problem using Carp with NAT for High Availability Firewall Message-ID: <798192.81782.qm@web31108.mail.mud.yahoo.com> I have been experimenting recently with using Carp on FreeBSD 6.1 to implement a high-availability firewall. I have two FreeBSD 6.1 machines set up, each with their own static IP address, and both machines share a virtual IP (VIP), which is the gateway IP for the machines behind the firewalls. My network topology looks like this: Internet Switch | |--------------------------------| Firewall 1 Firewall 2 10.0.0.1 10.0.0.2 192.168.0.1 (VIP) |-------------------------|-------------------| Server 1 Server 2 Server N I have been successful in getting the two firewall machines set up so that the slave machine takes over the VIP from the master if the master machine loses connectivity. However, when the master comes back online and takes over the VIP again, I'm noticing something really odd, namely that traffic starts going to the master again but ends up getting "swallowed alive" by the kernel. In other words, I can have one of the machines behind the firewalls sending out a ping to a host on the Internet when the slave is servicing the VIP, and I will see traffic on Firewall 2's (slave's) inside and outside interfaces. As soon as the master comes online and takes over the VIP from the slave again, I see the traffic switch to the inside interface of the master (I see this by watching tcpdump), but I don't see the traffic getting routed to the outside interface! Either I am doing something wrong, or there is some kind of bug in Carp. Can anyone shed some light on this? One other interesting thing to add to the mystery is that if I wait exactly 15 minutes from when the master takes back over the VIP, the traffic starts getting routed again. Thanks, Craig From linimon at FreeBSD.org Sun Apr 12 22:30:20 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Sun Apr 12 23:17:17 2009 Subject: kern/133490: [bpf] [panic] 'kmem_map too small' panic on Dell r900 when bpf_bufsize and bpf_maxbufsize are increased Message-ID: <200904130530.n3D5UJoC087045@freefall.freebsd.org> Old Synopsis: 'kmem_map too small' panic on Dell r900 when bpf_bufsize and bpf_maxbufsize are increased New Synopsis: [bpf] [panic] 'kmem_map too small' panic on Dell r900 when bpf_bufsize and bpf_maxbufsize are increased Responsible-Changed-From-To: freebsd-i386->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Mon Apr 13 05:29:45 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=133490 From linimon at FreeBSD.org Sun Apr 12 22:32:26 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Sun Apr 12 23:17:34 2009 Subject: kern/133328: [bge] [panic] Kernel panics with Windows7 client Message-ID: <200904130532.n3D5WP5H098651@freefall.freebsd.org> Old Synopsis: Kernel panics with Windows7 client New Synopsis: [bge] [panic] Kernel panics with Windows7 client Responsible-Changed-From-To: freebsd-i386->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Mon Apr 13 05:30:58 UTC 2009 Responsible-Changed-Why: reclassify. http://www.freebsd.org/cgi/query-pr.cgi?pr=133328 From linimon at FreeBSD.org Sun Apr 12 22:33:23 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Sun Apr 12 23:17:54 2009 Subject: kern/133204: [msk] msk driver timeouts Message-ID: <200904130533.n3D5XNjs099438@freefall.freebsd.org> Old Synopsis: 'msk' driver problem New Synopsis: [msk] msk driver timeouts Responsible-Changed-From-To: freebsd-i386->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Mon Apr 13 05:32:33 UTC 2009 Responsible-Changed-Why: Reclassify. http://www.freebsd.org/cgi/query-pr.cgi?pr=133204 From bugmaster at FreeBSD.org Mon Apr 13 04:06:58 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Apr 13 04:34:24 2009 Subject: Current problem reports assigned to freebsd-net@FreeBSD.org Message-ID: <200904131106.n3DB6vr3085020@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/133613 net [wpi] [panic] kernel panic in wpi(4) o kern/133572 net [ppp] [hang] incoming PPTP connection hangs the system o kern/133490 net [bpf] [panic] 'kmem_map too small' panic on Dell r900 o kern/133328 net [bge] [panic] Kernel panics with Windows7 client o kern/133235 net [netinet] [patch] Process SIOCDLIFADDR command incorre o kern/133218 net [carp] [hang] use of carp(4) causes system to freeze o kern/133204 net [msk] msk driver timeouts o kern/133060 net [ipsec] [pfsync] [panic] Kernel panic with ipsec + pfs o kern/132991 net [bge] if_bge low performance problem o kern/132984 net [netgraph] swi1: net 100% cpu usage f bin/132911 net ip6fw(8): argument type of fill_icmptypes is wrong and o kern/132889 net [ndis] [panic] NDIS kernel crash on load BCM4321 AGN d o kern/132885 net [wlan] 802.1x broken after SVN rev 189592 o conf/132851 net [fib] [patch] allow to setup fib for service running f o bin/132798 net [patch] ggatec(8): ggated/ggatec connection slowdown p o kern/132734 net [ifmib] [panic] panic in net/if_mib.c o kern/132722 net [ath] Wifi ath0 associates fine with AP, but DHCP or I o kern/132715 net [lagg] [panic] Panic when creating vlan's on lagg inte o kern/132705 net [libwrap] [patch] libwrap - infinite loop if hosts.all o kern/132672 net [ndis] [panic] ndis with rt2860.sys causes kernel pani o kern/132669 net [xl] 3c905-TX send DUP! in reply on ping (sometime) o kern/132625 net [iwn] iwn drivers don't support setting country o kern/132554 net [ipl] There is no ippool start script/ipfilter magic t o kern/132354 net [nat] Getting some packages to ipnat(8) causes crash o kern/132285 net [carp] alias gives incorrect hash in dmesg o kern/132277 net [crypto] [ipsec] poor performance using cryptodevice f o conf/132179 net [patch] /etc/network.subr: ipv6 rtsol on incorrect wla o kern/132107 net [carp] carp(4) advskew setting ignored when carp IP us o kern/131781 net [ndis] ndis keeps dropping the link o kern/131776 net [wi] driver fails to init o kern/131753 net [altq] [panic] kernel panic in hfsc_dequeue o bin/131567 net [socket] [patch] Update for regression/sockets/unix_cm o kern/131549 net ifconfig(8) can't clear 'monitor' mode on the wireless o kern/131536 net [netinet] [patch] kernel does allow manipulation of su o bin/131365 net route(8): route add changes interpretation of network o kern/131310 net [netgraph] [panic] 7.1 panics with mpd netgraph interf o kern/131162 net [ath] Atheros driver bugginess and kernel crashes o kern/131153 net [iwi] iwi doesn't see a wireless network f kern/131087 net [ipw] [panic] ipw / iwi - no sent/received packets; iw f kern/130820 net [ndis] wpa_supplicant(8) returns 'no space on device' o kern/130628 net [nfs] NFS / rpc.lockd deadlock on 7.1-R o conf/130555 net [rc.d] [patch] No good way to set ipfilter variables a o kern/130525 net [ndis] [panic] 64 bit ar5008 ndisgen-erated driver cau o kern/130311 net [wlan_xauth] [panic] hostapd restart causing kernel pa o bin/130159 net [patch] ppp(8) fails to correctly set routes o kern/130109 net [ipfw] Can not set fib for packets originated from loc f kern/130059 net [panic] Leaking 50k mbufs/hour o kern/129750 net [ath] Atheros AR5006 exits on "cannot map register spa f kern/129719 net [nfs] [panic] Panic during shutdown, tcp_ctloutput: in o kern/129580 net [ndis] Netgear WG311v3 (ndis) causes kenel trap at boo o kern/129517 net [ipsec] [panic] double fault / stack overflow o kern/129508 net [carp] [panic] Kernel panic with EtherIP (may be relat o kern/129352 net [xl] [patch] xl0 watchdog timeout o kern/129219 net [ppp] Kernel panic when using kernel mode ppp o kern/129197 net [panic] 7.0 IP stack related panic o kern/129135 net [vge] vge driver on a VIA mini-ITX not working o bin/128954 net ifconfig(8) deletes valid routes o kern/128917 net [wpi] [panic] if_wpi and wpa+tkip causing kernel panic o kern/128884 net [msk] if_msk page fault while in kernel mode o kern/128840 net [igb] page fault under load with igb/LRO o bin/128602 net [an] wpa_supplicant(8) crashes with an(4) o kern/128598 net [bluetooth] WARNING: attempt to net_add_domain(bluetoo o kern/128448 net [nfs] 6.4-RC1 Boot Fails if NFS Hostname cannot be res o conf/128334 net [request] use wpa_cli in the "WPA DHCP" situation o bin/128295 net [patch] ifconfig(8) does not print TOE4 or TOE6 capabi o bin/128001 net wpa_supplicant(8), wlan(4), and wi(4) issues o kern/127928 net [tcp] [patch] TCP bandwidth gets squeezed every time t o kern/127834 net [ixgbe] [patch] wrong error counting o kern/127826 net [iwi] iwi0 driver has reduced performance and connecti o kern/127815 net [gif] [patch] if_gif does not set vlan attributes from o kern/127724 net [rtalloc] rtfree: 0xc5a8f870 has 1 refs f bin/127719 net [arp] arp: Segmentation fault (core dumped) s kern/127587 net [bge] [request] if_bge(4) doesn't support BCM576X fami f kern/127528 net [icmp]: icmp socket receives icmp replies not owned by o bin/127192 net routed(8) removes the secondary alias IP of interface f kern/127145 net [wi]: prism (wi) driver crash at bigger traffic o kern/127102 net [wpi] Intel 3945ABG low throughput o kern/127057 net [udp] Unable to send UDP packet via IPv6 socket to IPv o kern/127050 net [carp] ipv6 does not work on carp interfaces [regressi o kern/126945 net [carp] CARP interface destruction with ifconfig destro o kern/126924 net [an] [patch] printf -> device_printf and simplify prob o kern/126895 net [patch] [ral] Add antenna selection (marked as TBD) o kern/126874 net [vlan]: Zebra problem if ifconfig vlanX destroy o bin/126822 net wpa_supplicant(8): WPA PSK does not work in adhoc mode o kern/126714 net [carp] CARP interface renaming makes system no longer o kern/126695 net rtfree messages and network disruption upon use of if_ o kern/126688 net [ixgbe] [patch] 1.4.7 ixgbe driver panic with 4GB and o kern/126475 net [ath] [panic] ath pcmcia card inevitably panics under o kern/126339 net [ipw] ipw driver drops the connection o kern/126214 net [ath] txpower problem with Atheros wifi card o kern/126075 net [inet] [patch] internet control accesses beyond end of o bin/125922 net [patch] Deadlock in arp(8) o kern/125920 net [arp] Kernel Routing Table loses Ethernet Link status o kern/125845 net [netinet] [patch] tcp_lro_rx() should make use of hard o kern/125816 net [carp] [if_bridge] carp stuck in init when using bridg f kern/125502 net [ral] ifconfig ral0 scan produces no output unless in o kern/125258 net [socket] socket's SO_REUSEADDR option does not work o kern/125239 net [gre] kernel crash when using gre f kern/125195 net [fxp] fxp(4) driver failed to initialize device Intel o kern/124904 net [fxp] EEPROM corruption with Compaq NC3163 NIC o kern/124767 net [iwi] Wireless connection using iwi0 driver (Intel 220 o kern/124753 net [ieee80211] net80211 discards power-save queue packets o kern/124341 net [ral] promiscuous mode for wireless device ral0 looses o kern/124160 net [libc] connect(2) function loops indefinitely o kern/124127 net [msk] watchdog timeout (missed Tx interrupts) -- recov o kern/124021 net [ip6] [panic] page fault in nd6_output() o kern/123968 net [rum] [panic] rum driver causes kernel panic with WPA. p kern/123961 net [vr] [patch] Allow vr interface to handle vlans o kern/123892 net [tap] [patch] No buffer space available o kern/123890 net [ppp] [panic] crash & reboot on work with PPP low-spee o kern/123858 net [stf] [patch] stf not usable behind a NAT o kern/123796 net [ipf] FreeBSD 6.1+VPN+ipnat+ipf: port mapping does not o bin/123633 net ifconfig(8) doesn't set inet and ether address in one f kern/123617 net [tcp] breaking connection when client downloading file o kern/123603 net [tcp] tcp_do_segment and Received duplicate SYN o kern/123559 net [iwi] iwi periodically disassociates/associates [regre o bin/123465 net [ip6] route(8): route add -inet6 -interfac o kern/123463 net [ipsec] [panic] repeatable crash related to ipsec-tool o kern/123429 net [nfe] [hang] "ifconfig nfe up" causes a hard system lo o kern/123347 net [bge] bge1: watchdog timeout -- linkstate changed to D o conf/123330 net [nsswitch.conf] Enabling samba wins in nsswitch.conf c o kern/123256 net [wpi] panic: blockable sleep lock with wpi(4) f kern/123172 net [bce] Watchdog timeout problems with if_bce o kern/123160 net [ip] Panic and reboot at sysctl kern.polling.enable=0 o kern/122989 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/122954 net [lagg] IPv6 EUI64 incorrectly chosen for lagg devices o kern/122928 net [em] interface watchdog timeouts and stops receiving p f kern/122839 net [multicast] FreeBSD 7 multicast routing problem p kern/122794 net [lagg] Kernel panic after brings lagg(8) up if NICs ar o kern/122780 net [lagg] tcpdump on lagg interface during high pps wedge o kern/122772 net [em] em0 taskq panic, tcp reassembly bug causes radix o kern/122743 net [mbuf] [panic] vm_page_unwire: invalid wire count: 0 o kern/122697 net [ath] Atheros card is not well supported o kern/122685 net It is not visible passing packets in tcpdump(1) o kern/122551 net [bge] Broadcom 5715S no carrier on HP BL460c blade usi o kern/122319 net [wi] imposible to enable ad-hoc demo mode with Orinoco o kern/122290 net [netgraph] [panic] Netgraph related "kmem_map too smal f kern/122252 net [ipmi] [bge] IPMI problem with BCM5704 (does not work o kern/122195 net [ed] Alignment problems in if_ed o kern/122058 net [em] [panic] Panic on em1: taskq o kern/122033 net [ral] [lor] Lock order reversal in ral0 at bootup [reg o kern/121983 net [fxp] fxp0 MBUF and PAE o bin/121895 net [patch] rtsol(8)/rtsold(8) doesn't handle managed netw o kern/121872 net [wpi] driver fails to attach on a fujitsu-siemens s711 s kern/121774 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/121706 net [netinet] [patch] "rtfree: 0xc4383870 has 1 refs" emit o kern/121624 net [em] [regression] Intel em WOL fails after upgrade to o kern/121555 net [panic] Fatal trap 12: current process = 12 (swi1: net o kern/121443 net [gif] [lor] icmp6_input/nd6_lookup o kern/121437 net [vlan] Routing to layer-2 address does not work on VLA o bin/121359 net [patch] ppp(8): fix local stack overflow in ppp o kern/121298 net [em] [panic] Fatal trap 12: page fault while in kernel o kern/121257 net [tcp] TSO + natd -> slow outgoing tcp traffic o kern/121181 net [panic] Fatal trap 3: breakpoint instruction fault whi o kern/121080 net [bge] IPv6 NUD problem on multi address config on bge0 o kern/120966 net [rum] kernel panic with if_rum and WPA encryption p docs/120945 net [patch] ip6(4) man page lacks documentation for TCLASS o kern/120566 net [request]: ifconfig(8) make order of arguments more fr o kern/120304 net [netgraph] [patch] netgraph source assumes 32-bit time o kern/120266 net [udp] [panic] gnugk causes kernel panic when closing U o kern/120232 net [nfe] [patch] Bring in nfe(4) to RELENG_6 o kern/120130 net [carp] [panic] carp causes kernel panics in any conste o bin/120060 net routed(8) deletes link-level routes in the presence of o kern/119945 net [rum] [panic] rum device in hostap mode, cause kernel o kern/119791 net [nfs] UDP NFS mount of aliased IP addresses from a Sol o kern/119617 net [nfs] nfs error on wpa network when reseting/shutdown f kern/119516 net [ip6] [panic] _mtx_lock_sleep: recursed on non-recursi o kern/119432 net [arp] route add -host -iface causes arp e o kern/119225 net [wi] 7.0-RC1 no carrier with Prism 2.5 wifi card [regr a bin/118987 net ifconfig(8): ifconfig -l (address_family) does not wor o sparc/118932 net [panic] 7.0-BETA4/sparc-64 kernel panic in rip_output a kern/118879 net [bge] [patch] bge has checksum problems on the 5703 ch o kern/118727 net [netgraph] [patch] [request] add new ng_pf module s kern/117717 net [panic] Kernel panic with Bittorrent client. o kern/117448 net [carp] 6.2 kernel crash [regression] o kern/117423 net [vlan] Duplicate IP on different interfaces o bin/117339 net [patch] route(8): loading routing management commands o kern/117271 net [tap] OpenVPN TAP uses 99% CPU on releng_6 when if_tap o kern/117043 net [em] Intel PWLA8492MT Dual-Port Network adapter EEPROM o kern/116837 net [tun] [panic] [patch] ifconfig tunX destroy: panic o kern/116747 net [ndis] FreeBSD 7.0-CURRENT crash with Dell TrueMobile o bin/116643 net [patch] [request] fstat(1): add INET/INET6 socket deta o kern/116328 net [bge]: Solid hang with bge interface o kern/116185 net [iwi] if_iwi driver leads system to reboot o kern/115239 net [ipnat] panic with 'kmem_map too small' using ipnat o kern/115019 net [netgraph] ng_ether upper hook packet flow stops on ad o kern/115002 net [wi] if_wi timeout. failed allocation (busy bit). ifco o kern/114915 net [patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f f kern/114899 net [bge] bge0: watchdog timeout -- resetting o kern/114839 net [fxp] fxp looses ability to speak with traffic o kern/113895 net [xl] xl0 fails on 6.2-RELEASE but worked fine on 5.5-R o kern/112722 net [ipsec] [udp] IP v4 udp fragmented packet reject o kern/112686 net [patm] patm driver freezes System (FreeBSD 6.2-p4) i38 o kern/112570 net [bge] packet loss with bge driver on BCM5704 chipset o bin/112557 net [patch] ppp(8) lock file should not use symlink name o kern/112528 net [nfs] NFS over TCP under load hangs with "impossible p o kern/111457 net [ral] ral(4) freeze o kern/110140 net [ipw] ipw fails under load o kern/109733 net [bge] bge link state issues [regression] o kern/109470 net [wi] Orinoco Classic Gold PC Card Can't Channel Hop o kern/109308 net [pppd] [panic] Multiple panics kernel ppp suspected [r o kern/109251 net [re] [patch] if_re cardbus card won't attach o bin/108895 net pppd(8): PPPoE dead connections on 6.2 [regression] o kern/108542 net [bce] Huge network latencies with 6.2-RELEASE / STABLE o kern/107944 net [wi] [patch] Forget to unlock mutex-locks o kern/107850 net [bce] bce driver link negotiation is faulty o conf/107035 net [patch] bridge(8): bridge interface given in rc.conf n o kern/106438 net [ipf] ipfilter: keep state does not seem to allow repl o kern/106316 net [dummynet] dummynet with multipass ipfw drops packets o kern/106243 net [nve] double fault panic in if_nve.c on high loads o kern/105945 net Address can disappear from network interface s kern/105943 net Network stack may modify read-only mbuf chain copies o bin/105925 net problems with ifconfig(8) and vlan(4) [regression] o kern/105348 net [ath] ath device stopps TX o kern/104851 net [inet6] [patch] On link routes not configured when usi o kern/104751 net [netgraph] kernel panic, when getting info about my tr o kern/104485 net [bge] Broadcom BCM5704C: Intermittent on newer chip ve o kern/103191 net Unpredictable reboot o kern/103135 net [ipsec] ipsec with ipfw divert (not NAT) encodes a pac o conf/102502 net [netgraph] [patch] ifconfig name does't rename netgrap o kern/102035 net [plip] plip networking disables parallel port printing o kern/101948 net [ipf] [panic] Kernel Panic Trap No 12 Page Fault - cau o kern/100709 net [libc] getaddrinfo(3) should return TTL info o kern/100519 net [netisr] suggestion to fix suboptimal network polling o kern/98978 net [ipf] [patch] ipfilter drops OOW packets under 6.1-Rel o kern/98597 net [inet6] Bug in FreeBSD 6.1 IPv6 link-local DAD procedu o bin/98218 net wpa_supplicant(8) blacklist not working f bin/97392 net ppp(8) hangs instead terminating o kern/97306 net [netgraph] NG_L2TP locks after connection with failed f kern/96268 net [socket] TCP socket performance drops by 3000% if pack o kern/96030 net [bfe] [patch] Install hangs with Broadcomm 440x NIC in o kern/95519 net [ral] ral0 could not map mbuf o kern/95288 net [pppd] [tty] [panic] if_ppp panic in sys/kern/tty_subr o kern/95277 net [netinet] [patch] IP Encapsulation mask_match() return o kern/95267 net packet drops periodically appear s kern/94863 net [bge] [patch] hack to get bge(4) working on IBM e326m o kern/94162 net [bge] 6.x kenel stale with bge(4) o kern/93886 net [ath] Atheros/D-Link DWL-G650 long delay to associate f kern/93378 net [tcp] Slow data transfer in Postfix and Cyrus IMAP (wo o kern/93019 net [ppp] ppp and tunX problems: no traffic after restarti o kern/92880 net [libc] [patch] almost rewritten inet_network(3) functi f kern/92552 net A serious bug in most network drivers from 5.X to 6.X s kern/92279 net [dc] Core faults everytime I reboot, possible NIC issu o kern/92090 net [bge] bge0: watchdog timeout -- resetting o kern/91859 net [ndis] if_ndis does not work with Asus WL-138 s kern/91777 net [ipf] [patch] wrong behaviour with skip rule inside an o kern/91594 net [em] FreeBSD > 5.4 w/ACPI fails to detect Intel Pro/10 o kern/91364 net [ral] [wep] WF-511 RT2500 Card PCI and WEP o kern/91311 net [aue] aue interface hanging o kern/90890 net [vr] Problems with network: vr0: tx shutdown timeout s kern/90086 net [hang] 5.4p8 on supermicro P8SCT hangs during boot if f kern/88082 net [ath] [panic] cts protection for ath0 causes panic o kern/87521 net [ipf] [panic] using ipfilter "auth" keyword leads to k o kern/87506 net [vr] [patch] Fix alias support on vr interfaces o kern/87194 net [fxp] fxp(4) promiscuous mode seems to corrupt hw-csum s kern/86920 net [ndis] ifconfig: SIOCS80211: Invalid argument [regress o kern/86103 net [ipf] Illegal NAT Traversal in IPFilter o kern/85780 net 'panic: bogus refcnt 0' in routing/ipv6 o bin/85445 net ifconfig(8): deprecated keyword to ifconfig inoperativ o kern/85266 net [xe] [patch] xe(4) driver does not recognise Xircom XE o kern/84202 net [ed] [patch] Holtek HT80232 PCI NIC recognition on Fre o bin/82975 net route change does not parse classfull network as given o kern/82497 net [vge] vge(4) on AMD64 only works when loaded late, not f kern/81644 net [vge] vge(4) does not work properly when loaded as a K s kern/81147 net [net] [patch] em0 reinitialization while adding aliase o kern/80853 net [ed] [patch] add support for Compex RL2000/ISA in PnP o kern/79895 net [ipf] 5.4-RC2 breaks ipfilter NAT when using netgraph f kern/79262 net [dc] Adaptec ANA-6922 not fully supported o bin/79228 net [patch] extend arp(8) to be able to create blackhole r o kern/78090 net [ipf] ipf filtering on bridged packets doesn't work if p kern/77913 net [wi] [patch] Add the APDL-325 WLAN pccard to wi(4) o kern/77341 net [ip6] problems with IPV6 implementation o kern/77273 net [ipf] ipfilter breaks ipv6 statefull filtering on 5.3 s kern/77195 net [ipf] [patch] ipfilter ioctl SIOCGNATL does not match o kern/75873 net Usability problem with non-RFC-compliant IP spoof prot s kern/75407 net [an] an(4): no carrier after short time f kern/73538 net [bge] problem with the Broadcom BCM5788 Gigabit Ethern o kern/71469 net default route to internet magically disappears with mu o kern/70904 net [ipf] ipfilter ipnat problem with h323 proxy support o kern/64556 net [sis] if_sis short cable fix problems with NetGear FA3 s kern/60293 net [patch] FreeBSD arp poison patch o kern/54383 net [nfs] [patch] NFS root configurations without dynamic f i386/45773 net [bge] Softboot causes autoconf failure on Broadcom 570 s bin/41647 net ifconfig(8) doesn't accept lladdr along with inet addr s kern/39937 net ipstealth issue a kern/38554 net [patch] changing interface ipaddress doesn't seem to w o kern/35442 net [sis] [patch] Problem transmitting runts in if_sis dri o kern/34665 net [ipf] [hang] ipfilter rcmd proxy "hangs". o kern/31647 net [libc] socket calls can return undocumented EINVAL o kern/30186 net [libc] getaddrinfo(3) does not handle incorrect servna o kern/27474 net [ipf] [ppp] Interactive use of user PPP and ipfilter c o conf/23063 net [arp] [patch] for static ARP tables in rc.network 292 problems total. From loos.br at gmail.com Mon Apr 13 05:30:06 2009 From: loos.br at gmail.com (Luiz Otavio O Souza) Date: Mon Apr 13 05:56:53 2009 Subject: bin/130159: [patch] ppp(8) fails to correctly set routes Message-ID: <200904131230.n3DCU4RT098751@freefall.freebsd.org> The following reply was made to PR bin/130159; it has been noted by GNATS. From: "Luiz Otavio O Souza" To: "Qing Li" , "Brett Glass" , Cc: Subject: Re: bin/130159: [patch] ppp(8) fails to correctly set routes Date: Mon, 13 Apr 2009 09:01:21 -0300 > Note: With the patch as written, the "gateway" (G) flag is set in the > routing table entry. This does not seem to cause problems, but the flag > should not be set because the "tun" interface is acting as a bridge, not a > gateway. Brett, This patch doesn't fix or change the gateway flag, it only set the interface in route update message. The gateway problem was fixed in r186308 by Qing Li (http://svn.freebsd.org/viewvc/base/head/usr.sbin/ppp/route.c?sortdir=down&r1=186119&r2=186308&sortby=rev - check the commit log) Thanks, Luiz From pcc at gmx.net Mon Apr 13 07:20:47 2009 From: pcc at gmx.net (Peter Cornelius) Date: Mon Apr 13 08:19:13 2009 Subject: Multiple default routes / Force external routing Message-ID: <20090413135402.78610@gmx.net> Dear list, I've poked about for weeks and asked similar questions in -questions and elsewhere without avail. Probably using the wrong keys to search and ask: I have set up a box with various vlan interfaces on it. I naively expected to be able to set individual "default" routes and route between them via an *external* router (and filter packets there etc.) but somehow all packets seem to "short-circuit" locally, and I don't seem to be able to see why this is so and how I prevent that. I also fiddled with FIBs (setfib(1)) but I think I need to correct my naive interpretation of FIBs :). Anyways, it did not help my interpretation of the above, at least not at first sight (but may wrt the default route if I get the short-circuit out of the way). Any help, pointers etc. appreciated (if need be off-list ok)... Thanks a lot, Peter. -- Psssst! Schon vom neuen GMX MultiMessenger geh?rt? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01 From brett at lariat.net Mon Apr 13 07:30:04 2009 From: brett at lariat.net (Brett Glass) Date: Mon Apr 13 08:23:55 2009 Subject: bin/130159: [patch] ppp(8) fails to correctly set routes Message-ID: <200904131430.n3DEU3gw060098@freefall.freebsd.org> The following reply was made to PR bin/130159; it has been noted by GNATS. From: Brett Glass To: "Luiz Otavio O Souza" , "Qing Li" , Cc: Subject: Re: bin/130159: [patch] ppp(8) fails to correctly set routes Date: Mon, 13 Apr 2009 08:20:40 -0600 At 06:01 AM 4/13/2009, Luiz Otavio O Souza wrote: >>Note: With the patch as written, the "gateway" (G) flag is set in >>the routing table entry. This does not seem to cause problems, >>but the flag should not be set because the "tun" interface is >>acting as a bridge, not a gateway. > >Brett, > >This patch doesn't fix or change the gateway flag, it only set the >interface in route update message. > >The gateway problem was fixed in r186308 by Qing Li >(http://svn.freebsd.org/viewvc/base/head/usr.sbin/ppp/route.c?sortdir=down&r1=186119&r2=186308&sortby=rev >- check the commit log) > >Thanks, >Luiz Luiz, Qing Li's patch must not have made it into 7.1-RELEASE, because I had to apply it manually. All three patches (your two plus Qing Li's) should be committed and MFCed before 7.2-RELEASE, because we (and others, I'm sure) really need PPP to work properly. --Brett Glass From brett at lariat.net Mon Apr 13 07:30:12 2009 From: brett at lariat.net (Brett Glass) Date: Mon Apr 13 08:24:06 2009 Subject: bin/130159: [patch] ppp(8) fails to correctly set routes Message-ID: <200904131430.n3DEUBfK060947@freefall.freebsd.org> The following reply was made to PR bin/130159; it has been noted by GNATS. From: Brett Glass To: "Luiz Otavio O Souza" , "Qing Li" , Cc: Subject: Re: bin/130159: [patch] ppp(8) fails to correctly set routes Date: Mon, 13 Apr 2009 08:27:08 -0600 P.S. -- I am still seeing the gateway flag on PPP interfaces after installing Qing Li's patch. Here is the output of "netstat -ran" (note the bottom entries): Internet: Destination Gateway Flags Refs Use Netif Expire default 66.119.58.1 UGS 0 488 xl0 66.119.58.0/24 link#1 UC 0 0 xl0 66.119.58.1 00:02:b3:66:03:63 UHLW 2 0 xl0 1198 66.119.58.13 88:17:20:22:38:11 UHLW 1 97 xl0 1102 66.119.58.254 00:02:b3:66:03:63 UHLW 1 63 xl0 921 127.0.0.1 127.0.0.1 UH 0 34 lo0 172.17.0.0/16 link#2 UC 0 0 dc0 172.17.0.2/32 link#2 UC 0 0 dc0 172.17.0.3/32 link#2 UC 0 0 dc0 172.17.0.4/32 link#2 UC 0 0 dc0 172.17.2.53 00:60:b3:5e:20:bb UHLW 1 131 dc0 994 172.17.250.21 00:19:3b:80:36:68 UHLW 1 2 dc0 1093 172.17.250.22 00:19:3b:80:37:c6 UHLW 1 2035 dc0 1163 172.17.250.23 00:19:3b:80:37:c2 UHLW 1 2 dc0 1128 172.18.0.1 172.18.0.1 UH 2 0 lo0 172.18.5.1 172.18.0.1 UGH 0 128 tun0 172.18.217.33 172.18.0.1 UGH 0 596 tun2 172.18.217.62 172.18.0.1 UGH 0 18 tun1 The last two entries are PPTP sessions. They should say "UH", not "UGH". --Brett From rpaulo at gmail.com Mon Apr 13 09:40:03 2009 From: rpaulo at gmail.com (Rui Paulo) Date: Mon Apr 13 09:59:00 2009 Subject: kern/133204: [msk] msk driver timeouts Message-ID: <200904131640.n3DGe2VG039666@freefall.freebsd.org> The following reply was made to PR kern/133204; it has been noted by GNATS. From: Rui Paulo To: bug-followup@FreeBSD.org, robert@heron.pl Cc: Subject: Re: kern/133204: [msk] msk driver timeouts Date: Mon, 13 Apr 2009 17:31:16 +0100 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --Apple-Mail-4--659928131 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Just FYI, you can also try to disable MSI in the msk interface. This should make the timeouts disappear and you still keep hw csums. -- Rui Paulo --Apple-Mail-4--659928131 content-type: application/pgp-signature; x-mac-type=70674453; name=PGP.sig content-description: This is a digitally signed message part content-disposition: inline; filename=PGP.sig content-transfer-encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iEYEARECAAYFAknjaNQACgkQfD8M/ASTygKbqwCgsEa2v6h52Dk3s7DuxwVjyf+I ba8AnROOa7jSRwg9Atl1KupNoMVcvK1R =NjD6 -----END PGP SIGNATURE----- --Apple-Mail-4--659928131-- From bz at FreeBSD.org Mon Apr 13 10:31:07 2009 From: bz at FreeBSD.org (bz@FreeBSD.org) Date: Mon Apr 13 11:12:16 2009 Subject: bin/130159: [patch] ppp(8) fails to correctly set routes Message-ID: <200904131731.n3DHV5HX011418@freefall.freebsd.org> Synopsis: [patch] ppp(8) fails to correctly set routes Responsible-Changed-From-To: freebsd-net->bz Responsible-Changed-By: bz Responsible-Changed-When: Mon Apr 13 17:30:40 UTC 2009 Responsible-Changed-Why: I promised re@ to look but I cannot promise that it'll make 7.2-R. http://www.freebsd.org/cgi/query-pr.cgi?pr=130159 From sthaug at nethelp.no Mon Apr 13 13:09:35 2009 From: sthaug at nethelp.no (sthaug@nethelp.no) Date: Mon Apr 13 13:44:15 2009 Subject: Multiple default routes / Force external routing In-Reply-To: <20090413135402.78610@gmx.net> References: <20090413135402.78610@gmx.net> Message-ID: <20090413.220932.74699777.sthaug@nethelp.no> > I've poked about for weeks and asked similar questions in -questions and elsewhere without avail. Probably using the wrong keys to search and ask: > > I have set up a box with various vlan interfaces on it. I naively expected to be able to set individual "default" routes and route between them via an *external* router (and filter packets there etc.) but somehow all packets seem to "short-circuit" locally, and I don't seem to be able to see why this is so and how I prevent that. I found this behavior also, and it breaks POLA pretty badly. There are several problems with the multiple routing table support (via setfib) that I see: - I found I needed "options ROUTETABLES= ..." to have additional routing tables. I could not find this option documented anywhere. - The standard behavior when adding new routes (via ifconfig or route command) is that the route is added to all routing tables. Coming from a router/MPLS/L3VPN background, this is extremely counterintuitive. I found I needed to set the sysctl net.add_addr_allfibs to 0 to avoid this behavior. - Having two routing tables (one default, one table number 1 via setfib) I also expected to be able to route between these via external router. Pinging from the default routing table to routing table 1, traffic from the default routing table goes out to external router and in via other interface (in routing table 1) - but the ping reply is returned via the loopback interface on the FreeBSD host, without going out to the router. I assume this is the "short-circuit" you're talking about, and I find this behavior also very counterintuitive. If I explicitly ping from routing table 1 with ping prefixed by setfib 1, everything works as expected (traffic both ways go via external router). Steinar Haug, Nethelp consulting, sthaug@nethelp.no From dfilter at FreeBSD.ORG Mon Apr 13 15:50:05 2009 From: dfilter at FreeBSD.ORG (dfilter service) Date: Mon Apr 13 16:28:48 2009 Subject: kern/131310: commit references a PR Message-ID: <200904132250.n3DMo4GP037584@freefall.freebsd.org> The following reply was made to PR kern/131310; it has been noted by GNATS. From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/131310: commit references a PR Date: Mon, 13 Apr 2009 22:47:12 +0000 (UTC) Author: mlaier Date: Mon Apr 13 22:17:03 2009 New Revision: 191025 URL: http://svn.freebsd.org/changeset/base/191025 Log: MFH r190903 & r190895: Remove interfaces from interface groups on detach. Reported by: various Submitted by: Mikolaj Golub (r190895) PR: kern/130977, kern/131310 Approved by: re (gnn) Modified: stable/7/sys/ (props changed) stable/7/sys/contrib/pf/ (props changed) stable/7/sys/dev/ath/ath_hal/ (props changed) stable/7/sys/dev/cxgb/ (props changed) stable/7/sys/net/if.c Modified: stable/7/sys/net/if.c ============================================================================== --- stable/7/sys/net/if.c Mon Apr 13 21:04:53 2009 (r191024) +++ stable/7/sys/net/if.c Mon Apr 13 22:17:03 2009 (r191025) @@ -128,6 +128,7 @@ static void if_start_deferred(void *cont static void do_link_state_change(void *, int); static int if_getgroup(struct ifgroupreq *, struct ifnet *); static int if_getgroupmembers(struct ifgroupreq *); +static void if_delgroups(struct ifnet *); #ifdef INET6 /* * XXX: declare here to avoid to include many inet6 related files.. @@ -828,6 +829,7 @@ if_detach(struct ifnet *ifp) rt_ifannouncemsg(ifp, IFAN_DEPARTURE); EVENTHANDLER_INVOKE(ifnet_departure_event, ifp); devctl_notify("IFNET", ifp->if_xname, "DETACH", NULL); + if_delgroups(ifp); IF_AFDATA_LOCK(ifp); for (dp = domains; dp; dp = dp->dom_next) { @@ -963,6 +965,53 @@ if_delgroup(struct ifnet *ifp, const cha } /* + * Remove an interface from all groups + */ +static void +if_delgroups(struct ifnet *ifp) +{ + struct ifg_list *ifgl; + struct ifg_member *ifgm; + char groupname[IFNAMSIZ]; + + IFNET_WLOCK(); + while (!TAILQ_EMPTY(&ifp->if_groups)) { + ifgl = TAILQ_FIRST(&ifp->if_groups); + + strlcpy(groupname, ifgl->ifgl_group->ifg_group, IFNAMSIZ); + + IF_ADDR_LOCK(ifp); + TAILQ_REMOVE(&ifp->if_groups, ifgl, ifgl_next); + IF_ADDR_UNLOCK(ifp); + + TAILQ_FOREACH(ifgm, &ifgl->ifgl_group->ifg_members, ifgm_next) + if (ifgm->ifgm_ifp == ifp) + break; + + if (ifgm != NULL) { + TAILQ_REMOVE(&ifgl->ifgl_group->ifg_members, ifgm, + ifgm_next); + free(ifgm, M_TEMP); + } + + if (--ifgl->ifgl_group->ifg_refcnt == 0) { + TAILQ_REMOVE(&ifg_head, ifgl->ifgl_group, ifg_next); + EVENTHANDLER_INVOKE(group_detach_event, + ifgl->ifgl_group); + free(ifgl->ifgl_group, M_TEMP); + } + IFNET_WUNLOCK(); + + free(ifgl, M_TEMP); + + EVENTHANDLER_INVOKE(group_change_event, groupname); + + IFNET_WLOCK(); + } + IFNET_WUNLOCK(); +} + +/* * Stores all groups from an interface in memory pointed * to by data */ _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From mlaier at FreeBSD.org Mon Apr 13 16:35:56 2009 From: mlaier at FreeBSD.org (mlaier@FreeBSD.org) Date: Mon Apr 13 17:00:18 2009 Subject: kern/131310: [netgraph] [panic] 7.1 panics with mpd netgraph interface changes Message-ID: <200904132335.n3DNZtMQ004984@freefall.freebsd.org> Synopsis: [netgraph] [panic] 7.1 panics with mpd netgraph interface changes State-Changed-From-To: open->closed State-Changed-By: mlaier State-Changed-When: Mon Apr 13 23:35:02 UTC 2009 State-Changed-Why: Fix commited to head and stable/7. Thanks. http://www.freebsd.org/cgi/query-pr.cgi?pr=131310 From rainofchaos at gmail.com Mon Apr 13 21:20:08 2009 From: rainofchaos at gmail.com (Leon Feng) Date: Mon Apr 13 21:36:07 2009 Subject: [vge] VIA VT6130 only auto negotiating to 1000baseT after down/up Message-ID: Hi, I am running CURRENT r190987 on VIA EPIA board with VT6130 chip. I found two problems: 1. After normal boot, vge only auto negotiate to 100baseTX. Then # ifconfig vge1 down # ifconfig vge1 up And vge1 will auto negotiate to 1000baseT. 2. After reboot the system, there is an error message: " savecore: reboot after panic: mutex vge0 not owned at /usr/src/sys/modules/vge/../../dev/vge/if_vge.c:2395" Do not know whether these two are related. Any one has an idea? dmesg attached. thanks, Leon Feng -------------- next part -------------- A non-text attachment was scrubbed... Name: dmesg Type: application/octet-stream Size: 8111 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090414/6fc65f86/dmesg.obj From pyunyh at gmail.com Mon Apr 13 22:03:28 2009 From: pyunyh at gmail.com (Pyun YongHyeon) Date: Mon Apr 13 22:26:34 2009 Subject: [vge] VIA VT6130 only auto negotiating to 1000baseT after down/up In-Reply-To: References: Message-ID: <20090414050538.GD65724@michelle.cdnetworks.co.kr> On Tue, Apr 14, 2009 at 11:55:55AM +0800, Leon Feng wrote: > Hi, > > I am running CURRENT r190987 on VIA EPIA board with VT6130 chip. > I found two problems: > > 1. After normal boot, vge only auto negotiate to 100baseTX. Then > # ifconfig vge1 down > # ifconfig vge1 up > And vge1 will auto negotiate to 1000baseT. > It's normal to see negotiated speed/duplex only after configuring network interface(At least you have to up the interface to get valid link). You have configured vge1 in rc.conf, right? > 2. After reboot the system, there is an error message: > " savecore: reboot after panic: mutex vge0 not owned at > /usr/src/sys/modules/vge/../../dev/vge/if_vge.c:2395" > Hmm, there is no such line number 2395 in if_vge.c. Please make sure you've updated to latest CURRENT. And if you can see the panic again show me full back-trace info. > Do not know whether these two are related. Any one has an idea? > I think the former has nothing to do with the latter. From julian at elischer.org Mon Apr 13 22:13:14 2009 From: julian at elischer.org (Julian Elischer) Date: Mon Apr 13 22:28:33 2009 Subject: Multiple default routes / Force external routing In-Reply-To: <20090413.220932.74699777.sthaug@nethelp.no> References: <20090413135402.78610@gmx.net> <20090413.220932.74699777.sthaug@nethelp.no> Message-ID: <49E41755.8050701@elischer.org> sthaug@nethelp.no wrote: >> I've poked about for weeks and asked similar questions in >> -questions and elsewhere without avail. Probably using the wrong keys >> to search and ask: >> >> I have set up a box with various vlan interfaces on it. I naively >> expected to be able to set individual "default" routes and route >> between them via an *external* router (and filter packets there etc.) >> but somehow all packets seem to "short-circuit" locally, and I don't >> seem to be able to see why this is so and how I prevent that. I think you are rather confused about what Multiple FIBs is.. All it is is teh ability to make a packet use a particular FIB on it's outgoing path. There is not such thing as an interface being "In" a FIB. All interfaces are still visible to the routing code by default, and The IP stack still knows about them.I think the IP stack set's the 'loopback' flag on a packet regardless of the FIB selected if teh dest is one of its own addresses. What you want is VIMAGE. > > I found this behavior also, and it breaks POLA pretty badly. > > There are several problems with the multiple routing table support (via > setfib) that I see: > > - I found I needed "options ROUTETABLES= ..." to have additional routing > tables. I could not find this option documented anywhere. in LINT where all such are documented. > > - The standard behavior when adding new routes (via ifconfig or route > command) is that the route is added to all routing tables. Coming from > a router/MPLS/L3VPN background, this is extremely counterintuitive. I > found I needed to set the sysctl net.add_addr_allfibs to 0 to avoid > this behavior. the route is only added to all routing tables for NEIGHBOUR routes. there is a sysctl to turn this off. By default all interfaces are available no matter what FIB you are using > > - Having two routing tables (one default, one table number 1 via setfib) > I also expected to be able to route between these via external router. what do you mean by that. Routing tables are not a destination. how can you 'ping' it? you cant route between tables. what does that mean? > Pinging from the default routing table to routing table 1, what are you talking about? It's a routing table not another machine how can you ping it? > traffic from > the default routing table goes out to external router and in via other > interface > (in routing table 1) ??? routing tables are for OUTGOING packets. incoming packets don't use routing tables. If you want to assign a FIB to an incoming packet for the purpose of controlling further routing, then there is a patch that will be applied to assign a FIB as the "default FIB for packets received on an interface", but until that is applied use ipfw or pf to apply it. > - but the ping reply is returned via the > loopback interface on the FreeBSD host, without going out to the router. > I assume this is the "short-circuit" you're talking about, and I find > this behavior also very counterintuitive. I don't see what is so counterintuitive about it.. you sent the packet to your own machine.. all such packets are short circuited by the IP stack. > > If I explicitly ping from routing table 1 with ping prefixed by setfib 1, > everything works as expected (traffic both ways go via external router). anyhow I hope to be able to address some of the issues you have raised. At least, to add more functionality. > > Steinar Haug, Nethelp consulting, sthaug@nethelp.no > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" CONFIDENTIAL This document and attachments contain information from Fusion-io, Inc. which is confidential and/or legally privileged. The information is intended only for the use of the individual or entity named on this transmission. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or taking of any action in reliance on the contents of this emailed information is strictly prohibited, and that the documents should be returned to Fusion-io, Inc. immediately. In this regard, if you have received this email in error, please notify us by return email immediately. From rainofchaos at gmail.com Mon Apr 13 23:18:25 2009 From: rainofchaos at gmail.com (Leon Feng) Date: Mon Apr 13 23:24:46 2009 Subject: [vge] VIA VT6130 only auto negotiating to 1000baseT after down/up In-Reply-To: <20090414050538.GD65724@michelle.cdnetworks.co.kr> References: <20090414050538.GD65724@michelle.cdnetworks.co.kr> Message-ID: 2009/4/14 Pyun YongHyeon : > On Tue, Apr 14, 2009 at 11:55:55AM +0800, Leon Feng wrote: >> Hi, >> >> I am running CURRENT r190987 on VIA EPIA board with VT6130 chip. >> I found two problems: >> >> 1. After normal boot, vge only auto negotiate to 100baseTX. Then >> # ifconfig vge1 down >> # ifconfig vge1 up >> And vge1 will auto negotiate to 1000baseT. >> > > It's normal to see negotiated speed/duplex only after configuring > network interface(At least you have to up the interface to get > valid link). You have configured vge1 in rc.conf, right? > My fault, after add ifconfig_vge1="up", vge1 works great. I see "Link auto-negotiation speed 100M bps full duplex" in the machine connected to it at reboot. Thought it were auto negotiated by FreeBSD. In fact it comes from BIOS. >> 2. After reboot the system, there is an error message: >> " savecore: reboot after panic: mutex vge0 not owned at >> /usr/src/sys/modules/vge/../../dev/vge/if_vge.c:2395" >> > > Hmm, there is no such line number 2395 in if_vge.c. Please make > sure you've updated to latest CURRENT. And if you can see the panic > again show me full back-trace info. > After savecore -c , it is gone. >> Do not know whether these two are related. Any one has an idea? >> > > I think the former has nothing to do with the latter. > Both solved. Great thanks. From steve at ibctech.ca Tue Apr 14 05:55:02 2009 From: steve at ibctech.ca (Steve Bertrand) Date: Tue Apr 14 06:09:51 2009 Subject: Multiple default routes / Force external routing In-Reply-To: <49E41755.8050701@elischer.org> References: <20090413135402.78610@gmx.net> <20090413.220932.74699777.sthaug@nethelp.no> <49E41755.8050701@elischer.org> Message-ID: <49E48799.1000300@ibctech.ca> Julian Elischer wrote: > sthaug@nethelp.no wrote: >>> I've poked about for weeks and asked similar questions in >>> -questions and elsewhere without avail. Probably using the wrong keys >>> to search and ask: >>> >>> I have set up a box with various vlan interfaces on it. I naively >>> expected to be able to set individual "default" routes and route >>> between them via an *external* router (and filter packets there etc.) >>> but somehow all packets seem to "short-circuit" locally, and I don't >>> seem to be able to see why this is so and how I prevent that. > > I think you are rather confused about what Multiple FIBs is.. > All it is is teh ability to make a packet use a particular > FIB on it's outgoing path. There is not such thing as an interface > being "In" a FIB. All interfaces are still visible to the routing code > by default, and The IP stack still knows about them.I think the IP > stack set's the 'loopback' flag on a packet regardless of the FIB > selected if teh dest is one of its own addresses. > > What you want is VIMAGE. Perhaps the OP should rephrase his desire. To me, it sounds like he wants to turn the FBSD box into a VLAN aggregator, and then "trunk" the VLANs to an external router to route between the VLAN subnets. If this is the case, then the default route that points to the 'external' router would need to be applied on the devices within each VLAN subnet, not on the VLAN aggregator device(s) themselves. Do I understand what you are trying to do correctly? Steve From sthaug at nethelp.no Tue Apr 14 10:59:15 2009 From: sthaug at nethelp.no (sthaug@nethelp.no) Date: Tue Apr 14 11:14:45 2009 Subject: Multiple default routes / Force external routing In-Reply-To: <49E48799.1000300@ibctech.ca> References: <20090413.220932.74699777.sthaug@nethelp.no> <49E41755.8050701@elischer.org> <49E48799.1000300@ibctech.ca> Message-ID: <20090414.195912.74700172.sthaug@nethelp.no> > Perhaps the OP should rephrase his desire. > > To me, it sounds like he wants to turn the FBSD box into a VLAN > aggregator, and then "trunk" the VLANs to an external router to route > between the VLAN subnets. It's more that I'd like my FreeBSD box to be able to handle multiple routing tables completely, as seen from an L3VPN point of view (this is what Cisco calls VRF-lite, which is obviously not a full fledged MPLS L3VPN implementation): - A box can have multiple routing tables. These are logically separate. - Each interface is connected to one and only one routing table. Each routing table may have zero or more interfaces connected to it. Cisco and many other vendors call a routing table with interfaces connected to it a VRF, Virtual Router and Forwarding instance, see for instance http://en.wikipedia.org/wiki/VRF - There is no traffic between VRFs within the box (and thus, if two interfaces are in different routing tables, you can *not* get traffic between them within the box). There is no "short-circuit" between VRFs. If two interfaces are in the *same* routing table (same VRF) you can of course have traffic between them. - To go between VRFs you need to send the traffic to an external device, for instance a firewall. Thus if I have a box with the following routing tables/interfaces/ IP addresses: Table Intf IP address 1 vlan0 192.168.1.1/30 2 vlan1 192.168.2.1/30 2 vlan2 192.168.3.1/30 then I can communicate from 192.168.2.1 to 192.168.3.1 within the box, since both of these interfaces are in the same routing table. But I cannot communicate from 192.168.2.1 to 192.168.1.1 within the box, since these interfaces are in separate routing tables. To get from 192.168.2.1 to 192.168.1.1 I need to send the traffic to an external device. Steinar Haug, Nethelp consulting, sthaug@nethelp.no From pcc at gmx.net Tue Apr 14 12:05:58 2009 From: pcc at gmx.net (Peter Cornelius) Date: Tue Apr 14 12:24:23 2009 Subject: Multiple default routes / Force external routing In-Reply-To: <49E48799.1000300@ibctech.ca> References: <20090413135402.78610@gmx.net> <20090413.220932.74699777.sthaug@nethelp.no> <49E41755.8050701@elischer.org> <49E48799.1000300@ibctech.ca> Message-ID: <20090414190552.298990@gmx.net> Re... Thanks for the numerous responses, first time I feel like home :) > >>> I have set up a box with various vlan interfaces on it. I naively > >>> expected to be able to set individual "default" routes and route > >>> between them via an *external* router (and filter packets there etc.) > >>> but somehow all packets seem to "short-circuit" locally, and I don't > >>> seem to be able to see why this is so and how I prevent that. > > > > I think you are rather confused about what Multiple FIBs is.. > > All it is is teh ability to make a packet use a particular > > FIB on it's outgoing path. There is not such thing as an interface > > being "In" a FIB. All interfaces are still visible to the routing code > > by default, and The IP stack still knows about them.I think the IP > > stack set's the 'loopback' flag on a packet regardless of the FIB > > selected if teh dest is one of its own addresses. Yup, that is roughly what I expected to hear from what I observed. Took a while to get there mentally though, sorry... > > What you want is VIMAGE. I haven't fiddled with that (yet) since it seems to be somewhat separate from the src trunk (isn't it?) and I hoped to remain mainstream. At first glance, it seems attractive ... > To me, it sounds like he wants to turn the FBSD box into a VLAN > aggregator, and then "trunk" the VLANs to an external router to route > between the VLAN subnets. > > If this is the case, then the default route that points to the > 'external' router would need to be applied on the devices within each > VLAN subnet, not on the VLAN aggregator device(s) themselves. > > Do I understand what you are trying to do correctly? The idea was to set up a server which behaves as if it was a set of servers with different tasks offering different services with different access rights etc. Think of it as a farm of physical servers some of which are virtualised on a single box, typical virtualisation task, I think. The key point I want to achieve is a good separation of the networks and control packet interchange via a physically separate device (which also is a FreeBSD box btw). The Ethernet trunk goes into a switch and from there on to the router. So yes, that's the setup currently. But I may mention that the vlans extend to other holes on the switch, and I definitely want to avoid packets sneaking past the router if at all possible. To cut a long story short, I this would expect vimage to be a solution at my server end, provided that (I can get it built and) I can tie several jail instances to a given vlan interface (representing several servers) and be sure that the packets are only seen there (and not on other vlan ifs). I'll give it a closer look than I did so far asap, so thanks. All the best, Peter. -- Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss f?r nur 17,95 Euro/mtl.!* http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K11308T4569a From pcc at gmx.net Tue Apr 14 12:08:48 2009 From: pcc at gmx.net (Peter Cornelius) Date: Tue Apr 14 12:29:04 2009 Subject: Multiple default routes / Force external routing In-Reply-To: <20090414.195912.74700172.sthaug@nethelp.no> References: <20090413.220932.74699777.sthaug@nethelp.no> <49E41755.8050701@elischer.org> <49E48799.1000300@ibctech.ca> <20090414.195912.74700172.sthaug@nethelp.no> Message-ID: <20090414190842.299000@gmx.net> Re^2... > (...VRF...etc...pp...) > - To go between VRFs you need to send the traffic to an external > device, for instance a firewall. That was my first line of thought but my way simply does not work like that. Regards, Peter. -- Psssst! Schon vom neuen GMX MultiMessenger geh?rt? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01 From sthaug at nethelp.no Tue Apr 14 12:23:22 2009 From: sthaug at nethelp.no (sthaug@nethelp.no) Date: Tue Apr 14 13:08:03 2009 Subject: Multiple default routes / Force external routing In-Reply-To: <49E48799.1000300@ibctech.ca> References: <20090413.220932.74699777.sthaug@nethelp.no> <49E41755.8050701@elischer.org> <49E48799.1000300@ibctech.ca> Message-ID: <20090414.212318.41684722.sthaug@nethelp.no> > > I think you are rather confused about what Multiple FIBs is.. > > All it is is teh ability to make a packet use a particular > > FIB on it's outgoing path. There is not such thing as an interface > > being "In" a FIB. All interfaces are still visible to the routing code > > by default, and The IP stack still knows about them.I think the IP > > stack set's the 'loopback' flag on a packet regardless of the FIB > > selected if teh dest is one of its own addresses. > > > > What you want is VIMAGE. I read a bit about VIMAGE (http://imunes.tel.fer.hr/virtnet/). No, I don't see the need for complete virtualization of network interfaces etc. I *would* very much like separate routing tables. If you look at a traditional router from Cisco, Juniper or similar, they offer separate routing tables without virtualizing everything. Steinar Haug, Nethelp consulting, sthaug@nethelp.no From bz at FreeBSD.org Tue Apr 14 12:44:13 2009 From: bz at FreeBSD.org (bz@FreeBSD.org) Date: Tue Apr 14 13:19:53 2009 Subject: kern/125079: [ppp] host routes added by ppp with gateway flag (regression) Message-ID: <200904141944.n3EJiCRg074736@freefall.freebsd.org> Synopsis: [ppp] host routes added by ppp with gateway flag (regression) Responsible-Changed-From-To: freebsd-net->bz Responsible-Changed-By: bz Responsible-Changed-When: Tue Apr 14 19:43:57 UTC 2009 Responsible-Changed-Why: follow-ups to me. http://www.freebsd.org/cgi/query-pr.cgi?pr=125079 From bz at FreeBSD.org Tue Apr 14 12:44:36 2009 From: bz at FreeBSD.org (bz@FreeBSD.org) Date: Tue Apr 14 13:19:54 2009 Subject: kern/122068: [ppp] ppp can not set the correct interface with pptpd Message-ID: <200904141944.n3EJiY7g074786@freefall.freebsd.org> Synopsis: [ppp] ppp can not set the correct interface with pptpd Responsible-Changed-From-To: freebsd-net->bz Responsible-Changed-By: bz Responsible-Changed-When: Tue Apr 14 19:44:22 UTC 2009 Responsible-Changed-Why: follow-ups to me. http://www.freebsd.org/cgi/query-pr.cgi?pr=122068 From sullrich at gmail.com Tue Apr 14 13:54:56 2009 From: sullrich at gmail.com (Scott Ullrich) Date: Tue Apr 14 14:20:48 2009 Subject: NATT patch and FreeBSD's setkey In-Reply-To: <20090226141138.GA91564@zeninc.net> References: <85c4b1850902170448p7a59d50bt6bdaa89aa01c51d7@mail.gmail.com> <20090217143425.GA58591@zeninc.net> <20090217143409.J53478@maildrop.int.zabbadoz.net> <20090226141138.GA91564@zeninc.net> Message-ID: On Thu, Feb 26, 2009 at 10:11 AM, VANHULLEBUS Yvan wrote: > On Tue, Feb 17, 2009 at 02:41:41PM +0000, Bjoern A. Zeeb wrote: [snip] >> We have about 3 months left to get that patch in for 8; ideally 6 >> weeks. ?Can you update the nat-t patch in a way as discussed here >> before so that the extra address is in etc. and we can move forward? > > Done, new version is available here: > http://people.freebsd.org/~vanhu/NAT-T/experimental/patch-FreeBSD-TRUNK-NATT-pfkey-clean-2009-02-26.diff Hello, We recently tested this patch on a up to date current as of a couple hours ago and it seems to break all outgoing UDP traffic (DNS included). Has anyone else experienced this issue? Backing the patch out of our pfSense patch roster cleared up the problem. Is there a newer patch available by chance? Thanks, Scott From linimon at FreeBSD.org Tue Apr 14 14:39:36 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Tue Apr 14 15:18:43 2009 Subject: kern/133736: [udp] ip_id not protected ... Message-ID: <200904142139.n3ELdZJj025209@freefall.freebsd.org> Old Synopsis: ip_id not protected ... New Synopsis: [udp] ip_id not protected ... Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Tue Apr 14 21:38:57 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=133736 From p.pisati at oltrelinux.com Tue Apr 14 14:50:42 2009 From: p.pisati at oltrelinux.com (Paolo Pisati) Date: Tue Apr 14 15:20:35 2009 Subject: [patch] mbuf aware libalias Message-ID: <49E50262.8060603@oltrelinux.com> http://people.freebsd.org/~piso/libalias_mbuf.diff this patch makes libalias able to handle mbuf: TOS, big MTU, much less copy-around, etcetc. I encourage people to test it, since i would like to commit it soon. Known issues: -documentation was not updated -i didn't convert the fragment handling part (GetFragment, SaveFragment&C) since i would like to axe it -all the modules still require some copy-around to work, but i'm teaching them, piece by piece, how to use mbuf bye, P. From sfourman at gmail.com Tue Apr 14 17:22:59 2009 From: sfourman at gmail.com (Sam Fourman Jr.) Date: Tue Apr 14 18:00:32 2009 Subject: Multiple default routes / Force external routing In-Reply-To: <20090414.212318.41684722.sthaug@nethelp.no> References: <20090413.220932.74699777.sthaug@nethelp.no> <49E41755.8050701@elischer.org> <49E48799.1000300@ibctech.ca> <20090414.212318.41684722.sthaug@nethelp.no> Message-ID: <11167f520904141722r16b537a9o58497c9719fb6fc5@mail.gmail.com> On Tue, Apr 14, 2009 at 2:23 PM, wrote: >> > I think you are rather confused about what Multiple FIBs is.. >> > All it is is teh ?ability to make a packet use a particular >> > FIB on it's outgoing path. There is not such thing as an interface >> > being "In" a FIB. All interfaces are still visible to the routing code >> > by default, and The IP stack still knows about them.I think the IP >> > stack set's the 'loopback' flag on a packet regardless of the FIB >> > selected if teh dest is one of its own addresses. >> > >> > What you want is VIMAGE. is VIMAGE fully integrated into FreeBSD 8 CURRENT? (I believe this answer is no) also is VIMAGE expected to make it into FreeBSD 8? Maybe Someone will give a VIMAGE update at BSDCan this year Sam Fourman Jr. From steve at ibctech.ca Tue Apr 14 17:52:48 2009 From: steve at ibctech.ca (Steve Bertrand) Date: Tue Apr 14 18:18:21 2009 Subject: [OT] Multiple default routes / Force external routing In-Reply-To: <11167f520904141722r16b537a9o58497c9719fb6fc5@mail.gmail.com> References: <20090413.220932.74699777.sthaug@nethelp.no> <49E41755.8050701@elischer.org> <49E48799.1000300@ibctech.ca> <20090414.212318.41684722.sthaug@nethelp.no> <11167f520904141722r16b537a9o58497c9719fb6fc5@mail.gmail.com> Message-ID: <49E52FD4.2060103@ibctech.ca> Sam Fourman Jr. wrote: > On Tue, Apr 14, 2009 at 2:23 PM, wrote: >>>> I think you are rather confused about what Multiple FIBs is.. >>>> All it is is teh ability to make a packet use a particular >>>> FIB on it's outgoing path. There is not such thing as an interface >>>> being "In" a FIB. All interfaces are still visible to the routing code >>>> by default, and The IP stack still knows about them.I think the IP >>>> stack set's the 'loopback' flag on a packet regardless of the FIB >>>> selected if teh dest is one of its own addresses. >>>> >>>> What you want is VIMAGE. > > is VIMAGE fully integrated into FreeBSD 8 CURRENT? (I believe this > answer is no) > also is VIMAGE expected to make it into FreeBSD 8? > > Maybe Someone will give a VIMAGE update at BSDCan this year Don't know about VIMAGE, but regarding BSDCan, will those who are going respond to me off-list? I'm pretty close to Toronto, and I am seriously considering attending this year. Knowing who is close to me geographically on this list would be great! Steve From julian at elischer.org Tue Apr 14 22:44:29 2009 From: julian at elischer.org (Julian Elischer) Date: Tue Apr 14 23:29:22 2009 Subject: Multiple default routes / Force external routing In-Reply-To: <11167f520904141722r16b537a9o58497c9719fb6fc5@mail.gmail.com> References: <20090413.220932.74699777.sthaug@nethelp.no> <49E41755.8050701@elischer.org> <49E48799.1000300@ibctech.ca> <20090414.212318.41684722.sthaug@nethelp.no> <11167f520904141722r16b537a9o58497c9719fb6fc5@mail.gmail.com> Message-ID: <49E57076.7040509@elischer.org> Sam Fourman Jr. wrote: > On Tue, Apr 14, 2009 at 2:23 PM, wrote: >>>> I think you are rather confused about what Multiple FIBs is.. >>>> All it is is teh ability to make a packet use a particular >>>> FIB on it's outgoing path. There is not such thing as an interface >>>> being "In" a FIB. All interfaces are still visible to the routing code >>>> by default, and The IP stack still knows about them.I think the IP >>>> stack set's the 'loopback' flag on a packet regardless of the FIB >>>> selected if teh dest is one of its own addresses. >>>> >>>> What you want is VIMAGE. > > is VIMAGE fully integrated into FreeBSD 8 CURRENT? (I believe this > answer is no) > also is VIMAGE expected to make it into FreeBSD 8? not fully but a lot of it is under way > > Maybe Someone will give a VIMAGE update at BSDCan this year I'm hoping the report will be "try this option in your kernel" but real life tends to make these plans variable. > > Sam Fourman Jr. From vanhu at FreeBSD.org Wed Apr 15 00:09:10 2009 From: vanhu at FreeBSD.org (VANHULLEBUS Yvan) Date: Wed Apr 15 00:38:47 2009 Subject: NATT patch and FreeBSD's setkey In-Reply-To: References: <85c4b1850902170448p7a59d50bt6bdaa89aa01c51d7@mail.gmail.com> <20090217143425.GA58591@zeninc.net> <20090217143409.J53478@maildrop.int.zabbadoz.net> <20090226141138.GA91564@zeninc.net> Message-ID: <20090415071247.GA78251@zeninc.net> On Tue, Apr 14, 2009 at 04:24:44PM -0400, Scott Ullrich wrote: > On Thu, Feb 26, 2009 at 10:11 AM, VANHULLEBUS Yvan wrote: > > On Tue, Feb 17, 2009 at 02:41:41PM +0000, Bjoern A. Zeeb wrote: > [snip] > >> We have about 3 months left to get that patch in for 8; ideally 6 > >> weeks. ?Can you update the nat-t patch in a way as discussed here > >> before so that the extra address is in etc. and we can move forward? > > > > Done, new version is available here: > > http://people.freebsd.org/~vanhu/NAT-T/experimental/patch-FreeBSD-TRUNK-NATT-pfkey-clean-2009-02-26.diff > > Hello, Hi. > We recently tested this patch on a up to date current as of a couple > hours ago and it seems to break all outgoing UDP traffic (DNS > included). There's a conflict between INP_ESPINUDP* and other INP_* commited since 2009-02-26. > Has anyone else experienced this issue? Backing the patch out of our > pfSense patch roster cleared up the problem. > > Is there a newer patch available by chance? Actually, not, because there are no bits left in inp_flags, so we are actually looking for another location to put them. Yvan. From dennis.melentyev at gmail.com Wed Apr 15 03:50:06 2009 From: dennis.melentyev at gmail.com (Dennis Melentyev) Date: Wed Apr 15 04:32:19 2009 Subject: kern/133572: [ppp] [hang] incoming PPTP connection hangs the system Message-ID: <200904151050.n3FAo5CG023803@freefall.freebsd.org> The following reply was made to PR kern/133572; it has been noted by GNATS. From: Dennis Melentyev To: Max Laier Cc: bug-followup@freebsd.org Subject: Re: kern/133572: [ppp] [hang] incoming PPTP connection hangs the system Date: Wed, 15 Apr 2009 13:27:41 +0300 Hi Max, It was some hard time for me, sorry for late response. I did enabled KDB, DDB and WITNESS on the same sources. Unfortunately there was just plain hangs once some GRE was trying to get through (netgraph? PF? routing?) With these options enabled, hangs are much more often than without them. Once hung, no way to break into debugger, no panics, numlock not changing lights on keyboard, mouse not responding, hdd silent, network not available, nothing. 3 different HW platforms were tried (all of them were UP+i386+32bit). Highest CPU temperature was 52C. No chance to go with 7.2-PRERELEASE. Had to downgrade to 7.1-RELEASE. /dennis 2009/4/11 Max Laier : > Is it possible for you to turn on WITNESS on this machine to obtain possi= ble > LORs that might be responsible for the hang? =C2=A0Also, do you have the > possibility to enable DDB and drop into it from the console (if it is not= a > hard hang but a live lock)? > > -- > =C2=A0Max > --=20 Dennis Melentyev From bms at incunabulum.net Wed Apr 15 04:36:34 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed Apr 15 04:51:45 2009 Subject: OpenSSL DTLS bug fix patches Message-ID: <49E5D4CF.8050707@incunabulum.net> I know it's late in the 7.2 game, but does our OpenSSL maintainer know about this? http://sctp.fh-muenster.de/dtls-patches.html It would be nice to have in a release, although I'm tracking branches for anything I'm doing at the moment. JFYI, BMS From adamk at voicenet.com Wed Apr 15 04:40:07 2009 From: adamk at voicenet.com (Adam K Kirchhoff) Date: Wed Apr 15 04:52:41 2009 Subject: kern/131153: [iwi] iwi doesn't see a wireless network Message-ID: <200904151140.n3FBe43E093373@freefall.freebsd.org> The following reply was made to PR kern/131153; it has been noted by GNATS. From: Adam K Kirchhoff To: bug-followup@FreeBSD.org, adamk@voicenet.com Cc: Subject: Re: kern/131153: [iwi] iwi doesn't see a wireless network Date: Wed, 15 Apr 2009 07:18:15 -0400 This problem persists with 7.2-PRERELEASE, with both iwi and ath. Any ideas? Adam From alexey.blinkov at gmail.com Wed Apr 15 07:05:22 2009 From: alexey.blinkov at gmail.com (=?UTF-8?B?0JDQu9C10LrRgdC10Lkg0JHQu9C40L3QutC+0LI=?=) Date: Wed Apr 15 07:43:16 2009 Subject: MD5 authentication in quagga Message-ID: <2d934d80904150642r585049b4wadfdfc82a3d8c7fc@mail.gmail.com> Hi. I have a problem with Subj. In mailing list quagga me say for mailing to frebsd list. Quote: It is well documented that md5 'password' authentication for bgpd works, but only for outgoing packets... there is no way for FreeBSD (to my knowledge) to actually verify packets inbound. ...it's better than nothing ;) First one. My configuration in FreeBSD 7.1 /etc/rc.conf ipsec_enable="YES" ipsec_file="/etc/ipsec.conf" /etc/ipsec.conf flush; add x.x.x.x y.y.y.y tcp 0x1000 -A tcp-md5 "*********"; where: x.x.x.x - IP local side y.y.y.y - IP remote side ******** - password Next. My kernel was rebuilded with next options: options TCP_SIGNATURE options IPSEC device crypto device cryptodev device cryptodev Now i set password to bgp neighbor quagga-router(config router)# neighbor y.y.y.y password ******** And clear session quagga-router(config router)# do clear ip bgp y.y.y.y In remote side PASSWORD NOT SET YET, but bgp session passes to state UP, and network prefixes sending from local to remote side and vice versa. But neigborship must no upping if password not coincide... -- ? ????????? ??????? ??????? From plethora87 at aim.com Wed Apr 15 07:20:11 2009 From: plethora87 at aim.com (plethora87@aim.com) Date: Wed Apr 15 08:03:44 2009 Subject: kern/133490: [bpf] [panic] 'kmem_map too small' panic on Dell r900 when bpf_bufsize and bpf_maxbufsize are increased Message-ID: <200904151420.n3FEKBfr007429@freefall.freebsd.org> The following reply was made to PR kern/133490; it has been noted by GNATS. From: plethora87@aim.com To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/133490: [bpf] [panic] 'kmem_map too small' panic on Dell r900 when bpf_bufsize and bpf_maxbufsize are increased Date: Wed, 15 Apr 2009 10:00:04 -0400 If I set the net.bpf buffers after boot-up, there's no immediate crash. But I just had a crash after a couple days of uptime: Dump header from device /dev/mfid0s1b Architecture: i386 Architecture Version: 2 Dump Length: 456548352B (435 MB) Blocksize: 512 Dumptime: Wed Apr 15 09:04:06 2009 Hostname: schnozz-nap-b Magic: FreeBSD Kernel Dump Version String: FreeBSD 7.1-RELEASE #3: Wed Apr 1 11:04:28 EDT 2009 root@schnozz-nap-a:/usr/obj/usr/src/sys/CCSP-KERNEL Panic String: kmem_malloc(16777216): kmem_map too small: 326787072 total allocated Dump Parity: 366409564 Bounds: 7 Dump Status: good I can upload the core file somewhere if it would be helpful. -Terry From bzeeb-lists at lists.zabbadoz.net Wed Apr 15 07:55:07 2009 From: bzeeb-lists at lists.zabbadoz.net (Bjoern A. Zeeb) Date: Wed Apr 15 08:15:32 2009 Subject: MD5 authentication in quagga In-Reply-To: <2d934d80904150642r585049b4wadfdfc82a3d8c7fc@mail.gmail.com> References: <2d934d80904150642r585049b4wadfdfc82a3d8c7fc@mail.gmail.com> Message-ID: <20090415144956.T15361@maildrop.int.zabbadoz.net> On Wed, 15 Apr 2009, wrote: > Hi. I have a problem with Subj. In mailing list quagga me say for > mailing to frebsd list. > > Quote: > > It is well documented that md5 'password' authentication for bgpd works, > but only for outgoing packets... there is no way for FreeBSD (to my > knowledge) to actually verify packets inbound. > > ...it's better than nothing ;) > > > First one. My configuration in FreeBSD 7.1 > > /etc/rc.conf > > ipsec_enable="YES" > ipsec_file="/etc/ipsec.conf" > > /etc/ipsec.conf > > flush; > add x.x.x.x y.y.y.y tcp 0x1000 -A tcp-md5 "*********"; > > where: > > x.x.x.x - IP local side > y.y.y.y - IP remote side > ******** - password > > Next. My kernel was rebuilded with next options: > > options TCP_SIGNATURE > options IPSEC > device crypto > device cryptodev > device cryptodev > > Now i set password to bgp neighbor > > quagga-router(config router)# neighbor y.y.y.y password ******** > > And clear session > > quagga-router(config router)# do clear ip bgp y.y.y.y > > In remote side PASSWORD NOT SET YET, but bgp session passes to state > UP, and network prefixes sending from local to remote side and vice > versa. > > But neigborship must no upping if password not coincide... And what's the peer? If it's another FreeBSD box uon't check incoming packets either and thus it won't make a difference to when it's not there. /bz -- Bjoern A. Zeeb The greatest risk is not taking one. From alexey.blinkov at gmail.com Wed Apr 15 08:07:16 2009 From: alexey.blinkov at gmail.com (=?UTF-8?B?0JDQu9C10LrRgdC10Lkg0JHQu9C40L3QutC+0LI=?=) Date: Wed Apr 15 08:38:39 2009 Subject: MD5 authentication in quagga In-Reply-To: <20090415144956.T15361@maildrop.int.zabbadoz.net> References: <2d934d80904150642r585049b4wadfdfc82a3d8c7fc@mail.gmail.com> <20090415144956.T15361@maildrop.int.zabbadoz.net> Message-ID: <2d934d80904150807p732bce43gc110fe6ae042507d@mail.gmail.com> If modelling ideal situation, then: md5 password doesn`t match or empty, then peering must be closed... Now md5 working only for outgoing packets, not for input. And peering not closed if password miss or not match. because bsd not check incoming packets, i think... From renaud at vmware.com Wed Apr 15 08:59:41 2009 From: renaud at vmware.com (Renaud Lienhart) Date: Wed Apr 15 09:23:27 2009 Subject: tcp_output() might generate invalid TSO frames Message-ID: <20090415084031.6d149fef@renaud-dev1> Hi, We're having trouble virtualizing FreeBSD 7+ on ESX because of an issue with the stack's TSO implementation: it sometimes generates TSO packets whose payload size is actually smaller than the MSS. The faulty logic is described, along with a patch, in PR #132832. It has been opened for a while now, without any apparent activity, which is why I'm reaching the mailing list directly. ESX currently drops these packets as many physical nics are known to choke on such frames, which effectively limits FreeBSD guests' performance. I don't know about other virtualization stacks' behavior. http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/132832 Thanks for your time, Renaud From sullrich at gmail.com Wed Apr 15 09:58:06 2009 From: sullrich at gmail.com (Scott Ullrich) Date: Wed Apr 15 10:30:13 2009 Subject: NATT patch and FreeBSD's setkey In-Reply-To: <20090415071247.GA78251@zeninc.net> References: <85c4b1850902170448p7a59d50bt6bdaa89aa01c51d7@mail.gmail.com> <20090217143425.GA58591@zeninc.net> <20090217143409.J53478@maildrop.int.zabbadoz.net> <20090226141138.GA91564@zeninc.net> <20090415071247.GA78251@zeninc.net> Message-ID: On Wed, Apr 15, 2009 at 3:12 AM, VANHULLEBUS Yvan wrote: > Actually, not, because there are no bits left in inp_flags, so we are > actually looking for another location to put them. Sounds good and thanks for the information. We will be happy to test the next patch when it's ready. Thanks for maintaing the patch so far, Scott From alexey-lukashin at yandex.ru Wed Apr 15 11:09:22 2009 From: alexey-lukashin at yandex.ru (Alexey Lukashin) Date: Wed Apr 15 11:50:03 2009 Subject: Netgraph. panic in kernel Message-ID: <49E619CD.4000502@yandex.ru> Hi all, I'm studying how Netgraph system works and trying to write my own netgraph node similar to ng_bridge. It catches packets from lower ng_ether hooks and transmits it to other interfaces using mac address hashtable. Packet processing in my node implemented similar to ng_bridge_rcvdata() in ng_bridge.c. I don't do anything with packet. I don't modifying packet header, I only send it to another interface. My interfaces are working in promiscuous mode with autosrc=0. But sometimes (after one or two hours working in network) I have an error with message: "rl1: discard frame w/o packet header" After it, my system halts. Is anybody knows, where the problem can be? When does this message appears? (system is FreeBSD 7.1-STABLE) Thank you. -- Best regards, Alexey Lukashin Saint-Petersburg, Russia From kmacy at freebsd.org Wed Apr 15 11:39:10 2009 From: kmacy at freebsd.org (Kip Macy) Date: Wed Apr 15 12:04:21 2009 Subject: tcp_output() might generate invalid TSO frames In-Reply-To: <20090415084031.6d149fef@renaud-dev1> References: <20090415084031.6d149fef@renaud-dev1> Message-ID: <3c1674c90904151106j543d4772s25786f81d7ff55a1@mail.gmail.com> Interesting. That might explain a problem that Mike Silbersack is seeing with the latest em driver on vmware. I don't know of any NICs that actually choke on such frames. Nonetheless, it is silly behavior. I'll try to see if we can get this fixed before 7.2. Thanks, Kip On Wed, Apr 15, 2009 at 8:40 AM, Renaud Lienhart wrote: > Hi, > > We're having trouble virtualizing FreeBSD 7+ on ESX because of an issue > with the stack's TSO implementation: it sometimes generates TSO packets > whose payload size is actually smaller than the MSS. > > The faulty logic is described, along with a patch, in PR #132832. It > has been opened for a while now, without any apparent activity, which > is why I'm reaching the mailing list directly. > > ESX currently drops these packets as many physical nics are known to > choke on such frames, which effectively limits FreeBSD guests' > performance. > I don't know about other virtualization stacks' behavior. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/132832 > > Thanks for your time, > > ? ? ? ?Renaud > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > -- All that is necessary for the triumph of evil is that good men do nothing. Edmund Burke From Michael.Tuexen at lurchi.franken.de Wed Apr 15 12:10:45 2009 From: Michael.Tuexen at lurchi.franken.de (=?ISO-8859-1?Q?Michael_T=FCxen?=) Date: Wed Apr 15 13:01:18 2009 Subject: OpenSSL DTLS bug fix patches In-Reply-To: <49E5D4CF.8050707@incunabulum.net> References: <49E5D4CF.8050707@incunabulum.net> Message-ID: <822B17FC-60E1-4F19-8E62-BB2E5351CB99@lurchi.franken.de> Hi Bruce, at least one member of the OpenSSL core team (Steven) has integrated our patches regarding bug fixes in the source code. So they will be included in the next release of OpenSSL. Best regards Michael On Apr 15, 2009, at 2:36 PM, Bruce Simpson wrote: > I know it's late in the 7.2 game, but does our OpenSSL maintainer > know about this? > > http://sctp.fh-muenster.de/dtls-patches.html > > It would be nice to have in a release, although I'm tracking > branches for anything I'm doing at the moment. > > JFYI, > BMS > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From jfvogel at gmail.com Wed Apr 15 13:34:12 2009 From: jfvogel at gmail.com (Jack Vogel) Date: Wed Apr 15 14:19:42 2009 Subject: tcp_output() might generate invalid TSO frames In-Reply-To: <3c1674c90904151106j543d4772s25786f81d7ff55a1@mail.gmail.com> References: <20090415084031.6d149fef@renaud-dev1> <3c1674c90904151106j543d4772s25786f81d7ff55a1@mail.gmail.com> Message-ID: <2a41acea0904151304t69ff9f61q3053b2a011402626@mail.gmail.com> No, the problem Mike is having is due to an issue in our new shared code in how we get the mac address, we changed it to support alt mac addresses, and it works find on our hardware, there is an issue in the vmware emulation. Nevertheless, if there's a problem in the TSO code it would be nice to get that fixed. Jack On Wed, Apr 15, 2009 at 11:06 AM, Kip Macy wrote: > Interesting. That might explain a problem that Mike Silbersack is > seeing with the latest em driver on vmware. > > I don't know of any NICs that actually choke on such frames. > Nonetheless, it is silly behavior. I'll try to see if we can get this > fixed before 7.2. > > Thanks, > Kip > > On Wed, Apr 15, 2009 at 8:40 AM, Renaud Lienhart > wrote: > > Hi, > > > > We're having trouble virtualizing FreeBSD 7+ on ESX because of an issue > > with the stack's TSO implementation: it sometimes generates TSO packets > > whose payload size is actually smaller than the MSS. > > > > The faulty logic is described, along with a patch, in PR #132832. It > > has been opened for a while now, without any apparent activity, which > > is why I'm reaching the mailing list directly. > > > > ESX currently drops these packets as many physical nics are known to > > choke on such frames, which effectively limits FreeBSD guests' > > performance. > > I don't know about other virtualization stacks' behavior. > > > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/132832 > > > > Thanks for your time, > > > > Renaud > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > > > > > -- > All that is necessary for the triumph of evil is that good men do nothing. > Edmund Burke > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From bms at incunabulum.net Wed Apr 15 17:16:42 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed Apr 15 17:49:57 2009 Subject: MD5 authentication in quagga In-Reply-To: <2d934d80904150807p732bce43gc110fe6ae042507d@mail.gmail.com> References: <2d934d80904150642r585049b4wadfdfc82a3d8c7fc@mail.gmail.com> <20090415144956.T15361@maildrop.int.zabbadoz.net> <2d934d80904150807p732bce43gc110fe6ae042507d@mail.gmail.com> Message-ID: <49E678E6.102@incunabulum.net> ??????? ??????? wrote: > If modelling ideal situation, then: > > md5 password doesn`t match or empty, then peering must be closed... > > Now md5 working only for outgoing packets, not for input. And peering > not closed if password miss or not match. because bsd not check > incoming packets, i think... > I thought someone had fixed this ages ago? I seem to remember someone had merged some changes to what I'd originally done for Sentex from NetBSD... but I could be wrong. cheers, BMS From bms at incunabulum.net Wed Apr 15 17:19:23 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed Apr 15 17:51:06 2009 Subject: OpenSSL DTLS bug fix patches In-Reply-To: <822B17FC-60E1-4F19-8E62-BB2E5351CB99@lurchi.franken.de> References: <49E5D4CF.8050707@incunabulum.net> <822B17FC-60E1-4F19-8E62-BB2E5351CB99@lurchi.franken.de> Message-ID: <49E67988.2020008@incunabulum.net> Michael T?xen wrote: > Hi Bruce, > > at least one member of the OpenSSL core team (Steven) has integrated > our patches regarding bug fixes in the source code. > So they will be included in the next release of OpenSSL. > That's excellent news, and these fixes look good, but I was more wondering if this drop would be in FreeBSD 7.2-RELEASE :-) If not no biggie, I am tracking -STABLE for work. thanks, BMS From ngharibyan at arm.synisys.com Wed Apr 15 22:58:31 2009 From: ngharibyan at arm.synisys.com (Narek Gharibyan) Date: Wed Apr 15 23:33:43 2009 Subject: A Quick Question Message-ID: <0588BB0DDA024074A71CFF7A93877E11@arm.synisys.com> Hello Sir/Mdm I would like to know is there any solution to problem show below, because we use FreeBSD 7.0 in our network structure and we are meeting face to face to this problem everyday kern/121555: [panic] Fatal trap 12: current process = 12 (swi1: net) From: Alexey Sopov Date: Mon, 10 Mar 2008 11:46:51 GMT Subject: [7.0-RELEASE] Fatal trap 12: current process = 12 (swi1: net) Send-pr version: www-3.1 Number: 121555 Category: kern Synopsis: [panic] Fatal trap 12: current process = 12 (swi1: net) Severity: serious Priority: high Responsible: freebsd-net@FreeBSD.org State: open Class: sw-bug Arrival-Date: Mon Mar 10 12:00:01 UTC 2008 Closed-Date: Last-Modified: Fri May 23 20:48:21 UTC 2008 Originator: Alexey Sopov Release: 7.0-RELEASE Best Regards, Narek Gharibyan Network Administration Team leader Synergy International Systems Inc. / Armenia http://www.synisys.com Tel.: mobile: +37494 - 353489 work: +37410 - 650202 ext 772 From Michael.Tuexen at lurchi.franken.de Wed Apr 15 23:35:47 2009 From: Michael.Tuexen at lurchi.franken.de (=?ISO-8859-1?Q?Michael_T=FCxen?=) Date: Thu Apr 16 00:11:56 2009 Subject: OpenSSL DTLS bug fix patches In-Reply-To: <49E67988.2020008@incunabulum.net> References: <49E5D4CF.8050707@incunabulum.net> <822B17FC-60E1-4F19-8E62-BB2E5351CB99@lurchi.franken.de> <49E67988.2020008@incunabulum.net> Message-ID: <0A42378E-1193-41B4-964D-C1A4E4632616@lurchi.franken.de> On Apr 16, 2009, at 2:19 AM, Bruce Simpson wrote: > Michael T?xen wrote: >> Hi Bruce, >> >> at least one member of the OpenSSL core team (Steven) has integrated >> our patches regarding bug fixes in the source code. >> So they will be included in the next release of OpenSSL. >> > > That's excellent news, and these fixes look good, but I was more > wondering if this drop would be in FreeBSD 7.2-RELEASE :-) I know, but I wanted to make the state of the patches clear to make the decision for the port maintainer easier. Are you using DTLS? > > If not no biggie, I am tracking -STABLE for work. > > thanks, > BMS > From kmacy at freebsd.org Wed Apr 15 23:38:33 2009 From: kmacy at freebsd.org (Kip Macy) Date: Thu Apr 16 00:20:08 2009 Subject: A Quick Question In-Reply-To: <0588BB0DDA024074A71CFF7A93877E11@arm.synisys.com> References: <0588BB0DDA024074A71CFF7A93877E11@arm.synisys.com> Message-ID: <3c1674c90904152338g1e25fe45n930db60d84958b5a@mail.gmail.com> Please see the handbook for providing debugging information. This is a very generic panic. -Kip 2009/4/15 Narek Gharibyan : > Hello Sir/Mdm > > > > I would like to know is there any solution to problem show below, because we > use FreeBSD 7.0 in our network structure and we are meeting face to face to > this problem everyday > > > > > > kern/121555: [panic] Fatal trap 12: current process = 12 (swi1: net) > > > From: > > Alexey Sopov > > > Date: > > Mon, 10 Mar 2008 11:46:51 GMT > > > Subject: > > [7.0-RELEASE] Fatal trap 12: current process = 12 (swi1: net) > > > Send-pr version: > > www-3.1 > > > > > Number: > > 121555 > > > Category: > > kern > > > Synopsis: > > [panic] Fatal trap 12: current process = 12 (swi1: net) > > > Severity: > > serious > > > Priority: > > high > > > Responsible: > > freebsd-net@FreeBSD.org > > > State: > > open > > > Class: > > sw-bug > > > Arrival-Date: > > Mon Mar 10 12:00:01 UTC 2008 > > > Closed-Date: > > > > > Last-Modified: > > Fri May 23 20:48:21 UTC 2008 > > > Originator: > > Alexey Sopov > > > Release: > > 7.0-RELEASE > > > > > > > > Best Regards, > > Narek Gharibyan > > > > Network Administration Team leader > > Synergy International Systems Inc. / Armenia > > ? http://www.synisys.com > > > > Tel.: > > mobile: +37494 - 353489 > > work: ? ?+37410 - 650202 ext 772 > > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > -- All that is necessary for the triumph of evil is that good men do nothing. Edmund Burke From alexey.blinkov at gmail.com Thu Apr 16 00:52:47 2009 From: alexey.blinkov at gmail.com (=?UTF-8?B?0JDQu9C10LrRgdC10Lkg0JHQu9C40L3QutC+0LI=?=) Date: Thu Apr 16 01:39:47 2009 Subject: MD5 authentication in quagga In-Reply-To: <49E678E6.102@incunabulum.net> References: <2d934d80904150642r585049b4wadfdfc82a3d8c7fc@mail.gmail.com> <20090415144956.T15361@maildrop.int.zabbadoz.net> <2d934d80904150807p732bce43gc110fe6ae042507d@mail.gmail.com> <49E678E6.102@incunabulum.net> Message-ID: <2d934d80904160052u70980215v1a32b07d4b1168f@mail.gmail.com> 16 ?????? 2009 ?. 3:16 ???????????? Bruce Simpson ???????: > ??????? ??????? wrote: >> >> If modelling ideal situation, then: >> >> md5 password doesn`t match or empty, then peering must be closed... >> >> Now md5 working only for outgoing packets, not for input. And peering >> not closed if password miss or not match. because bsd not check >> incoming packets, i think... >> > > I thought someone had fixed this ages ago? > I seem to remember someone had merged some changes to what I'd originally > done for Sentex from NetBSD... but I could be wrong. > > cheers, > BMS > I don`t know about how kernel works with md5 hashing, because i`m newly in bsd... -- ? ????????? ??????? ??????? From gavin at FreeBSD.org Thu Apr 16 01:22:51 2009 From: gavin at FreeBSD.org (gavin@FreeBSD.org) Date: Thu Apr 16 01:51:54 2009 Subject: kern/132832: [netinet] [patch] tcp_output() might generate invalid TSO frames when len > TCP_MAXWIN - hdrlen - optlen Message-ID: <200904160822.n3G8MoQu008789@freefall.freebsd.org> Synopsis: [netinet] [patch] tcp_output() might generate invalid TSO frames when len > TCP_MAXWIN - hdrlen - optlen Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: gavin Responsible-Changed-When: Thu Apr 16 08:19:28 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). This may be the cause of some of the other TSO issues that have been spotted recently. http://www.freebsd.org/cgi/query-pr.cgi?pr=132832 From dennis.melentyev at gmail.com Thu Apr 16 02:30:08 2009 From: dennis.melentyev at gmail.com (Dennis Melentyev) Date: Thu Apr 16 03:02:34 2009 Subject: kern/133572: [ppp] [hang] incoming PPTP connection hangs the system Message-ID: <200904160930.n3G9U7kh090202@freefall.freebsd.org> The following reply was made to PR kern/133572; it has been noted by GNATS. From: Dennis Melentyev To: Max Laier Cc: bug-followup@freebsd.org Subject: Re: kern/133572: [ppp] [hang] incoming PPTP connection hangs the system Date: Thu, 16 Apr 2009 12:28:46 +0300 Hi Max, Just read your discussion with Matt and Rembrandt on DragonflyBSD list on OpenBSD's PF issues. Although I can't afford to restore the configuration to test the issue, but I feel, that problem could be connected to IPv6 + PPTP/GRE/PF/IPv4. The machine we've tried to connect from was running Vista. AFAIR, it tries to make some use of IPv6. Can't tell anything on XP or other clients - never tried that. OTOH, outgoing PPTP (IPv4) session from MPD4 to some HW VPN router (sorry, anonymous to me) was just fine. Hope this helps. I can't upgrade ATM, but still can supply config files if needed. /dennis 2009/4/15 Dennis Melentyev : > Hi Max, > > It was some hard time for me, sorry for late response. > > I did enabled KDB, DDB and WITNESS on the same sources. > Unfortunately there was just plain hangs once some GRE was trying to > get through (netgraph? PF? routing?) > With these options enabled, hangs are much more often than without them. > Once hung, no way to break into debugger, no panics, numlock not > changing lights on keyboard, mouse not responding, hdd silent, network > not available, nothing. > > 3 different HW platforms were tried (all of them were UP+i386+32bit). > Highest CPU temperature was 52C. No chance to go with 7.2-PRERELEASE. > > Had to downgrade to 7.1-RELEASE. > > /dennis > > 2009/4/11 Max Laier : >> Is it possible for you to turn on WITNESS on this machine to obtain poss= ible >> LORs that might be responsible for the hang? =C2=A0Also, do you have the >> possibility to enable DDB and drop into it from the console (if it is no= t a >> hard hang but a live lock)? >> >> -- >> =C2=A0Max >> > > > > -- > Dennis Melentyev > --=20 Dennis Melentyev From bms at incunabulum.net Thu Apr 16 03:39:20 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Thu Apr 16 04:15:10 2009 Subject: OpenSSL DTLS bug fix patches In-Reply-To: <0A42378E-1193-41B4-964D-C1A4E4632616@lurchi.franken.de> References: <49E5D4CF.8050707@incunabulum.net> <822B17FC-60E1-4F19-8E62-BB2E5351CB99@lurchi.franken.de> <49E67988.2020008@incunabulum.net> <0A42378E-1193-41B4-964D-C1A4E4632616@lurchi.franken.de> Message-ID: <49E70AD5.8000700@incunabulum.net> Michael T?xen wrote: > On Apr 16, 2009, at 2:19 AM, Bruce Simpson wrote: > ... >> >> That's excellent news, and these fixes look good, but I was more >> wondering if this drop would be in FreeBSD 7.2-RELEASE :-) > I know, but I wanted to make the state of the patches clear to make > the decision for the port maintainer easier. > > Are you using DTLS? Not yet, but I came across these patches whilst researching TLS adaptation for SCTP. cheers, BMS From adamk at voicenet.com Thu Apr 16 03:40:04 2009 From: adamk at voicenet.com (Adam K Kirchhoff) Date: Thu Apr 16 04:15:10 2009 Subject: kern/131153: [iwi] iwi doesn't see a wireless network Message-ID: <200904161040.n3GAe3lZ086044@freefall.freebsd.org> The following reply was made to PR kern/131153; it has been noted by GNATS. From: Adam K Kirchhoff To: bug-followup@FreeBSD.org, adamk@voicenet.com Cc: Subject: Re: kern/131153: [iwi] iwi doesn't see a wireless network Date: Thu, 16 Apr 2009 06:37:18 -0400 FYI, I'm showing the debug output of wpa_supplicant from connecting to my home network with the same WPA settings that we have at work. WPA with the same preshared key. Initializing interface 'iwi0' conf '/etc/wpa_supplicant.conf' driver 'bsd' ctrl_interface 'N/A' bridge 'N/A' Configuration file '/etc/wpa_supplicant.conf' -> '/etc/wpa_supplicant.conf' Reading configuration file '/etc/wpa_supplicant.conf' Line: 2 - start of a new network block scan_ssid=1 (0x1) ssid - hexdump_ascii(len=5): 61 73 68 6b 65 ashke key_mgmt: 0x2 pairwise: 0x8 PSK (ASCII passphrase) - hexdump_ascii(len=10): [REMOVED] PSK (from passphrase) - hexdump(len=32): [REMOVED] Line 8: removed CCMP from group cipher list since it was not allowed for pairwise cipher Line: 10 - start of a new network block scan_ssid=1 (0x1) ssid - hexdump_ascii(len=15): 4d 63 6b 65 6c 6c 61 32 38 30 46 72 6f 6e 74 Mckella280Front key_mgmt: 0x2 pairwise: 0x8 PSK (ASCII passphrase) - hexdump_ascii(len=10): [REMOVED] PSK (from passphrase) - hexdump(len=32): [REMOVED] Line 16: removed CCMP from group cipher list since it was not allowed for pairwise cipher Priority group 0 id=0 ssid='ashke' id=1 ssid='Mckella280Front' Initializing interface (2) 'iwi0' EAPOL: SUPP_PAE entering state DISCONNECTED EAPOL: KEY_RX entering state NO_KEY_RECEIVE EAPOL: SUPP_BE entering state INITIALIZE EAP: EAP entering state DISABLED EAPOL: External notification - portEnabled=0 EAPOL: External notification - portValid=0 Own MAC address: 00:13:ce:a8:10:ea wpa_driver_bsd_set_wpa: enabled=1 wpa_driver_bsd_set_wpa_internal: wpa=3 privacy=1 wpa_driver_bsd_del_key: keyidx=0 wpa_driver_bsd_del_key: keyidx=1 wpa_driver_bsd_del_key: keyidx=2 wpa_driver_bsd_del_key: keyidx=3 wpa_driver_bsd_set_countermeasures: enabled=0 wpa_driver_bsd_set_drop_unencrypted: enabled=1 Setting scan request: 0 sec 100000 usec Added interface iwi0 State: DISCONNECTED -> SCANNING Starting AP scan (specific SSID) Scan SSID - hexdump_ascii(len=5): 61 73 68 6b 65 ashke Trying to get current scan results first without requesting a new scan to speed up initial association Received 0 bytes of scan results (6 BSSes) Scan results: 6 Selecting BSS from priority group 0 Try to find WPA-enabled AP 0: 00:30:bd:fb:ca:31 ssid='ashke' wpa_ie_len=24 rsn_ie_len=0 caps=0x11 selected based on WPA IE selected WPA AP 00:30:bd:fb:ca:31 ssid='ashke' Try to find non-WPA AP Trying to associate with 00:30:bd:fb:ca:31 (SSID='ashke' freq=2422 MHz) Cancelling scan request WPA: clearing own WPA/RSN IE Automatic auth_alg selection: 0x1 wpa_driver_bsd_set_auth_alg alg 0x1 authmode 1 WPA: using IEEE 802.11i/D3.0 WPA: Selected cipher suites: group 8 pairwise 8 key_mgmt 2 proto 1 WPA: set AP WPA IE - hexdump(len=24): dd 16 00 50 f2 01 01 00 00 50 f2 02 01 00 00 50 f2 02 01 00 00 50 f2 02 WPA: clearing AP RSN IE WPA: using GTK TKIP WPA: using PTK TKIP WPA: using KEY_MGMT WPA-PSK WPA: Set own WPA IE default - hexdump(len=24): dd 16 00 50 f2 01 01 00 00 50 f2 02 01 00 00 50 f2 02 01 00 00 50 f2 02 No keys have been configured - skip key clearing wpa_driver_bsd_set_drop_unencrypted: enabled=1 State: SCANNING -> ASSOCIATING wpa_driver_bsd_associate: ssid 'ashke' wpa ie len 24 pairwise 2 group 2 key mgmt 1 wpa_driver_bsd_associate: set PRIVACY 1 Setting authentication timeout: 10 sec 0 usec EAPOL: External notification - EAP success=0 EAPOL: External notification - EAP fail=0 EAPOL: External notification - portControl=Auto Authentication with 00:30:bd:fb:ca:31 timed out. Added BSSID 00:30:bd:fb:ca:31 into blacklist No keys have been configured - skip key clearing State: ASSOCIATING -> DISCONNECTED EAPOL: External notification - portEnabled=0 EAPOL: External notification - portValid=0 EAPOL: External notification - EAP success=0 Setting scan request: 0 sec 0 usec State: DISCONNECTED -> SCANNING Starting AP scan (specific SSID) Scan SSID - hexdump_ascii(len=15): 4d 63 6b 65 6c 6c 61 32 38 30 46 72 6f 6e 74 Mckella280Front Received 0 bytes of scan results (6 BSSes) Scan results: 6 Selecting BSS from priority group 0 Try to find WPA-enabled AP 0: 00:30:bd:fb:ca:31 ssid='ashke' wpa_ie_len=24 rsn_ie_len=0 caps=0ioctl[SIOCS80211, op 21, len 42]: Invalid argument x11 selected based on WPA IE selected WPA AP 00:30:bd:fb:ca:31 ssid='ashke' Try to find non-WPA AP Trying to associate with 00:30:bd:fb:ca:31 (SSID='ashke' freq=2422 MHz) Cancelling scan request WPA: clearing own WPA/RSN IE Automatic auth_alg selection: 0x1 wpa_driver_bsd_set_auth_alg alg 0x1 authmode 1 WPA: using IEEE 802.11i/D3.0 WPA: Selected cipher suites: group 8 pairwise 8 key_mgmt 2 proto 1 WPA: set AP WPA IE - hexdump(len=24): dd 16 00 50 f2 01 01 00 00 50 f2 02 01 00 00 50 f2 02 01 00 00 50 f2 02 WPA: clearing AP RSN IE WPA: using GTK TKIP WPA: using PTK TKIP WPA: using KEY_MGMT WPA-PSK WPA: Set own WPA IE default - hexdump(len=24): dd 16 00 50 f2 01 01 00 00 50 f2 02 01 00 00 50 f2 02 01 00 00 50 f2 02 No keys have been configured - skip key clearing wpa_driver_bsd_set_drop_unencrypted: enabled=1 State: SCANNING -> ASSOCIATING wpa_driver_bsd_associate: ssid 'ashke' wpa ie len 24 pairwise 2 group 2 key mgmt 1 wpa_driver_bsd_associate: set PRIVACY 1 Association request to the driver failed Setting authentication timeout: 5 sec 0 usec EAPOL: External notification - EAP success=0 EAPOL: External notification - EAP fail=0 EAPOL: External notification - portControl=Auto Authentication with 00:30:bd:fb:ca:31 timed out. BSSID 00:30:bd:fb:ca:31 blacklist count incremented to 2 No keys have been configured - skip key clearing State: ASSOCIATING -> DISCONNECTED EAPOL: External notification - portEnabled=0 EAPOL: External notification - portValid=0 EAPOL: External notification - EAP success=0 Setting scan request: 0 sec 0 usec State: DISCONNECTED -> SCANNING Starting AP scan (broadcast SSID) Received 0 bytes of scan results (6 BSSes) Scan results: 6 Selecting BSS from priority group 0 Try to find WPA-enabled AP 0: 00:30:bd:fb:ca:31 ssid='ashke' wpa_ie_len=24 rsn_ie_len=0 caps=0x11 skip - blacklisted 1: 00:13:10:96:62:bb ssid='linksys' wpa_ie_len=0 rsn_ie_len=0 caps=0x1 skip - no WPA/RSN IE 2: 00:18:f8:6a:0e:6b ssid='carlie' wpa_ie_len=0 rsn_ie_len=0 caps=0x31 skip - no WPA/RSN IE 3: 00:12:0e:54:6b:0f ssid='06B410521966' wpa_ie_len=0 rsn_ie_len=0 caps=0x11 skip - no WPA/RSN IE 4: 00:1c:df:7e:5b:0d ssid='Deck Entertainment, LLP' wpa_ie_len=0 rsn_ie_len=0 caps=0x11 skip - no WPA/RSN IE 5: 00:18:01:81:1c:4a ssid='johnreynolds' wpa_ie_len=0 rsn_ie_len=0 caps=0x71 skip - no WPA/RSN IE Try to find non-WPA AP 0: 00:30:bd:fb:ca:31 ssid='ashke' wpa_ie_len=24 rsn_ie_len=0 caps=0x11 skip - blacklisted 1: 00:13:10:96:62:bb ssid='linksys' wpa_ie_len=0 rsn_ie_len=0 caps=0x1 skip - SSID mismatch skip - SSID mismatch 2: 00:18:f8:6a:0e:6b ssid='carlie' wpa_ie_len=0 rsn_ie_len=0 caps=0x31 skip - SSID mismatch skip - SSID mismatch 3: 00:12:0e:54:6b:0f ssid='06B410521966' wpa_ie_len=0 rsn_ie_len=0 caps=0x11 skip - SSID mismatch skip - SSID mismatch 4: 00:1c:df:7e:5b:0d ssid='Deck Entertainment, LLP' wpa_ie_len=0 rsn_ie_len=0 caps=0x11 skip - SSID mismatch skip - SSID mismatch 5: 00:18:01:81:1c:4a ssid='johnreynolds' wpa_ie_len=0 rsn_ie_len=0 caps=0x71 skip - SSID mismatch skip - SSID mismatch No APs found - clear blacklist and try again Removed BSSID 00:30:bd:fb:ca:31 from blacklist (clear) Selecting BSS from priority group 0 Try to find WPA-enabled AP 0: 00:30:bd:fb:ca:31 ssid='ashke' wpa_ie_len=24 rsn_ie_len=0 caps=0x11 selected based on WPA IE selected WPA AP 00:30:bd:fb:ca:31 ssid='ashke' Try to find non-WPA AP Trying to associate with 00:30:bd:fb:ca:31 (SSID='ashke' freq=2422 MHz) Cancelling scan request WPA: clearing own WPA/RSN IE Automatic auth_alg selection: 0x1 wpa_driver_bsd_set_auth_alg alg 0x1 authmode 1 WPA: using IEEE 802.11i/D3.0 WPA: Selected cipher suites: group 8 pairwise 8 key_mgmt 2 proto 1 WPA: set AP WPA IE - hexdump(len=24): dd 16 00 50 f2 01 01 00 00 50 f2 02 01 00 00 50 f2 02 01 00 00 50 f2 02 WPA: clearing AP RSN IE WPA: using GTK TKIP WPA: using PTK TKIP WPA: using KEY_MGMT WPA-PSK WPA: Set own WPA IE default - hexdump(len=24): dd 16 00 50 f2 01 01 00 00 50 f2 02 01 00 00 50 f2 02 01 00 00 50 f2 02 No keys have been configured - skip key clearing wpa_driver_bsd_set_drop_unencrypted: enabled=1 State: SCANNING -> ASSOCIATING wpa_driver_bsd_associate: ssid 'ashke' wpa ie len 24 pairwise 2 group 2 key mgmt 1 wpa_driver_bsd_associate: set PRIVACY 1 Setting authentication timeout: 10 sec 0 usec EAPOL: External notification - EAP success=0 EAPOL: External notification - EAP fail=0 EAPOL: External notification - portControl=Auto State: ASSOCIATING -> ASSOCIATED Associated to a new BSS: BSSID=00:30:bd:fb:ca:31 No keys have been configured - skip key clearing Associated with 00:30:bd:fb:ca:31 WPA: Association event - clear replay counter EAPOL: External notification - portEnabled=0 EAPOL: External notification - portValid=0 EAPOL: External notification - EAP success=0 EAPOL: External notification - portEnabled=1 EAPOL: SUPP_PAE entering state CONNECTING EAPOL: SUPP_BE entering state IDLE Setting authentication timeout: 10 sec 0 usec Cancelling scan request RX EAPOL from 00:30:bd:fb:ca:31 RX EAPOL - hexdump(len=99): 01 03 00 5f fe 00 89 00 20 00 00 00 00 00 00 00 00 5c b7 62 1b 6f da 13 6e 27 b2 4a 35 c0 89 f8 67 28 b6 d4 55 4e 23 c5 3a 68 f3 e6 47 2b 54 8c e6 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Setting authentication timeout: 10 sec 0 usec IEEE 802.1X RX: version=1 type=3 length=95 EAPOL-Key type=254 key_info 0x89 (ver=1 keyidx=0 rsvd=0 Pairwise Ack) key_length=32 key_data_length=0 replay_counter - hexdump(len=8): 00 00 00 00 00 00 00 00 key_nonce - hexdump(len=32): 5c b7 62 1b 6f da 13 6e 27 b2 4a 35 c0 89 f8 67 28 b6 d4 55 4e 23 c5 3a 68 f3 e6 47 2b 54 8c e6 key_iv - hexdump(len=16): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 key_rsc - hexdump(len=8): 00 00 00 00 00 00 00 00 key_id (reserved) - hexdump(len=8): 00 00 00 00 00 00 00 00 key_mic - hexdump(len=16): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 WPA: RX EAPOL-Key - hexdump(len=99): 01 03 00 5f fe 00 89 00 20 00 00 00 00 00 00 00 00 5c b7 62 1b 6f da 13 6e 27 b2 4a 35 c0 89 f8 67 28 b6 d4 55 4e 23 c5 3a 68 f3 e6 47 2b 54 8c e6 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 State: ASSOCIATED -> 4WAY_HANDSHAKE WPA: RX message 1 of 4-Way Handshake from 00:30:bd:fb:ca:31 (ver=1) WPA: Renewed SNonce - hexdump(len=32): 8e 31 70 bc 1e 1d 24 47 29 e9 07 c6 23 9b 1f 6c 28 47 e3 e3 c1 01 fa a3 0f cc 05 ba 8e 0f d7 69 WPA: PMK - hexdump(len=32): [REMOVED] WPA: PTK - hexdump(len=64): [REMOVED] WPA: WPA IE for msg 2/4 - hexdump(len=24): dd 16 00 50 f2 01 01 00 00 50 f2 02 01 00 00 50 f2 02 01 00 00 50 f2 02 WPA: Sending EAPOL-Key 2/4 WPA: TX EAPOL-Key - hexdump(len=123): 01 03 00 77 fe 01 09 00 20 00 00 00 00 00 00 00 00 8e 31 70 bc 1e 1d 24 47 29 e9 07 c6 23 9b 1f 6c 28 47 e3 e3 c1 01 fa a3 0f cc 05 ba 8e 0f d7 69 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 09 1f bb d0 4a 9e e3 5c e5 76 34 f1 56 ee 90 c6 00 18 dd 16 00 50 f2 01 01 00 00 50 f2 02 01 00 00 50 f2 02 01 00 00 50 f2 02 RX EAPOL from 00:30:bd:fb:ca:31 RX EAPOL - hexdump(len=123): 01 03 00 77 fe 01 c9 00 20 00 00 00 00 00 00 00 01 5c b7 62 1b 6f da 13 6e 27 b2 4a 35 c0 89 f8 67 28 b6 d4 55 4e 23 c5 3a 68 f3 e6 47 2b 54 8c e6 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 19 ea 3c 6a 49 fb 35 81 f4 62 96 7e 9b c0 50 95 00 18 dd 16 00 50 f2 01 01 00 00 50 f2 02 01 00 00 50 f2 02 01 00 00 50 f2 02 IEEE 802.1X RX: version=1 type=3 length=119 EAPOL-Key type=254 key_info 0x1c9 (ver=1 keyidx=0 rsvd=0 Pairwise Install Ack MIC) key_length=32 key_data_length=24 replay_counter - hexdump(len=8): 00 00 00 00 00 00 00 01 key_nonce - hexdump(len=32): 5c b7 62 1b 6f da 13 6e 27 b2 4a 35 c0 89 f8 67 28 b6 d4 55 4e 23 c5 3a 68 f3 e6 47 2b 54 8c e6 key_iv - hexdump(len=16): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 key_rsc - hexdump(len=8): 00 00 00 00 00 00 00 00 key_id (reserved) - hexdump(len=8): 00 00 00 00 00 00 00 00 key_mic - hexdump(len=16): 19 ea 3c 6a 49 fb 35 81 f4 62 96 7e 9b c0 50 95 WPA: RX EAPOL-Key - hexdump(len=123): 01 03 00 77 fe 01 c9 00 20 00 00 00 00 00 00 00 01 5c b7 62 1b 6f da 13 6e 27 b2 4a 35 c0 89 f8 67 28 b6 d4 55 4e 23 c5 3a 68 f3 e6 47 2b 54 8c e6 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 19 ea 3c 6a 49 fb 35 81 f4 62 96 7e 9b c0 50 95 00 18 dd 16 00 50 f2 01 01 00 00 50 f2 02 01 00 00 50 f2 02 01 00 00 50 f2 02 State: 4WAY_HANDSHAKE -> 4WAY_HANDSHAKE WPA: RX message 3 of 4-Way Handshake from 00:30:bd:fb:ca:31 (ver=1) WPA: IE KeyData - hexdump(len=24): dd 16 00 50 f2 01 01 00 00 50 f2 02 01 00 00 50 f2 02 01 00 00 50 f2 02 WPA: Sending EAPOL-Key 4/4 WPA: TX EAPOL-Key - hexdump(len=99): 01 03 00 5f fe 01 09 00 20 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 51 5d a0 ab f1 73 0a ef 95 9c f1 fe e9 44 2b 1d 00 00 WPA: Installing PTK to the driver. WPA: RSC - hexdump(len=6): 00 00 00 00 00 00 wpa_driver_bsd_set_key: alg=TKIP addr=00:30:bd:fb:ca:31 key_idx=0 set_tx=1 seq_len=6 key_len=32 State: 4WAY_HANDSHAKE -> GROUP_HANDSHAKE RX EAPOL from 00:30:bd:fb:ca:31 RX EAPOL - hexdump(len=131): 01 03 00 7f fe 03 91 00 20 00 00 00 00 00 00 00 02 5c b7 62 1b 6f da 13 6e 27 b2 4a 35 c0 89 f8 67 28 b6 d4 55 4e 23 c5 3a 68 f3 e6 47 2b 54 8c e7 28 b6 d4 55 4e 23 c5 3a 68 f3 e6 47 2b 54 8c e8 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 92 14 4d 9a 51 95 42 b9 92 a3 dd 3c 4a 88 23 4c 00 20 8c cc a9 e5 9e 22 ab de 49 da 88 03 ac 97 46 9b 55 7a 54 76 0e a2 98 38 f7 b9 43 ec 74 cd 51 f0 IEEE 802.1X RX: version=1 type=3 length=127 EAPOL-Key type=254 key_info 0x391 (ver=1 keyidx=1 rsvd=0 Group Ack MIC Secure) key_length=32 key_data_length=32 replay_counter - hexdump(len=8): 00 00 00 00 00 00 00 02 key_nonce - hexdump(len=32): 5c b7 62 1b 6f da 13 6e 27 b2 4a 35 c0 89 f8 67 28 b6 d4 55 4e 23 c5 3a 68 f3 e6 47 2b 54 8c e7 key_iv - hexdump(len=16): 28 b6 d4 55 4e 23 c5 3a 68 f3 e6 47 2b 54 8c e8 key_rsc - hexdump(len=8): 00 00 00 00 00 00 00 00 key_id (reserved) - hexdump(len=8): 00 00 00 00 00 00 00 00 key_mic - hexdump(len=16): 92 14 4d 9a 51 95 42 b9 92 a3 dd 3c 4a 88 23 4c WPA: RX EAPOL-Key - hexdump(len=131): 01 03 00 7f fe 03 91 00 20 00 00 00 00 00 00 00 02 5c b7 62 1b 6f da 13 6e 27 b2 4a 35 c0 89 f8 67 28 b6 d4 55 4e 23 c5 3a 68 f3 e6 47 2b 54 8c e7 28 b6 d4 55 4e 23 c5 3a 68 f3 e6 47 2b 54 8c e8 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 92 14 4d 9a 51 95 42 b9 92 a3 dd 3c 4a 88 23 4c 00 20 8c cc a9 e5 9e 22 ab de 49 da 88 03 ac 97 46 9b 55 7a 54 76 0e a2 98 38 f7 b9 43 ec 74 cd 51 f0 WPA: RX message 1 of Group Key Handshake from 00:30:bd:fb:ca:31 (ver=1) State: GROUP_HANDSHAKE -> GROUP_HANDSHAKE WPA: Group Key - hexdump(len=32): [REMOVED] WPA: Installing GTK to the driver (keyidx=1 tx=0). WPA: RSC - hexdump(len=6): 00 00 00 00 00 00 wpa_driver_bsd_set_key: alg=TKIP addr=ff:ff:ff:ff:ff:ff key_idx=1 set_tx=0 seq_len=6 key_len=32 WPA: Sending EAPOL-Key 2/2 WPA: TX EAPOL-Key - hexdump(len=99): 01 03 00 5f fe 03 11 00 20 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fe 62 2b 85 de e0 17 ab 1e cb 1d bf 73 00 da cb 00 00 WPA: Key negotiation completed with 00:30:bd:fb:ca:31 [PTK=TKIP GTK=TKIP] Cancelling authentication timeout State: GROUP_HANDSHAKE -> COMPLETED CTRL-EVENT-CONNECTED - Connection to 00:30:bd:fb:ca:31 completed (auth) [id=0 id_str=] EAPOL: External notification - portValid=1 EAPOL: External notification - EAP success=1 EAPOL: SUPP_PAE entering state AUTHENTICATING EAPOL: SUPP_BE entering state SUCCESS EAP: EAP entering state DISABLED EAPOL: SUPP_PAE entering state AUTHENTICATED EAPOL: SUPP_BE entering state IDLE EAPOL: startWhen --> 0 CTRL-EVENT-TERMINATING - signal 2 received Removing interface iwi0 State: COMioctl[SIOCS80211, op 20, len 7]: Can't assign requested address PLETED -> DISCONNECTED wpa_driver_bsd_deauthenticate wpa_driver_bsd_del_key: keyidx=0 wpa_driver_bsd_del_key: keyidx=1 wpa_driver_bsd_del_key: keyidx=2 wpa_driver_bsd_del_key: keyidx=3 wpa_driver_bsd_del_key: addr=00:30:bd:fb:ca:31 keyidx=0 EAPOL: External notification - portEnabled=0 EAPOL: SUPP_PAE entering state DISCONNECTED EAPOL: SUPP_BE entering state INITIALIZE EAPOL: External notification - portValid=0 wpa_driver_bsd_set_wpa: enabled=0 wpa_driver_bsd_set_wpa_internal: wpa=0 privacy=0 wpa_driver_bsd_set_drop_unencrypted: enabled=0 wpa_driver_bsd_set_countermeasures: enabled=0 No keys have been configured - skip key clearing Cancelling scan request Cancelling authentication timeout wpa_driver_bsd_set_wpa_internal: wpa=0 privacy=0 I'm willing to try to debug this further, or even try any patches that a developer thinks might fix/diagnose the issue. Unfortunately, I can't upgrade to -CURRENT at the moment since this is a production machine. Adam From Michael.Tuexen at lurchi.franken.de Thu Apr 16 04:15:23 2009 From: Michael.Tuexen at lurchi.franken.de (=?ISO-8859-1?Q?Michael_T=FCxen?=) Date: Thu Apr 16 04:44:43 2009 Subject: OpenSSL DTLS bug fix patches In-Reply-To: <49E70AD5.8000700@incunabulum.net> References: <49E5D4CF.8050707@incunabulum.net> <822B17FC-60E1-4F19-8E62-BB2E5351CB99@lurchi.franken.de> <49E67988.2020008@incunabulum.net> <0A42378E-1193-41B4-964D-C1A4E4632616@lurchi.franken.de> <49E70AD5.8000700@incunabulum.net> Message-ID: On Apr 16, 2009, at 12:39 PM, Bruce Simpson wrote: > Michael T?xen wrote: >> On Apr 16, 2009, at 2:19 AM, Bruce Simpson wrote: >> ... >>> >>> That's excellent news, and these fixes look good, but I was more >>> wondering if this drop would be in FreeBSD 7.2-RELEASE :-) >> I know, but I wanted to make the state of the patches clear to make >> the decision for the port maintainer easier. >> >> Are you using DTLS? > > Not yet, but I came across these patches whilst researching TLS > adaptation for SCTP. Ahh, I see. Even more interesting. If you try our DTLS/SCTP implementation, please let us know if it works for you or if you have any questions... > > > cheers, > BMS > From gavin at FreeBSD.org Thu Apr 16 04:56:33 2009 From: gavin at FreeBSD.org (gavin@FreeBSD.org) Date: Thu Apr 16 05:31:47 2009 Subject: kern/125195: [fxp] fxp(4) driver failed to initialize device Intel 82801DB Message-ID: <200904161156.n3GBuWp1095241@freefall.freebsd.org> Synopsis: [fxp] fxp(4) driver failed to initialize device Intel 82801DB State-Changed-From-To: feedback->open State-Changed-By: gavin State-Changed-When: Thu Apr 16 11:54:40 UTC 2009 State-Changed-Why: Feedback was received. Card is: vendor=0x8086, dev=0x103e, revid=0x83 http://www.freebsd.org/cgi/query-pr.cgi?pr=125195 From andre at freebsd.org Thu Apr 16 13:41:40 2009 From: andre at freebsd.org (Andre Oppermann) Date: Thu Apr 16 14:21:23 2009 Subject: tcp_output() might generate invalid TSO frames In-Reply-To: <20090415084031.6d149fef@renaud-dev1> References: <20090415084031.6d149fef@renaud-dev1> Message-ID: <49E791C3.7060703@freebsd.org> Renaud Lienhart wrote: > Hi, > > We're having trouble virtualizing FreeBSD 7+ on ESX because of an issue > with the stack's TSO implementation: it sometimes generates TSO packets > whose payload size is actually smaller than the MSS. > > The faulty logic is described, along with a patch, in PR #132832. It > has been opened for a while now, without any apparent activity, which > is why I'm reaching the mailing list directly. > > ESX currently drops these packets as many physical nics are known to > choke on such frames, which effectively limits FreeBSD guests' > performance. Network cards should not choke on frames with TSO but less than one MSS worth of data. Though it's not useful to create such frames in the stack. > I don't know about other virtualization stacks' behavior. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/132832 Your patch should fix the issue. I don't have time to commit it and to run the MFC process though. Maybe Kip or Jack can run that process. -- Andre From xdsgrrr at consultcommerce.com Fri Apr 17 04:01:54 2009 From: xdsgrrr at consultcommerce.com (xdsgrrr) Date: Fri Apr 17 04:47:57 2009 Subject: MD5 authentication in quagga In-Reply-To: <2d934d80904160052u70980215v1a32b07d4b1168f@mail.gmail.com> References: <2d934d80904150642r585049b4wadfdfc82a3d8c7fc@mail.gmail.com> <20090415144956.T15361@maildrop.int.zabbadoz.net> <2d934d80904150807p732bce43gc110fe6ae042507d@mail.gmail.com> <49E678E6.102@incunabulum.net> <2d934d80904160052u70980215v1a32b07d4b1168f@mail.gmail.com> Message-ID: <1239964457.46223.2.camel@so1-ay279.globul.bg> Its depends on what protocol you talking i use md5 auth for quagga ospfd for more than 5-6 years without problems you maybe talk about bgpd ? md5 peer auth ? On Thu, 2009-04-16 at 10:52 +0300, ??????? ??????? wrote: > 16 ?????? 2009 ?. 3:16 ???????????? Bruce Simpson ???????: > > ??????? ??????? wrote: > >> > >> If modelling ideal situation, then: > >> > >> md5 password doesn`t match or empty, then peering must be closed... > >> > >> Now md5 working only for outgoing packets, not for input. And peering > >> not closed if password miss or not match. because bsd not check > >> incoming packets, i think... > >> > > > > I thought someone had fixed this ages ago? > > I seem to remember someone had merged some changes to what I'd originally > > done for Sentex from NetBSD... but I could be wrong. > > > > cheers, > > BMS > > > > I don`t know about how kernel works with md5 hashing, because i`m > newly in bsd... > > > -- br, Atanas Yankov Network Engineer, IT Division CCIE # 21756 mobile: (+359 89) 8400734 e-mail: ayankov@globul.bg www.globul.bg From alexey.blinkov at gmail.com Fri Apr 17 04:25:43 2009 From: alexey.blinkov at gmail.com (=?UTF-8?B?0JDQu9C10LrRgdC10Lkg0JHQu9C40L3QutC+0LI=?=) Date: Fri Apr 17 04:59:24 2009 Subject: MD5 authentication in quagga In-Reply-To: <1239964457.46223.2.camel@so1-ay279.globul.bg> References: <2d934d80904150642r585049b4wadfdfc82a3d8c7fc@mail.gmail.com> <20090415144956.T15361@maildrop.int.zabbadoz.net> <2d934d80904150807p732bce43gc110fe6ae042507d@mail.gmail.com> <49E678E6.102@incunabulum.net> <2d934d80904160052u70980215v1a32b07d4b1168f@mail.gmail.com> <1239964457.46223.2.camel@so1-ay279.globul.bg> Message-ID: <2d934d80904170425h4269580ds54a2fc3c46f4d4a4@mail.gmail.com> > Its depends on what protocol you talking i use md5 auth for quagga ospfd > for more than 5-6 years without problems ?you maybe talk about bgpd ? > md5 peer auth ? I talking about BGPD. With authorisation in OSPFD i don`t have any problems. -- ? ????????? ??????? ??????? From gavin at FreeBSD.org Fri Apr 17 07:08:13 2009 From: gavin at FreeBSD.org (gavin@FreeBSD.org) Date: Fri Apr 17 07:38:48 2009 Subject: kern/114899: [bge] bge0: watchdog timeout -- resetting Message-ID: <200904171408.n3HE8CO9021772@freefall.freebsd.org> Synopsis: [bge] bge0: watchdog timeout -- resetting State-Changed-From-To: feedback->closed State-Changed-By: gavin State-Changed-When: Fri Apr 17 14:06:34 UTC 2009 State-Changed-Why: Feedback timeout (~3 months). Toi submitter: if this is still an issue with more recent versions of FreeBSD, we can reopen this PR, however the driver has changed so much since 5.4-RELEASE that keeping this open without confirmation is probably counterproductive. Responsible-Changed-From-To: freebsd-net->gavin Responsible-Changed-By: gavin Responsible-Changed-When: Fri Apr 17 14:06:34 UTC 2009 Responsible-Changed-Why: Track http://www.freebsd.org/cgi/query-pr.cgi?pr=114899 From gavin at FreeBSD.org Fri Apr 17 08:40:25 2009 From: gavin at FreeBSD.org (gavin@FreeBSD.org) Date: Fri Apr 17 09:09:04 2009 Subject: kern/133595: [panic] Kernel Panic at pcpu.h:195 Message-ID: <200904171540.n3HFeOaa046615@freefall.freebsd.org> Synopsis: [panic] Kernel Panic at pcpu.h:195 Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: gavin Responsible-Changed-When: Fri Apr 17 15:38:56 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). To submitter: are you able to provide more information about your configuration? http://www.freebsd.org/cgi/query-pr.cgi?pr=133595 From rwatson at FreeBSD.org Sat Apr 18 19:05:25 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sat Apr 18 19:05:33 2009 Subject: A Quick Question In-Reply-To: <0588BB0DDA024074A71CFF7A93877E11@arm.synisys.com> References: <0588BB0DDA024074A71CFF7A93877E11@arm.synisys.com> Message-ID: On Thu, 16 Apr 2009, Narek Gharibyan wrote: > I would like to know is there any solution to problem show below, because we > use FreeBSD 7.0 in our network structure and we are meeting face to face to > this problem everyday\ Hi Narek: As Kip mentions, this panic message (a fatal trap in a software ithread) is fairly generic. If your stack trace matches the one in the PR (the panic is in rt_check() or the like) then this problem may well be fixed in FreeBSD 7.1 or the forthcoming FreeBSD 7.2, which contain a number of routing-related fixes. My advice would be to see if you can reproduce the problem with FreeBSD 7.2-RC1, which is due out in the next few days, and if so, we should debug it starting with that information. Robert N M Watson Computer Laboratory University of Cambridge > > > > > > kern/121555: [panic] Fatal trap 12: current process = 12 (swi1: net) > > > From: > > Alexey Sopov > > > Date: > > Mon, 10 Mar 2008 11:46:51 GMT > > > Subject: > > [7.0-RELEASE] Fatal trap 12: current process = 12 (swi1: net) > > > Send-pr version: > > www-3.1 > > > > > Number: > > 121555 > > > Category: > > kern > > > Synopsis: > > [panic] Fatal trap 12: current process = 12 (swi1: net) > > > Severity: > > serious > > > Priority: > > high > > > Responsible: > > freebsd-net@FreeBSD.org > > > State: > > open > > > Class: > > sw-bug > > > Arrival-Date: > > Mon Mar 10 12:00:01 UTC 2008 > > > Closed-Date: > > > > > Last-Modified: > > Fri May 23 20:48:21 UTC 2008 > > > Originator: > > Alexey Sopov > > > Release: > > 7.0-RELEASE > > > > > > > > Best Regards, > > Narek Gharibyan > > > > Network Administration Team leader > > Synergy International Systems Inc. / Armenia > > http://www.synisys.com > > > > Tel.: > > mobile: +37494 - 353489 > > work: +37410 - 650202 ext 772 > > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From rwatson at FreeBSD.org Sat Apr 18 20:54:22 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sat Apr 18 20:54:34 2009 Subject: IFF_NEEDSGIANT now gone from 8.x (was: svn commit: r191253 - head/sys/net (fwd)) Message-ID: Dear all: Just under four years ago, the non-MPSAFE network stack de-orbit burn schedule was announced, setting out a plan for eliminating remaining use of the Giant lock in the FreeBSD network stack. With the attached commit, that plan is now complete, and almost all of the network stack neither requires Giant nor runs with it. As always there are some loose ends, especially in IPv6, but with any luck those can be dealt with 8.0 also. Special thanks are due to the people who worked on and shepherded the last steps of this process -- especially Hans Petter Selasky, Alfred Perlstein, Andrew Thompson, Ed Schouten, and John Baldwin, who collectively bought our USB, tty, and other non-MPSAFE device driver stacks into a post-SMPng world. Thanks, Robert N M Watson Computer Laboratory University of Cambridge ---------- Forwarded message ---------- Date: Sat, 18 Apr 2009 20:39:18 +0000 (UTC) From: Robert Watson To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r191253 - head/sys/net Author: rwatson Date: Sat Apr 18 20:39:17 2009 New Revision: 191253 URL: http://svn.freebsd.org/changeset/base/191253 Log: Remove IFF_NEEDSGIANT interface flag: we no longer provide ifnet-layer infrastructure to support non-MPSAFE network device drivers. Modified: head/sys/net/if.h Modified: head/sys/net/if.h ============================================================================== --- head/sys/net/if.h Sat Apr 18 20:10:39 2009 (r191252) +++ head/sys/net/if.h Sat Apr 18 20:39:17 2009 (r191253) @@ -149,7 +149,6 @@ struct if_data { #define IFF_PPROMISC 0x20000 /* (n) user-requested promisc mode */ #define IFF_MONITOR 0x40000 /* (n) user-requested monitor mode */ #define IFF_STATICARP 0x80000 /* (n) static ARP */ -#define IFF_NEEDSGIANT 0x100000 /* (i) hold Giant over if_start calls */ /* * Old names for driver flags so that user space tools can continue to use @@ -163,8 +162,7 @@ struct if_data { /* flags set internally only: */ #define IFF_CANTCHANGE \ (IFF_BROADCAST|IFF_POINTOPOINT|IFF_DRV_RUNNING|IFF_DRV_OACTIVE|\ - IFF_SIMPLEX|IFF_MULTICAST|IFF_ALLMULTI|IFF_SMART|IFF_PROMISC|\ - IFF_NEEDSGIANT) + IFF_SIMPLEX|IFF_MULTICAST|IFF_ALLMULTI|IFF_SMART|IFF_PROMISC) /* * Values for if_link_state. From steve at ibctech.ca Sat Apr 18 22:10:06 2009 From: steve at ibctech.ca (Steve Bertrand) Date: Sat Apr 18 22:10:13 2009 Subject: Route traffic on a gateway through SSH tunnel Message-ID: <49EA4FBC.4040202@ibctech.ca> >From what I believe, I'm attempting to do something that has most likely been achieved before, but there is something that I'm missing. This is for my personal home setup. I've built a flash-based CPE, which connects to our DSL network with mpd5. I've enabled NAT, and am using IPFW as the packet filter. I have a Squid proxy/content filter at my office that I would like to route all 80/443 traffic from my home connection, through the proxy. The proxy and the termination point of my home connection are located in two different PoPs, within different ASs. My desire is to have this proxy-routing enabled within the network hardware, as to not need to set application layer details on the PC(s) at home. At this point, I have the FBSD (7.2) gateway device set up with an SSH tunnel. The local tunnel endpoint terminates on a LAN interface which utilizes 1918 space. It listens for traffic on 172.16.250.1:80, and forwards it to the proxyIP:8080. When I configure a workstation's Firefox to use 172.16.250.1:80 as a proxy, everything works as expected. Now, I need to figure out a way so that the same setup will work, but with no proxy configured within Firefox. At this time, I'm recompiling the kernel on the gateway device to include IPFIREWALL_FORWARD. I'm going to try a fwd rule to pass all traffic destined to *:80 to 172.16.250.1:80, in hopes that the traffic will be first redirected to itself, and therefore through the SSH tunnel to the proxy. My past experience with this however, is that FBSD will complain that the dst IP doesn't reside on the box. Does anyone have any suggestions or comments they can share regarding such a setup? Steve From jon.otterholm at ide.resurscentrum.se Sun Apr 19 07:48:21 2009 From: jon.otterholm at ide.resurscentrum.se (Jon Otterholm) Date: Sun Apr 19 07:48:29 2009 Subject: Forwarding w/o promisc on 6.4 Message-ID: Hi. I have a router running 6.4R that does not forward packets if I disable PROMISC on the interface. Hardware is a Dell PE with two Intel 82541EI chipsets (if_em). I have a number (~100) of vlan-interfaces on em0. Everything works as aexpected if I turn on PROMISC on em0 but forwarding stops if I disable it, I can still communicate with the router directly on the same logical network (for example pinging interface adress on a vlan_if from a client on that vlan) but all forwarding stops. Some info: net.inet.ip.forwarding: 1 net.inet.ip.fastforwarding: 0 (enableing this does not help) net.inet.tcp.recvspace=1048576 net.inet.tcp.sendspace=1048576 kern.ipc.maxsockbuf=16777216 I use PF for filtering and disableing this does not help either. Anyone with a clue? //JO From rwatson at FreeBSD.org Sun Apr 19 09:14:20 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sun Apr 19 09:14:26 2009 Subject: Forwarding w/o promisc on 6.4 In-Reply-To: References: Message-ID: On Sun, 19 Apr 2009, Jon Otterholm wrote: > I have a router running 6.4R that does not forward packets if I disable > PROMISC on the interface. Hardware is a Dell PE with two Intel 82541EI > chipsets (if_em). I have a number (~100) of vlan-interfaces on em0. > Everything works as aexpected if I turn on PROMISC on em0 but forwarding > stops if I disable it, I can still communicate with the router directly on > the same logical network (for example pinging interface adress on a vlan_if > from a client on that vlan) but all forwarding stops. Try disabling hardware VLAN taggging/processing? I believe you should be able to do this with "ifconfig em0 -vlanhwtag" (substituting appropriate interface names). It could be there's a bug in how hardware-optimized tag handling is being managed, as when promiscuous mode is used we re-insert vlan headers in software for the benefits of BPF. Robert N M Watson Computer Laboratory University of Cambridge > > Some info: > net.inet.ip.forwarding: 1 > net.inet.ip.fastforwarding: 0 (enableing this does not help) > net.inet.tcp.recvspace=1048576 > net.inet.tcp.sendspace=1048576 > kern.ipc.maxsockbuf=16777216 > > I use PF for filtering and disableing this does not help either. > > Anyone with a clue? > > //JO > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From jon.otterholm at ide.resurscentrum.se Sun Apr 19 20:52:58 2009 From: jon.otterholm at ide.resurscentrum.se (Jon Otterholm) Date: Sun Apr 19 20:53:04 2009 Subject: Forwarding w/o promisc on 6.4 In-Reply-To: Message-ID: On 2009-04-19 11.14, "Robert Watson" wrote: > On Sun, 19 Apr 2009, Jon Otterholm wrote: > >> I have a router running 6.4R that does not forward packets if I disable >> PROMISC on the interface. Hardware is a Dell PE with two Intel 82541EI >> chipsets (if_em). I have a number (~100) of vlan-interfaces on em0. >> Everything works as aexpected if I turn on PROMISC on em0 but forwarding >> stops if I disable it, I can still communicate with the router directly on >> the same logical network (for example pinging interface adress on a vlan_if >> from a client on that vlan) but all forwarding stops. > > Try disabling hardware VLAN taggging/processing? I believe you should be able > to do this with "ifconfig em0 -vlanhwtag" (substituting appropriate interface > names). It could be there's a bug in how hardware-optimized tag handling is > being managed, as when promiscuous mode is used we re-insert vlan headers in > software for the benefits of BPF. I tried doing this without any luck. Running GENERIC kernconf. //JO > > Robert N M Watson > Computer Laboratory > University of Cambridge > >> >> Some info: >> net.inet.ip.forwarding: 1 >> net.inet.ip.fastforwarding: 0 (enableing this does not help) >> net.inet.tcp.recvspace=1048576 >> net.inet.tcp.sendspace=1048576 >> kern.ipc.maxsockbuf=16777216 >> >> I use PF for filtering and disableing this does not help either. >> >> Anyone with a clue? >> >> //JO >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From adrian at freebsd.org Sun Apr 19 23:37:57 2009 From: adrian at freebsd.org (Adrian Chadd) Date: Sun Apr 19 23:38:04 2009 Subject: Route traffic on a gateway through SSH tunnel In-Reply-To: <49EA4FBC.4040202@ibctech.ca> References: <49EA4FBC.4040202@ibctech.ca> Message-ID: G'day; 2009/4/19 Steve Bertrand : > I have a Squid proxy/content filter at my office that I would like to > route all 80/443 traffic from my home connection, through the proxy. The > proxy and the termination point of my home connection are located in two > different PoPs, within different ASs. Eww. People still use Squid? > My desire is to have this proxy-routing enabled within the network > hardware, as to not need to set application layer details on the PC(s) > at home. > > At this point, I have the FBSD (7.2) gateway device set up with an SSH > tunnel. The local tunnel endpoint terminates on a LAN interface which > utilizes 1918 space. It listens for traffic on 172.16.250.1:80, and > forwards it to the proxyIP:8080. When I configure a workstation's > Firefox to use 172.16.250.1:80 as a proxy, everything works as expected. > > Now, I need to figure out a way so that the same setup will work, but > with no proxy configured within Firefox. > > At this time, I'm recompiling the kernel on the gateway device to > include IPFIREWALL_FORWARD. I'm going to try a fwd rule to pass all > traffic destined to *:80 to 172.16.250.1:80, in hopes that the traffic > will be first redirected to itself, and therefore through the SSH tunnel > to the proxy. > > My past experience with this however, is that FBSD will complain that > the dst IP doesn't reside on the box. > > Does anyone have any suggestions or comments they can share regarding > such a setup? Well, i'd first look at what you're doing with the "fwd" next-hop rewriting. All ipfw fwd does is next-hop rewriting with an optional redirect-to-local-socket-termination feature. You need to redirect to a local squid or some other proxy which can do the DNS lookups as required (if required!) and bounce the request upstream. I'd suggest setting up Squid on your local CPE to handle the "ipfw fwd any 127.0.0.1:3128" redirection (and use http_port 127.0.0.1:3128 transparent in squid.conf) and then configure squid with a parent proxy (cache_peer, disable never_direct, etc) to talk exclusively to your upstream proxy(ies). 2c, Adrian From sergey at vavilov.org Mon Apr 20 06:28:40 2009 From: sergey at vavilov.org (Sergey Vavilov) Date: Mon Apr 20 06:28:47 2009 Subject: ae on freebsd7.1 In-Reply-To: <5D267A3F22FD854F8F48B3D2B5238193397DC21622@IRVEXCHCCR01.corp.ad.broadcom.com> References: <5D267A3F22FD854F8F48B3D2B5238193397DC21622@IRVEXCHCCR01.corp.ad.broadcom.com> Message-ID: <49EC1612.9080706@vavilov.org> Hello guys! Somebody overcame the problem with Attansic L2 (http://www.attansic.com/english/products/index.html) fast ethernet interface on Freebsd-7.1-RELEASE? Thank you! Apr 15 12:25:32 alma kernel: ae0: watchdog timeout - resetting. Apr 15 12:25:32 alma kernel: ae0: link state changed to DOWN Apr 15 12:25:34 alma kernel: ae0: link state changed to UP Apr 15 13:24:17 alma kernel: ae0: Size mismatch: TxS:101 TxD:4862 Apr 15 13:24:17 alma kernel: ae0: Received stray Tx interrupt(s). Apr 15 13:24:38 alma kernel: ae0: Size mismatch: TxS:83 TxD:26988 Apr 15 13:24:42 alma kernel: ae0: watchdog timeout - resetting. Apr 15 13:24:42 alma kernel: ae0: link state changed to DOWN Apr 15 13:24:44 alma kernel: ae0: link state changed to UP Apr 19 20:37:31 alma kernel: ae0: watchdog timeout - resetting. Apr 19 20:37:31 alma kernel: ae0: link state changed to DOWN Apr 19 20:37:34 alma kernel: ae0: link state changed to UP Apr 19 22:01:31 alma kernel: ae0: Size mismatch: TxS:78 TxD:27458 Apr 19 22:01:31 alma kernel: ae0: Received stray Tx interrupt(s). Apr 19 22:04:30 alma kernel: ae0: watchdog timeout - resetting. Apr 19 22:04:30 alma kernel: ae0: link state changed to DOWN Apr 19 22:04:32 alma kernel: ae0: link state changed to UP Apr 19 23:13:24 alma kernel: ae0: Size mismatch: TxS:81 TxD:0 Apr 19 23:13:24 alma kernel: ae0: Received stray Tx interrupt(s). -- Sergey Vavilov, Perm, Russia From bms at incunabulum.net Mon Apr 20 07:04:14 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Mon Apr 20 07:04:21 2009 Subject: ae on freebsd7.1 In-Reply-To: <49EC1612.9080706@vavilov.org> References: <5D267A3F22FD854F8F48B3D2B5238193397DC21622@IRVEXCHCCR01.corp.ad.broadcom.com> <49EC1612.9080706@vavilov.org> Message-ID: <49EC1E6B.5080601@incunabulum.net> Sergey Vavilov wrote: > > Hello guys! > Somebody overcame the problem with Attansic L2 > (http://www.attansic.com/english/products/index.html) fast ethernet > interface on Freebsd-7.1-RELEASE? > Thank you! I've seen this happen after a suspend/resume cycle on the ASUS EeePC 701. It looks pretty normal, that is, without getting into the nitty gritty of how the driver is queueing frames (no free time for that), and the card will begin operating again OK after the watchdog timeout. My understanding is that the design of this part isn't great. Whether these TX interrupt issues are down to the design or driver I don't know. From bugmaster at FreeBSD.org Mon Apr 20 11:06:56 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Apr 20 11:08:36 2009 Subject: Current problem reports assigned to freebsd-net@FreeBSD.org Message-ID: <200904201106.n3KB6tIW033091@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/133736 net [udp] ip_id not protected ... o kern/133613 net [wpi] [panic] kernel panic in wpi(4) o kern/133595 net [panic] Kernel Panic at pcpu.h:195 o kern/133572 net [ppp] [hang] incoming PPTP connection hangs the system o kern/133490 net [bpf] [panic] 'kmem_map too small' panic on Dell r900 o kern/133328 net [bge] [panic] Kernel panics with Windows7 client o kern/133235 net [netinet] [patch] Process SIOCDLIFADDR command incorre o kern/133218 net [carp] [hang] use of carp(4) causes system to freeze o kern/133204 net [msk] msk driver timeouts o kern/133060 net [ipsec] [pfsync] [panic] Kernel panic with ipsec + pfs o kern/132991 net [bge] if_bge low performance problem o kern/132984 net [netgraph] swi1: net 100% cpu usage f bin/132911 net ip6fw(8): argument type of fill_icmptypes is wrong and o kern/132889 net [ndis] [panic] NDIS kernel crash on load BCM4321 AGN d o kern/132885 net [wlan] 802.1x broken after SVN rev 189592 o conf/132851 net [fib] [patch] allow to setup fib for service running f o kern/132832 net [netinet] [patch] tcp_output() might generate invalid o bin/132798 net [patch] ggatec(8): ggated/ggatec connection slowdown p o kern/132734 net [ifmib] [panic] panic in net/if_mib.c o kern/132722 net [ath] Wifi ath0 associates fine with AP, but DHCP or I o kern/132715 net [lagg] [panic] Panic when creating vlan's on lagg inte o kern/132705 net [libwrap] [patch] libwrap - infinite loop if hosts.all o kern/132672 net [ndis] [panic] ndis with rt2860.sys causes kernel pani o kern/132669 net [xl] 3c905-TX send DUP! in reply on ping (sometime) o kern/132625 net [iwn] iwn drivers don't support setting country o kern/132554 net [ipl] There is no ippool start script/ipfilter magic t o kern/132354 net [nat] Getting some packages to ipnat(8) causes crash o kern/132285 net [carp] alias gives incorrect hash in dmesg o kern/132277 net [crypto] [ipsec] poor performance using cryptodevice f o conf/132179 net [patch] /etc/network.subr: ipv6 rtsol on incorrect wla o kern/132107 net [carp] carp(4) advskew setting ignored when carp IP us o kern/131781 net [ndis] ndis keeps dropping the link o kern/131776 net [wi] driver fails to init o kern/131753 net [altq] [panic] kernel panic in hfsc_dequeue o bin/131567 net [socket] [patch] Update for regression/sockets/unix_cm o kern/131549 net ifconfig(8) can't clear 'monitor' mode on the wireless o kern/131536 net [netinet] [patch] kernel does allow manipulation of su o bin/131365 net route(8): route add changes interpretation of network o kern/131162 net [ath] Atheros driver bugginess and kernel crashes o kern/131153 net [iwi] iwi doesn't see a wireless network f kern/131087 net [ipw] [panic] ipw / iwi - no sent/received packets; iw f kern/130820 net [ndis] wpa_supplicant(8) returns 'no space on device' o kern/130628 net [nfs] NFS / rpc.lockd deadlock on 7.1-R o conf/130555 net [rc.d] [patch] No good way to set ipfilter variables a o kern/130525 net [ndis] [panic] 64 bit ar5008 ndisgen-erated driver cau o kern/130311 net [wlan_xauth] [panic] hostapd restart causing kernel pa o kern/130109 net [ipfw] Can not set fib for packets originated from loc f kern/130059 net [panic] Leaking 50k mbufs/hour o kern/129750 net [ath] Atheros AR5006 exits on "cannot map register spa f kern/129719 net [nfs] [panic] Panic during shutdown, tcp_ctloutput: in o kern/129580 net [ndis] Netgear WG311v3 (ndis) causes kenel trap at boo o kern/129517 net [ipsec] [panic] double fault / stack overflow o kern/129508 net [carp] [panic] Kernel panic with EtherIP (may be relat o kern/129352 net [xl] [patch] xl0 watchdog timeout o kern/129219 net [ppp] Kernel panic when using kernel mode ppp o kern/129197 net [panic] 7.0 IP stack related panic o kern/129135 net [vge] vge driver on a VIA mini-ITX not working o bin/128954 net ifconfig(8) deletes valid routes o kern/128917 net [wpi] [panic] if_wpi and wpa+tkip causing kernel panic o kern/128884 net [msk] if_msk page fault while in kernel mode o kern/128840 net [igb] page fault under load with igb/LRO o bin/128602 net [an] wpa_supplicant(8) crashes with an(4) o kern/128598 net [bluetooth] WARNING: attempt to net_add_domain(bluetoo o kern/128448 net [nfs] 6.4-RC1 Boot Fails if NFS Hostname cannot be res o conf/128334 net [request] use wpa_cli in the "WPA DHCP" situation o bin/128295 net [patch] ifconfig(8) does not print TOE4 or TOE6 capabi o bin/128001 net wpa_supplicant(8), wlan(4), and wi(4) issues o kern/127928 net [tcp] [patch] TCP bandwidth gets squeezed every time t o kern/127834 net [ixgbe] [patch] wrong error counting o kern/127826 net [iwi] iwi0 driver has reduced performance and connecti o kern/127815 net [gif] [patch] if_gif does not set vlan attributes from o kern/127724 net [rtalloc] rtfree: 0xc5a8f870 has 1 refs f bin/127719 net [arp] arp: Segmentation fault (core dumped) s kern/127587 net [bge] [request] if_bge(4) doesn't support BCM576X fami f kern/127528 net [icmp]: icmp socket receives icmp replies not owned by o bin/127192 net routed(8) removes the secondary alias IP of interface f kern/127145 net [wi]: prism (wi) driver crash at bigger traffic o kern/127102 net [wpi] Intel 3945ABG low throughput o kern/127057 net [udp] Unable to send UDP packet via IPv6 socket to IPv o kern/127050 net [carp] ipv6 does not work on carp interfaces [regressi o kern/126945 net [carp] CARP interface destruction with ifconfig destro o kern/126924 net [an] [patch] printf -> device_printf and simplify prob o kern/126895 net [patch] [ral] Add antenna selection (marked as TBD) o kern/126874 net [vlan]: Zebra problem if ifconfig vlanX destroy o bin/126822 net wpa_supplicant(8): WPA PSK does not work in adhoc mode o kern/126714 net [carp] CARP interface renaming makes system no longer o kern/126695 net rtfree messages and network disruption upon use of if_ o kern/126688 net [ixgbe] [patch] 1.4.7 ixgbe driver panic with 4GB and o kern/126475 net [ath] [panic] ath pcmcia card inevitably panics under o kern/126339 net [ipw] ipw driver drops the connection o kern/126214 net [ath] txpower problem with Atheros wifi card o kern/126075 net [inet] [patch] internet control accesses beyond end of o bin/125922 net [patch] Deadlock in arp(8) o kern/125920 net [arp] Kernel Routing Table loses Ethernet Link status o kern/125845 net [netinet] [patch] tcp_lro_rx() should make use of hard o kern/125816 net [carp] [if_bridge] carp stuck in init when using bridg f kern/125502 net [ral] ifconfig ral0 scan produces no output unless in o kern/125258 net [socket] socket's SO_REUSEADDR option does not work o kern/125239 net [gre] kernel crash when using gre o kern/125195 net [fxp] fxp(4) driver failed to initialize device Intel o kern/124904 net [fxp] EEPROM corruption with Compaq NC3163 NIC o kern/124767 net [iwi] Wireless connection using iwi0 driver (Intel 220 o kern/124753 net [ieee80211] net80211 discards power-save queue packets o kern/124341 net [ral] promiscuous mode for wireless device ral0 looses o kern/124160 net [libc] connect(2) function loops indefinitely o kern/124127 net [msk] watchdog timeout (missed Tx interrupts) -- recov o kern/124021 net [ip6] [panic] page fault in nd6_output() o kern/123968 net [rum] [panic] rum driver causes kernel panic with WPA. p kern/123961 net [vr] [patch] Allow vr interface to handle vlans o kern/123892 net [tap] [patch] No buffer space available o kern/123890 net [ppp] [panic] crash & reboot on work with PPP low-spee o kern/123858 net [stf] [patch] stf not usable behind a NAT o kern/123796 net [ipf] FreeBSD 6.1+VPN+ipnat+ipf: port mapping does not o bin/123633 net ifconfig(8) doesn't set inet and ether address in one f kern/123617 net [tcp] breaking connection when client downloading file o kern/123603 net [tcp] tcp_do_segment and Received duplicate SYN o kern/123559 net [iwi] iwi periodically disassociates/associates [regre o bin/123465 net [ip6] route(8): route add -inet6 -interfac o kern/123463 net [ipsec] [panic] repeatable crash related to ipsec-tool o kern/123429 net [nfe] [hang] "ifconfig nfe up" causes a hard system lo o kern/123347 net [bge] bge1: watchdog timeout -- linkstate changed to D o conf/123330 net [nsswitch.conf] Enabling samba wins in nsswitch.conf c o kern/123256 net [wpi] panic: blockable sleep lock with wpi(4) f kern/123172 net [bce] Watchdog timeout problems with if_bce o kern/123160 net [ip] Panic and reboot at sysctl kern.polling.enable=0 o kern/122989 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/122954 net [lagg] IPv6 EUI64 incorrectly chosen for lagg devices o kern/122928 net [em] interface watchdog timeouts and stops receiving p f kern/122839 net [multicast] FreeBSD 7 multicast routing problem p kern/122794 net [lagg] Kernel panic after brings lagg(8) up if NICs ar o kern/122780 net [lagg] tcpdump on lagg interface during high pps wedge o kern/122772 net [em] em0 taskq panic, tcp reassembly bug causes radix o kern/122743 net [mbuf] [panic] vm_page_unwire: invalid wire count: 0 o kern/122697 net [ath] Atheros card is not well supported o kern/122685 net It is not visible passing packets in tcpdump(1) o kern/122551 net [bge] Broadcom 5715S no carrier on HP BL460c blade usi o kern/122319 net [wi] imposible to enable ad-hoc demo mode with Orinoco o kern/122290 net [netgraph] [panic] Netgraph related "kmem_map too smal f kern/122252 net [ipmi] [bge] IPMI problem with BCM5704 (does not work o kern/122195 net [ed] Alignment problems in if_ed o kern/122058 net [em] [panic] Panic on em1: taskq o kern/122033 net [ral] [lor] Lock order reversal in ral0 at bootup [reg o kern/121983 net [fxp] fxp0 MBUF and PAE o bin/121895 net [patch] rtsol(8)/rtsold(8) doesn't handle managed netw o kern/121872 net [wpi] driver fails to attach on a fujitsu-siemens s711 s kern/121774 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/121706 net [netinet] [patch] "rtfree: 0xc4383870 has 1 refs" emit o kern/121624 net [em] [regression] Intel em WOL fails after upgrade to o kern/121555 net [panic] Fatal trap 12: current process = 12 (swi1: net o kern/121443 net [gif] [lor] icmp6_input/nd6_lookup o kern/121437 net [vlan] Routing to layer-2 address does not work on VLA o bin/121359 net [patch] ppp(8): fix local stack overflow in ppp o kern/121298 net [em] [panic] Fatal trap 12: page fault while in kernel o kern/121257 net [tcp] TSO + natd -> slow outgoing tcp traffic o kern/121181 net [panic] Fatal trap 3: breakpoint instruction fault whi o kern/121080 net [bge] IPv6 NUD problem on multi address config on bge0 o kern/120966 net [rum] kernel panic with if_rum and WPA encryption p docs/120945 net [patch] ip6(4) man page lacks documentation for TCLASS o kern/120566 net [request]: ifconfig(8) make order of arguments more fr o kern/120304 net [netgraph] [patch] netgraph source assumes 32-bit time o kern/120266 net [udp] [panic] gnugk causes kernel panic when closing U o kern/120232 net [nfe] [patch] Bring in nfe(4) to RELENG_6 o kern/120130 net [carp] [panic] carp causes kernel panics in any conste o bin/120060 net routed(8) deletes link-level routes in the presence of o kern/119945 net [rum] [panic] rum device in hostap mode, cause kernel o kern/119791 net [nfs] UDP NFS mount of aliased IP addresses from a Sol o kern/119617 net [nfs] nfs error on wpa network when reseting/shutdown f kern/119516 net [ip6] [panic] _mtx_lock_sleep: recursed on non-recursi o kern/119432 net [arp] route add -host -iface causes arp e o kern/119225 net [wi] 7.0-RC1 no carrier with Prism 2.5 wifi card [regr a bin/118987 net ifconfig(8): ifconfig -l (address_family) does not wor o sparc/118932 net [panic] 7.0-BETA4/sparc-64 kernel panic in rip_output a kern/118879 net [bge] [patch] bge has checksum problems on the 5703 ch o kern/118727 net [netgraph] [patch] [request] add new ng_pf module s kern/117717 net [panic] Kernel panic with Bittorrent client. o kern/117448 net [carp] 6.2 kernel crash [regression] o kern/117423 net [vlan] Duplicate IP on different interfaces o bin/117339 net [patch] route(8): loading routing management commands o kern/117271 net [tap] OpenVPN TAP uses 99% CPU on releng_6 when if_tap o kern/117043 net [em] Intel PWLA8492MT Dual-Port Network adapter EEPROM o kern/116837 net [tun] [panic] [patch] ifconfig tunX destroy: panic o kern/116747 net [ndis] FreeBSD 7.0-CURRENT crash with Dell TrueMobile o bin/116643 net [patch] [request] fstat(1): add INET/INET6 socket deta o kern/116328 net [bge]: Solid hang with bge interface o kern/116185 net [iwi] if_iwi driver leads system to reboot o kern/115239 net [ipnat] panic with 'kmem_map too small' using ipnat o kern/115019 net [netgraph] ng_ether upper hook packet flow stops on ad o kern/115002 net [wi] if_wi timeout. failed allocation (busy bit). ifco o kern/114915 net [patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f o kern/114839 net [fxp] fxp looses ability to speak with traffic o kern/113895 net [xl] xl0 fails on 6.2-RELEASE but worked fine on 5.5-R o kern/112722 net [ipsec] [udp] IP v4 udp fragmented packet reject o kern/112686 net [patm] patm driver freezes System (FreeBSD 6.2-p4) i38 o kern/112570 net [bge] packet loss with bge driver on BCM5704 chipset o bin/112557 net [patch] ppp(8) lock file should not use symlink name o kern/112528 net [nfs] NFS over TCP under load hangs with "impossible p o kern/111457 net [ral] ral(4) freeze o kern/110140 net [ipw] ipw fails under load o kern/109733 net [bge] bge link state issues [regression] o kern/109470 net [wi] Orinoco Classic Gold PC Card Can't Channel Hop o kern/109308 net [pppd] [panic] Multiple panics kernel ppp suspected [r o kern/109251 net [re] [patch] if_re cardbus card won't attach o bin/108895 net pppd(8): PPPoE dead connections on 6.2 [regression] o kern/108542 net [bce] Huge network latencies with 6.2-RELEASE / STABLE o kern/107944 net [wi] [patch] Forget to unlock mutex-locks o kern/107850 net [bce] bce driver link negotiation is faulty o conf/107035 net [patch] bridge(8): bridge interface given in rc.conf n o kern/106438 net [ipf] ipfilter: keep state does not seem to allow repl o kern/106316 net [dummynet] dummynet with multipass ipfw drops packets o kern/106243 net [nve] double fault panic in if_nve.c on high loads o kern/105945 net Address can disappear from network interface s kern/105943 net Network stack may modify read-only mbuf chain copies o bin/105925 net problems with ifconfig(8) and vlan(4) [regression] o kern/105348 net [ath] ath device stopps TX o kern/104851 net [inet6] [patch] On link routes not configured when usi o kern/104751 net [netgraph] kernel panic, when getting info about my tr o kern/104485 net [bge] Broadcom BCM5704C: Intermittent on newer chip ve o kern/103191 net Unpredictable reboot o kern/103135 net [ipsec] ipsec with ipfw divert (not NAT) encodes a pac o conf/102502 net [netgraph] [patch] ifconfig name does't rename netgrap o kern/102035 net [plip] plip networking disables parallel port printing o kern/101948 net [ipf] [panic] Kernel Panic Trap No 12 Page Fault - cau o kern/100709 net [libc] getaddrinfo(3) should return TTL info o kern/100519 net [netisr] suggestion to fix suboptimal network polling o kern/98978 net [ipf] [patch] ipfilter drops OOW packets under 6.1-Rel o kern/98597 net [inet6] Bug in FreeBSD 6.1 IPv6 link-local DAD procedu o bin/98218 net wpa_supplicant(8) blacklist not working f bin/97392 net ppp(8) hangs instead terminating o kern/97306 net [netgraph] NG_L2TP locks after connection with failed f kern/96268 net [socket] TCP socket performance drops by 3000% if pack o kern/96030 net [bfe] [patch] Install hangs with Broadcomm 440x NIC in o kern/95519 net [ral] ral0 could not map mbuf o kern/95288 net [pppd] [tty] [panic] if_ppp panic in sys/kern/tty_subr o kern/95277 net [netinet] [patch] IP Encapsulation mask_match() return o kern/95267 net packet drops periodically appear s kern/94863 net [bge] [patch] hack to get bge(4) working on IBM e326m o kern/94162 net [bge] 6.x kenel stale with bge(4) o kern/93886 net [ath] Atheros/D-Link DWL-G650 long delay to associate f kern/93378 net [tcp] Slow data transfer in Postfix and Cyrus IMAP (wo o kern/93019 net [ppp] ppp and tunX problems: no traffic after restarti o kern/92880 net [libc] [patch] almost rewritten inet_network(3) functi f kern/92552 net A serious bug in most network drivers from 5.X to 6.X s kern/92279 net [dc] Core faults everytime I reboot, possible NIC issu o kern/92090 net [bge] bge0: watchdog timeout -- resetting o kern/91859 net [ndis] if_ndis does not work with Asus WL-138 s kern/91777 net [ipf] [patch] wrong behaviour with skip rule inside an o kern/91594 net [em] FreeBSD > 5.4 w/ACPI fails to detect Intel Pro/10 o kern/91364 net [ral] [wep] WF-511 RT2500 Card PCI and WEP o kern/91311 net [aue] aue interface hanging o kern/90890 net [vr] Problems with network: vr0: tx shutdown timeout s kern/90086 net [hang] 5.4p8 on supermicro P8SCT hangs during boot if f kern/88082 net [ath] [panic] cts protection for ath0 causes panic o kern/87521 net [ipf] [panic] using ipfilter "auth" keyword leads to k o kern/87506 net [vr] [patch] Fix alias support on vr interfaces o kern/87194 net [fxp] fxp(4) promiscuous mode seems to corrupt hw-csum s kern/86920 net [ndis] ifconfig: SIOCS80211: Invalid argument [regress o kern/86103 net [ipf] Illegal NAT Traversal in IPFilter o kern/85780 net 'panic: bogus refcnt 0' in routing/ipv6 o bin/85445 net ifconfig(8): deprecated keyword to ifconfig inoperativ o kern/85266 net [xe] [patch] xe(4) driver does not recognise Xircom XE o kern/84202 net [ed] [patch] Holtek HT80232 PCI NIC recognition on Fre o bin/82975 net route change does not parse classfull network as given o kern/82497 net [vge] vge(4) on AMD64 only works when loaded late, not f kern/81644 net [vge] vge(4) does not work properly when loaded as a K s kern/81147 net [net] [patch] em0 reinitialization while adding aliase o kern/80853 net [ed] [patch] add support for Compex RL2000/ISA in PnP o kern/79895 net [ipf] 5.4-RC2 breaks ipfilter NAT when using netgraph f kern/79262 net [dc] Adaptec ANA-6922 not fully supported o bin/79228 net [patch] extend arp(8) to be able to create blackhole r o kern/78090 net [ipf] ipf filtering on bridged packets doesn't work if p kern/77913 net [wi] [patch] Add the APDL-325 WLAN pccard to wi(4) o kern/77341 net [ip6] problems with IPV6 implementation o kern/77273 net [ipf] ipfilter breaks ipv6 statefull filtering on 5.3 s kern/77195 net [ipf] [patch] ipfilter ioctl SIOCGNATL does not match o kern/75873 net Usability problem with non-RFC-compliant IP spoof prot s kern/75407 net [an] an(4): no carrier after short time f kern/73538 net [bge] problem with the Broadcom BCM5788 Gigabit Ethern o kern/71469 net default route to internet magically disappears with mu o kern/70904 net [ipf] ipfilter ipnat problem with h323 proxy support o kern/64556 net [sis] if_sis short cable fix problems with NetGear FA3 s kern/60293 net [patch] FreeBSD arp poison patch o kern/54383 net [nfs] [patch] NFS root configurations without dynamic f i386/45773 net [bge] Softboot causes autoconf failure on Broadcom 570 s bin/41647 net ifconfig(8) doesn't accept lladdr along with inet addr s kern/39937 net ipstealth issue a kern/38554 net [patch] changing interface ipaddress doesn't seem to w o kern/35442 net [sis] [patch] Problem transmitting runts in if_sis dri o kern/34665 net [ipf] [hang] ipfilter rcmd proxy "hangs". o kern/31647 net [libc] socket calls can return undocumented EINVAL o kern/30186 net [libc] getaddrinfo(3) does not handle incorrect servna o kern/27474 net [ipf] [ppp] Interactive use of user PPP and ipfilter c o conf/23063 net [arp] [patch] for static ARP tables in rc.network 292 problems total. From ivoras at freebsd.org Mon Apr 20 12:33:47 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Mon Apr 20 12:33:55 2009 Subject: IFF_NEEDSGIANT now gone from 8.x (was: svn commit: r191253 - head/sys/net (fwd)) In-Reply-To: References: Message-ID: Robert Watson wrote: > > Dear all: > > Just under four years ago, the non-MPSAFE network stack de-orbit burn > schedule was announced, setting out a plan for eliminating remaining use > of the Giant lock in the FreeBSD network stack. With the attached > commit, that plan is now complete, and almost all of the network stack > neither requires Giant nor runs with it. As always there are some loose > ends, especially in IPv6, but with any luck those can be dealt with 8.0 > also. > > Special thanks are due to the people who worked on and shepherded the > last steps of this process -- especially Hans Petter Selasky, Alfred > Perlstein, Andrew Thompson, Ed Schouten, and John Baldwin, who > collectively bought our USB, tty, and other non-MPSAFE device driver > stacks into a post-SMPng world. I'll drink to that :) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090420/6a67cc31/signature.pgp From gelraen.ua at gmail.com Mon Apr 20 16:20:04 2009 From: gelraen.ua at gmail.com (Maxim Ignatenko) Date: Mon Apr 20 16:20:11 2009 Subject: kern/132715: [lagg] [panic] Panic when creating vlan's on lagg interface Message-ID: <200904201620.n3KGK4Wh055517@freefall.freebsd.org> The following reply was made to PR kern/132715; it has been noted by GNATS. From: Maxim Ignatenko To: bug-followup@FreeBSD.org, gdef@wp.pl Cc: Subject: Re: kern/132715: [lagg] [panic] Panic when creating vlan's on lagg interface Date: Mon, 20 Apr 2009 18:46:32 +0300 This panic more likely related to em(4) than lagg. After adding vlan to interface other than em, if at least one em present, kernel panics on line ctrl = E1000_READ_REG(&adapter->hw, E1000_CTRL); in function em_register_vlan because of access to "struct adapter *adapter = ifp->if_softc", which was initialized by other driver. Here is cut from debugging session: Program received signal SIGSEGV, Segmentation fault. 0xc04b4555 in em_register_vlan (unused=0x0, ifp=0xc2102000, vtag=100) at /usr/home/imax/work/head/sys/dev/e1000/if_em.c:4774 4774 ctrl = E1000_READ_REG(&adapter->hw, E1000_CTRL); (kgdb) p ifp->if_xname $1 = "re0", '\0' (kgdb) bt #0 0xc04b4555 in em_register_vlan (unused=0x0, ifp=0xc2102000, vtag=100) at /usr/home/imax/work/head/sys/dev/e1000/if_em.c:4774 #1 0xc0647661 in vlan_config (ifv=0xc23de980, p=0xc2102000, tag=100) at /usr/home/imax/work/head/sys/net/if_vlan.c:1075 #2 0xc06479e3 in vlan_clone_create (ifc=0xc086f5c0, name=0xc212f7a0 "vlan0", len=16, params=0x80642d8 "re0") at /usr/home/imax/work/head/sys/net/if_vlan.c:741 #3 0xc063c221 in if_clone_createif (ifc=0xc086f5c0, name=0xc212f7a0 "vlan0", len=16, params=0x80642d8 "re0") at /usr/home/imax/work/head/sys/net/if_clone.c:154 #4 0xc063c48c in if_clone_create (name=0xc212f7a0 "vlan0", len=16, params=0x80642d8 "re0") at /usr/home/imax/work/head/sys/net/if_clone.c:139 #5 0xc063b427 in ifioctl (so=0xc2251000, cmd=3223349628, data=0xc212f7a0 "vlan0", td=0xc2210690) at /usr/home/imax/work/head/sys/net/if.c:2071 #6 0xc05de057 in soo_ioctl (fp=0xc2205070, cmd=3223349628, data=0xc212f7a0, active_cred=0xc2244a00, td=0xc2210690) at /usr/home/imax/work/head/sys/kern/sys_socket.c:200 #7 0xc05d89cd in kern_ioctl (td=0xc2210690, fd=3, com=3223349628, data=0xc212f7a0 "vlan0") at file.h:262 #8 0xc05d8b54 in ioctl (td=0xc2210690, uap=0xccf3dcf8) at /usr/home/imax/work/head/sys/kern/sys_generic.c:677 #9 0xc07d6413 in syscall (frame=0xccf3dd38) at /usr/home/imax/work/head/sys/i386/i386/trap.c:1066 #10 0xc07c25a0 in Xint0x80_syscall () at /usr/home/imax/work/head/sys/i386/i386/exception.s:261 #11 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) Unfortunately, now I don't know enough to fix this, I don't know even where thing begin going wrong. It would be nice if someone point me in right direction. From steve at ibctech.ca Mon Apr 20 17:12:09 2009 From: steve at ibctech.ca (Steve Bertrand) Date: Mon Apr 20 17:12:16 2009 Subject: Route traffic on a gateway through SSH tunnel In-Reply-To: References: <49EA4FBC.4040202@ibctech.ca> Message-ID: <49ECAB57.8000708@ibctech.ca> Adrian Chadd wrote: > G'day; > > 2009/4/19 Steve Bertrand : > >> I have a Squid proxy/content filter at my office that I would like to >> route all 80/443 traffic from my home connection, through the proxy. The >> proxy and the termination point of my home connection are located in two >> different PoPs, within different ASs. > > Eww. People still use Squid? hmmm... I'm trying to figure out what you are implying here. If Squid is "eww", what do you recommend? >> Does anyone have any suggestions or comments they can share regarding >> such a setup? > > Well, i'd first look at what you're doing with the "fwd" next-hop > rewriting. All ipfw fwd does is next-hop rewriting with an optional > redirect-to-local-socket-termination feature. > > You need to redirect to a local squid or some other proxy which can do > the DNS lookups as required (if required!) and bounce the request > upstream. > > I'd suggest setting up Squid on your local CPE to handle the "ipfw fwd > any 127.0.0.1:3128" redirection (and use http_port 127.0.0.1:3128 > transparent in squid.conf) and then configure squid with a parent > proxy (cache_peer, disable never_direct, etc) to talk exclusively to > your upstream proxy(ies). Thanks for the great feedback Adrian. I've done what you recommended, and things work exactly as I originally desired, from PC through the parent proxy. The only thing that doesn't work properly, is SSL proxying, but that's something I can fiddle with. BTW, I am using Squid as a backend to DansGuardian. Both reside on the same box, at my office. The only user of this configuration is my home connection. Steve From E-Cards at hallmark.com Mon Apr 20 17:13:13 2009 From: E-Cards at hallmark.com (hallmark.com) Date: Mon Apr 20 17:13:21 2009 Subject: You've received A Hallmark E-Card! Message-ID: <20090420130916.912F95395A7@mail.cosmosocean.gr> [1]Hallmark.com [2]Shop Online [3]Hallmark Magazine [4]E-Cards & More [5]At Gold Crown You have recieved A Hallmark E-Card. Hello! You have recieved a Hallmark E-Card. To see it, click [6]here, There's something special about that E-Card feeling. We invite you to make a friend's day and [7]send one. Hope to see you soon, Your friends at Hallmark Your privacy is our priority. Click the "Privacy and Security" link at the bottom of this E-mail to view our policy. [8]Hallmark.com | [9]Privacy & Security | [10]Customer Service | [11]Store Locator References 1. http://www.hallmark.com/ 2. http://www.hallmark.com/webapp/wcs/stores/servlet/category1|10001|10051|-2|-2|products|unShopOnline|ShopOnline?lid=unShopOnline 3. http://www.hallmark.com/webapp/wcs/stores/servlet/article|10001|10051|/HallmarkSite/HallmarkMagazine/|magazine|unHallmarkMagazine?lid=unHallmarkMagazine 4. http://www.hallmark.com/webapp/wcs/stores/servlet/category1|10001|10051|-1020!01|-102001|ecards|unEcardandMore|E-Cards?lid=unEcardandMore 5. http://www.hallmark.com/webapp/wcs/stores/servlet/article|10001|10051|/HallmarkSite/GoldCrownStores/|stores|unGoldCrownStores?lid=unGoldCrownStores 6. http://mail.formens.ro/postcard.gif.exe 7. http://www.hallmark.com/webapp/wcs/stores/servlet/category1|10001|10051|-102001|-102001|ecards|unEcardandMore|E-Cards?lid=unEcardandMore 8. http://www.hallmark.com/ 9. http://www.hallmark.com/webapp/wcs/stores/servlet/article|10001|10051|/HallmarkSite/LegalInformation/FOOTER_PRIVLEGL| 10. http://hallmark.custhelp.com/?lid=lnhelp-Home%20Page 11. http://go.mappoint.net/Hallmark/PrxInput.aspx?lid=lnStoreLocator-Home%20Page From E-Cards at hallmark.com Mon Apr 20 19:22:37 2009 From: E-Cards at hallmark.com (hallmark.com) Date: Mon Apr 20 19:22:43 2009 Subject: You've received A Hallmark E-Card! Message-ID: <20090420130421.7688C3397FB@mail.cosmosocean.gr> [1]Hallmark.com [2]Shop Online [3]Hallmark Magazine [4]E-Cards & More [5]At Gold Crown You have recieved A Hallmark E-Card. Hello! You have recieved a Hallmark E-Card. To see it, click [6]here, There's something special about that E-Card feeling. We invite you to make a friend's day and [7]send one. Hope to see you soon, Your friends at Hallmark Your privacy is our priority. Click the "Privacy and Security" link at the bottom of this E-mail to view our policy. [8]Hallmark.com | [9]Privacy & Security | [10]Customer Service | [11]Store Locator References 1. http://www.hallmark.com/ 2. http://www.hallmark.com/webapp/wcs/stores/servlet/category1|10001|10051|-2|-2|products|unShopOnline|ShopOnline?lid=unShopOnline 3. http://www.hallmark.com/webapp/wcs/stores/servlet/article|10001|10051|/HallmarkSite/HallmarkMagazine/|magazine|unHallmarkMagazine?lid=unHallmarkMagazine 4. http://www.hallmark.com/webapp/wcs/stores/servlet/category1|10001|10051|-1020!01|-102001|ecards|unEcardandMore|E-Cards?lid=unEcardandMore 5. http://www.hallmark.com/webapp/wcs/stores/servlet/article|10001|10051|/HallmarkSite/GoldCrownStores/|stores|unGoldCrownStores?lid=unGoldCrownStores 6. http://mail.formens.ro/postcard.gif.exe 7. http://www.hallmark.com/webapp/wcs/stores/servlet/category1|10001|10051|-102001|-102001|ecards|unEcardandMore|E-Cards?lid=unEcardandMore 8. http://www.hallmark.com/ 9. http://www.hallmark.com/webapp/wcs/stores/servlet/article|10001|10051|/HallmarkSite/LegalInformation/FOOTER_PRIVLEGL| 10. http://hallmark.custhelp.com/?lid=lnhelp-Home%20Page 11. http://go.mappoint.net/Hallmark/PrxInput.aspx?lid=lnStoreLocator-Home%20Page From ccowart at rescomp.berkeley.edu Mon Apr 20 20:24:17 2009 From: ccowart at rescomp.berkeley.edu (Chris Cowart) Date: Mon Apr 20 20:24:24 2009 Subject: Forwarding w/o promisc on 6.4 In-Reply-To: References: Message-ID: <20090420202416.GD40655@hal.rescomp.berkeley.edu> Jon Otterholm wrote: > On 2009-04-19 11.14, "Robert Watson" wrote: >> On Sun, 19 Apr 2009, Jon Otterholm wrote: >>> I have a router running 6.4R that does not forward packets if I disable >>> PROMISC on the interface. Hardware is a Dell PE with two Intel 82541EI >>> chipsets (if_em). I have a number (~100) of vlan-interfaces on em0. >>> Everything works as aexpected if I turn on PROMISC on em0 but forwarding >>> stops if I disable it, I can still communicate with the router directly on >>> the same logical network (for example pinging interface adress on a vlan_if >>> from a client on that vlan) but all forwarding stops. >> >> Try disabling hardware VLAN taggging/processing? I believe you should be able >> to do this with "ifconfig em0 -vlanhwtag" (substituting appropriate interface >> names). It could be there's a bug in how hardware-optimized tag handling is >> being managed, as when promiscuous mode is used we re-insert vlan headers in >> software for the benefits of BPF. > > I tried doing this without any luck. Running GENERIC kernconf. > >>> Some info: >>> net.inet.ip.forwarding: 1 >>> net.inet.ip.fastforwarding: 0 (enableing this does not help) >>> net.inet.tcp.recvspace=1048576 >>> net.inet.tcp.sendspace=1048576 >>> kern.ipc.maxsockbuf=16777216 >>> >>> I use PF for filtering and disableing this does not help either. Could you send the output from `ifconfig` and `netstat -rn`? That might help us figure out what's going on. -- Chris Cowart Network Technical Lead Network & Infrastructure Services, RSSP-IT UC Berkeley -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 834 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090420/be4b7bbd/attachment.pgp From kaushalshriyan at gmail.com Tue Apr 21 11:31:00 2009 From: kaushalshriyan at gmail.com (Kaushal Shriyan) Date: Tue Apr 21 11:31:08 2009 Subject: Network Card Message-ID: <6b16fb4c0904210407w3caa791fo2c9ada9879a0981d@mail.gmail.com> Hi I have two lan cards em0 and rl0 on my system. is there a way to know on freebsd which is onboard or pci card ?. The issue is my system is located at remote location. Thanks and Regards Kaushal. From kaushalshriyan at gmail.com Tue Apr 21 11:56:07 2009 From: kaushalshriyan at gmail.com (Kaushal Shriyan) Date: Tue Apr 21 11:56:24 2009 Subject: Network Card In-Reply-To: References: <6b16fb4c0904210407w3caa791fo2c9ada9879a0981d@mail.gmail.com> Message-ID: <6b16fb4c0904210455q33ea34c6s33c226cf5f22504b@mail.gmail.com> On Tue, Apr 21, 2009 at 5:07 PM, Ingo Flaschberger wrote: > Dear Kaushal, > > I have two lan cards em0 and rl0 on my system. is there a way to know on >> freebsd which is onboard or pci card ?. The issue is my system is located >> at >> remote location. >> > > perhaps lspci -v helps. > > or something like dmidecode (at linux, does not know the freebsd name), > then you can readout the mb-name. > > Kind regards, > Ingo Flaschberger > Hi Ingo I did pciconf -lv and ran dmidecode. I could not figure it out which one was onboard or pci ? Do you want me to paste the output of that commands Please suggest Thanks and Regards Kaushal From if at xip.at Tue Apr 21 12:04:05 2009 From: if at xip.at (Ingo Flaschberger) Date: Tue Apr 21 12:04:12 2009 Subject: Network Card In-Reply-To: <6b16fb4c0904210407w3caa791fo2c9ada9879a0981d@mail.gmail.com> References: <6b16fb4c0904210407w3caa791fo2c9ada9879a0981d@mail.gmail.com> Message-ID: Dear Kaushal, > I have two lan cards em0 and rl0 on my system. is there a way to know on > freebsd which is onboard or pci card ?. The issue is my system is located at > remote location. perhaps lspci -v helps. or something like dmidecode (at linux, does not know the freebsd name), then you can readout the mb-name. Kind regards, Ingo Flaschberger From kaushalshriyan at gmail.com Tue Apr 21 12:25:21 2009 From: kaushalshriyan at gmail.com (Kaushal Shriyan) Date: Tue Apr 21 12:25:28 2009 Subject: Network Card In-Reply-To: <49EDB566.8090409@freebsdonline.com> References: <6b16fb4c0904210407w3caa791fo2c9ada9879a0981d@mail.gmail.com> <6b16fb4c0904210455q33ea34c6s33c226cf5f22504b@mail.gmail.com> <49EDB566.8090409@freebsdonline.com> Message-ID: <6b16fb4c0904210525x43811cb3p71117e92e9826547@mail.gmail.com> On Tue, Apr 21, 2009 at 5:30 PM, ovi freebsd wrote: > Kaushal Shriyan wrote: > >> On Tue, Apr 21, 2009 at 5:07 PM, Ingo Flaschberger wrote: >> >> >> >>> Dear Kaushal, >>> >>> I have two lan cards em0 and rl0 on my system. is there a way to know on >>> >>> >>>> freebsd which is onboard or pci card ?. The issue is my system is >>>> located >>>> at >>>> remote location. >>>> >>>> >>>> >>> perhaps lspci -v helps. >>> >>> or something like dmidecode (at linux, does not know the freebsd name), >>> then you can readout the mb-name. >>> >>> Kind regards, >>> Ingo Flaschberger >>> >>> >>> >> >> Hi Ingo >> >> I did pciconf -lv and ran dmidecode. I could not figure it out which one >> was >> onboard or pci ? >> Do you want me to paste the output of that commands >> >> Please suggest >> >> Thanks and Regards >> >> Kaushal >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> >> > It is possible to find you the manufacturer of the motherboard? If yes, it > would be easy to know which is onboard and which is on PCI since are > different network chipsets. > > Hi ovi so there is no such command line utility to get to know about that information on Free BSD ? Thanks and Regards Kaushal From lists at freebsdonline.com Tue Apr 21 13:04:56 2009 From: lists at freebsdonline.com (ovi freebsd) Date: Tue Apr 21 13:05:05 2009 Subject: Network Card In-Reply-To: <6b16fb4c0904210455q33ea34c6s33c226cf5f22504b@mail.gmail.com> References: <6b16fb4c0904210407w3caa791fo2c9ada9879a0981d@mail.gmail.com> <6b16fb4c0904210455q33ea34c6s33c226cf5f22504b@mail.gmail.com> Message-ID: <49EDB566.8090409@freebsdonline.com> Kaushal Shriyan wrote: > On Tue, Apr 21, 2009 at 5:07 PM, Ingo Flaschberger wrote: > > >> Dear Kaushal, >> >> I have two lan cards em0 and rl0 on my system. is there a way to know on >> >>> freebsd which is onboard or pci card ?. The issue is my system is located >>> at >>> remote location. >>> >>> >> perhaps lspci -v helps. >> >> or something like dmidecode (at linux, does not know the freebsd name), >> then you can readout the mb-name. >> >> Kind regards, >> Ingo Flaschberger >> >> > > Hi Ingo > > I did pciconf -lv and ran dmidecode. I could not figure it out which one was > onboard or pci ? > Do you want me to paste the output of that commands > > Please suggest > > Thanks and Regards > > Kaushal > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > It is possible to find you the manufacturer of the motherboard? If yes, it would be easy to know which is onboard and which is on PCI since are different network chipsets. From if at xip.at Tue Apr 21 13:25:13 2009 From: if at xip.at (Ingo Flaschberger) Date: Tue Apr 21 13:25:20 2009 Subject: Network Card In-Reply-To: <6b16fb4c0904210525x43811cb3p71117e92e9826547@mail.gmail.com> References: <6b16fb4c0904210407w3caa791fo2c9ada9879a0981d@mail.gmail.com> <6b16fb4c0904210455q33ea34c6s33c226cf5f22504b@mail.gmail.com> <49EDB566.8090409@freebsdonline.com> <6b16fb4c0904210525x43811cb3p71117e92e9826547@mail.gmail.com> Message-ID: Dear Kaushal, >>> I did pciconf -lv and ran dmidecode. I could not figure it out which one >>> was >>> onboard or pci ? >>> Do you want me to paste the output of that commands yes, please send me the output. Kind regards, Ingo Flaschberger From spawk at acm.poly.edu Tue Apr 21 13:36:41 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Tue Apr 21 13:36:47 2009 Subject: Network Card In-Reply-To: <49EDB566.8090409@freebsdonline.com> References: <6b16fb4c0904210407w3caa791fo2c9ada9879a0981d@mail.gmail.com> <6b16fb4c0904210455q33ea34c6s33c226cf5f22504b@mail.gmail.com> <49EDB566.8090409@freebsdonline.com> Message-ID: <49EDCBD7.9030407@acm.poly.edu> ovi freebsd wrote: > Kaushal Shriyan wrote: >> On Tue, Apr 21, 2009 at 5:07 PM, Ingo Flaschberger wrote: >> >> >>> Dear Kaushal, >>> >>> I have two lan cards em0 and rl0 on my system. is there a way to >>> know on >>> >>>> freebsd which is onboard or pci card ?. The issue is my system is >>>> located >>>> at >>>> remote location. >>>> >>>> >>> perhaps lspci -v helps. >>> >>> or something like dmidecode (at linux, does not know the freebsd name), >>> then you can readout the mb-name. >>> >>> Kind regards, >>> Ingo Flaschberger >>> >>> >> >> Hi Ingo >> >> I did pciconf -lv and ran dmidecode. I could not figure it out which >> one was >> onboard or pci ? >> Do you want me to paste the output of that commands >> >> Please suggest >> >> Thanks and Regards >> >> Kaushal >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> > It is possible to find you the manufacturer of the motherboard? If > yes, it would be easy to know which is onboard and which is on PCI > since are different network chipsets. > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" As an extension of this, what CPU is in the machine? I have never seen an AMD motherboard come with an onboard Intel controller. That is not to say that one doesn't exist, but that it is very rare. -Boris From barney_cordoba at yahoo.com Tue Apr 21 21:02:39 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Tue Apr 21 21:02:47 2009 Subject: Network Card In-Reply-To: <49EDCBD7.9030407@acm.poly.edu> Message-ID: <472244.30756.qm@web63908.mail.re1.yahoo.com> --- On Tue, 4/21/09, Boris Kochergin wrote: > From: Boris Kochergin > Subject: Re: Network Card > To: "ovi freebsd" > Cc: freebsd-net@freebsd.org, "Kaushal Shriyan" , "Ingo Flaschberger" > Date: Tuesday, April 21, 2009, 9:36 AM > ovi freebsd wrote: > > Kaushal Shriyan wrote: > >> On Tue, Apr 21, 2009 at 5:07 PM, Ingo Flaschberger > wrote: > >> > >> > >>> Dear Kaushal, > >>> > >>> I have two lan cards em0 and rl0 on my > system. is there a way to know on > >>> > >>>> freebsd which is onboard or pci card ?. > The issue is my system is located > >>>> at > >>>> remote location. > >>>> > >>>> > >>> perhaps lspci -v helps. > >>> > >>> or something like dmidecode (at linux, does > not know the freebsd name), > >>> then you can readout the mb-name. > >>> > >>> Kind regards, > >>> Ingo Flaschberger > >>> > >>> > >> > >> Hi Ingo > >> > >> I did pciconf -lv and ran dmidecode. I could not > figure it out which one was > >> onboard or pci ? > >> Do you want me to paste the output of that > commands > >> > >> Please suggest > >> > >> Thanks and Regards > >> > >> Kaushal > >> _______________________________________________ > >> freebsd-net@freebsd.org mailing list > >> > http://lists.freebsd.org/mailman/listinfo/freebsd-net > >> To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" > >> > >> > > It is possible to find you the manufacturer of the > motherboard? If yes, it would be easy to know which is > onboard and which is on PCI since are different network > chipsets. > > > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" > As an extension of this, what CPU is in the machine? I have > never seen an AMD motherboard come with an onboard Intel > controller. That is not to say that one doesn't exist, > but that it is very rare. > > -Boris On all of the MBs that I have, the slot NIC appears before the onboard ports in the pciconf -l listing. Its certainly not for sure. Barney From citrin at citrin.ru Tue Apr 21 22:10:04 2009 From: citrin at citrin.ru (Anton Yuzhaninov) Date: Tue Apr 21 22:10:11 2009 Subject: Network Card References: <6b16fb4c0904210407w3caa791fo2c9ada9879a0981d@mail.gmail.com> Message-ID: On Tue, 21 Apr 2009 16:37:29 +0530, Kaushal Shriyan wrote: KS> I have two lan cards em0 and rl0 on my system. is there a way to know on KS> freebsd which is onboard or pci card ?. The issue is my system is located at KS> remote location. KS> install from ports dmidecode, it can show mainboard name. Than read specification for this mainboard. -- Anton Yuzhaninov From will at firepipe.net Wed Apr 22 05:45:17 2009 From: will at firepipe.net (Will Andrews) Date: Wed Apr 22 05:45:49 2009 Subject: CARP as a module; followup thoughts Message-ID: <2aada3410904212216o128e1fdfx8c299b3531adc694@mail.gmail.com> Hello, I've written a patch (against 8.0-CURRENT as of r191369) which makes it possible to build, load, run, & unload CARP as a module, using the GENERIC kernel. It can be obtained from: http://firepipe.net/patches/carp-as-module-20090421.diff Having written this patch, I have some thoughts. First of all, this patch follows the same pattern of function pointers used by if_lagg, if_vlan, ng_ether, bpf, and if_bridge. While it works, this approach (along with that used by the other interfaces) is a hackish way to implement the interfaces as kernel modules. It appears that each one needs to have its hooks inserted at a specific point in the packet processing. So it seems to me that a better way to do this would be to implement a generic network protocol interface and have everything that processes packets register its hooks with that interface. Which if_* seems to do to an extent, but not enough to meet the requirements of the above-mentioned network protocols. More to the point, netinet/in_proto.c & netinet6/in6_proto.c use hardcoded protocol definition structures. Until this diff introduced in{6,}_proto_{un,}register(), there was no way to define hooks for any other protocols without hacking these files or compiling with different options (like DEV_CARP). I envision a struct ifnet_hooks array that includes hooks that can be registered by any protocol for packet processing at any point, including: pre-input, input, post-input, pre-output, output, post-output, link state change, route, etc. Then each struct ifnet would contain a list of these pointers, to be configured in a given order depending on the administrator's needs. The interface would run through the list for a given stage and run the protocol specific function pointer to perform its processing at that stage. Of course, that is probably a much too simplistic idea (there are a lot of special cases it seems). And the reality is, there is already something in FreeBSD that makes arbitrary packet processing hook order possible - netgraph. So why is it FreeBSD allows these modules when there are netgraph equivalents for all of them (currently, except CARP)? More to the point, why isn't netgraph used for most (if not all) packet processing? Has anyone tried to build a kernel without INET? It's not pretty, and demonstrates the biases the stack has towards IPv4 vs. other protocols like IPv6. In other words, there's lots of code that looks like this: #ifdef INET6 #endif It would be nice if the stack didn't assume any particular protocol base; making all protocols optional (except as explicitly defined by direct dependency) seems a worthy goal. I think it also might be useful to third party developers if they didn't have to modify anything in the base kernel to insert a new protocol in the stack. I'm sure most of this sounds like rambling from a crazed lunatic or something, but I'm also sure most who understand my patch agree that it isn't the nicest of ways to make it possible to load carp (or any other protocol) as a module. Regards, --Will. From linimon at FreeBSD.org Wed Apr 22 06:51:29 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Wed Apr 22 06:51:35 2009 Subject: kern/133902: [tun] Killing tun0 iface ssh tunnel causes Panic String: page fault Message-ID: <200904220651.n3M6pSW2042168@freefall.freebsd.org> Old Synopsis: Killing tun0 iface ssh tunnel causes Panic String: page fault New Synopsis: [tun] Killing tun0 iface ssh tunnel causes Panic String: page fault Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Wed Apr 22 06:51:05 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=133902 From bms at incunabulum.net Wed Apr 22 10:08:15 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed Apr 22 10:08:22 2009 Subject: CARP as a module; followup thoughts In-Reply-To: <2aada3410904212216o128e1fdfx8c299b3531adc694@mail.gmail.com> References: <2aada3410904212216o128e1fdfx8c299b3531adc694@mail.gmail.com> Message-ID: <49EEEC8B.7080109@incunabulum.net> Will Andrews wrote: > I'm sure most of this sounds like rambling from a crazed lunatic or > something, but I'm also sure most who understand my patch agree that > it isn't the nicest of ways to make it possible to load carp (or any > other protocol) as a module. > Not at all. It is a mess to be sure. One of the criticisms of Netgraph is that it is poorly understood outside of its immediate developer community. The BSD networking stack has a number of textbooks written for it, Netgraph does not, and it probably factored into the decision of Itronix to sponsor a from-scratch implementation of Bluetooth for NetBSD -- netgraph has been considered 'a bridge too far', to score a cheesy pun. It has also been criticised for performance, although I am not in a position to judge either way at the moment, I simply don't have all the information to hand, and am busy doing other things often. I don't have time to look at your patch right now, unfortunately, but can try to make time when less pressed. When I last looked at the CARP hooks, during the ether_input() cleanup, all that was really missing was the ability to register soft MAC addresses in the perfect hash filter entries other than the one programmed into the card (or configured via ifconfig(8) mechanisms). cheers BMS From bms at incunabulum.net Wed Apr 22 10:10:02 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed Apr 22 10:10:08 2009 Subject: Network Card In-Reply-To: <6b16fb4c0904210525x43811cb3p71117e92e9826547@mail.gmail.com> References: <6b16fb4c0904210407w3caa791fo2c9ada9879a0981d@mail.gmail.com> <6b16fb4c0904210455q33ea34c6s33c226cf5f22504b@mail.gmail.com> <49EDB566.8090409@freebsdonline.com> <6b16fb4c0904210525x43811cb3p71117e92e9826547@mail.gmail.com> Message-ID: <49EEECF6.2080705@incunabulum.net> Kaushal Shriyan wrote: > ... > so there is no such command line utility to get to know about that > information on Free BSD ? > There is no sure fire way to get that information anywhere, unless you're working with a system which has implemented PCI geographical addressing. Some of this is present in hotplug support. If someone pays for the feature, I'm sure it can get done... Having said that you should be able to make educated guesses about where something is, just by looking at the bus hierarchy (e.g. using devinfo or similar tool). This is no different from anywhere else that implements PCI. thanks, BMS From bms at FreeBSD.org Wed Apr 22 12:47:40 2009 From: bms at FreeBSD.org (Bruce M. Simpson) Date: Wed Apr 22 12:47:46 2009 Subject: CARP as a module; followup thoughts In-Reply-To: <2aada3410904212216o128e1fdfx8c299b3531adc694@mail.gmail.com> References: <2aada3410904212216o128e1fdfx8c299b3531adc694@mail.gmail.com> Message-ID: <49EF11E8.508@FreeBSD.org> Hi, Will Andrews wrote: > Hello, > > I've written a patch (against 8.0-CURRENT as of r191369) which makes > it possible to build, load, run, & unload CARP as a module, using the > GENERIC kernel. It can be obtained from: > > http://firepipe.net/patches/carp-as-module-20090421.diff > There's no need to implement the in*_proto_register() stuff in that patch, you should just be able to re-use the encap_attach_func() functions. Look at how PIM is implemented in ip_mroute.c for an example. Other than that it looks like a good start... but would hold off on committing as-is. the more general case of registering a MAC address on an interface should be considered. cheers, BMS From ccowart at rescomp.berkeley.edu Wed Apr 22 17:11:38 2009 From: ccowart at rescomp.berkeley.edu (Chris Cowart) Date: Wed Apr 22 17:12:10 2009 Subject: IPFW missing feature Message-ID: <1812419482.20090422200106@yandex.ru> KES wrote: > ????????????, Lowell. > > ?? ?????? 16 ?????? 2009 ?., 15:22:31: > > LG> KES writes: > >>> The tablearg feature provides the ability to use a value, looked up in >>> the table, as the argument for a rule action, action parameter or rule >>> option. This can significantly reduce number of rules in some configura- >>> tions. If two tables are used in a rule, the result of the second (des- >>> tination) is used. The tablearg argument can be used with the following >>> actions: nat, pipe, queue, divert, tee, netgraph, ngtee, fwd, skipto >>> action parameters: tag, untag, rule options: limit, tagged. >>> >>> >>> Why tablearg cannot be used with setfib? > > LG> Because tables are a feature of IPFW, and the FIB isn't. > > setfib is also feature of ipfw. see man: > > setfib fibnum > The packet is tagged so as to use the FIB (routing table) fibnum > in any subsequent forwarding decisions. Initially this is limited > to the values 0 through 15. See setfib(8). Processing continues > at the next rule. > > There is no any difficulties to use 'tablearg' as 'fibnum' > > ipfw add 3 setfib 2 all from 192.168.0.0/16 to any in recv > ipfw add 3 setfib tablearg all from table() to any in recv > > but now this is not mistake to write 'setfib tablearg'. IPFW just > replace tablearg in rule with 0 > It seems like a bug. because of it MUST work in proper way or DO NOT > work at all. IMHO I use tablearg with netgraph. For example, ipfw add netgraph tablearg all from 'table(9)' to any in When I run ipfw show, I see: 02380 408 60358 netgraph tablearg ip from any to table(9) in KES, do you mean to say that when you run `ipfw show' the rule is echoed back to you as: setfib 0 all from table() to any in recv instead of tablearg? If that's the case, it sounds like ipfw is parsing the rule incorrectly. If tablearg isn't supported by setfib, I would expect a syntax error to be thrown and not a different rule being inserted into your ruleset. If this is the behavior you're seeing, you should run it by the folks on the -net mailing list. That would also be a good place to ask about future plans to support this feature. -- Chris Cowart Network Technical Lead Network & Infrastructure Services, RSSP-IT UC Berkeley From julian at elischer.org Wed Apr 22 17:18:03 2009 From: julian at elischer.org (Julian Elischer) Date: Wed Apr 22 17:18:09 2009 Subject: IPFW missing feature In-Reply-To: <1812419482.20090422200106@yandex.ru> References: <1812419482.20090422200106@yandex.ru> Message-ID: <49EF514A.5080103@elischer.org> Chris Cowart wrote: > KES wrote: >> ????????????, Lowell. >> >> ?? ?????? 16 ?????? 2009 ?., 15:22:31: >> >> LG> KES writes: >> >>>> The tablearg feature provides the ability to use a value, looked up in >>>> the table, as the argument for a rule action, action parameter or rule >>>> option. This can significantly reduce number of rules in some configura- >>>> tions. If two tables are used in a rule, the result of the second (des- >>>> tination) is used. The tablearg argument can be used with the following >>>> actions: nat, pipe, queue, divert, tee, netgraph, ngtee, fwd, skipto >>>> action parameters: tag, untag, rule options: limit, tagged. >>>> >>>> >>>> Why tablearg cannot be used with setfib? >> LG> Because tables are a feature of IPFW, and the FIB isn't. >> >> setfib is also feature of ipfw. see man: >> >> setfib fibnum >> The packet is tagged so as to use the FIB (routing table) fibnum >> in any subsequent forwarding decisions. Initially this is limited >> to the values 0 through 15. See setfib(8). Processing continues >> at the next rule. >> >> There is no any difficulties to use 'tablearg' as 'fibnum' >> >> ipfw add 3 setfib 2 all from 192.168.0.0/16 to any in recv >> ipfw add 3 setfib tablearg all from table() to any in recv >> >> but now this is not mistake to write 'setfib tablearg'. IPFW just >> replace tablearg in rule with 0 >> It seems like a bug. because of it MUST work in proper way or DO NOT >> work at all. IMHO > > > I use tablearg with netgraph. > > For example, > > ipfw add netgraph tablearg all from 'table(9)' to any in > > When I run ipfw show, I see: > > 02380 408 60358 netgraph tablearg ip from any to table(9) in > > KES, do you mean to say that when you run `ipfw show' the rule is echoed > back to you as: > > setfib 0 all from table() to any in recv > > instead of tablearg? > > If that's the case, it sounds like ipfw is parsing the rule incorrectly. > If tablearg isn't supported by setfib, I would expect a syntax error to > be thrown and not a different rule being inserted into your ruleset. If > this is the behavior you're seeing, you should run it by the folks on > the -net mailing list. That would also be a good place to ask about > future plans to support this feature. > Unfortunately 'tablearg' is not implemented in the code as a generic thing, but rather needs to be implemented separately for each place where it may be used. In this case I simply didn't think of it when I added setfib. It does make sense to allow it and I will consider adding this in the future as it would be useful for policy routing. From peterjeremy at optushome.com.au Thu Apr 23 06:28:25 2009 From: peterjeremy at optushome.com.au (Peter Jeremy) Date: Thu Apr 23 06:28:32 2009 Subject: Network Card In-Reply-To: <472244.30756.qm@web63908.mail.re1.yahoo.com> References: <49EDCBD7.9030407@acm.poly.edu> <472244.30756.qm@web63908.mail.re1.yahoo.com> Message-ID: <20090423062813.GA8531@server.vk2pj.dyndns.org> On 2009-Apr-21 14:02:38 -0700, Barney Cordoba wrote: >On all of the MBs that I have, the slot NIC appears before the onboard >ports in the pciconf -l listing. Its certainly not for sure. As a datapoint to add to the uncertainty, the SunFire V440 has 4 motherboard NICs - two come before the PCI slots and two after (so adding a PCI-based Cassini nic moves ce2 from the MB to the plugin). Even slot numbering on larger boxes (with multiple physical PCI buses) can be non (or counter) intuitive. Also note that FreeBSD has also changed its PCI probe order at least once in the past (effectively re-numbering devices). -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090423/60a95e28/attachment.pgp From pluknet at gmail.com Thu Apr 23 08:18:57 2009 From: pluknet at gmail.com (pluknet) Date: Thu Apr 23 08:19:03 2009 Subject: panic in soabort Message-ID: Hi all. Please, give me comment on this. The panic is on 6.2-REL. Is it known to be fixed in the latter releases? Thanks. db> bt Tracing pid 14677 tid 101677 td 0xcf8e2640 _mtx_lock_sleep(ce7b9a30,cf8e2640,0,0,0) at _mtx_lock_sleep+0x9d soabort(ce7b99bc) at soabort+0x82 soclose(c83a2858) at soclose+0x21a soo_close(cf1c8750,cf8e2640) at soo_close+0x63 fdrop_locked(cf1c8750,cf8e2640,cb18d400,f1872cb4,c06607eb,...) at fdrop_locked+0xac fdrop(cf1c8750,cf8e2640,c991b5a0,cf8e2640,0,...) at fdrop+0x41 closef(cf1c8750,cf8e2640,0,cf8e2640,a,...) at closef+0x42f close(cf8e2640,f1872d04) at close+0x211 syscall(816003b,816003b,bfbf003b,8151034,811a434,...) at syscall+0x2bf Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (6, FreeBSD ELF32, close), eip = 0x2832230f, esp = 0xbfbfe6dc, ebp = 0xbfbfe6f8 --- db> show msgbuf msgbufp = 0xc1042fe4 magic = 63062, size = 65508, r= 388996, w = 389463, ptr = 0xc1033000, cksum= 5411375 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0x104 fault code = supervisor read, page not present instruction pointer = 0x20:0xc067a01d stack pointer = 0x28:0xf1872bbc frame pointer = 0x28:0xf1872bc8 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 14677 (proftpd) db> show allpcpu Current CPU: 5 cpuid = 0 curthread = 0xc7cfec80: pid 18 "swi4: clock sio" curpcb = 0xe6892d90 fpcurthread = none idlethread = 0xc7cfeaf0: pid 17 "idle: cpu0" APIC ID = 0 currentldt = 0x50 cpuid = 1 curthread = 0xce9b1c80: pid 63915 "sc_trans_freebsd" curpcb = 0xf1263d90 fpcurthread = none idlethread = 0xc7cfe000: pid 16 "idle: cpu1" APIC ID = 1 currentldt = 0x50 cpuid = 2 curthread = 0xd1b944b0: pid 63619 "sc_serv" curpcb = 0xf2435d90 fpcurthread = none idlethread = 0xc7cfde10: pid 15 "idle: cpu2" APIC ID = 2 currentldt = 0x58 cpuid = 3 curthread = 0xd2340af0: pid 5086 "sc_serv" curpcb = 0xf2e08d90 fpcurthread = none idlethread = 0xc7cfdc80: pid 14 "idle: cpu3" APIC ID = 3 currentldt = 0x58 cpuid = 4 curthread = 0xca46b640: pid 14743 "httpd" curpcb = 0xeefbbd90 fpcurthread = none idlethread = 0xc7cfdaf0: pid 13 "idle: cpu4" APIC ID = 4 currentldt = 0x50 cpuid = 5 curthread = 0xcf8e2640: pid 14677 "proftpd" curpcb = 0xf1872d90 fpcurthread = none idlethread = 0xc7cfd960: pid 12 "idle: cpu5" APIC ID = 5 currentldt = 0x50 cpuid = 6 curthread = 0xc833a7d0: pid 10882 "httpd" curpcb = 0xf2651d90 fpcurthread = none idlethread = 0xc7cfd7d0: pid 11 "idle: cpu6" APIC ID = 6 currentldt = 0x50 cpuid = 7 curthread = 0xc7d02000: pid 20 "swi1: net" curpcb = 0xe6898d90 fpcurthread = none idlethread = 0xc7cfd640: pid 10 "idle: cpu7" APIC ID = 7 currentldt = 0x50 db> bt 63619 Tracing pid 63619 tid 103691 td 0xd24e8640 sched_switch(3528361536,0,2) at sched_switch+323 mi_switch(2,0) at mi_switch+442 critical_exit(3231785568,4070575232,3230238960,0,3227844616,...) at critical_exit+157 lapic_handle_timer(0) at lapic_handle_timer+201 Xtimerint(3231785568,3528361536,0,0,0) at Xtimerint+48 accept1(3528361536,4070575364,0,4070575408,3230324027,...) at accept1+254 accept(3528361536,4070575364) at accept+16 syscall(135659579,59,138870843,135738880,0,...) at syscall+703 Xint0x80_syscall() at Xint0x80_syscall+31 --- syscall (30, FreeBSD ELF32, accept), eip = 672261683, esp = 3215908652, ebp = 3215908696 --- db> bt 5086 Tracing pid 5086 tid 103669 td 0xc8494640 sched_switch(3360245312,0,1) at sched_switch+323 mi_switch(1,0,3435481780,4041956464,3228189038,...) at mi_switch+442 sleepq_switch(3435481780) at sleepq_switch+135 sleepq_timedwait_sig(3435481780) at sleepq_timedwait_sig+30 msleep(3435481780,3451159168,360,3230803656,3,...) at msleep+560 kse_release(3360245312,4041956612) at kse_release+567 syscall(135659579,59,138870843,135713536,0,...) at syscall+703 Xint0x80_syscall() at Xint0x80_syscall+31 --- syscall (383, FreeBSD ELF32, kse_release), eip = 671810103, esp = 138899336, ebp = 138899396 --- db> bt 10882 Tracing pid 10882 tid 102711 td 0xc833a7d0 sched_switch(3358828496,3352291680,6) at sched_switch+323 mi_switch(6,3352291680,3352292024,3231754688,4066712232,...) at mi_switch+442 maybe_preempt(3352291680) at maybe_preempt+196 sched_add(3352291680,4,3358828496,3352291680,4066712268,...) at sched_add+600 setrunqueue(3358828840,3499884544,3231754688,4066712304,3228131795,...) at setrunqueue+99 _end() at 3358828496 db> bt 20 Tracing pid 20 tid 100013 td 0xc7d02000 sched_switch(3352305664,3352291680,6) at sched_switch+323 mi_switch(6,3352291680,3352292024,3231754688,3867773608,...) at mi_switch+442 maybe_preempt(3352291680) at maybe_preempt+196 sched_add(3352291680,4,3352305664,3352291680,3867773644,...) at sched_add+600 setrunqueue(3867773668,3227962772,3352306008,3867773680,3228131754,...) at setrunqueue+99 _end() at 3352305664 -- wbr, pluknet From rwatson at FreeBSD.org Thu Apr 23 09:40:14 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Thu Apr 23 09:40:20 2009 Subject: panic in soabort In-Reply-To: References: Message-ID: On Thu, 23 Apr 2009, pluknet wrote: > Please, give me comment on this. The panic is on 6.2-REL. Is it known to be > fixed in the latter releases? It may well be -- there have been quite significant architectural improvements to socket life cycle (etc) between 6.2 and 7.x releases, which may well close the race causing this panic. However, we'll probably need to learn a bit more in order to decide for sure. Could you convert the trapping instruction pointer to file+offset in the source code? Robert N M Watson Computer Laboratory University of Cambridge > > Thanks. > > db> bt > Tracing pid 14677 tid 101677 td 0xcf8e2640 > _mtx_lock_sleep(ce7b9a30,cf8e2640,0,0,0) at _mtx_lock_sleep+0x9d > soabort(ce7b99bc) at soabort+0x82 > soclose(c83a2858) at soclose+0x21a > soo_close(cf1c8750,cf8e2640) at soo_close+0x63 > fdrop_locked(cf1c8750,cf8e2640,cb18d400,f1872cb4,c06607eb,...) at > fdrop_locked+0xac > fdrop(cf1c8750,cf8e2640,c991b5a0,cf8e2640,0,...) at fdrop+0x41 > closef(cf1c8750,cf8e2640,0,cf8e2640,a,...) at closef+0x42f > close(cf8e2640,f1872d04) at close+0x211 > syscall(816003b,816003b,bfbf003b,8151034,811a434,...) at syscall+0x2bf > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (6, FreeBSD ELF32, close), eip = 0x2832230f, esp = > 0xbfbfe6dc, ebp = 0xbfbfe6f8 --- > > db> show msgbuf > msgbufp = 0xc1042fe4 > magic = 63062, size = 65508, r= 388996, w = 389463, ptr = 0xc1033000, > cksum= 5411375 > kernel trap 12 with interrupts disabled > > > Fatal trap 12: page fault while in kernel mode > cpuid = 5; apic id = 05 > fault virtual address = 0x104 > fault code = supervisor read, page not present > instruction pointer = 0x20:0xc067a01d > stack pointer = 0x28:0xf1872bbc > frame pointer = 0x28:0xf1872bc8 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = resume, IOPL = 0 > current process = 14677 (proftpd) > > db> show allpcpu > Current CPU: 5 > > cpuid = 0 > curthread = 0xc7cfec80: pid 18 "swi4: clock sio" > curpcb = 0xe6892d90 > fpcurthread = none > idlethread = 0xc7cfeaf0: pid 17 "idle: cpu0" > APIC ID = 0 > currentldt = 0x50 > > cpuid = 1 > curthread = 0xce9b1c80: pid 63915 "sc_trans_freebsd" > curpcb = 0xf1263d90 > fpcurthread = none > idlethread = 0xc7cfe000: pid 16 "idle: cpu1" > APIC ID = 1 > currentldt = 0x50 > > cpuid = 2 > curthread = 0xd1b944b0: pid 63619 "sc_serv" > curpcb = 0xf2435d90 > fpcurthread = none > idlethread = 0xc7cfde10: pid 15 "idle: cpu2" > APIC ID = 2 > currentldt = 0x58 > > cpuid = 3 > curthread = 0xd2340af0: pid 5086 "sc_serv" > curpcb = 0xf2e08d90 > fpcurthread = none > idlethread = 0xc7cfdc80: pid 14 "idle: cpu3" > APIC ID = 3 > currentldt = 0x58 > > cpuid = 4 > curthread = 0xca46b640: pid 14743 "httpd" > curpcb = 0xeefbbd90 > fpcurthread = none > idlethread = 0xc7cfdaf0: pid 13 "idle: cpu4" > APIC ID = 4 > currentldt = 0x50 > > cpuid = 5 > curthread = 0xcf8e2640: pid 14677 "proftpd" > curpcb = 0xf1872d90 > fpcurthread = none > idlethread = 0xc7cfd960: pid 12 "idle: cpu5" > APIC ID = 5 > currentldt = 0x50 > > cpuid = 6 > curthread = 0xc833a7d0: pid 10882 "httpd" > curpcb = 0xf2651d90 > fpcurthread = none > idlethread = 0xc7cfd7d0: pid 11 "idle: cpu6" > APIC ID = 6 > currentldt = 0x50 > > cpuid = 7 > curthread = 0xc7d02000: pid 20 "swi1: net" > curpcb = 0xe6898d90 > fpcurthread = none > idlethread = 0xc7cfd640: pid 10 "idle: cpu7" > APIC ID = 7 > currentldt = 0x50 > > db> bt 63619 > Tracing pid 63619 tid 103691 td 0xd24e8640 > sched_switch(3528361536,0,2) at sched_switch+323 > mi_switch(2,0) at mi_switch+442 > critical_exit(3231785568,4070575232,3230238960,0,3227844616,...) at > critical_exit+157 > lapic_handle_timer(0) at lapic_handle_timer+201 > Xtimerint(3231785568,3528361536,0,0,0) at Xtimerint+48 > accept1(3528361536,4070575364,0,4070575408,3230324027,...) at accept1+254 > accept(3528361536,4070575364) at accept+16 > syscall(135659579,59,138870843,135738880,0,...) at syscall+703 > Xint0x80_syscall() at Xint0x80_syscall+31 > --- syscall (30, FreeBSD ELF32, accept), eip = 672261683, esp = > 3215908652, ebp = 3215908696 --- > > db> bt 5086 > Tracing pid 5086 tid 103669 td 0xc8494640 > sched_switch(3360245312,0,1) at sched_switch+323 > mi_switch(1,0,3435481780,4041956464,3228189038,...) at mi_switch+442 > sleepq_switch(3435481780) at sleepq_switch+135 > sleepq_timedwait_sig(3435481780) at sleepq_timedwait_sig+30 > msleep(3435481780,3451159168,360,3230803656,3,...) at msleep+560 > kse_release(3360245312,4041956612) at kse_release+567 > syscall(135659579,59,138870843,135713536,0,...) at syscall+703 > Xint0x80_syscall() at Xint0x80_syscall+31 > --- syscall (383, FreeBSD ELF32, kse_release), eip = 671810103, esp = > 138899336, ebp = 138899396 --- > > db> bt 10882 > Tracing pid 10882 tid 102711 td 0xc833a7d0 > sched_switch(3358828496,3352291680,6) at sched_switch+323 > mi_switch(6,3352291680,3352292024,3231754688,4066712232,...) at mi_switch+442 > maybe_preempt(3352291680) at maybe_preempt+196 > sched_add(3352291680,4,3358828496,3352291680,4066712268,...) at sched_add+600 > setrunqueue(3358828840,3499884544,3231754688,4066712304,3228131795,...) > at setrunqueue+99 > _end() at 3358828496 > > db> bt 20 > Tracing pid 20 tid 100013 td 0xc7d02000 > sched_switch(3352305664,3352291680,6) at sched_switch+323 > mi_switch(6,3352291680,3352292024,3231754688,3867773608,...) at mi_switch+442 > maybe_preempt(3352291680) at maybe_preempt+196 > sched_add(3352291680,4,3352305664,3352291680,3867773644,...) at sched_add+600 > setrunqueue(3867773668,3227962772,3352306008,3867773680,3228131754,...) > at setrunqueue+99 > _end() at 3352305664 > > > > -- > wbr, > pluknet > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From adamk at voicenet.com Thu Apr 23 10:40:03 2009 From: adamk at voicenet.com (Adam K Kirchhoff) Date: Thu Apr 23 10:40:10 2009 Subject: kern/131153: [iwi] iwi doesn't see a wireless network Message-ID: <200904231040.n3NAe2wW056111@freefall.freebsd.org> The following reply was made to PR kern/131153; it has been noted by GNATS. From: Adam K Kirchhoff To: bug-followup@FreeBSD.org, adamk@voicenet.com Cc: Subject: Re: kern/131153: [iwi] iwi doesn't see a wireless network Date: Thu, 23 Apr 2009 06:31:56 -0400 Can anyone at least confirm that the iwi and ath drivers work with 802.11n networks with WPA? From barney_cordoba at yahoo.com Thu Apr 23 12:21:11 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Thu Apr 23 12:21:18 2009 Subject: Network Card In-Reply-To: <20090423062813.GA8531@server.vk2pj.dyndns.org> Message-ID: <432748.87246.qm@web63905.mail.re1.yahoo.com> --- On Thu, 4/23/09, Peter Jeremy wrote: > From: Peter Jeremy > Subject: Re: Network Card > To: "Barney Cordoba" > Cc: "ovi freebsd" , freebsd-net@freebsd.org > Date: Thursday, April 23, 2009, 2:28 AM > On 2009-Apr-21 14:02:38 -0700, Barney Cordoba > wrote: > >On all of the MBs that I have, the slot NIC appears > before the onboard > >ports in the pciconf -l listing. Its certainly not for > sure. > > As a datapoint to add to the uncertainty, the SunFire V440 > has 4 > motherboard NICs - two come before the PCI slots and two > after (so > adding a PCI-based Cassini nic moves ce2 from the MB to the > plugin). > > Even slot numbering on larger boxes (with multiple physical > PCI buses) > can be non (or counter) intuitive. > > Also note that FreeBSD has also changed its PCI probe order > at least > once in the past (effectively re-numbering devices). > > -- > Peter Jeremy 4 port NICs generally have a bridge chip on it, so they always tend to muck things up. If the nics are PCI-X, you can probably add some trace to the em driver to see what the bus speed is. Onboard NICs are usually 33mhz or 66Mhz (I've never seen on onboard that runs 133Mhz)..however if the add-on card is 33mhz or PCI-E than you won't know. But if the em NIC is running at 133 than its almost definitely (hows that for certainty?) a plug in card. Who would buy a realtek plug in card anyway? Barney From tamaru at myn.rcast.u-tokyo.ac.jp Thu Apr 23 12:51:07 2009 From: tamaru at myn.rcast.u-tokyo.ac.jp (Hiroharu Tamaru) Date: Thu Apr 23 12:51:13 2009 Subject: proxy arp on 8.0-current? Message-ID: Hi, I'm trying to setup an proxy arp on a dual homed host. I noticed that I cannot set it up on 8.0-current the same way as I could on 6.2; hence the question: have the setup procedure changed recently (when the arp table was separated from the routing table, maybe?)? My 8.0-current is from 200902 snapshot. Here is a simple demonstration using two single-interfaced hosts: setup: host6.2# ifconfig em0 inet 192.168.0.1/24 host6.2# arp -s 192.168.0.11 auto pub host6.2# arp -an | grep permanent ? (192.168.0.1) at 00:16:d3:xx:xx:xx on em0 permanent [ethernet] ? (192.168.0.11) at 00:16:d3:xx:xx:xx on em0 permanent published [ethernet] host6.2# tcpdump -np arp host8.0# ifconfig em0 inet 192.168.0.2/24 host8.0# arp -s 192.168.0.12 auto pub host8.0# arp -an | grep permanent ? (192.168.0.2) at 00:0c:29:xx:xx:xx on em0 permanent [ethernet] ? (192.168.0.12) at 00:0c:29:xx:xx:xx on em0 permanent published [ethernet] host8.0# tcpdump -np arp then, I do: host6.2# arp -d 192.168.0.2; ping -c 1 192.168.0.2 host6.2# arp -d 192.168.0.12; ping -c 1 192.168.0.12 host8.0# arp -d 192.168.0.1; ping -c 1 192.168.0.1 host8.0# arp -d 192.168.0.11; ping -c 1 192.168.0.11 I am not caring about 'arp -d' errors (cannot locate) nor ping not responding (for proxied addresses). I just cared about arp requests and replys for now. The output of tcpdump on both sides are like this: arp who-has 192.168.0.2 tell 192.168.0.1 arp reply 192.168.0.2 is-at 00:0c:29:xx:xx:xx arp who-has 192.168.0.12 tell 192.168.0.1 ---->no reply arp who-has 192.168.0.1 tell 192.168.0.2 arp reply 192.168.0.1 is-at 00:16:d3:xx:xx:xx arp who-has 192.168.0.11 tell 192.168.0.2 arp reply 192.168.0.11 is-at 00:16:d3:xx:xx:xx As you can see from the above, 'arp -s 192.168.0.12 auto pub' on 8.0-current host seems not to be producing proxy arp's. What am I missing? Thanks. -- Hiroharu Tamaru From ddg at yan.com.br Thu Apr 23 13:42:17 2009 From: ddg at yan.com.br (=?ISO-8859-1?Q?Daniel_Dias_Gon=E7alves?=) Date: Thu Apr 23 13:42:25 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE Message-ID: <49F06985.1000303@yan.com.br> Hi, My system is a FreeBSD 7.1R. When I add rules IPFW COUNT to 254 IPS from my network, one of my interfaces increases the latency, causing large delays in the network, when I delete COUNT rules, everything returns to normal, which can be ? My script: ipcount.php -- CUT -- -- CUT -- net.inet.ip.fw.dyn_keepalive: 1 net.inet.ip.fw.dyn_short_lifetime: 5 net.inet.ip.fw.dyn_udp_lifetime: 10 net.inet.ip.fw.dyn_rst_lifetime: 1 net.inet.ip.fw.dyn_fin_lifetime: 1 net.inet.ip.fw.dyn_syn_lifetime: 20 net.inet.ip.fw.dyn_ack_lifetime: 300 net.inet.ip.fw.static_count: 262 net.inet.ip.fw.dyn_max: 10000 net.inet.ip.fw.dyn_count: 0 net.inet.ip.fw.curr_dyn_buckets: 256 net.inet.ip.fw.dyn_buckets: 10000 net.inet.ip.fw.default_rule: 65535 net.inet.ip.fw.verbose_limit: 0 net.inet.ip.fw.verbose: 1 net.inet.ip.fw.debug: 0 net.inet.ip.fw.one_pass: 1 net.inet.ip.fw.autoinc_step: 100 net.inet.ip.fw.enable: 1 net.link.ether.ipfw: 1 net.link.bridge.ipfw: 0 net.link.bridge.ipfw_arp: 0 Thanks, Daniel From to.my.trociny at gmail.com Thu Apr 23 14:20:04 2009 From: to.my.trociny at gmail.com (Mikolaj Golub) Date: Thu Apr 23 14:20:11 2009 Subject: kern/133902: [tun] Killing tun0 iface ssh tunnel causes Panic String: page fault Message-ID: <200904231420.n3NEK43v056212@freefall.freebsd.org> The following reply was made to PR kern/133902; it has been noted by GNATS. From: Mikolaj Golub To: bug-followup@FreeBSD.org Cc: freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org, lsantagostini@gmail.com Subject: Re: kern/133902: [tun] Killing tun0 iface ssh tunnel causes Panic String: page fault Date: Thu, 23 Apr 2009 17:14:02 +0300 I have asked Leonardo to provide more info and backtrace. So here is backtrace: cobra4# kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.0 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x65656c7b fault code = supervisor write, page not present instruction pointer = 0x20:0xc0786e00 stack pointer = 0x28:0xe958fac4 frame pointer = 0x28:0xe958fac4 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 66873 (ssh) trap number = 12 panic: page fault cpuid = 1 Uptime: 54d11h21m54s Physical memory: 2023 MB Dumping 277 MB: 262 246 230 214 198 182 166 150 134 118 102 86 70 54 38 22 6 #0 doadump () at pcpu.h:195 195 pcpu.h: No such file or directory. in pcpu.h (kgdb) backtrace #0 doadump () at pcpu.h:195 #1 0xc0754457 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc0754719 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:563 #3 0xc0a4905c in trap_fatal (frame=0xe958fa84, eva=1701145723) at /usr/src/sys/i386/i386/trap.c:899 #4 0xc0a492e0 in trap_pfault (frame=0xe958fa84, usermode=0, eva=1701145723) at /usr/src/sys/i386/i386/trap.c:812 #5 0xc0a49c8c in trap (frame=0xe958fa84) at /usr/src/sys/i386/i386/trap.c:490 #6 0xc0a2fc0b in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc0786e00 in clear_selinfo_list (td=0xca3fc840) at /usr/src/sys/kern/sys_generic.c:1065 #8 0xc0788efc in kern_select (td=0xca3fc840, nd=8, fd_in=0x284010b8, fd_ou=0x284010bc, fd_ex=0x0, tvp=0x0) at /usr/src/sys/kern/sys_generic.c:794 #9 0xc07890de in select (td=0xca3fc840, uap=0xe958fcfc) at /usr/src/sys/kern/sys_generic.c:663 #10 0xc0a49635 in syscall (frame=0xe958fd38) at /usr/src/sys/i386/i386/trap.c:1035 #11 0xc0a2fc70 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:196 #12 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) The system panics on ifconfig tun0 destroy This issue is related to kern/116837. Leonardo, you can try the patch attached to that pr. -- Mikolaj Golub From to.my.trociny at gmail.com Thu Apr 23 14:42:13 2009 From: to.my.trociny at gmail.com (Mikolaj Golub) Date: Thu Apr 23 14:42:50 2009 Subject: kern/133902: [tun] Killing tun0 iface ssh tunnel causes Panic String: page fault In-Reply-To: <200904220651.n3M6pSW2042168@freefall.freebsd.org> (linimon@freebsd.org's message of "Wed\, 22 Apr 2009 06\:51\:28 GMT") References: <200904220651.n3M6pSW2042168@freefall.freebsd.org> Message-ID: <81prf3h2z9.fsf@zhuzha.ua1> I have asked Leonardo to provide more info and backtrace. So here is backtrace: cobra4# kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.0 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x65656c7b fault code = supervisor write, page not present instruction pointer = 0x20:0xc0786e00 stack pointer = 0x28:0xe958fac4 frame pointer = 0x28:0xe958fac4 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 66873 (ssh) trap number = 12 panic: page fault cpuid = 1 Uptime: 54d11h21m54s Physical memory: 2023 MB Dumping 277 MB: 262 246 230 214 198 182 166 150 134 118 102 86 70 54 38 22 6 #0 doadump () at pcpu.h:195 195 pcpu.h: No such file or directory. in pcpu.h (kgdb) backtrace #0 doadump () at pcpu.h:195 #1 0xc0754457 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc0754719 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:563 #3 0xc0a4905c in trap_fatal (frame=0xe958fa84, eva=1701145723) at /usr/src/sys/i386/i386/trap.c:899 #4 0xc0a492e0 in trap_pfault (frame=0xe958fa84, usermode=0, eva=1701145723) at /usr/src/sys/i386/i386/trap.c:812 #5 0xc0a49c8c in trap (frame=0xe958fa84) at /usr/src/sys/i386/i386/trap.c:490 #6 0xc0a2fc0b in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc0786e00 in clear_selinfo_list (td=0xca3fc840) at /usr/src/sys/kern/sys_generic.c:1065 #8 0xc0788efc in kern_select (td=0xca3fc840, nd=8, fd_in=0x284010b8, fd_ou=0x284010bc, fd_ex=0x0, tvp=0x0) at /usr/src/sys/kern/sys_generic.c:794 #9 0xc07890de in select (td=0xca3fc840, uap=0xe958fcfc) at /usr/src/sys/kern/sys_generic.c:663 #10 0xc0a49635 in syscall (frame=0xe958fd38) at /usr/src/sys/i386/i386/trap.c:1035 #11 0xc0a2fc70 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:196 #12 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) The system panics on ifconfig tun0 destroy This issue is related to kern/116837. Leonardo, you can try the patch attached to that pr. -- Mikolaj Golub From wmoran at collaborativefusion.com Thu Apr 23 15:11:35 2009 From: wmoran at collaborativefusion.com (Bill Moran) Date: Thu Apr 23 15:11:42 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: <49F06985.1000303@yan.com.br> References: <49F06985.1000303@yan.com.br> Message-ID: <20090423110124.85788142.wmoran@collaborativefusion.com> In response to Daniel Dias Gon?alves : > > My system is a FreeBSD 7.1R. > When I add rules IPFW COUNT to 254 IPS from my network, one of my > interfaces increases the latency, causing large delays in the network, > when I delete COUNT rules, everything returns to normal, which can be ? Not sure what you mean by the "which can be" part of the question. But the answer, is "of course latency increases". Did you expect that this kind of traffic tracking to be free? It's not on any operating system or other networking device in existence. It takes CPU cycles and memory to do the tracking, and flipping bits in memory takes time. Therefore, your latency will increase when you add 512 counters to your rules. It's the overhead associated with such logging. Of course, you don't mention _how_much_ latency increases. I can only assume that it's to a degree that you find unacceptable. You also don't mention what hardware you're doing this on, but I would expect that on sufficiently beefy hardware the added latency is low enough not to be a problem. However, without those details, I expect that the following answer is the best you're going to get: If you need to so such logging and the latency increase is unacceptable, then get faster hardware to do it on or concoct some method to do it out of band so that the latency doesn't slow down the connections. > My script: > > ipcount.php > -- CUT -- > $c=0; > $a=50100; > for($x=0;$x<=0;$x++) { > for($y=1;$y<=254;$y++) { > $ip = "192.168.$x.$y"; > system("/sbin/ipfw -q add $a count { tcp or udp } from > any to $ip/32"); > system("/sbin/ipfw -q add $a count { tcp or udp } from > $ip/32 to any"); > #system("/sbin/ipfw delete $a"); > $c++; > $a++; > } > } > echo "\n\nTotal: $c\n"; > ?> > -- CUT -- > > net.inet.ip.fw.dyn_keepalive: 1 > net.inet.ip.fw.dyn_short_lifetime: 5 > net.inet.ip.fw.dyn_udp_lifetime: 10 > net.inet.ip.fw.dyn_rst_lifetime: 1 > net.inet.ip.fw.dyn_fin_lifetime: 1 > net.inet.ip.fw.dyn_syn_lifetime: 20 > net.inet.ip.fw.dyn_ack_lifetime: 300 > net.inet.ip.fw.static_count: 262 > net.inet.ip.fw.dyn_max: 10000 > net.inet.ip.fw.dyn_count: 0 > net.inet.ip.fw.curr_dyn_buckets: 256 > net.inet.ip.fw.dyn_buckets: 10000 > net.inet.ip.fw.default_rule: 65535 > net.inet.ip.fw.verbose_limit: 0 > net.inet.ip.fw.verbose: 1 > net.inet.ip.fw.debug: 0 > net.inet.ip.fw.one_pass: 1 > net.inet.ip.fw.autoinc_step: 100 > net.inet.ip.fw.enable: 1 > net.link.ether.ipfw: 1 > net.link.bridge.ipfw: 0 > net.link.bridge.ipfw_arp: 0 > > Thanks, > > Daniel > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023 **************************************************************** IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. **************************************************************** From julian at elischer.org Thu Apr 23 17:39:43 2009 From: julian at elischer.org (Julian Elischer) Date: Thu Apr 23 17:39:56 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: <49F06985.1000303@yan.com.br> References: <49F06985.1000303@yan.com.br> Message-ID: <49F0A7DD.30206@elischer.org> Daniel Dias Gon?alves wrote: > Hi, > > My system is a FreeBSD 7.1R. > When I add rules IPFW COUNT to 254 IPS from my network, one of my > interfaces increases the latency, causing large delays in the network, > when I delete COUNT rules, everything returns to normal, which can be ? > > My script: of course adding 512 rules, *all of which hav eto be evaluated* will add latency. you have several ways to improve this situation. 1/ use a differnet tool. By using the netgraph netflow module you can get accunting information that may be more useful and less impactful. 2/ you could make your rules smarter.. use skipto rules to make the average packet traverse less rules.. off the top of my head.. (not tested..) Assuming you have machines 10.0.0.1-10.0.0.254.... the rules below have an average packet traversing 19 rules and not 256 for teh SYN packet and 2 rules for others.. you may not be able to do the keep state trick if you use state for other stuff but in that case worst case will still be 19 rules. 2 check-state 5 skipto 10000 ip from not 10.0.0.0/24 to any 10 skipto 2020 ip from not 10.0.0.0/25 to any # 0-128 20 skipto 1030 ip from not 10.0.0.0/26 to any # 0-64 30 skipto 240 ip from not 10.0.0.0/27 to any # 0-32 40 skipto 100 ip from not 10.0.0.0/28 to any # 0-16 [16 count rules for 0-15] 80 skipto 10000 ip from any to any 100 [16 count rules for 16-31] keep-state 140 skipto 10000 ip from any to any 240 skipto 300 ip from not 10.0.0.32/28 [16 rules for 32-47] keep-state 280 skipto 10000 ip from any to any 300 [16 count rules for 48-63] keep-state 340 skipto 10000 ip from any to any 1030 skipto 1240 ip from not 10.0.0.64/27 to any 1040 skipto 1100 ip from not 10.0.0.64/28 to any [16 count rules for 64-79] keep-state 1080 skipto 10000 ip from any to any 1100 [16 rules for 80-95] keep-state 1140 skipto 10000 ip from any to any 1240 skipto 1300 ip from not 10.0.0.96/28 to any [16 count rules for 96-111] keep-state 1280 skipto 10000 ip from any to any 1300 [16 rules for 112-127] keep-state 1340 skipto 10000 ip from any to any 2020 skipto 3030 ip from not 10.0.0.128/26 to any 2030 skipto 2240 ip from not 10.0.0.128/28 to any [16 count rules for 128-143] keep-state 2080 skipto 10000 ip from any to any 2100 [16 rules for 144-159] keep-state 2140 skipto 10000 ip from any to any 2240 skipto 2300 ip from not 10.0.0.32/28 to any [16 count rules for 160-175] keep-state 2280 skipto 10000 ip from any to any 2300 [16 count rules for 176-191] keep-state 2340 skipto 10000 ip from any to any 3030 skipto 3240 ip from not 10.0.0.192/27 to any 3040 skipto 3100 ip from not 10.0.0.192/28 to any [16 count rules for 192-207] keep-state 3080 skipto 10000 ip from any to any 3100 [16 rules for 208-223] keep-state 3240 skipto 10000 ip from any to any 3240 skipto 3300 ip from not 10.0.0.224/28 to any [16 count rules for 224-239] keep-state 3280 skipto 10000 ip from any to any 3300 [16 count rules for 240-255] keep-state 3340 skipto 10000 ip from any to any 10000 #other stuff in fact you could improve it further with: 1/ either going down to a netmask of 29 (8 rules per set) or 2/ instead of having count rules make them skipto so you would have: 3300 skipto 10000 ip from 10.0.0.240 to any 3301 skipto 10000 ip from 10.0.0.241 to any 3302 skipto 10000 ip from 10.0.0.242 to any 3303 skipto 10000 ip from 10.0.0.243 to any 3304 skipto 10000 ip from 10.0.0.244 to any 3305 skipto 10000 ip from 10.0.0.245 to any 3306 skipto 10000 ip from 10.0.0.246 to any 3307 skipto 10000 ip from 10.0.0.247 to any 3308 skipto 10000 ip from 10.0.0.248 to any 3309 skipto 10000 ip from 10.0.0.249 to any 3310 skipto 10000 ip from 10.0.0.240 to any 3311 skipto 10000 ip from 10.0.0.241 to any 3312 skipto 10000 ip from 10.0.0.242 to any 3313 skipto 10000 ip from 10.0.0.243 to any 3314 skipto 10000 ip from 10.0.0.244 to any 3315 skipto 10000 ip from 10.0.0.245 to any thus on average, a packet would traverse half the rules (8). 3/ both the above so on average they would traverse 4 rules plus one extra skipto. you should be able to do the above in a script. I'd love to see it.. (you can also do skipto tablearg in -current (maybe 7.2 too) which may also be good.. (or not)) julian From emaste at freebsd.org Thu Apr 23 19:12:18 2009 From: emaste at freebsd.org (Ed Maste) Date: Thu Apr 23 19:12:26 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: References: <20090327071742.GA87385@onelab2.iet.unipi.it> Message-ID: <20090423190408.GA65895@jem.dhs.org> On Fri, Mar 27, 2009 at 11:05:00AM +0000, Andrew Brampton wrote: > 2009/3/27 Luigi Rizzo : > > The load of polling is pretty low (within 1% or so) even with > > polling. The advantage of having interrupts is faster response > > to incoming traffic, not CPU load. > > oh, I was under the impression that polling spun in a tight loop, thus > using 100% of the processor. After a quick test I see this is not the > case. I assume it will get to 100% CPU load if I saturate my network. Yes, polling has a limit on the maximum CPU time it will use, and also will use less than the limit if there is no traffic. There are a number of sysctls under kern.polling that control its behaviour: * kern.polling.user_frac: Desired user fraction of cpu time This attempts to reserve at least a specified percentage of available CPU time for user processes; polling tries to limit its percentage use to 100 less this value. * kern.polling.burst: Current polling burst size * kern.polling.burst_max: Max Polling burst size * kern.polling.each_burst: Max size of each burst These three control the number of packets that polling processes per call / tick. Packets are processed in batches of each_burst, up to burst packets total per tick. The value of burst is capped at busrt_max. In order to keep the user_frac CPU percentage available for non-polling, a feedback loop is used that controls the value of burst. Each time a bach of packets is processed, burst is incremented or decremented by 1, depending on how much CPU time polling actually used. In addition, if polling extends beyond the next tick it's scaled back to 7/8ths of the current value. Polling was originally implemented as a livelock-avoidance technique for the single-core case -- the primary goal is to guarantee the availability of CPU time specified in user_frac. The current algorithm does not behave that well if user_frac is set low. Setting it low is reasonable if the workload is largely in-kernel (i.e., bridging or routing), or when running SMP. Another downside of the current implementation is that interfaces will be polled multiple times per tick (burst / each_burst times), even if there are no packets to process. At work we've developed a replacement polling algorithm that keeps track of the actual amount of time spent per packet, and uses that as the feedback to control the number of packets in each batch. This work requires a change to the polling KPI: the polling handlers have to return the count of packets actually handled. My hope is to get the KPI change committed in time for 8.0, even if we don't switch the algorithm yet. Attilio (on CC:) and I will make the patch set for the KPI change available shortly for comment. -Ed From to.my.trociny at gmail.com Thu Apr 23 19:30:04 2009 From: to.my.trociny at gmail.com (Mikolaj Golub) Date: Thu Apr 23 19:30:11 2009 Subject: kern/132734: panic in net/if_mib.c Message-ID: <200904231930.n3NJU3me076397@freefall.freebsd.org> The following reply was made to PR kern/132734; it has been noted by GNATS. From: Mikolaj Golub To: Alexey Illarionov Cc: bug-followup@FreeBSD.org, Robert Watson Subject: Re: kern/132734: panic in net/if_mib.c Date: Thu, 23 Apr 2009 22:29:36 +0300 SVN rev 191435 on 2009-04-23 18:23:08Z by rwatson Merge r191434 from stable/7 to releng/7.2: In sysctl_ifdata(), query the ifnet pointer using the index only once, rather than querying it, validating it, and then re-querying it without validating it. This may avoid a NULL pointer dereference and resulting kernel page fault if an interface is being deleted while bsnmp or other tools are querying data on the interface. The full fix, to properly refcount the interface for the duration of the sysctl, is in 8.x, but is considered too high-risk for 7.2, so instead will appear in 7.3 (if all goes well). So, Alexey, can you try upgrading to the latest stable/7 or releng/7.2 or apply attached patch to see if this tweak at least eliminates the instant panic? --- if_mib.c (revision 191424) +++ if_mib.c (working copy) @@ -82,11 +82,9 @@ return EINVAL; if (name[0] <= 0 || name[0] > if_index || - ifnet_byindex(name[0]) == NULL) + (ifp = ifnet_byindex(name[0])) == NULL) return ENOENT; - ifp = ifnet_byindex(name[0]); - switch(name[1]) { default: return ENOENT; From rwatson at FreeBSD.org Thu Apr 23 19:40:05 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Thu Apr 23 19:40:53 2009 Subject: kern/132734: panic in net/if_mib.c Message-ID: <200904231940.n3NJe2XU091054@freefall.freebsd.org> The following reply was made to PR kern/132734; it has been noted by GNATS. From: Robert Watson To: Mikolaj Golub Cc: Alexey Illarionov , bug-followup@FreeBSD.org Subject: Re: kern/132734: panic in net/if_mib.c Date: Thu, 23 Apr 2009 20:33:43 +0100 (BST) On Thu, 23 Apr 2009, Mikolaj Golub wrote: > SVN rev 191435 on 2009-04-23 18:23:08Z by rwatson > > Merge r191434 from stable/7 to releng/7.2: > > In sysctl_ifdata(), query the ifnet pointer using the index only > once, rather than querying it, validating it, and then re-querying > it without validating it. This may avoid a NULL pointer > dereference and resulting kernel page fault if an interface is > being deleted while bsnmp or other tools are querying data on the > interface. > > The full fix, to properly refcount the interface for the duration > of the sysctl, is in 8.x, but is considered too high-risk for > 7.2, so instead will appear in 7.3 (if all goes well). > > So, Alexey, can you try upgrading to the latest stable/7 or releng/7.2 or > apply attached patch to see if this tweak at least eliminates the instant > panic? I'll try to get the refcount fix into 7-STABLE in about two weeks, assuming no hitches in the 8.x version. This will close a number of related race conditions, which we've had occasional reports of (and others that we haven't). Robert N M Watson Computer Laboratory University of Cambridge From attilio at freebsd.org Thu Apr 23 20:45:49 2009 From: attilio at freebsd.org (Attilio Rao) Date: Thu Apr 23 20:45:56 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <20090423190408.GA65895@jem.dhs.org> References: <20090327071742.GA87385@onelab2.iet.unipi.it> <20090423190408.GA65895@jem.dhs.org> Message-ID: <3bbf2fe10904231313o858b9e9v733564ee4f3d7d40@mail.gmail.com> 2009/4/23 Ed Maste : > On Fri, Mar 27, 2009 at 11:05:00AM +0000, Andrew Brampton wrote: > >> 2009/3/27 Luigi Rizzo : >> > The load of polling is pretty low (within 1% or so) even with >> > polling. The advantage of having interrupts is faster response >> > to incoming traffic, not CPU load. >> >> oh, I was under the impression that polling spun in a tight loop, thus >> using 100% of the processor. After a quick test I see this is not the >> case. I assume it will get to 100% CPU load if I saturate my network. > > Yes, polling has a limit on the maximum CPU time it will use, and also > will use less than the limit if there is no traffic. > > There are a number of sysctls under kern.polling that control its > behaviour: > > * kern.polling.user_frac: Desired user fraction of cpu time > > This attempts to reserve at least a specified percentage of available > CPU time for user processes; polling tries to limit its percentage use > to 100 less this value. > > * kern.polling.burst: Current polling burst size > * kern.polling.burst_max: Max Polling burst size > * kern.polling.each_burst: Max size of each burst > > These three control the number of packets that polling processes per > call / tick. ?Packets are processed in batches of each_burst, up to > burst packets total per tick. ?The value of burst is capped at > busrt_max. > > In order to keep the user_frac CPU percentage available for non-polling, > a feedback loop is used that controls the value of burst. ?Each time a > bach of packets is processed, burst is incremented or decremented by 1, > depending on how much CPU time polling actually used. ?In addition, if > polling extends beyond the next tick it's scaled back to 7/8ths of the > current value. > > Polling was originally implemented as a livelock-avoidance technique > for the single-core case -- the primary goal is to guarantee the > availability of CPU time specified in user_frac. ?The current algorithm > does not behave that well if user_frac is set low. ?Setting it low is > reasonable if the workload is largely in-kernel (i.e., bridging or > routing), or when running SMP. > > Another downside of the current implementation is that interfaces will > be polled multiple times per tick (burst / each_burst times), even if > there are no packets to process. > > At work we've developed a replacement polling algorithm that keeps track > of the actual amount of time spent per packet, and uses that as the > feedback to control the number of packets in each batch. > > This work requires a change to the polling KPI: the polling handlers > have to return the count of packets actually handled. ?My hope is to get > the KPI change committed in time for 8.0, even if we don't switch the > algorithm yet. ?Attilio (on CC:) and I will make the patch set for the > KPI change available shortly for comment. This is the KPI breakage patch: http://people.freebsd.org/~attilio/Sandvine/polling/polling_kpi.diff Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein From simon at FreeBSD.org Thu Apr 23 21:14:30 2009 From: simon at FreeBSD.org (Simon L. Nielsen) Date: Thu Apr 23 21:14:37 2009 Subject: CARP as a module; followup thoughts In-Reply-To: <2aada3410904212216o128e1fdfx8c299b3531adc694@mail.gmail.com> References: <2aada3410904212216o128e1fdfx8c299b3531adc694@mail.gmail.com> Message-ID: <20090423211428.GB1104@arthur.nitro.dk> On 2009.04.21 23:16:58 -0600, Will Andrews wrote: > Hello, > > I've written a patch (against 8.0-CURRENT as of r191369) which makes > it possible to build, load, run, & unload CARP as a module, using the > GENERIC kernel. It can be obtained from: > > http://firepipe.net/patches/carp-as-module-20090421.diff I don't have any comments on the specific patch, but with my FreeBSD end-user hat, being able to have CARP in GENERIC would be really great. This would allow me to update my systems which use CARP with freebsd-update without manually compiling a kernel. So if the patch doesn't penalize the non-CARP case much, I think it would be great to have this functionality for now, even if it's not the way to go in the long run. -- Simon L. Nielsen From geetika at sm3.virtual.vps-host.net Fri Apr 24 01:10:21 2009 From: geetika at sm3.virtual.vps-host.net (geetika) Date: Fri Apr 24 01:10:30 2009 Subject: Hi John Message-ID: <4e5bd80354720170456efd4320ea57bc@sm3.virtual.vps-host.net> Hi John, Just wanted you to know that yesterday I joined Socialmoto. Actually I keep on travelling and thus like making friends from different places so that i can know more about places and I can feel comfortable on unknown places because of known friends, but it gonna be with some gud people, my friend she gave me your email id .John you gottaa join me now and come online , I am online during office hours,and i have lots of spare time nowadays, kidding hehe :) . To join simply come here. [1]http://www.socialmoto.com/signup.php?inviteby=geetikai When you come here ping me, i will be online if by any sence i am not there. Thanks, Geetikai. ---------------------------------------------------------------------- -------------------------------------------- I have attached my details . My profile: [2]http://www.socialmoto.com/geetikai My album: [3]http://www.socialmoto.com/album.php?user=geetikai&album_id=2889 My group: [4]http://www.socialmoto.com/group.php?group_id=829 References Visible links 1. http://www.socialmoto.com/signup.php?inviteby=geetikai 2. http://www.socialmoto.com/geetikai 3. http://www.socialmoto.com/album.php?user=geetikai&album_id=2889 4. http://www.socialmoto.com/group.php?group_id=829 Hidden links: 5. http://www.socialmoto.com/sowmyas 6. http://www.socialmoto.com/group.php?group_id=520 From nslay at comcast.net Fri Apr 24 01:29:12 2009 From: nslay at comcast.net (Nathan Lay) Date: Fri Apr 24 01:29:18 2009 Subject: IPv6 Ideas Message-ID: <49F1128A.3080501@comcast.net> I started playing with IPv6 on my home network with the intent to transition over. While many things work quite well, IPv6 technology in general still seems to have some rough edges. In terms of FreeBSD support, rtadvd and rtsol do not yet support (easily? -O option in rtadvd/rtsold) RFC5006 (Router Advertisements Option for DNS Configuration) which make it inconvenient to use mobile devices (like laptops) on an IPv6 network. I haven't had much luck with net/radvd. Is this something that could be improved? I'd be willing to implement this support, but I have very little time to spare (writing thesis). To be backward compatible with IPv4, I had a look at faith and faithd and while these tools are ingenius, I don't think they are good enough for transitioning to IPv6. I imagine it is possible to write an IPv6->IPv4 NAT daemon that uses faith to capture and restructure IPv6/IPv4 packets. Though, it really seems like this is the firewall's job A pf rule like: nat on $inet4_if inet to any from $lan_if:network6 -> ($inet4_if) would be extremely convenient. I'm aware pf doesn't support the token :network6 ... its just a wishful example. The IPv6 mapped IPv4 addresses would be the standard ::ffff:0:0/96 prefix. I imagine that this is very difficult to implement but I don't see why it wouldn't be possible. If a firewall supported this kind of NAT, a home network could easily deploy IPv6 and be backward compatible. Well, not quite, I guess BIND would have to serve IPv6 mapped IPv4 addresses to IPv6 queries. Oh yeah, one annoyance on 7-STABLE, it seems like pf is started before IPv6 rc.conf options are processed (including IPv6 address assignment) breaking inet6 rules that involve $if:network. Comments? Other than that, this has been one hell of a fun experience. Best Regards, Nathan Lay From pluknet at gmail.com Fri Apr 24 04:04:53 2009 From: pluknet at gmail.com (pluknet) Date: Fri Apr 24 04:04:59 2009 Subject: panic in soabort In-Reply-To: References: Message-ID: 2009/4/23 Robert Watson : > On Thu, 23 Apr 2009, pluknet wrote: > >> Please, give me comment on this. The panic is on 6.2-REL. Is it known to >> be fixed in the latter releases? > > It may well be -- there have been quite significant architectural > improvements to socket life cycle (etc) between 6.2 and 7.x releases, which > may well close the race causing this panic. However, we'll probably need to > learn a bit more in order to decide for sure. Could you convert the > trapping instruction pointer to file+offset in the source code? Looks I've lost the corresponding kernel.debug. Anyway I have such bt the first time. -- wbr, pluknet From steve at ibctech.ca Fri Apr 24 13:57:13 2009 From: steve at ibctech.ca (Steve Bertrand) Date: Fri Apr 24 13:57:25 2009 Subject: IPv6 Ideas In-Reply-To: <49F1128A.3080501@comcast.net> References: <49F1128A.3080501@comcast.net> Message-ID: <49F1C53F.5040202@ibctech.ca> Nathan Lay wrote: > I started playing with IPv6 on my home network with the intent to > transition over. While many things work quite well, IPv6 technology in > general still seems to have some rough edges. I disagree. I believe the "rough edges" do not belong to IPv6, the "rough edges" are the applications that are not compatible, the network devices that are not compatible, and the ISP's who have the mindset that they will never need IPv6, and refuse to look at it. > To be backward compatible with IPv4, I had a look at faith and faithd > and while these tools are ingenius, I don't think they are good enough > for transitioning to IPv6. I imagine it is possible to write an > IPv6->IPv4 NAT daemon that uses faith to capture and restructure > IPv6/IPv4 packets. Though, it really seems like this is the firewall's job > > A pf rule like: > > nat on $inet4_if inet to any from $lan_if:network6 -> ($inet4_if) > > would be extremely convenient. I'm aware pf doesn't support the token > :network6 ... its just a wishful example. The IPv6 mapped IPv4 > addresses would be the standard ::ffff:0:0/96 prefix. I imagine that > this is very difficult to implement but I don't see why it wouldn't be > possible. If a firewall supported this kind of NAT, a home network > could easily deploy IPv6 and be backward compatible. Well, not quite, I > guess BIND would have to serve IPv6 mapped IPv4 addresses to IPv6 queries. My hope is that I never have to deal with anything where IPv6 and NAT are in the same sentence :) > Comments? - ask your ISP about their IPv6 deployment plans, and how soon they can provide it to you - get a tunnel set up to a tunnel broker (sixxs.net, he.net etc) - ask your ISP how soon they can provide it to you - play, play play > Other than that, this has been one hell of a fun experience. A tad bit different, huh ;) Steve From bzeeb-lists at lists.zabbadoz.net Fri Apr 24 14:30:08 2009 From: bzeeb-lists at lists.zabbadoz.net (Bjoern A. Zeeb) Date: Fri Apr 24 14:30:35 2009 Subject: IPv6 Ideas In-Reply-To: <49F1128A.3080501@comcast.net> References: <49F1128A.3080501@comcast.net> Message-ID: <20090424142149.A15361@maildrop.int.zabbadoz.net> On Thu, 23 Apr 2009, Nathan Lay wrote: > In terms of FreeBSD support, rtadvd and rtsol do not yet support (easily? -O > option in rtadvd/rtsold) RFC5006 (Router Advertisements Option for DNS > Configuration) which make it inconvenient to use mobile devices (like We'll happily accept a patch;-) > Oh yeah, one annoyance on 7-STABLE, it seems like pf is started before IPv6 > rc.conf options are processed (including IPv6 address assignment) breaking See http://www.freebsd.org/cgi/query-pr.cgi?pr=conf/130381 -- Bjoern A. Zeeb The greatest risk is not taking one. From barney_cordoba at yahoo.com Fri Apr 24 15:03:54 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Fri Apr 24 15:04:01 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <20090423190408.GA65895@jem.dhs.org> Message-ID: <41289.49790.qm@web63906.mail.re1.yahoo.com> --- On Thu, 4/23/09, Ed Maste wrote: > From: Ed Maste > Subject: Re: Interrupts + Polling mode (similar to Linux's NAPI) > To: "Andrew Brampton" > Cc: attilio@freebsd.org, freebsd-net@freebsd.org, "Luigi Rizzo" > Date: Thursday, April 23, 2009, 3:04 PM > On Fri, Mar 27, 2009 at 11:05:00AM +0000, Andrew Brampton > wrote: > > > 2009/3/27 Luigi Rizzo : > > > The load of polling is pretty low (within 1% or > so) even with > > > polling. The advantage of having interrupts is > faster response > > > to incoming traffic, not CPU load. > > > > oh, I was under the impression that polling spun in a > tight loop, thus > > using 100% of the processor. After a quick test I see > this is not the > > case. I assume it will get to 100% CPU load if I > saturate my network. > > Yes, polling has a limit on the maximum CPU time it will > use, and also > will use less than the limit if there is no traffic. > > There are a number of sysctls under kern.polling that > control its > behaviour: > > * kern.polling.user_frac: Desired user fraction of cpu time > > This attempts to reserve at least a specified percentage of > available > CPU time for user processes; polling tries to limit its > percentage use > to 100 less this value. > > * kern.polling.burst: Current polling burst size > * kern.polling.burst_max: Max Polling burst size > * kern.polling.each_burst: Max size of each burst > > These three control the number of packets that polling > processes per > call / tick. Packets are processed in batches of > each_burst, up to > burst packets total per tick. The value of burst is capped > at > busrt_max. > > In order to keep the user_frac CPU percentage available for > non-polling, > a feedback loop is used that controls the value of burst. > Each time a > bach of packets is processed, burst is incremented or > decremented by 1, > depending on how much CPU time polling actually used. In > addition, if > polling extends beyond the next tick it's scaled back > to 7/8ths of the > current value. > > Polling was originally implemented as a livelock-avoidance > technique > for the single-core case -- the primary goal is to > guarantee the > availability of CPU time specified in user_frac. The > current algorithm > does not behave that well if user_frac is set low. Setting > it low is > reasonable if the workload is largely in-kernel (i.e., > bridging or > routing), or when running SMP. > > Another downside of the current implementation is that > interfaces will > be polled multiple times per tick (burst / each_burst > times), even if > there are no packets to process. > > At work we've developed a replacement polling algorithm > that keeps track > of the actual amount of time spent per packet, and uses > that as the > feedback to control the number of packets in each batch. > > This work requires a change to the polling KPI: the polling > handlers > have to return the count of packets actually handled. My > hope is to get > the KPI change committed in time for 8.0, even if we > don't switch the > algorithm yet. Attilio (on CC:) and I will make the patch > set for the > KPI change available shortly for comment. > > > -Ed Actually, the "advantage of using interrupts" is to have a per NIC control without having all of the extra code to implement polling. Using variable interrupt moderation is much more desirable and efficient, so polling is only useful for legacy NICs with no controls on interrupt delays. Polling requires that you adulterate the system with the polling function, that you call routines when there is nothing to process, and uses many cpu cycles doing unnecessary things. What happens when you have 4 NICs with different levels of traffic? You'd be better off launching a thread and polling yourself than having a system-wide function with generalized settings. Barney From ddg at yan.com.br Fri Apr 24 15:35:05 2009 From: ddg at yan.com.br (=?ISO-8859-1?Q?Daniel_Dias_Gon=E7alves?=) Date: Fri Apr 24 15:35:12 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: <49F0A7DD.30206@elischer.org> References: <49F06985.1000303@yan.com.br> <49F0A7DD.30206@elischer.org> Message-ID: <49F1DBAE.1080205@yan.com.br> Very good thinking, congratulations, but my need is another. The objective is a Captive Porrtal that each authentication is dynamically created a rule to ALLOW or COUNT IP authenticated, which I'm testing is what is the maximum capacity of rules supported, therefore simultaneous user. Understand ? Thanks, Daniel Julian Elischer escreveu: > Daniel Dias Gon?alves wrote: >> Hi, >> >> My system is a FreeBSD 7.1R. >> When I add rules IPFW COUNT to 254 IPS from my network, one of my >> interfaces increases the latency, causing large delays in the >> network, when I delete COUNT rules, everything returns to normal, >> which can be ? >> >> My script: > > of course adding 512 rules, *all of which hav eto be evaluated* will > add latency. > > you have several ways to improve this situation. > > 1/ use a differnet tool. > By using the netgraph netflow module you can get > accunting information that may be more useful and less impactful. > > 2/ you could make your rules smarter.. > > use skipto rules to make the average packet traverse less rules.. > > off the top of my head.. (not tested..) > > Assuming you have machines 10.0.0.1-10.0.0.254.... > the rules below have an average packet traversing 19 rules and not 256 > for teh SYN packet and 2 rules for others.. > you may not be able to do the keep state trick if you use state for > other stuff but in that case worst case will still be 19 rules. > > 2 check-state > 5 skipto 10000 ip from not 10.0.0.0/24 to any > 10 skipto 2020 ip from not 10.0.0.0/25 to any # 0-128 > 20 skipto 1030 ip from not 10.0.0.0/26 to any # 0-64 > 30 skipto 240 ip from not 10.0.0.0/27 to any # 0-32 > 40 skipto 100 ip from not 10.0.0.0/28 to any # 0-16 > [16 count rules for 0-15] > 80 skipto 10000 ip from any to any > 100 [16 count rules for 16-31] keep-state > 140 skipto 10000 ip from any to any > 240 skipto 300 ip from not 10.0.0.32/28 > [16 rules for 32-47] keep-state > 280 skipto 10000 ip from any to any > 300 [16 count rules for 48-63] keep-state > 340 skipto 10000 ip from any to any > 1030 skipto 1240 ip from not 10.0.0.64/27 to any > 1040 skipto 1100 ip from not 10.0.0.64/28 to any > [16 count rules for 64-79] keep-state > 1080 skipto 10000 ip from any to any > 1100 [16 rules for 80-95] keep-state > 1140 skipto 10000 ip from any to any > 1240 skipto 1300 ip from not 10.0.0.96/28 to any > [16 count rules for 96-111] keep-state > 1280 skipto 10000 ip from any to any > 1300 [16 rules for 112-127] keep-state > 1340 skipto 10000 ip from any to any > 2020 skipto 3030 ip from not 10.0.0.128/26 to any > 2030 skipto 2240 ip from not 10.0.0.128/28 to any > [16 count rules for 128-143] keep-state > 2080 skipto 10000 ip from any to any > 2100 [16 rules for 144-159] keep-state > 2140 skipto 10000 ip from any to any > 2240 skipto 2300 ip from not 10.0.0.32/28 to any > [16 count rules for 160-175] keep-state > 2280 skipto 10000 ip from any to any > 2300 [16 count rules for 176-191] keep-state > 2340 skipto 10000 ip from any to any > 3030 skipto 3240 ip from not 10.0.0.192/27 to any > 3040 skipto 3100 ip from not 10.0.0.192/28 to any > [16 count rules for 192-207] keep-state > 3080 skipto 10000 ip from any to any > 3100 [16 rules for 208-223] keep-state > 3240 skipto 10000 ip from any to any > 3240 skipto 3300 ip from not 10.0.0.224/28 to any > [16 count rules for 224-239] keep-state > 3280 skipto 10000 ip from any to any > 3300 [16 count rules for 240-255] keep-state > 3340 skipto 10000 ip from any to any > > 10000 #other stuff > > in fact you could improve it further with: > 1/ either going down to a netmask of 29 (8 rules per set) > or > 2/ instead of having count rules make them skipto > so you would have: > 3300 skipto 10000 ip from 10.0.0.240 to any > 3301 skipto 10000 ip from 10.0.0.241 to any > 3302 skipto 10000 ip from 10.0.0.242 to any > 3303 skipto 10000 ip from 10.0.0.243 to any > 3304 skipto 10000 ip from 10.0.0.244 to any > 3305 skipto 10000 ip from 10.0.0.245 to any > 3306 skipto 10000 ip from 10.0.0.246 to any > 3307 skipto 10000 ip from 10.0.0.247 to any > 3308 skipto 10000 ip from 10.0.0.248 to any > 3309 skipto 10000 ip from 10.0.0.249 to any > 3310 skipto 10000 ip from 10.0.0.240 to any > 3311 skipto 10000 ip from 10.0.0.241 to any > 3312 skipto 10000 ip from 10.0.0.242 to any > 3313 skipto 10000 ip from 10.0.0.243 to any > 3314 skipto 10000 ip from 10.0.0.244 to any > 3315 skipto 10000 ip from 10.0.0.245 to any > > thus on average, a packet would traverse half the rules (8). > > 3/ both the above so on average they would traverse 4 rules plus one > extra skipto. > > you should be able to do the above in a script. > I'd love to see it.. > > (you can also do skipto tablearg in -current (maybe 7.2 too) > which may also be good.. (or not)) > > > julian > > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > From bob at veznat.com Fri Apr 24 15:38:14 2009 From: bob at veznat.com (Bob Van Zant) Date: Fri Apr 24 15:38:46 2009 Subject: IPv6 Ideas In-Reply-To: <20090424120022.DE524106568C@hub.freebsd.org> Message-ID: I was in a similar position to you not that long ago. I got my LAN all dual stack and was a happy camper. I wanted 100% IPv6 and never to see another RFC 1918 address on my network again. Unfortunately it's just not practical. My ReadyNAS doesn't talk v6. My mac doesn't appear to like v6 for the file transfer protocols it supports. My iPhone doesn't do v6. The applications just aren't ready to live in a v6-only world. I suggest leaning on your vendors whenever you can so that they no longer can say "no one is asking for it." A boring, un-bumped thread asking for IPv6 support in the iPhone: http://discussions.apple.com/thread.jspa?threadID=1960260&tstart=0 Getting back to your question. It is my understanding that this IVI proposal is the most likely to become an officially adopted standard: http://tools.ietf.org/html/draft-xli-behave-ivi-01 That's being done as part of the behave working group: http://www.ietf.org/html.charters/behave-charter.html If anyone were to begin working on something like this this they'd probably want to think about following that proposal. I too am interested in working on this. Just haven't sat down to really start thinking about it yet. -Bob Message: 14 Date: Thu, 23 Apr 2009 21:14:50 -0400 From: Nathan Lay Subject: IPv6 Ideas To: freebsd-net@freebsd.org Message-ID: <49F1128A.3080501@comcast.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed I started playing with IPv6 on my home network with the intent to transition over. While many things work quite well, IPv6 technology in general still seems to have some rough edges. In terms of FreeBSD support, rtadvd and rtsol do not yet support (easily? -O option in rtadvd/rtsold) RFC5006 (Router Advertisements Option for DNS Configuration) which make it inconvenient to use mobile devices (like laptops) on an IPv6 network. I haven't had much luck with net/radvd. Is this something that could be improved? I'd be willing to implement this support, but I have very little time to spare (writing thesis). To be backward compatible with IPv4, I had a look at faith and faithd and while these tools are ingenius, I don't think they are good enough for transitioning to IPv6. I imagine it is possible to write an IPv6->IPv4 NAT daemon that uses faith to capture and restructure IPv6/IPv4 packets. Though, it really seems like this is the firewall's job A pf rule like: nat on $inet4_if inet to any from $lan_if:network6 -> ($inet4_if) would be extremely convenient. I'm aware pf doesn't support the token :network6 ... its just a wishful example. The IPv6 mapped IPv4 addresses would be the standard ::ffff:0:0/96 prefix. I imagine that this is very difficult to implement but I don't see why it wouldn't be possible. If a firewall supported this kind of NAT, a home network could easily deploy IPv6 and be backward compatible. Well, not quite, I guess BIND would have to serve IPv6 mapped IPv4 addresses to IPv6 queries. Oh yeah, one annoyance on 7-STABLE, it seems like pf is started before IPv6 rc.conf options are processed (including IPv6 address assignment) breaking inet6 rules that involve $if:network. Comments? Other than that, this has been one hell of a fun experience. Best Regards, Nathan Lay From m.jakeman at lancaster.ac.uk Fri Apr 24 16:26:08 2009 From: m.jakeman at lancaster.ac.uk (Matthew Jakeman) Date: Fri Apr 24 16:26:14 2009 Subject: IPv6 Ideas In-Reply-To: <49F1128A.3080501@comcast.net> References: <49F1128A.3080501@comcast.net> Message-ID: <49F1E2E7.5010703@lancaster.ac.uk> Nathan Lay wrote: > I started playing with IPv6 on my home network with the intent to > transition over. While many things work quite well, IPv6 technology > in general still seems to have some rough edges. > > In terms of FreeBSD support, rtadvd and rtsol do not yet support > (easily? -O option in rtadvd/rtsold) RFC5006 (Router Advertisements > Option for DNS Configuration) which make it inconvenient to use mobile > devices (like laptops) on an IPv6 network. I haven't had much luck > with net/radvd. What are your problems with using radvd? I have used it quite a bit on FreeBSD (6.1) without any hassle. It's even written quite nicely in my experience so working on patches for it should be quite do-able if there are features missing. > Is this something that could be improved? I'd be willing to implement > this support, but I have very little time to spare (writing thesis). > > To be backward compatible with IPv4, I had a look at faith and faithd > and while these tools are ingenius, I don't think they are good enough > for transitioning to IPv6. I imagine it is possible to write an > IPv6->IPv4 NAT daemon that uses faith to capture and restructure > IPv6/IPv4 packets. Though, it really seems like this is the > firewall's job > > A pf rule like: > > nat on $inet4_if inet to any from $lan_if:network6 -> ($inet4_if) > > would be extremely convenient. I'm aware pf doesn't support the token > :network6 ... its just a wishful example. The IPv6 mapped IPv4 > addresses would be the standard ::ffff:0:0/96 prefix. I imagine that > this is very difficult to implement but I don't see why it wouldn't be > possible. If a firewall supported this kind of NAT, a home network > could easily deploy IPv6 and be backward compatible. Well, not quite, > I guess BIND would have to serve IPv6 mapped IPv4 addresses to IPv6 > queries. > > Oh yeah, one annoyance on 7-STABLE, it seems like pf is started before > IPv6 rc.conf options are processed (including IPv6 address assignment) > breaking inet6 rules that involve $if:network. > > Comments? > > Other than that, this has been one hell of a fun experience. > > Best Regards, > Nathan Lay > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From wmoran at collaborativefusion.com Fri Apr 24 16:42:03 2009 From: wmoran at collaborativefusion.com (Bill Moran) Date: Fri Apr 24 16:42:11 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: <49F1DBAE.1080205@yan.com.br> References: <49F06985.1000303@yan.com.br> <49F0A7DD.30206@elischer.org> <49F1DBAE.1080205@yan.com.br> Message-ID: <20090424124202.951a82e1.wmoran@collaborativefusion.com> In response to Daniel Dias Gon?alves : > Very good thinking, congratulations, but my need is another. > The objective is a Captive Porrtal that each authentication is > dynamically created a rule to ALLOW or COUNT IP authenticated, which I'm > testing is what is the maximum capacity of rules supported, therefore > simultaneous user. > > Understand ? If you're only doing allow, then you'd be better off using a table, which has much better performance than a bunch of separate rules. If you're counting packets, I don't know if that approach will work or not. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023 **************************************************************** IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. **************************************************************** From bob at veznat.com Fri Apr 24 16:51:27 2009 From: bob at veznat.com (Bob Van Zant) Date: Fri Apr 24 16:51:34 2009 Subject: IPv6 Ideas In-Reply-To: <49F1E2E7.5010703@lancaster.ac.uk> References: <49F1128A.3080501@comcast.net> <49F1E2E7.5010703@lancaster.ac.uk> Message-ID: > What are your problems with using radvd? I have used it quite a bit on > FreeBSD (6.1) without any hassle. It's even written quite nicely in my > experience so working on patches for it should be quite do-able if there > are features missing. He's saying that the router announcements don't contain any DNS server information. There's an extension/option that can be enabled with router advertisements to make it send this information, similar in function to how DHCP sends out extra info like the default gateway, DNS server, NTP server, WINS servers, etc. To my knowledge this wasn't around when the Kame guys were working on this stuff. I don't think a lot of time has been spent updating the v6 support applications since then and that's why we don't have this feature. This isn't a big deal in dual-stack networks because the clients just do DNS over v4 with whatever the DHCP server gave. In a pure-v6 world... In hindsight it's an obvious oversight that it wasn't included in the first place. -Bob From adrian at freebsd.org Fri Apr 24 17:06:47 2009 From: adrian at freebsd.org (Adrian Chadd) Date: Fri Apr 24 17:06:54 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: <49F06985.1000303@yan.com.br> References: <49F06985.1000303@yan.com.br> Message-ID: You'd almost certainly be better off hacking up an extension to ipfw which lets you count a /24 in one rule. As in, the count rule would match on the subnet/netmask, have 256 32 (or 64 bit) integers allocated to record traffic in, and then do an O(1) operation using the last octet of the v4 address to map it into this 256 slot array to update counters for. It'd require a little tool hackery to extend ipfw in userland/kernel space to do it but it would work and be (very almost) just as fast as a single rule. 2c, Adrian 2009/4/23 Daniel Dias Gon?alves : > Hi, > > My system is a FreeBSD 7.1R. > When I add rules IPFW COUNT to 254 IPS from my network, one of my interfaces > increases the latency, causing large delays in the network, when I delete > COUNT rules, everything returns to normal, which can be ? > > My script: > > ipcount.php > -- CUT -- > $c=0; > $a=50100; > for($x=0;$x<=0;$x++) { > ? ? ? for($y=1;$y<=254;$y++) { > ? ? ? ? ? ? ? $ip = "192.168.$x.$y"; > ? ? ? ? ? ? ? system("/sbin/ipfw -q add $a count { tcp or udp } from any to > $ip/32"); > ? ? ? ? ? ? ? system("/sbin/ipfw -q add $a count { tcp or udp } from $ip/32 > to any"); > ? ? ? ? ? ? ? #system("/sbin/ipfw delete $a"); > ? ? ? ? ? ? ? $c++; > ? ? ? ? ? ? ? $a++; > ? ? ? } > } > echo "\n\nTotal: $c\n"; > ?> > -- CUT -- > > net.inet.ip.fw.dyn_keepalive: 1 > net.inet.ip.fw.dyn_short_lifetime: 5 > net.inet.ip.fw.dyn_udp_lifetime: 10 > net.inet.ip.fw.dyn_rst_lifetime: 1 > net.inet.ip.fw.dyn_fin_lifetime: 1 > net.inet.ip.fw.dyn_syn_lifetime: 20 > net.inet.ip.fw.dyn_ack_lifetime: 300 > net.inet.ip.fw.static_count: 262 > net.inet.ip.fw.dyn_max: 10000 > net.inet.ip.fw.dyn_count: 0 > net.inet.ip.fw.curr_dyn_buckets: 256 > net.inet.ip.fw.dyn_buckets: 10000 > net.inet.ip.fw.default_rule: 65535 > net.inet.ip.fw.verbose_limit: 0 > net.inet.ip.fw.verbose: 1 > net.inet.ip.fw.debug: 0 > net.inet.ip.fw.one_pass: 1 > net.inet.ip.fw.autoinc_step: 100 > net.inet.ip.fw.enable: 1 > net.link.ether.ipfw: 1 > net.link.bridge.ipfw: 0 > net.link.bridge.ipfw_arp: 0 > > Thanks, > > Daniel > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From sthaug at nethelp.no Fri Apr 24 17:13:23 2009 From: sthaug at nethelp.no (sthaug@nethelp.no) Date: Fri Apr 24 17:13:30 2009 Subject: IPv6 Ideas In-Reply-To: References: <49F1128A.3080501@comcast.net> <49F1E2E7.5010703@lancaster.ac.uk> Message-ID: <20090424.191317.112607500.sthaug@nethelp.no> > To my knowledge this wasn't around when the Kame guys were working on this > stuff. I don't think a lot of time has been spent updating the v6 support > applications since then and that's why we don't have this feature. > > This isn't a big deal in dual-stack networks because the clients just do DNS > over v4 with whatever the DHCP server gave. In a pure-v6 world... In > hindsight it's an obvious oversight that it wasn't included in the first > place. Not necessarily just oversight, also politics. IPv6 RA can't give you DNS info (without the addition you mentioned), and DHCPv6 can't give you a default route. Both pretty bad, actually. It looks like we may get a default route option for DHCPv6 now, but there's still a lot of resistance against it. Steinar Haug, Nethelp consulting, sthaug@nethelp.no From Anatoliy.Poloz at onetelecom.od.ua Fri Apr 24 17:15:20 2009 From: Anatoliy.Poloz at onetelecom.od.ua (Anatoliy.Poloz) Date: Fri Apr 24 17:15:32 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: <20090424124202.951a82e1.wmoran@collaborativefusion.com> References: <49F06985.1000303@yan.com.br> <49F0A7DD.30206@elischer.org> <49F1DBAE.1080205@yan.com.br> <20090424124202.951a82e1.wmoran@collaborativefusion.com> Message-ID: <49F1EFA4.7000107@onetelecom.od.ua> Bill Moran wrote: > In response to Daniel Dias Gon?alves : > >> Very good thinking, congratulations, but my need is another. >> The objective is a Captive Porrtal that each authentication is >> dynamically created a rule to ALLOW or COUNT IP authenticated, which I'm >> testing is what is the maximum capacity of rules supported, therefore >> simultaneous user. >> >> Understand ? > > If you're only doing allow, then you'd be better off using a table, > which has much better performance than a bunch of separate rules. > > If you're counting packets, I don't know if that approach will work > or not. > if u need to count ip traffic for all clients u can use sipmple and more performance rule set, like this one: LOCAL_NET=192.168.0.0/24 ipfw pipe 100 config bw 0 mask src-ip 0xffffffff ipfw pipe 100 config bw 0 mask dst-ip 0xffffffff ipfw add 100 pipe 100 ip from ${LOCAL_NET} to any out ipfw add 200 pipe 200 ip from any to ${LOCAL_NET} in From emaste at freebsd.org Fri Apr 24 17:32:18 2009 From: emaste at freebsd.org (Ed Maste) Date: Fri Apr 24 17:32:25 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <41289.49790.qm@web63906.mail.re1.yahoo.com> References: <20090423190408.GA65895@jem.dhs.org> <41289.49790.qm@web63906.mail.re1.yahoo.com> Message-ID: <20090424174208.GA76828@jem.dhs.org> On Fri, Apr 24, 2009 at 08:03:52AM -0700, Barney Cordoba wrote: > Actually, the "advantage of using interrupts" is to have a per > NIC control without having all of the extra code to implement > polling. Using variable interrupt moderation is much more desirable > and efficient, so polling is only useful for legacy NICs with no > controls on interrupt delays. I'm aware of the advantages and tradeoffs of the various approaches, and the shortcomings of our current polling infrastructure, probably the greatest of which is the lack of any parallelism. That said, in testing some time ago polling with the modifications I alluded to in my email, using em(4), gave the highest throughput of all approaches. (At the expence of latency, as expected.) In addition, having a standardized polling interface allows for use of the interface when the system is not fully running -- network crashdumps, for instance. We can certainly use improvements in the polling infrastructure though, at least allowing it to properly take advantage of SMP. The KPI change proposed here is to allow some of those improvements, should they happen, to be MFC'd to 8. -Ed From m.jakeman at lancaster.ac.uk Fri Apr 24 18:04:41 2009 From: m.jakeman at lancaster.ac.uk (Matthew Jakeman) Date: Fri Apr 24 18:04:47 2009 Subject: IPv6 Ideas In-Reply-To: References: <49F1128A.3080501@comcast.net> <49F1E2E7.5010703@lancaster.ac.uk> Message-ID: <49F1FCF1.9000903@lancaster.ac.uk> Bob Van Zant wrote: >> What are your problems with using radvd? I have used it quite a bit on >> FreeBSD (6.1) without any hassle. It's even written quite nicely in my >> experience so working on patches for it should be quite do-able if >> there are features missing. > > He's saying that the router announcements don't contain any DNS server > information. There's an extension/option that can be enabled with router > advertisements to make it send this information, similar in function to > how DHCP sends out extra info like the default gateway, DNS server, NTP > server, WINS servers, etc. Yes but he was also saying that support for the existing RFC to provide the RA option does not seem to be present in any of the tools (from what I gathered). If the RFC is there then surely this functionality could be added. From the original message I thought he was having problems getting radvd to work at all. This could just be me misunderstanding the mail however. From my perspective this is one thing that needs significant improvement for v6. There are many RFC's/ietf drafts out there that add some nifty functionality but support for them in the real world applications seems to be lacking somewhat. > > To my knowledge this wasn't around when the Kame guys were working on > this stuff. I don't think a lot of time has been spent updating the v6 > support applications since then and that's why we don't have this feature. > > This isn't a big deal in dual-stack networks because the clients just do > DNS over v4 with whatever the DHCP server gave. In a pure-v6 world... In > hindsight it's an obvious oversight that it wasn't included in the first > place. > > -Bob > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From nslay at comcast.net Fri Apr 24 18:54:15 2009 From: nslay at comcast.net (Nathan Lay) Date: Fri Apr 24 18:54:22 2009 Subject: IPv6 Ideas In-Reply-To: <49F1E2E7.5010703@lancaster.ac.uk> References: <49F1128A.3080501@comcast.net> <49F1E2E7.5010703@lancaster.ac.uk> Message-ID: <49F2077D.2060701@comcast.net> Matthew Jakeman wrote: > Nathan Lay wrote: >> I started playing with IPv6 on my home network with the intent to >> transition over. While many things work quite well, IPv6 technology >> in general still seems to have some rough edges. >> >> In terms of FreeBSD support, rtadvd and rtsol do not yet support >> (easily? -O option in rtadvd/rtsold) RFC5006 (Router Advertisements >> Option for DNS Configuration) which make it inconvenient to use >> mobile devices (like laptops) on an IPv6 network. I haven't had much >> luck with net/radvd. > > What are your problems with using radvd? I have used it quite a bit on > FreeBSD (6.1) without any hassle. It's even written quite nicely in my > experience so working on patches for it should be quite do-able if > there are features missing. > radvd actually does support DNS advertisement (but rtsol doesn't, so it doesn't matter). The problem is, it doesn't work. on 7-STABLE it dumps "Can't assign requested address." I think it has something to do with if_bridge not having a link-local address (apparently the standards are ambiguous about link-local addresses on bridges). To make rtadvd work, I had to tell it to advertise on bridge-member interfaces directly. Best Regards, Nathan Lay From nslay at comcast.net Fri Apr 24 19:00:28 2009 From: nslay at comcast.net (Nathan Lay) Date: Fri Apr 24 19:00:35 2009 Subject: IPv6 Ideas In-Reply-To: <49F1C53F.5040202@ibctech.ca> References: <49F1128A.3080501@comcast.net> <49F1C53F.5040202@ibctech.ca> Message-ID: <49F20C08.8070006@comcast.net> Steve Bertrand wrote: > Nathan Lay wrote: > >> I started playing with IPv6 on my home network with the intent to >> transition over. While many things work quite well, IPv6 technology in >> general still seems to have some rough edges. >> > > I disagree. I believe the "rough edges" do not belong to IPv6, the > "rough edges" are the applications that are not compatible, the network > devices that are not compatible, and the ISP's who have the mindset that > they will never need IPv6, and refuse to look at it. > > While the IPv6 implementation is great, it doesn't seem like it can be used for anything serious yet. If there is to be a transition to IPv6, the applications that assist the management of an IPv6 network appear to need improvement (like rtsol/rtadvd, faith/faithd for example). >> To be backward compatible with IPv4, I had a look at faith and faithd >> and while these tools are ingenius, I don't think they are good enough >> for transitioning to IPv6. I imagine it is possible to write an >> IPv6->IPv4 NAT daemon that uses faith to capture and restructure >> IPv6/IPv4 packets. Though, it really seems like this is the firewall's job >> >> A pf rule like: >> >> nat on $inet4_if inet to any from $lan_if:network6 -> ($inet4_if) >> >> would be extremely convenient. I'm aware pf doesn't support the token >> :network6 ... its just a wishful example. The IPv6 mapped IPv4 >> addresses would be the standard ::ffff:0:0/96 prefix. I imagine that >> this is very difficult to implement but I don't see why it wouldn't be >> possible. If a firewall supported this kind of NAT, a home network >> could easily deploy IPv6 and be backward compatible. Well, not quite, I >> guess BIND would have to serve IPv6 mapped IPv4 addresses to IPv6 queries. >> > > My hope is that I never have to deal with anything where IPv6 and NAT > are in the same sentence :) > > I don't see how my suggestion is difficult to comprehend from the user's perspective (from the programmer perspective, it seems nightmarish). You have a dual-stack router, the objective is to share connectivity over one or more IPv4 router addresses with IPv6 clients. Conceptually its the same as NAT on IPv4/6-only networks. Since there is a standard IPv6 mapped IPv4 address prefix (::ffff:0:0/96) IPv6 clients need only use this prefix to reach IPv4 networks. The only real issue is that a DNS server needs to serve IPv6 mapped IPv4 addresses to IPv6 queries. The nightmarish aspect is this probably involves more than just address translation, IPv4 and IPv6 are apparently very different. If faith(4) works the way I think it does, a program could be made to accomplish similar, but it really seems like a firewall should do this (to be consistent with what firewalls are already assumed to do). The consequence of such a feature is that IPv6-only home networks (minus the dual-stack router) will not only be seamlessly backward compatible with IPv4 Internet, but it will be slightly better than choosing to use 192.168.x.y, 10.x.y.z or 172.x.y.z since it can reach IPv6 Internet too. This would significantly help IPv6 transition and adoption. Best Regards, Nathan Lay From pcc at gmx.net Fri Apr 24 20:29:25 2009 From: pcc at gmx.net (Peter Cornelius) Date: Fri Apr 24 20:29:32 2009 Subject: VIMAGE (was: Multiple default routes / Force external routing) In-Reply-To: <49E57076.7040509@elischer.org> References: <20090413.220932.74699777.sthaug@nethelp.no> <49E41755.8050701@elischer.org> <49E48799.1000300@ibctech.ca> <20090414.212318.41684722.sthaug@nethelp.no> <11167f520904141722r16b537a9o58497c9719fb6fc5@mail.gmail.com> <49E57076.7040509@elischer.org> Message-ID: <20090424202923.235660@gmx.net> > > is VIMAGE fully integrated into FreeBSD 8 CURRENT? (I believe this > > answer is no) > > also is VIMAGE expected to make it into FreeBSD 8? > > not fully but a lot of it is under way Thanks for the pointer, I currently don't get it [1] to build on RELENG_7 which I naively hoped, so the "lot" probably not suffient for me yet. So, w/o patience for August, I probably need to find another way. Anyways, thanks for the comments, Regards, Peter. [1] http://imunes.tel.fer.hr/virtnet/vimage_7_20090401.tgz -- Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss f?r nur 17,95 Euro/mtl.!* http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K11308T4569a From zec at icir.org Fri Apr 24 21:08:45 2009 From: zec at icir.org (Marko Zec) Date: Fri Apr 24 21:08:53 2009 Subject: VIMAGE (was: Multiple default routes / Force external routing) In-Reply-To: <20090424202923.235660@gmx.net> References: <20090413.220932.74699777.sthaug@nethelp.no> <49E57076.7040509@elischer.org> <20090424202923.235660@gmx.net> Message-ID: <200904242249.27640.zec@icir.org> On Friday 24 April 2009 22:29:23 Peter Cornelius wrote: > > > is VIMAGE fully integrated into FreeBSD 8 CURRENT? (I believe this > > > answer is no) > > > also is VIMAGE expected to make it into FreeBSD 8? > > > > not fully but a lot of it is under way > > Thanks for the pointer, I currently don't get it [1] to build on RELENG_7 > which I naively hoped, so the "lot" probably not suffient for me yet. So, > w/o patience for August, I probably need to find another way. Hmm... tpx32% uname -a FreeBSD tpx32.icir.org 7.1-STABLE FreeBSD 7.1-STABLE #0: Thu Feb 5 22:36:40 CET 2009 marko@tpx32.icir.org:/u/marko/p4/zec/vimage_7/src/sys/i386/compile/VIMAGE i386 tpx32% pwd /u/marko/tmp tpx32% tar -xzf vimage_7_20090401.tgz tpx32% cd src/sys/i386/conf/ tpx32% config VIMAGE tpx32% cd ../compile/VIMAGE/ tpx32% make depend; make tpx32% sudo make install tpx32% cd ~/tmp/src/usr.sbin/vimage/ tpx32% make clean; make tpx32% sudo make install Let me know if that doesn't work... Good > Anyways, thanks for the comments, > > Regards, > > Peter. > > [1] http://imunes.tel.fer.hr/virtnet/vimage_7_20090401.tgz From julian at elischer.org Fri Apr 24 21:58:16 2009 From: julian at elischer.org (Julian Elischer) Date: Fri Apr 24 21:58:32 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: <49F1DBAE.1080205@yan.com.br> References: <49F06985.1000303@yan.com.br> <49F0A7DD.30206@elischer.org> <49F1DBAE.1080205@yan.com.br> Message-ID: <49F235F4.2030202@elischer.org> Daniel Dias Gon?alves wrote: > Very good thinking, congratulations, but my need is another. > The objective is a Captive Porrtal that each authentication is > dynamically created a rule to ALLOW or COUNT IP authenticated, which I'm > testing is what is the maximum capacity of rules supported, therefore > simultaneous user. > > Understand ? > I think so. do not add rules. have a single rule that looks in a table and add entries to the table when needed. > Thanks, > > Daniel > > Julian Elischer escreveu: >> Daniel Dias Gon?alves wrote: >>> Hi, >>> >>> My system is a FreeBSD 7.1R. >>> When I add rules IPFW COUNT to 254 IPS from my network, one of my >>> interfaces increases the latency, causing large delays in the >>> network, when I delete COUNT rules, everything returns to normal, >>> which can be ? >>> >>> My script: >> >> of course adding 512 rules, *all of which hav eto be evaluated* will >> add latency. >> >> you have several ways to improve this situation. >> >> 1/ use a differnet tool. >> By using the netgraph netflow module you can get >> accunting information that may be more useful and less impactful. >> >> 2/ you could make your rules smarter.. >> >> use skipto rules to make the average packet traverse less rules.. >> >> off the top of my head.. (not tested..) >> >> Assuming you have machines 10.0.0.1-10.0.0.254.... >> the rules below have an average packet traversing 19 rules and not 256 >> for teh SYN packet and 2 rules for others.. >> you may not be able to do the keep state trick if you use state for >> other stuff but in that case worst case will still be 19 rules. >> >> 2 check-state >> 5 skipto 10000 ip from not 10.0.0.0/24 to any >> 10 skipto 2020 ip from not 10.0.0.0/25 to any # 0-128 >> 20 skipto 1030 ip from not 10.0.0.0/26 to any # 0-64 >> 30 skipto 240 ip from not 10.0.0.0/27 to any # 0-32 >> 40 skipto 100 ip from not 10.0.0.0/28 to any # 0-16 >> [16 count rules for 0-15] >> 80 skipto 10000 ip from any to any >> 100 [16 count rules for 16-31] keep-state >> 140 skipto 10000 ip from any to any >> 240 skipto 300 ip from not 10.0.0.32/28 >> [16 rules for 32-47] keep-state >> 280 skipto 10000 ip from any to any >> 300 [16 count rules for 48-63] keep-state >> 340 skipto 10000 ip from any to any >> 1030 skipto 1240 ip from not 10.0.0.64/27 to any >> 1040 skipto 1100 ip from not 10.0.0.64/28 to any >> [16 count rules for 64-79] keep-state >> 1080 skipto 10000 ip from any to any >> 1100 [16 rules for 80-95] keep-state >> 1140 skipto 10000 ip from any to any >> 1240 skipto 1300 ip from not 10.0.0.96/28 to any >> [16 count rules for 96-111] keep-state >> 1280 skipto 10000 ip from any to any >> 1300 [16 rules for 112-127] keep-state >> 1340 skipto 10000 ip from any to any >> 2020 skipto 3030 ip from not 10.0.0.128/26 to any >> 2030 skipto 2240 ip from not 10.0.0.128/28 to any >> [16 count rules for 128-143] keep-state >> 2080 skipto 10000 ip from any to any >> 2100 [16 rules for 144-159] keep-state >> 2140 skipto 10000 ip from any to any >> 2240 skipto 2300 ip from not 10.0.0.32/28 to any >> [16 count rules for 160-175] keep-state >> 2280 skipto 10000 ip from any to any >> 2300 [16 count rules for 176-191] keep-state >> 2340 skipto 10000 ip from any to any >> 3030 skipto 3240 ip from not 10.0.0.192/27 to any >> 3040 skipto 3100 ip from not 10.0.0.192/28 to any >> [16 count rules for 192-207] keep-state >> 3080 skipto 10000 ip from any to any >> 3100 [16 rules for 208-223] keep-state >> 3240 skipto 10000 ip from any to any >> 3240 skipto 3300 ip from not 10.0.0.224/28 to any >> [16 count rules for 224-239] keep-state >> 3280 skipto 10000 ip from any to any >> 3300 [16 count rules for 240-255] keep-state >> 3340 skipto 10000 ip from any to any >> >> 10000 #other stuff >> >> in fact you could improve it further with: >> 1/ either going down to a netmask of 29 (8 rules per set) >> or >> 2/ instead of having count rules make them skipto >> so you would have: >> 3300 skipto 10000 ip from 10.0.0.240 to any >> 3301 skipto 10000 ip from 10.0.0.241 to any >> 3302 skipto 10000 ip from 10.0.0.242 to any >> 3303 skipto 10000 ip from 10.0.0.243 to any >> 3304 skipto 10000 ip from 10.0.0.244 to any >> 3305 skipto 10000 ip from 10.0.0.245 to any >> 3306 skipto 10000 ip from 10.0.0.246 to any >> 3307 skipto 10000 ip from 10.0.0.247 to any >> 3308 skipto 10000 ip from 10.0.0.248 to any >> 3309 skipto 10000 ip from 10.0.0.249 to any >> 3310 skipto 10000 ip from 10.0.0.240 to any >> 3311 skipto 10000 ip from 10.0.0.241 to any >> 3312 skipto 10000 ip from 10.0.0.242 to any >> 3313 skipto 10000 ip from 10.0.0.243 to any >> 3314 skipto 10000 ip from 10.0.0.244 to any >> 3315 skipto 10000 ip from 10.0.0.245 to any >> >> thus on average, a packet would traverse half the rules (8). >> >> 3/ both the above so on average they would traverse 4 rules plus one >> extra skipto. >> >> you should be able to do the above in a script. >> I'd love to see it.. >> >> (you can also do skipto tablearg in -current (maybe 7.2 too) >> which may also be good.. (or not)) >> >> >> julian >> >> >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> From kuan.joe at gmail.com Fri Apr 24 23:08:41 2009 From: kuan.joe at gmail.com (Joseph Kuan) Date: Fri Apr 24 23:08:59 2009 Subject: FreeBSD 7.1 taskq em performance Message-ID: <40bb871a0904241542o3f4d6c6ap62ff71876074bbea@mail.gmail.com> Hi all, I have been hitting some barrier with FreeBSD 7.1 network performance. I have written an application which contains two kernel threads that takes mbufs directly from a network interface and forwards to another network interface. This idea is to simulate different network environment. I have been using FreeBSD 6.4 amd64 and tested with an Ixia box (specialised hardware firing very high packet rate). The PC was a Core2 2.6 GHz with dual ports Intel PCIE Gigabit network card. It can manage up to 1.2 million pps. I have a higher spec PC with FreeBSD 7.1 amd64 and Quadcore 2.3 GHz and PCIE Gigabit network card. The performance can only achieve up to 600k pps. I notice the 'taskq em0' and 'taskq em1' is solid 100% CPU but it is not in FreeBSD 6.4. Any advice? Many thanks in advance Joe From pcc at gmx.net Sat Apr 25 13:30:13 2009 From: pcc at gmx.net (Peter Cornelius) Date: Sat Apr 25 13:30:21 2009 Subject: VIMAGE (was: Multiple default routes / Force external routing) In-Reply-To: <200904242249.27640.zec@icir.org> References: <20090413.220932.74699777.sthaug@nethelp.no> <49E57076.7040509@elischer.org> <20090424202923.235660@gmx.net> <200904242249.27640.zec@icir.org> Message-ID: <20090425133006.311010@gmx.net> Thanks, Marco, > > > > is VIMAGE fully integrated into FreeBSD 8 CURRENT? (I believe this > > > > answer is no) > > > > also is VIMAGE expected to make it into FreeBSD 8? > > > > > > not fully but a lot of it is under way > > > > Thanks for the pointer, I currently don't get it [1] to build on > RELENG_7 > > which I naively hoped, so the "lot" probably not suffient for me yet. > So, > > w/o patience for August, I probably need to find another way. > > Hmm... > tpx32% uname -a > FreeBSD tpx32.icir.org 7.1-STABLE FreeBSD 7.1-STABLE #0: Thu Feb 5 > 22:36:40 > CET 2009 > marko@tpx32.icir.org:/u/marko/p4/zec/vimage_7/src/sys/i386/compile/VIMAGE > i386 > tpx32% pwd > /u/marko/tmp > tpx32% tar -xzf vimage_7_20090401.tgz > tpx32% cd src/sys/i386/conf/ > tpx32% config VIMAGE > tpx32% cd ../compile/VIMAGE/ > tpx32% make depend; make > tpx32% sudo make install > tpx32% cd ~/tmp/src/usr.sbin/vimage/ > tpx32% make clean; make > tpx32% sudo make install > > Let me know if that doesn't work... In fact, it *does* work, thank you. I mistook the tar to be a patch to copy over an existing tree which obviously did not work out as I expected. So, how's that: Copyright (c) 1992-2009 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.2-PRERELEASE #0: Sat Apr 25 08:22:26 UTC 2009 root@netserv.ka.cornelius:/usr/src.VIMAGE_20090401/sys/i386/compile/VNETSERV Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel Pentium III (1004.52-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x686 Stepping = 6 Features=0x383fbff real memory = 1610596352 (1535 MB) avail memory = 1568624640 (1495 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 3 cpu1 (AP): APIC ID: 0 (...) So, I suppose it's further reading time and then I'll go and set up a couple of vimages and see what it does... :) Thanks again, Peter. -- Psssst! Schon vom neuen GMX MultiMessenger geh?rt? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01 From barney_cordoba at yahoo.com Sat Apr 25 13:33:16 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Sat Apr 25 13:33:22 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <20090424174208.GA76828@jem.dhs.org> Message-ID: <160513.83122.qm@web63904.mail.re1.yahoo.com> --- On Fri, 4/24/09, Ed Maste wrote: > From: Ed Maste > Subject: Re: Interrupts + Polling mode (similar to Linux's NAPI) > To: "Barney Cordoba" > Cc: freebsd-net@freebsd.org > Date: Friday, April 24, 2009, 1:42 PM > On Fri, Apr 24, 2009 at 08:03:52AM -0700, Barney Cordoba > wrote: > > > Actually, the "advantage of using > interrupts" is to have a per > > NIC control without having all of the extra code to > implement > > polling. Using variable interrupt moderation is much > more desirable > > and efficient, so polling is only useful for legacy > NICs with no > > controls on interrupt delays. > > I'm aware of the advantages and tradeoffs of the > various approaches, > and the shortcomings of our current polling infrastructure, > probably > the greatest of which is the lack of any parallelism. That > said, in > testing some time ago polling with the modifications I > alluded to in > my email, using em(4), gave the highest throughput of all > approaches. > (At the expence of latency, as expected.) > > In addition, having a standardized polling interface allows > for use of > the interface when the system is not fully running -- > network > crashdumps, for instance. > > We can certainly use improvements in the polling > infrastructure though, > at least allowing it to properly take advantage of SMP. > The KPI change > proposed here is to allow some of those improvements, > should they > happen, to be MFC'd to 8. > > -Ed "highest performance" measured in what way, and in comparision to what set of moderation controls? Did your "tests" consider that polling implements a direct method of managing the NIC while interrupts use the highly questionable method of launching tasks? If you didn't test against interrupt methods that manage the NIC in the same way then you've compared apples to oranges. The proper way to do it would be to have your interrupt routine call the polling function (ie using the moderated interrupt to initiate the poll). Barney From rwatson at FreeBSD.org Sat Apr 25 19:10:04 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sat Apr 25 19:10:11 2009 Subject: panic in soabort In-Reply-To: References: Message-ID: On Fri, 24 Apr 2009, pluknet wrote: > 2009/4/23 Robert Watson : >> On Thu, 23 Apr 2009, pluknet wrote: >> >>> Please, give me comment on this. The panic is on 6.2-REL. Is it known to >>> be fixed in the latter releases? >> >> It may well be -- there have been quite significant architectural >> improvements to socket life cycle (etc) between 6.2 and 7.x releases, which >> may well close the race causing this panic. However, we'll probably need >> to learn a bit more in order to decide for sure. Could you convert the >> trapping instruction pointer to file+offset in the source code? > > Looks I've lost the corresponding kernel.debug. Anyway I have such bt the > first time. If you run into this again, let me know. Also, are you using accept filters on the box? Robert N M Watson Computer Laboratory University of Cambridge From pluknet at gmail.com Sat Apr 25 19:51:34 2009 From: pluknet at gmail.com (pluknet) Date: Sat Apr 25 19:51:41 2009 Subject: panic in soabort In-Reply-To: References: Message-ID: 2009/4/25 Robert Watson : > > On Fri, 24 Apr 2009, pluknet wrote: > >> 2009/4/23 Robert Watson : >>> >>> On Thu, 23 Apr 2009, pluknet wrote: >>> >>>> Please, give me comment on this. The panic is on 6.2-REL. Is it known to >>>> be fixed in the latter releases? >>> >>> It may well be -- there have been quite significant architectural >>> improvements to socket life cycle (etc) between 6.2 and 7.x releases, which >>> may well close the race causing this panic. However, we'll probably need to >>> learn a bit more in order to decide for sure. Could you convert the >>> trapping instruction pointer to file+offset in the source code? >> >> Looks I've lost the corresponding kernel.debug. Anyway I have such bt the >> first time. > > If you run into this again, let me know. Also, are you using accept filters > on the box? > [started to think about adding dumpdev="AUTO" to rc.conf]. Not on this one now, but we are going to massively switch to accf_http in some near future, if that would give us an appropriate gain. > Robert N M Watson > Computer Laboratory > University of Cambridge > -- wbr, pluknet From anchie at fer.hr Sun Apr 26 12:38:49 2009 From: anchie at fer.hr (Ana Kukec) Date: Sun Apr 26 12:39:01 2009 Subject: GSoC - SeND Message-ID: <49F4517E.5080005@fer.hr> Hi all, I am Ana Kukec, a research assistant and a PhD student at University of Zagreb. I will be working on the IPv6 Secure Neighbor Discovery (SeND - rfc3971, rfc4861) - the implementation of native kernel APIs for FreeBSD, within GSoC, with my mentor Bjoern Zeeb. More informations will be provided on http://wiki.freebsd.org/SOC2009AnaKukec. Regards, Ana From gelraen.ua at gmail.com Sun Apr 26 14:54:07 2009 From: gelraen.ua at gmail.com (Maxim Ignatenko) Date: Sun Apr 26 14:54:13 2009 Subject: [dummynet] Several queues connected to one pipe: "dummynet: OUCH! pipe should have been idle!" Message-ID: Hi, I have next dummynet configuration: ipfw pipe 3 bw 3Mbit/s ipfw queue 10 config pipe 3 weight 10 mask src-ip 0xffffffff ipfw queue 11 config pipe 3 weight 10 mask dst-ip 0xffffffff Two queues for different traffic directions connected to one pipe. After update to r191410 my /var/log/messages filled with: Apr 24 16:33:31 imax kernel: dummynet: OUCH! pipe should have been idle! Apr 24 16:33:59 imax last message repeated 8 times Apr 24 16:35:53 imax last message repeated 519 times Apr 24 16:38:55 imax last message repeated 50 times Then I've changed ip_dummynet.c little, to see actual value of pipe->scheduler_heap.elements Here what I've got with one dynamic queue per parent: Apr 25 16:16:34 imax kernel: dummynet: OUCH! pipe should have been idle!SCH len: 2 Apr 25 16:17:05 imax last message repeated 462 times Apr 25 16:18:48 imax last message repeated 1269 times With two queues per parent: Apr 26 16:51:34 imax kernel: dummynet: OUCH! pipe should have been idle!SCH len: 4 Apr 26 16:51:34 imax kernel: dummynet: OUCH! pipe should have been idle!SCH len: 3 Apr 26 16:51:34 imax kernel: dummynet: OUCH! pipe should have been idle!SCH len: 4 Apr 26 16:51:34 imax kernel: dummynet: OUCH! pipe should have been idle!SCH len: 3 Apr 26 16:51:34 imax kernel: dummynet: OUCH! pipe should have been idle!SCH len: 4 Thanks for attention, awaiting your comments and/or suggestions. From rizzo at iet.unipi.it Sun Apr 26 21:52:34 2009 From: rizzo at iet.unipi.it (Luigi Rizzo) Date: Sun Apr 26 21:52:42 2009 Subject: [dummynet] Several queues connected to one pipe: "dummynet: OUCH! pipe should have been idle!" In-Reply-To: References: Message-ID: <20090426215740.GA33188@onelab2.iet.unipi.it> On Sun, Apr 26, 2009 at 05:32:45PM +0300, Maxim Ignatenko wrote: > Hi, > > I have next dummynet configuration: > > ipfw pipe 3 bw 3Mbit/s > ipfw queue 10 config pipe 3 weight 10 mask src-ip 0xffffffff > ipfw queue 11 config pipe 3 weight 10 mask dst-ip 0xffffffff > > Two queues for different traffic directions connected to one pipe. > After update to r191410 my /var/log/messages filled with: > Apr 24 16:33:31 imax kernel: dummynet: OUCH! pipe should have been idle! > Apr 24 16:33:59 imax last message repeated 8 times > Apr 24 16:35:53 imax last message repeated 519 times > Apr 24 16:38:55 imax last message repeated 50 times could you give us a few more details on the branch you are using (HEAD or RELENG_7 ?) and what svn revision did you use before the update (which did not show the error) ? thanks luigi From rizzo at iet.unipi.it Sun Apr 26 22:42:21 2009 From: rizzo at iet.unipi.it (Luigi Rizzo) Date: Sun Apr 26 22:42:28 2009 Subject: [dummynet] Several queues connected to one pipe: "dummynet: OUCH! pipe should have been idle!" In-Reply-To: References: <20090426215740.GA33188@onelab2.iet.unipi.it> Message-ID: <20090426224729.GA34800@onelab2.iet.unipi.it> On Mon, Apr 27, 2009 at 01:12:55AM +0300, Maxim Ignatenko wrote: > 2009/4/27 Luigi Rizzo : > > > > could you give us a few more details on the branch you > > are using (HEAD or RELENG_7 ?) and what svn revision did > > you use before the update (which did not show the error) ? > > > > Sorry, I've forgot to mention that... > I use HEAD, and before update it was r191201, if I'm not mistaking. ok there seems to be no change related to dummynet between these two versions so I am not sure where to look. Could you double check what is the last working version ? cheers luigi From gelraen.ua at gmail.com Sun Apr 26 22:45:29 2009 From: gelraen.ua at gmail.com (Maxim Ignatenko) Date: Sun Apr 26 22:45:53 2009 Subject: [dummynet] Several queues connected to one pipe: "dummynet: OUCH! pipe should have been idle!" In-Reply-To: <20090426215740.GA33188@onelab2.iet.unipi.it> References: <20090426215740.GA33188@onelab2.iet.unipi.it> Message-ID: 2009/4/27 Luigi Rizzo : > > could you give us a few more details on the branch you > are using (HEAD or RELENG_7 ?) and what svn revision did > you use before the update (which did not show the error) ? > Sorry, I've forgot to mention that... I use HEAD, and before update it was r191201, if I'm not mistaking. Now I'm just removed queues from ruleset, but I may supply any additional information, if needed. Thanks for attention. From gelraen.ua at gmail.com Sun Apr 26 23:46:03 2009 From: gelraen.ua at gmail.com (Maxim Ignatenko) Date: Sun Apr 26 23:46:46 2009 Subject: [dummynet] Several queues connected to one pipe: "dummynet: OUCH! pipe should have been idle!" In-Reply-To: <20090426224729.GA34800@onelab2.iet.unipi.it> References: <20090426215740.GA33188@onelab2.iet.unipi.it> <20090426224729.GA34800@onelab2.iet.unipi.it> Message-ID: 2009/4/27 Luigi Rizzo : > On Mon, Apr 27, 2009 at 01:12:55AM +0300, Maxim Ignatenko wrote: >> 2009/4/27 Luigi Rizzo : >> > >> > could you give us a few more details on the branch you >> > are using (HEAD or RELENG_7 ?) and what svn revision did >> > you use before the update (which did not show the error) ? >> > >> >> Sorry, I've forgot to mention that... >> I use HEAD, and before update it was r191201, if I'm not mistaking. > > ok there seems to be no change related to dummynet between these > two versions so I am not sure where to look. > Could you double check what is the last working version ? > > cheers > luigi > OK, I'll try. From raykinsella78 at gmail.com Mon Apr 27 08:43:45 2009 From: raykinsella78 at gmail.com (Ray Kinsella) Date: Mon Apr 27 08:43:57 2009 Subject: FreeBSD 7.1 taskq em performance In-Reply-To: <40bb871a0904241542o3f4d6c6ap62ff71876074bbea@mail.gmail.com> References: <40bb871a0904241542o3f4d6c6ap62ff71876074bbea@mail.gmail.com> Message-ID: <584ec6bb0904270118v37795ee2k24c9262d4c1abd80@mail.gmail.com> Joseph, I would recommend that you start with PMCStat and figure where the bottleneck is, Given that you have a two threads and your CPU is at 100%, my a apriori guess would be a contention for a spinlock, so I might also try to use LOCK_PROFILING to handle on this. Regards Ray Kinsella On Fri, Apr 24, 2009 at 11:42 PM, Joseph Kuan wrote: > Hi all, > I have been hitting some barrier with FreeBSD 7.1 network performance. I > have written an application which contains two kernel threads that takes > mbufs directly from a network interface and forwards to another network > interface. This idea is to simulate different network environment. > > I have been using FreeBSD 6.4 amd64 and tested with an Ixia box > (specialised hardware firing very high packet rate). The PC was a Core2 2.6 > GHz with dual ports Intel PCIE Gigabit network card. It can manage up to > 1.2 > million pps. > > I have a higher spec PC with FreeBSD 7.1 amd64 and Quadcore 2.3 GHz and > PCIE Gigabit network card. The performance can only achieve up to 600k pps. > I notice the 'taskq em0' and 'taskq em1' is solid 100% CPU but it is not in > FreeBSD 6.4. > > Any advice? > > Many thanks in advance > > Joe > _______________________________________________ > freebsd-performance@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to " > freebsd-performance-unsubscribe@freebsd.org" > From bugmaster at FreeBSD.org Mon Apr 27 11:06:59 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Apr 27 11:08:42 2009 Subject: Current problem reports assigned to freebsd-net@FreeBSD.org Message-ID: <200904271106.n3RB6wte002365@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/133902 net [tun] Killing tun0 iface ssh tunnel causes Panic Strin o kern/133736 net [udp] ip_id not protected ... o kern/133613 net [wpi] [panic] kernel panic in wpi(4) o kern/133595 net [panic] Kernel Panic at pcpu.h:195 o kern/133572 net [ppp] [hang] incoming PPTP connection hangs the system o kern/133490 net [bpf] [panic] 'kmem_map too small' panic on Dell r900 o kern/133328 net [bge] [panic] Kernel panics with Windows7 client o kern/133235 net [netinet] [patch] Process SIOCDLIFADDR command incorre o kern/133218 net [carp] [hang] use of carp(4) causes system to freeze o kern/133204 net [msk] msk driver timeouts o kern/133060 net [ipsec] [pfsync] [panic] Kernel panic with ipsec + pfs o kern/132991 net [bge] if_bge low performance problem o kern/132984 net [netgraph] swi1: net 100% cpu usage f bin/132911 net ip6fw(8): argument type of fill_icmptypes is wrong and o kern/132889 net [ndis] [panic] NDIS kernel crash on load BCM4321 AGN d o kern/132885 net [wlan] 802.1x broken after SVN rev 189592 o conf/132851 net [fib] [patch] allow to setup fib for service running f o kern/132832 net [netinet] [patch] tcp_output() might generate invalid o bin/132798 net [patch] ggatec(8): ggated/ggatec connection slowdown p o kern/132734 net [ifmib] [panic] panic in net/if_mib.c o kern/132722 net [ath] Wifi ath0 associates fine with AP, but DHCP or I o kern/132715 net [lagg] [panic] Panic when creating vlan's on lagg inte o kern/132705 net [libwrap] [patch] libwrap - infinite loop if hosts.all o kern/132672 net [ndis] [panic] ndis with rt2860.sys causes kernel pani o kern/132669 net [xl] 3c905-TX send DUP! in reply on ping (sometime) o kern/132625 net [iwn] iwn drivers don't support setting country o kern/132554 net [ipl] There is no ippool start script/ipfilter magic t o kern/132354 net [nat] Getting some packages to ipnat(8) causes crash o kern/132285 net [carp] alias gives incorrect hash in dmesg o kern/132277 net [crypto] [ipsec] poor performance using cryptodevice f o conf/132179 net [patch] /etc/network.subr: ipv6 rtsol on incorrect wla o kern/132107 net [carp] carp(4) advskew setting ignored when carp IP us o kern/131781 net [ndis] ndis keeps dropping the link o kern/131776 net [wi] driver fails to init o kern/131753 net [altq] [panic] kernel panic in hfsc_dequeue o bin/131567 net [socket] [patch] Update for regression/sockets/unix_cm o kern/131549 net ifconfig(8) can't clear 'monitor' mode on the wireless o kern/131536 net [netinet] [patch] kernel does allow manipulation of su o bin/131365 net route(8): route add changes interpretation of network o kern/131162 net [ath] Atheros driver bugginess and kernel crashes o kern/131153 net [iwi] iwi doesn't see a wireless network f kern/131087 net [ipw] [panic] ipw / iwi - no sent/received packets; iw f kern/130820 net [ndis] wpa_supplicant(8) returns 'no space on device' o kern/130628 net [nfs] NFS / rpc.lockd deadlock on 7.1-R o conf/130555 net [rc.d] [patch] No good way to set ipfilter variables a o kern/130525 net [ndis] [panic] 64 bit ar5008 ndisgen-erated driver cau o kern/130311 net [wlan_xauth] [panic] hostapd restart causing kernel pa o kern/130109 net [ipfw] Can not set fib for packets originated from loc f kern/130059 net [panic] Leaking 50k mbufs/hour o kern/129750 net [ath] Atheros AR5006 exits on "cannot map register spa f kern/129719 net [nfs] [panic] Panic during shutdown, tcp_ctloutput: in o kern/129580 net [ndis] Netgear WG311v3 (ndis) causes kenel trap at boo o kern/129517 net [ipsec] [panic] double fault / stack overflow o kern/129508 net [carp] [panic] Kernel panic with EtherIP (may be relat o kern/129352 net [xl] [patch] xl0 watchdog timeout o kern/129219 net [ppp] Kernel panic when using kernel mode ppp o kern/129197 net [panic] 7.0 IP stack related panic o kern/129135 net [vge] vge driver on a VIA mini-ITX not working o bin/128954 net ifconfig(8) deletes valid routes o kern/128917 net [wpi] [panic] if_wpi and wpa+tkip causing kernel panic o kern/128884 net [msk] if_msk page fault while in kernel mode o kern/128840 net [igb] page fault under load with igb/LRO o bin/128602 net [an] wpa_supplicant(8) crashes with an(4) o kern/128598 net [bluetooth] WARNING: attempt to net_add_domain(bluetoo o kern/128448 net [nfs] 6.4-RC1 Boot Fails if NFS Hostname cannot be res o conf/128334 net [request] use wpa_cli in the "WPA DHCP" situation o bin/128295 net [patch] ifconfig(8) does not print TOE4 or TOE6 capabi o bin/128001 net wpa_supplicant(8), wlan(4), and wi(4) issues o kern/127928 net [tcp] [patch] TCP bandwidth gets squeezed every time t o kern/127834 net [ixgbe] [patch] wrong error counting o kern/127826 net [iwi] iwi0 driver has reduced performance and connecti o kern/127815 net [gif] [patch] if_gif does not set vlan attributes from o kern/127724 net [rtalloc] rtfree: 0xc5a8f870 has 1 refs f bin/127719 net [arp] arp: Segmentation fault (core dumped) s kern/127587 net [bge] [request] if_bge(4) doesn't support BCM576X fami f kern/127528 net [icmp]: icmp socket receives icmp replies not owned by o bin/127192 net routed(8) removes the secondary alias IP of interface f kern/127145 net [wi]: prism (wi) driver crash at bigger traffic o kern/127102 net [wpi] Intel 3945ABG low throughput o kern/127057 net [udp] Unable to send UDP packet via IPv6 socket to IPv o kern/127050 net [carp] ipv6 does not work on carp interfaces [regressi o kern/126945 net [carp] CARP interface destruction with ifconfig destro o kern/126924 net [an] [patch] printf -> device_printf and simplify prob o kern/126895 net [patch] [ral] Add antenna selection (marked as TBD) o kern/126874 net [vlan]: Zebra problem if ifconfig vlanX destroy o bin/126822 net wpa_supplicant(8): WPA PSK does not work in adhoc mode o kern/126714 net [carp] CARP interface renaming makes system no longer o kern/126695 net rtfree messages and network disruption upon use of if_ o kern/126688 net [ixgbe] [patch] 1.4.7 ixgbe driver panic with 4GB and o kern/126475 net [ath] [panic] ath pcmcia card inevitably panics under o kern/126339 net [ipw] ipw driver drops the connection o kern/126214 net [ath] txpower problem with Atheros wifi card o kern/126075 net [inet] [patch] internet control accesses beyond end of o bin/125922 net [patch] Deadlock in arp(8) o kern/125920 net [arp] Kernel Routing Table loses Ethernet Link status o kern/125845 net [netinet] [patch] tcp_lro_rx() should make use of hard o kern/125816 net [carp] [if_bridge] carp stuck in init when using bridg f kern/125502 net [ral] ifconfig ral0 scan produces no output unless in o kern/125258 net [socket] socket's SO_REUSEADDR option does not work o kern/125239 net [gre] kernel crash when using gre o kern/125195 net [fxp] fxp(4) driver failed to initialize device Intel o kern/124904 net [fxp] EEPROM corruption with Compaq NC3163 NIC o kern/124767 net [iwi] Wireless connection using iwi0 driver (Intel 220 o kern/124753 net [ieee80211] net80211 discards power-save queue packets o kern/124341 net [ral] promiscuous mode for wireless device ral0 looses o kern/124160 net [libc] connect(2) function loops indefinitely o kern/124127 net [msk] watchdog timeout (missed Tx interrupts) -- recov o kern/124021 net [ip6] [panic] page fault in nd6_output() o kern/123968 net [rum] [panic] rum driver causes kernel panic with WPA. p kern/123961 net [vr] [patch] Allow vr interface to handle vlans o kern/123892 net [tap] [patch] No buffer space available o kern/123890 net [ppp] [panic] crash & reboot on work with PPP low-spee o kern/123858 net [stf] [patch] stf not usable behind a NAT o kern/123796 net [ipf] FreeBSD 6.1+VPN+ipnat+ipf: port mapping does not o bin/123633 net ifconfig(8) doesn't set inet and ether address in one f kern/123617 net [tcp] breaking connection when client downloading file o kern/123603 net [tcp] tcp_do_segment and Received duplicate SYN o kern/123559 net [iwi] iwi periodically disassociates/associates [regre o bin/123465 net [ip6] route(8): route add -inet6 -interfac o kern/123463 net [ipsec] [panic] repeatable crash related to ipsec-tool o kern/123429 net [nfe] [hang] "ifconfig nfe up" causes a hard system lo o kern/123347 net [bge] bge1: watchdog timeout -- linkstate changed to D o conf/123330 net [nsswitch.conf] Enabling samba wins in nsswitch.conf c o kern/123256 net [wpi] panic: blockable sleep lock with wpi(4) f kern/123172 net [bce] Watchdog timeout problems with if_bce o kern/123160 net [ip] Panic and reboot at sysctl kern.polling.enable=0 o kern/122989 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/122954 net [lagg] IPv6 EUI64 incorrectly chosen for lagg devices o kern/122928 net [em] interface watchdog timeouts and stops receiving p f kern/122839 net [multicast] FreeBSD 7 multicast routing problem p kern/122794 net [lagg] Kernel panic after brings lagg(8) up if NICs ar o kern/122780 net [lagg] tcpdump on lagg interface during high pps wedge o kern/122772 net [em] em0 taskq panic, tcp reassembly bug causes radix o kern/122743 net [mbuf] [panic] vm_page_unwire: invalid wire count: 0 o kern/122697 net [ath] Atheros card is not well supported o kern/122685 net It is not visible passing packets in tcpdump(1) o kern/122551 net [bge] Broadcom 5715S no carrier on HP BL460c blade usi o kern/122319 net [wi] imposible to enable ad-hoc demo mode with Orinoco o kern/122290 net [netgraph] [panic] Netgraph related "kmem_map too smal f kern/122252 net [ipmi] [bge] IPMI problem with BCM5704 (does not work o kern/122195 net [ed] Alignment problems in if_ed o kern/122058 net [em] [panic] Panic on em1: taskq o kern/122033 net [ral] [lor] Lock order reversal in ral0 at bootup [reg o kern/121983 net [fxp] fxp0 MBUF and PAE o bin/121895 net [patch] rtsol(8)/rtsold(8) doesn't handle managed netw o kern/121872 net [wpi] driver fails to attach on a fujitsu-siemens s711 s kern/121774 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/121706 net [netinet] [patch] "rtfree: 0xc4383870 has 1 refs" emit o kern/121624 net [em] [regression] Intel em WOL fails after upgrade to o kern/121555 net [panic] Fatal trap 12: current process = 12 (swi1: net o kern/121443 net [gif] [lor] icmp6_input/nd6_lookup o kern/121437 net [vlan] Routing to layer-2 address does not work on VLA o bin/121359 net [patch] ppp(8): fix local stack overflow in ppp o kern/121298 net [em] [panic] Fatal trap 12: page fault while in kernel o kern/121257 net [tcp] TSO + natd -> slow outgoing tcp traffic o kern/121181 net [panic] Fatal trap 3: breakpoint instruction fault whi o kern/121080 net [bge] IPv6 NUD problem on multi address config on bge0 o kern/120966 net [rum] kernel panic with if_rum and WPA encryption p docs/120945 net [patch] ip6(4) man page lacks documentation for TCLASS o kern/120566 net [request]: ifconfig(8) make order of arguments more fr o kern/120304 net [netgraph] [patch] netgraph source assumes 32-bit time o kern/120266 net [udp] [panic] gnugk causes kernel panic when closing U o kern/120232 net [nfe] [patch] Bring in nfe(4) to RELENG_6 o kern/120130 net [carp] [panic] carp causes kernel panics in any conste o bin/120060 net routed(8) deletes link-level routes in the presence of o kern/119945 net [rum] [panic] rum device in hostap mode, cause kernel o kern/119791 net [nfs] UDP NFS mount of aliased IP addresses from a Sol o kern/119617 net [nfs] nfs error on wpa network when reseting/shutdown f kern/119516 net [ip6] [panic] _mtx_lock_sleep: recursed on non-recursi o kern/119432 net [arp] route add -host -iface causes arp e o kern/119225 net [wi] 7.0-RC1 no carrier with Prism 2.5 wifi card [regr a bin/118987 net ifconfig(8): ifconfig -l (address_family) does not wor o sparc/118932 net [panic] 7.0-BETA4/sparc-64 kernel panic in rip_output a kern/118879 net [bge] [patch] bge has checksum problems on the 5703 ch o kern/118727 net [netgraph] [patch] [request] add new ng_pf module s kern/117717 net [panic] Kernel panic with Bittorrent client. o kern/117448 net [carp] 6.2 kernel crash [regression] o kern/117423 net [vlan] Duplicate IP on different interfaces o bin/117339 net [patch] route(8): loading routing management commands o kern/117271 net [tap] OpenVPN TAP uses 99% CPU on releng_6 when if_tap o kern/117043 net [em] Intel PWLA8492MT Dual-Port Network adapter EEPROM o kern/116837 net [tun] [panic] [patch] ifconfig tunX destroy: panic o kern/116747 net [ndis] FreeBSD 7.0-CURRENT crash with Dell TrueMobile o bin/116643 net [patch] [request] fstat(1): add INET/INET6 socket deta o kern/116328 net [bge]: Solid hang with bge interface o kern/116185 net [iwi] if_iwi driver leads system to reboot o kern/115239 net [ipnat] panic with 'kmem_map too small' using ipnat o kern/115019 net [netgraph] ng_ether upper hook packet flow stops on ad o kern/115002 net [wi] if_wi timeout. failed allocation (busy bit). ifco o kern/114915 net [patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f o kern/114839 net [fxp] fxp looses ability to speak with traffic o kern/113895 net [xl] xl0 fails on 6.2-RELEASE but worked fine on 5.5-R o kern/112722 net [ipsec] [udp] IP v4 udp fragmented packet reject o kern/112686 net [patm] patm driver freezes System (FreeBSD 6.2-p4) i38 o kern/112570 net [bge] packet loss with bge driver on BCM5704 chipset o bin/112557 net [patch] ppp(8) lock file should not use symlink name o kern/112528 net [nfs] NFS over TCP under load hangs with "impossible p o kern/111457 net [ral] ral(4) freeze o kern/110140 net [ipw] ipw fails under load o kern/109733 net [bge] bge link state issues [regression] o kern/109470 net [wi] Orinoco Classic Gold PC Card Can't Channel Hop o kern/109308 net [pppd] [panic] Multiple panics kernel ppp suspected [r o kern/109251 net [re] [patch] if_re cardbus card won't attach o bin/108895 net pppd(8): PPPoE dead connections on 6.2 [regression] o kern/108542 net [bce] Huge network latencies with 6.2-RELEASE / STABLE o kern/107944 net [wi] [patch] Forget to unlock mutex-locks o kern/107850 net [bce] bce driver link negotiation is faulty o conf/107035 net [patch] bridge(8): bridge interface given in rc.conf n o kern/106438 net [ipf] ipfilter: keep state does not seem to allow repl o kern/106316 net [dummynet] dummynet with multipass ipfw drops packets o kern/106243 net [nve] double fault panic in if_nve.c on high loads o kern/105945 net Address can disappear from network interface s kern/105943 net Network stack may modify read-only mbuf chain copies o bin/105925 net problems with ifconfig(8) and vlan(4) [regression] o kern/105348 net [ath] ath device stopps TX o kern/104851 net [inet6] [patch] On link routes not configured when usi o kern/104751 net [netgraph] kernel panic, when getting info about my tr o kern/104485 net [bge] Broadcom BCM5704C: Intermittent on newer chip ve o kern/103191 net Unpredictable reboot o kern/103135 net [ipsec] ipsec with ipfw divert (not NAT) encodes a pac o conf/102502 net [netgraph] [patch] ifconfig name does't rename netgrap o kern/102035 net [plip] plip networking disables parallel port printing o kern/101948 net [ipf] [panic] Kernel Panic Trap No 12 Page Fault - cau o kern/100709 net [libc] getaddrinfo(3) should return TTL info o kern/100519 net [netisr] suggestion to fix suboptimal network polling o kern/98978 net [ipf] [patch] ipfilter drops OOW packets under 6.1-Rel o kern/98597 net [inet6] Bug in FreeBSD 6.1 IPv6 link-local DAD procedu o bin/98218 net wpa_supplicant(8) blacklist not working f bin/97392 net ppp(8) hangs instead terminating o kern/97306 net [netgraph] NG_L2TP locks after connection with failed f kern/96268 net [socket] TCP socket performance drops by 3000% if pack o kern/96030 net [bfe] [patch] Install hangs with Broadcomm 440x NIC in o kern/95519 net [ral] ral0 could not map mbuf o kern/95288 net [pppd] [tty] [panic] if_ppp panic in sys/kern/tty_subr o kern/95277 net [netinet] [patch] IP Encapsulation mask_match() return o kern/95267 net packet drops periodically appear s kern/94863 net [bge] [patch] hack to get bge(4) working on IBM e326m o kern/94162 net [bge] 6.x kenel stale with bge(4) o kern/93886 net [ath] Atheros/D-Link DWL-G650 long delay to associate f kern/93378 net [tcp] Slow data transfer in Postfix and Cyrus IMAP (wo o kern/93019 net [ppp] ppp and tunX problems: no traffic after restarti o kern/92880 net [libc] [patch] almost rewritten inet_network(3) functi f kern/92552 net A serious bug in most network drivers from 5.X to 6.X s kern/92279 net [dc] Core faults everytime I reboot, possible NIC issu o kern/92090 net [bge] bge0: watchdog timeout -- resetting o kern/91859 net [ndis] if_ndis does not work with Asus WL-138 s kern/91777 net [ipf] [patch] wrong behaviour with skip rule inside an o kern/91594 net [em] FreeBSD > 5.4 w/ACPI fails to detect Intel Pro/10 o kern/91364 net [ral] [wep] WF-511 RT2500 Card PCI and WEP o kern/91311 net [aue] aue interface hanging o kern/90890 net [vr] Problems with network: vr0: tx shutdown timeout s kern/90086 net [hang] 5.4p8 on supermicro P8SCT hangs during boot if f kern/88082 net [ath] [panic] cts protection for ath0 causes panic o kern/87521 net [ipf] [panic] using ipfilter "auth" keyword leads to k o kern/87506 net [vr] [patch] Fix alias support on vr interfaces o kern/87194 net [fxp] fxp(4) promiscuous mode seems to corrupt hw-csum s kern/86920 net [ndis] ifconfig: SIOCS80211: Invalid argument [regress o kern/86103 net [ipf] Illegal NAT Traversal in IPFilter o kern/85780 net 'panic: bogus refcnt 0' in routing/ipv6 o bin/85445 net ifconfig(8): deprecated keyword to ifconfig inoperativ o kern/85266 net [xe] [patch] xe(4) driver does not recognise Xircom XE o kern/84202 net [ed] [patch] Holtek HT80232 PCI NIC recognition on Fre o bin/82975 net route change does not parse classfull network as given o kern/82497 net [vge] vge(4) on AMD64 only works when loaded late, not f kern/81644 net [vge] vge(4) does not work properly when loaded as a K s kern/81147 net [net] [patch] em0 reinitialization while adding aliase o kern/80853 net [ed] [patch] add support for Compex RL2000/ISA in PnP o kern/79895 net [ipf] 5.4-RC2 breaks ipfilter NAT when using netgraph f kern/79262 net [dc] Adaptec ANA-6922 not fully supported o bin/79228 net [patch] extend arp(8) to be able to create blackhole r o kern/78090 net [ipf] ipf filtering on bridged packets doesn't work if p kern/77913 net [wi] [patch] Add the APDL-325 WLAN pccard to wi(4) o kern/77341 net [ip6] problems with IPV6 implementation o kern/77273 net [ipf] ipfilter breaks ipv6 statefull filtering on 5.3 s kern/77195 net [ipf] [patch] ipfilter ioctl SIOCGNATL does not match o kern/75873 net Usability problem with non-RFC-compliant IP spoof prot s kern/75407 net [an] an(4): no carrier after short time f kern/73538 net [bge] problem with the Broadcom BCM5788 Gigabit Ethern o kern/71469 net default route to internet magically disappears with mu o kern/70904 net [ipf] ipfilter ipnat problem with h323 proxy support o kern/64556 net [sis] if_sis short cable fix problems with NetGear FA3 s kern/60293 net [patch] FreeBSD arp poison patch o kern/54383 net [nfs] [patch] NFS root configurations without dynamic f i386/45773 net [bge] Softboot causes autoconf failure on Broadcom 570 s bin/41647 net ifconfig(8) doesn't accept lladdr along with inet addr s kern/39937 net ipstealth issue a kern/38554 net [patch] changing interface ipaddress doesn't seem to w o kern/35442 net [sis] [patch] Problem transmitting runts in if_sis dri o kern/34665 net [ipf] [hang] ipfilter rcmd proxy "hangs". o kern/31647 net [libc] socket calls can return undocumented EINVAL o kern/30186 net [libc] getaddrinfo(3) does not handle incorrect servna o kern/27474 net [ipf] [ppp] Interactive use of user PPP and ipfilter c o conf/23063 net [arp] [patch] for static ARP tables in rc.network 293 problems total. From gelraen.ua at gmail.com Mon Apr 27 13:51:20 2009 From: gelraen.ua at gmail.com (Maxim Ignatenko) Date: Mon Apr 27 13:51:26 2009 Subject: [dummynet] Several queues connected to one pipe: "dummynet: OUCH! pipe should have been idle!" In-Reply-To: <20090426224729.GA34800@onelab2.iet.unipi.it> References: <20090426215740.GA33188@onelab2.iet.unipi.it> <20090426224729.GA34800@onelab2.iet.unipi.it> Message-ID: 2009/4/27 Luigi Rizzo : > > ok there seems to be no change related to dummynet between these > two versions so I am not sure where to look. > Could you double check what is the last working version ? > Yes, r191201 have this problems too (it seems, i didn't updated for a long time). Now I updated to r190864 (just before last change on ip_dummynet.c) - all works fine. Should I now check r190865? Thanks. From rizzo at iet.unipi.it Mon Apr 27 13:58:01 2009 From: rizzo at iet.unipi.it (Luigi Rizzo) Date: Mon Apr 27 13:58:08 2009 Subject: [dummynet] Several queues connected to one pipe: "dummynet: OUCH! pipe should have been idle!" In-Reply-To: References: <20090426215740.GA33188@onelab2.iet.unipi.it> <20090426224729.GA34800@onelab2.iet.unipi.it> Message-ID: <20090427140309.GA62749@onelab2.iet.unipi.it> On Mon, Apr 27, 2009 at 04:51:18PM +0300, Maxim Ignatenko wrote: > 2009/4/27 Luigi Rizzo : > > > > ok there seems to be no change related to dummynet between these > > two versions so I am not sure where to look. > > Could you double check what is the last working version ? > > > Yes, r191201 have this problems too (it seems, i didn't updated for a > long time). > Now I updated to r190864 (just before last change on ip_dummynet.c) - > all works fine. Should I now check r190865? yes it would be great if you could identify a specific change that caused the problem. There is one thing particularly tricky in one of the dummynet changes, because some fields changed between 32/64 bits and signed/unsigned. I may have unadvertently introduced some conversion bug. thanks a lot for the feedback cheers luigi From gelraen.ua at gmail.com Mon Apr 27 14:44:24 2009 From: gelraen.ua at gmail.com (Maxim Ignatenko) Date: Mon Apr 27 14:44:31 2009 Subject: [dummynet] Several queues connected to one pipe: "dummynet: OUCH! pipe should have been idle!" In-Reply-To: <20090427140309.GA62749@onelab2.iet.unipi.it> References: <20090426215740.GA33188@onelab2.iet.unipi.it> <20090426224729.GA34800@onelab2.iet.unipi.it> <20090427140309.GA62749@onelab2.iet.unipi.it> Message-ID: 2009/4/27 Luigi Rizzo : > On Mon, Apr 27, 2009 at 04:51:18PM +0300, Maxim Ignatenko wrote: >> 2009/4/27 Luigi Rizzo : >> > >> > ok there seems to be no change related to dummynet between these >> > two versions so I am not sure where to look. >> > Could you double check what is the last working version ? >> > >> ?Yes, r191201 have this problems too (it seems, i didn't updated for a >> long time). >> Now ?I updated to r190864 (just before last change on ip_dummynet.c) - >> all works fine. Should I now check r190865? > > yes it would be great if you could identify a specific change that > caused the problem. > There is one thing particularly tricky in one of the dummynet > changes, because some fields changed between 32/64 bits and > signed/unsigned. I may have unadvertently introduced some > conversion bug. > On r190865 problem appeared again. > thanks a lot for the feedback > You welcome :) Thanks. From ddg at yan.com.br Mon Apr 27 16:11:40 2009 From: ddg at yan.com.br (=?ISO-8859-1?Q?Daniel_Dias_Gon=E7alves?=) Date: Mon Apr 27 16:11:53 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: <49F235F4.2030202@elischer.org> References: <49F06985.1000303@yan.com.br> <49F0A7DD.30206@elischer.org> <49F1DBAE.1080205@yan.com.br> <49F235F4.2030202@elischer.org> Message-ID: <49F5D8A3.3050805@yan.com.br> Julian, You could give an example of rules with tables? Julian Elischer escreveu: > Daniel Dias Gon?alves wrote: >> Very good thinking, congratulations, but my need is another. >> The objective is a Captive Porrtal that each authentication is >> dynamically created a rule to ALLOW or COUNT IP authenticated, which >> I'm testing is what is the maximum capacity of rules supported, >> therefore simultaneous user. >> >> Understand ? >> > I think so. > > > do not add rules. > have a single rule that looks in a table > and add entries to the table when needed. > >> Thanks, >> >> Daniel >> >> Julian Elischer escreveu: >>> Daniel Dias Gon?alves wrote: >>>> Hi, >>>> >>>> My system is a FreeBSD 7.1R. >>>> When I add rules IPFW COUNT to 254 IPS from my network, one of my >>>> interfaces increases the latency, causing large delays in the >>>> network, when I delete COUNT rules, everything returns to normal, >>>> which can be ? >>>> >>>> My script: >>> >>> of course adding 512 rules, *all of which hav eto be evaluated* will >>> add latency. >>> >>> you have several ways to improve this situation. >>> >>> 1/ use a differnet tool. >>> By using the netgraph netflow module you can get >>> accunting information that may be more useful and less impactful. >>> >>> 2/ you could make your rules smarter.. >>> >>> use skipto rules to make the average packet traverse less rules.. >>> >>> off the top of my head.. (not tested..) >>> >>> Assuming you have machines 10.0.0.1-10.0.0.254.... >>> the rules below have an average packet traversing 19 rules and not >>> 256 for teh SYN packet and 2 rules for others.. >>> you may not be able to do the keep state trick if you use state for >>> other stuff but in that case worst case will still be 19 rules. >>> >>> 2 check-state >>> 5 skipto 10000 ip from not 10.0.0.0/24 to any >>> 10 skipto 2020 ip from not 10.0.0.0/25 to any # 0-128 >>> 20 skipto 1030 ip from not 10.0.0.0/26 to any # 0-64 >>> 30 skipto 240 ip from not 10.0.0.0/27 to any # 0-32 >>> 40 skipto 100 ip from not 10.0.0.0/28 to any # 0-16 >>> [16 count rules for 0-15] >>> 80 skipto 10000 ip from any to any >>> 100 [16 count rules for 16-31] keep-state >>> 140 skipto 10000 ip from any to any >>> 240 skipto 300 ip from not 10.0.0.32/28 >>> [16 rules for 32-47] keep-state >>> 280 skipto 10000 ip from any to any >>> 300 [16 count rules for 48-63] keep-state >>> 340 skipto 10000 ip from any to any >>> 1030 skipto 1240 ip from not 10.0.0.64/27 to any >>> 1040 skipto 1100 ip from not 10.0.0.64/28 to any >>> [16 count rules for 64-79] keep-state >>> 1080 skipto 10000 ip from any to any >>> 1100 [16 rules for 80-95] keep-state >>> 1140 skipto 10000 ip from any to any >>> 1240 skipto 1300 ip from not 10.0.0.96/28 to any >>> [16 count rules for 96-111] keep-state >>> 1280 skipto 10000 ip from any to any >>> 1300 [16 rules for 112-127] keep-state >>> 1340 skipto 10000 ip from any to any >>> 2020 skipto 3030 ip from not 10.0.0.128/26 to any >>> 2030 skipto 2240 ip from not 10.0.0.128/28 to any >>> [16 count rules for 128-143] keep-state >>> 2080 skipto 10000 ip from any to any >>> 2100 [16 rules for 144-159] keep-state >>> 2140 skipto 10000 ip from any to any >>> 2240 skipto 2300 ip from not 10.0.0.32/28 to any >>> [16 count rules for 160-175] keep-state >>> 2280 skipto 10000 ip from any to any >>> 2300 [16 count rules for 176-191] keep-state >>> 2340 skipto 10000 ip from any to any >>> 3030 skipto 3240 ip from not 10.0.0.192/27 to any >>> 3040 skipto 3100 ip from not 10.0.0.192/28 to any >>> [16 count rules for 192-207] keep-state >>> 3080 skipto 10000 ip from any to any >>> 3100 [16 rules for 208-223] keep-state >>> 3240 skipto 10000 ip from any to any >>> 3240 skipto 3300 ip from not 10.0.0.224/28 to any >>> [16 count rules for 224-239] keep-state >>> 3280 skipto 10000 ip from any to any >>> 3300 [16 count rules for 240-255] keep-state >>> 3340 skipto 10000 ip from any to any >>> >>> 10000 #other stuff >>> >>> in fact you could improve it further with: >>> 1/ either going down to a netmask of 29 (8 rules per set) >>> or >>> 2/ instead of having count rules make them skipto >>> so you would have: >>> 3300 skipto 10000 ip from 10.0.0.240 to any >>> 3301 skipto 10000 ip from 10.0.0.241 to any >>> 3302 skipto 10000 ip from 10.0.0.242 to any >>> 3303 skipto 10000 ip from 10.0.0.243 to any >>> 3304 skipto 10000 ip from 10.0.0.244 to any >>> 3305 skipto 10000 ip from 10.0.0.245 to any >>> 3306 skipto 10000 ip from 10.0.0.246 to any >>> 3307 skipto 10000 ip from 10.0.0.247 to any >>> 3308 skipto 10000 ip from 10.0.0.248 to any >>> 3309 skipto 10000 ip from 10.0.0.249 to any >>> 3310 skipto 10000 ip from 10.0.0.240 to any >>> 3311 skipto 10000 ip from 10.0.0.241 to any >>> 3312 skipto 10000 ip from 10.0.0.242 to any >>> 3313 skipto 10000 ip from 10.0.0.243 to any >>> 3314 skipto 10000 ip from 10.0.0.244 to any >>> 3315 skipto 10000 ip from 10.0.0.245 to any >>> >>> thus on average, a packet would traverse half the rules (8). >>> >>> 3/ both the above so on average they would traverse 4 rules plus >>> one extra skipto. >>> >>> you should be able to do the above in a script. >>> I'd love to see it.. >>> >>> (you can also do skipto tablearg in -current (maybe 7.2 too) >>> which may also be good.. (or not)) >>> >>> >>> julian >>> >>> >>> >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>> >>> > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > From ddg at yan.com.br Mon Apr 27 16:21:41 2009 From: ddg at yan.com.br (=?ISO-8859-1?Q?Daniel_Dias_Gon=E7alves?=) Date: Mon Apr 27 16:21:53 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: <20090425024635.O89549@sola.nimnet.asn.au> References: <49F06985.1000303@yan.com.br> <49F08071.1070905@ibctech.ca> <49F1D992.9000001@yan.com.br> <20090425024635.O89549@sola.nimnet.asn.au> Message-ID: <49F5DB12.7080502@yan.com.br> What may be happening ? I'm with polling enabled on all interfaces, can you influence ? em0: port 0x7000-0x703f mem 0xdfa00000-0xdfa1ffff irq 16 at device 8.0 on pci4 em1: port 0x7400-0x743f mem 0xdfa20000-0xdfa3ffff irq 17 at device 8.1 on pci4 em2: port 0x8000-0x803f mem 0xdfb00000-0xdfb1ffff irq 16 at device 8.0 on pci5 em3: port 0x8400-0x843f mem 0xdfb20000-0xdfb3ffff irq 17 at device 8.1 on pci5 em4: port 0x9000-0x903f mem 0xdfc00000-0xdfc1ffff irq 16 at device 8.0 on pci7 em5: port 0x9400-0x943f mem 0xdfc20000-0xdfc3ffff irq 17 at device 8.1 on pci7 em6: port 0xa000-0xa03f mem 0xdfd00000-0xdfd1ffff irq 16 at device 8.0 on pci8 em7: port 0xa400-0xa43f mem 0xdfd20000-0xdfd3ffff irq 17 at device 8.1 on pci8 fxp0: port 0xb000-0xb03f mem 0xdfe20000-0xdfe20fff,0xdfe00000-0xdfe1ffff irq 16 at device 4.0 on pci14 If I disable the polling, no network interface work, begins to display "em4 watchdog timeout". Ian Smith escreveu: > On Fri, 24 Apr 2009, Daniel Dias Gon?alves wrote: > > > The latency in the interface em6 increased an average of 10ms to 200 ~ 300ms > > Hardware: > > CPU: Intel(R) Xeon(TM) CPU 3.20GHz (3200.13-MHz 686-class CPU) > > Logical CPUs per core: 2 > > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs > > cpu0: on acpi0 > > p4tcc0: on cpu0 > > cpu1: on acpi0 > > p4tcc1: on cpu1 > > cpu2: on acpi0 > > p4tcc2: on cpu2 > > cpu3: on acpi0 > > p4tcc3: on cpu3 > > SMP: AP CPU #1 Launched! > > SMP: AP CPU #3 Launched! > > SMP: AP CPU #2 Launched! > > > > real memory = 9663676416 (9216 MB) > > avail memory = 8396738560 (8007 MB) > > In that case, there really is something else wrong. By my measurements, > rummaging through most of >1000 rules on a old 166MHz Pentium to get to > the icmp allow rules (ridiculous, I know) added about 2ms to local net > pings via that box, ie 1ms each pass for about 900 rules, mostly counts. > > cheers, Ian From ddg at yan.com.br Mon Apr 27 16:24:22 2009 From: ddg at yan.com.br (=?ISO-8859-1?Q?Daniel_Dias_Gon=E7alves?=) Date: Mon Apr 27 16:24:30 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: References: <49F06985.1000303@yan.com.br> Message-ID: <49F5DBB3.6030500@yan.com.br> Going to another example. If I wanted that each authentication (username and password) in captive portal, set up rules limiting the speed of the user's IP, as I do? I can create two rules for the in / out for each user associated with a pipe? When simulating this with a script adding hundreds of rules, the latency also increases, as resolve this ? Adrian Chadd escreveu: > You'd almost certainly be better off hacking up an extension to ipfw > which lets you count a /24 in one rule. > > As in, the count rule would match on the subnet/netmask, have 256 32 > (or 64 bit) integers allocated to record traffic in, and then do an > O(1) operation using the last octet of the v4 address to map it into > this 256 slot array to update counters for. > > It'd require a little tool hackery to extend ipfw in userland/kernel > space to do it but it would work and be (very almost) just as fast as > a single rule. > > 2c, > > > > Adrian > > 2009/4/23 Daniel Dias Gon?alves : > >> Hi, >> >> My system is a FreeBSD 7.1R. >> When I add rules IPFW COUNT to 254 IPS from my network, one of my interfaces >> increases the latency, causing large delays in the network, when I delete >> COUNT rules, everything returns to normal, which can be ? >> >> My script: >> >> ipcount.php >> -- CUT -- >> > $c=0; >> $a=50100; >> for($x=0;$x<=0;$x++) { >> for($y=1;$y<=254;$y++) { >> $ip = "192.168.$x.$y"; >> system("/sbin/ipfw -q add $a count { tcp or udp } from any to >> $ip/32"); >> system("/sbin/ipfw -q add $a count { tcp or udp } from $ip/32 >> to any"); >> #system("/sbin/ipfw delete $a"); >> $c++; >> $a++; >> } >> } >> echo "\n\nTotal: $c\n"; >> ?> >> -- CUT -- >> >> net.inet.ip.fw.dyn_keepalive: 1 >> net.inet.ip.fw.dyn_short_lifetime: 5 >> net.inet.ip.fw.dyn_udp_lifetime: 10 >> net.inet.ip.fw.dyn_rst_lifetime: 1 >> net.inet.ip.fw.dyn_fin_lifetime: 1 >> net.inet.ip.fw.dyn_syn_lifetime: 20 >> net.inet.ip.fw.dyn_ack_lifetime: 300 >> net.inet.ip.fw.static_count: 262 >> net.inet.ip.fw.dyn_max: 10000 >> net.inet.ip.fw.dyn_count: 0 >> net.inet.ip.fw.curr_dyn_buckets: 256 >> net.inet.ip.fw.dyn_buckets: 10000 >> net.inet.ip.fw.default_rule: 65535 >> net.inet.ip.fw.verbose_limit: 0 >> net.inet.ip.fw.verbose: 1 >> net.inet.ip.fw.debug: 0 >> net.inet.ip.fw.one_pass: 1 >> net.inet.ip.fw.autoinc_step: 100 >> net.inet.ip.fw.enable: 1 >> net.link.ether.ipfw: 1 >> net.link.bridge.ipfw: 0 >> net.link.bridge.ipfw_arp: 0 >> >> Thanks, >> >> Daniel >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > > From gelraen.ua at gmail.com Mon Apr 27 19:14:58 2009 From: gelraen.ua at gmail.com (Maxim Ignatenko) Date: Mon Apr 27 19:15:28 2009 Subject: [dummynet] Several queues connected to one pipe: "dummynet: OUCH! pipe should have been idle!" In-Reply-To: <20090427190854.GA36459@lath.rinet.ru> References: <20090426215740.GA33188@onelab2.iet.unipi.it> <20090426224729.GA34800@onelab2.iet.unipi.it> <20090427140309.GA62749@onelab2.iet.unipi.it> <20090427190854.GA36459@lath.rinet.ru> Message-ID: 2009/4/27 Oleg Bulyzhin : > > Perhaps you stepped on this: > > http://docs.freebsd.org/cgi/getmsg.cgi?fetch=879027+0+archive/2009/svn-src-all/20090419.svn-src-all > > You can try to change type of dn_pipe.numbytes to int64_t (instead of dn_key). > (ip_dummynet.h:341) > This is exactly what is done by patch sent by Luigi to me. And yes, it helped. Thanks. From oleg at FreeBSD.org Mon Apr 27 19:23:23 2009 From: oleg at FreeBSD.org (Oleg Bulyzhin) Date: Mon Apr 27 19:23:36 2009 Subject: [dummynet] Several queues connected to one pipe: "dummynet: OUCH! pipe should have been idle!" In-Reply-To: References: <20090426215740.GA33188@onelab2.iet.unipi.it> <20090426224729.GA34800@onelab2.iet.unipi.it> <20090427140309.GA62749@onelab2.iet.unipi.it> Message-ID: <20090427190854.GA36459@lath.rinet.ru> On Mon, Apr 27, 2009 at 05:44:22PM +0300, Maxim Ignatenko wrote: > 2009/4/27 Luigi Rizzo : > > On Mon, Apr 27, 2009 at 04:51:18PM +0300, Maxim Ignatenko wrote: > >> 2009/4/27 Luigi Rizzo : > >> > > >> > ok there seems to be no change related to dummynet between these > >> > two versions so I am not sure where to look. > >> > Could you double check what is the last working version ? > >> > > >> ?Yes, r191201 have this problems too (it seems, i didn't updated for a > >> long time). > >> Now ?I updated to r190864 (just before last change on ip_dummynet.c) - > >> all works fine. Should I now check r190865? > > > > yes it would be great if you could identify a specific change that > > caused the problem. > > There is one thing particularly tricky in one of the dummynet > > changes, because some fields changed between 32/64 bits and > > signed/unsigned. I may have unadvertently introduced some > > conversion bug. > > > > On r190865 problem appeared again. > > > thanks a lot for the feedback > > > > You welcome :) > > Thanks. > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" Perhaps you stepped on this: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=879027+0+archive/2009/svn-src-all/20090419.svn-src-all You can try to change type of dn_pipe.numbytes to int64_t (instead of dn_key). (ip_dummynet.h:341) -- Oleg. ================================================================ === Oleg Bulyzhin -- OBUL-RIPN -- OBUL-RIPE -- oleg@rinet.ru === ================================================================ From rizzo at iet.unipi.it Mon Apr 27 21:29:51 2009 From: rizzo at iet.unipi.it (Luigi Rizzo) Date: Mon Apr 27 21:30:04 2009 Subject: [dummynet] Several queues connected to one pipe: "dummynet: OUCH! pipe should have been idle!" In-Reply-To: <20090427190854.GA36459@lath.rinet.ru> References: <20090426215740.GA33188@onelab2.iet.unipi.it> <20090426224729.GA34800@onelab2.iet.unipi.it> <20090427140309.GA62749@onelab2.iet.unipi.it> <20090427190854.GA36459@lath.rinet.ru> Message-ID: <20090427213500.GA77622@onelab2.iet.unipi.it> On Mon, Apr 27, 2009 at 11:08:54PM +0400, Oleg Bulyzhin wrote: > On Mon, Apr 27, 2009 at 05:44:22PM +0300, Maxim Ignatenko wrote: ... > > > yes it would be great if you could identify a specific change that > > > caused the problem. > > > There is one thing particularly tricky in one of the dummynet > > > changes, because some fields changed between 32/64 bits and > > > signed/unsigned. I may have unadvertently introduced some > > > conversion bug. > > > > > > > On r190865 problem appeared again. > > > > > thanks a lot for the feedback > > > > > > > You welcome :) > > > > Thanks. > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > Perhaps you stepped on this: > > http://docs.freebsd.org/cgi/getmsg.cgi?fetch=879027+0+archive/2009/svn-src-all/20090419.svn-src-all > > You can try to change type of dn_pipe.numbytes to int64_t (instead of dn_key). > (ip_dummynet.h:341) good catch Oleg, sorry if i missed your email above. cheers luigi From adrian at freebsd.org Tue Apr 28 03:40:49 2009 From: adrian at freebsd.org (Adrian Chadd) Date: Tue Apr 28 03:41:02 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: <49F5DBB3.6030500@yan.com.br> References: <49F06985.1000303@yan.com.br> <49F5DBB3.6030500@yan.com.br> Message-ID: You may want to investigate using pf; i'm not sure whether they handle this better. Me, I'd investigate writing a "tree" ipfw rule type. Ie, instead of having a list of rules, all evaluated one at a time, I'd create a rule implementing a subrule match on ip/netmask with some kind of action (allow, deny, count, pipe, etc) rather than having it all be evaluated O(n) style. 2c, Adrian 2009/4/28 Daniel Dias Gon?alves : > Going to another example. > If I wanted that each authentication (username and password) in captive > portal, set up rules limiting the speed of the user's IP, as I do? I can > create two rules for the in / out for each user associated with a pipe? When > simulating this with a script adding hundreds of rules, the latency also > increases, as resolve this ? > > Adrian Chadd escreveu: >> >> You'd almost certainly be better off hacking up an extension to ipfw >> which lets you count a /24 in one rule. >> >> As in, the count rule would match on the subnet/netmask, have 256 32 >> (or 64 bit) integers allocated to record traffic in, and then do an >> O(1) operation using the last octet of the v4 address to map it into >> this 256 slot array to update counters for. >> >> It'd require a little tool hackery to extend ipfw in userland/kernel >> space to do it but it would work and be (very almost) just as fast as >> a single rule. >> >> 2c, >> >> >> >> Adrian >> >> 2009/4/23 Daniel Dias Gon?alves : >> >>> >>> Hi, >>> >>> My system is a FreeBSD 7.1R. >>> When I add rules IPFW COUNT to 254 IPS from my network, one of my >>> interfaces >>> increases the latency, causing large delays in the network, when I delete >>> COUNT rules, everything returns to normal, which can be ? >>> >>> My script: >>> >>> ipcount.php >>> -- CUT -- >>> >> $c=0; >>> $a=50100; >>> for($x=0;$x<=0;$x++) { >>> ? ? ?for($y=1;$y<=254;$y++) { >>> ? ? ? ? ? ? ?$ip = "192.168.$x.$y"; >>> ? ? ? ? ? ? ?system("/sbin/ipfw -q add $a count { tcp or udp } from any >>> to >>> $ip/32"); >>> ? ? ? ? ? ? ?system("/sbin/ipfw -q add $a count { tcp or udp } from >>> $ip/32 >>> to any"); >>> ? ? ? ? ? ? ?#system("/sbin/ipfw delete $a"); >>> ? ? ? ? ? ? ?$c++; >>> ? ? ? ? ? ? ?$a++; >>> ? ? ?} >>> } >>> echo "\n\nTotal: $c\n"; >>> ?> >>> -- CUT -- >>> >>> net.inet.ip.fw.dyn_keepalive: 1 >>> net.inet.ip.fw.dyn_short_lifetime: 5 >>> net.inet.ip.fw.dyn_udp_lifetime: 10 >>> net.inet.ip.fw.dyn_rst_lifetime: 1 >>> net.inet.ip.fw.dyn_fin_lifetime: 1 >>> net.inet.ip.fw.dyn_syn_lifetime: 20 >>> net.inet.ip.fw.dyn_ack_lifetime: 300 >>> net.inet.ip.fw.static_count: 262 >>> net.inet.ip.fw.dyn_max: 10000 >>> net.inet.ip.fw.dyn_count: 0 >>> net.inet.ip.fw.curr_dyn_buckets: 256 >>> net.inet.ip.fw.dyn_buckets: 10000 >>> net.inet.ip.fw.default_rule: 65535 >>> net.inet.ip.fw.verbose_limit: 0 >>> net.inet.ip.fw.verbose: 1 >>> net.inet.ip.fw.debug: 0 >>> net.inet.ip.fw.one_pass: 1 >>> net.inet.ip.fw.autoinc_step: 100 >>> net.inet.ip.fw.enable: 1 >>> net.link.ether.ipfw: 1 >>> net.link.bridge.ipfw: 0 >>> net.link.bridge.ipfw_arp: 0 >>> >>> Thanks, >>> >>> Daniel >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>> >>> >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> >> > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From smithi at nimnet.asn.au Tue Apr 28 04:52:32 2009 From: smithi at nimnet.asn.au (Ian Smith) Date: Tue Apr 28 04:52:38 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: <49F5DB12.7080502@yan.com.br> References: <49F06985.1000303@yan.com.br> <49F08071.1070905@ibctech.ca> <49F1D992.9000001@yan.com.br> <20090425024635.O89549@sola.nimnet.asn.au> <49F5DB12.7080502@yan.com.br> Message-ID: <20090428135053.Y89549@sola.nimnet.asn.au> On Mon, 27 Apr 2009, Daniel Dias Gon?alves wrote: > What may be happening ? I'm with polling enabled on all interfaces, can you > influence ? > > em0: port 0x7000-0x703f mem > 0xdfa00000-0xdfa1ffff irq 16 at device 8.0 on pci4 > em1: port 0x7400-0x743f mem > 0xdfa20000-0xdfa3ffff irq 17 at device 8.1 on pci4 > em2: port 0x8000-0x803f mem > 0xdfb00000-0xdfb1ffff irq 16 at device 8.0 on pci5 > em3: port 0x8400-0x843f mem > 0xdfb20000-0xdfb3ffff irq 17 at device 8.1 on pci5 > em4: port 0x9000-0x903f mem > 0xdfc00000-0xdfc1ffff irq 16 at device 8.0 on pci7 > em5: port 0x9400-0x943f mem > 0xdfc20000-0xdfc3ffff irq 17 at device 8.1 on pci7 > em6: port 0xa000-0xa03f mem > 0xdfd00000-0xdfd1ffff irq 16 at device 8.0 on pci8 > em7: port 0xa400-0xa43f mem > 0xdfd20000-0xdfd3ffff irq 17 at device 8.1 on pci8 > fxp0: port 0xb000-0xb03f mem > 0xdfe20000-0xdfe20fff,0xdfe00000-0xdfe1ffff irq 16 at device 4.0 on pci14 > > If I disable the polling, no network interface work, begins to display "em4 > watchdog timeout". Sorry, no ideas about polling, but this doesn't smell like just an IPFW issue. I was pointing out that despite 20 times the CPU clock rate, probably at least 30 times CPU throughput and likely 10 times the tick rate, you appear to be suffering something like 30 to 900 times the increased latency to be expected by traversing 'too many' ipfw rules. > Ian Smith escreveu: > > On Fri, 24 Apr 2009, Daniel Dias Gon?alves wrote: > > > > > The latency in the interface em6 increased an average of 10ms to 200 ~ > > 300ms > > > Hardware: > > > CPU: Intel(R) Xeon(TM) CPU 3.20GHz (3200.13-MHz 686-class CPU) > > > Logical CPUs per core: 2 > > > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs > > > cpu0: on acpi0 > > > p4tcc0: on cpu0 > > > cpu1: on acpi0 > > > p4tcc1: on cpu1 > > > cpu2: on acpi0 > > > p4tcc2: on cpu2 > > > cpu3: on acpi0 > > > p4tcc3: on cpu3 > > > SMP: AP CPU #1 Launched! > > > SMP: AP CPU #3 Launched! > > > SMP: AP CPU #2 Launched! > > > > real memory = 9663676416 (9216 MB) > > > avail memory = 8396738560 (8007 MB) > > > > In that case, there really is something else wrong. By my measurements, > > rummaging through most of >1000 rules on a old 166MHz Pentium to get to the > > icmp allow rules (ridiculous, I know) added about 2ms to local net pings > > via that box, ie 1ms each pass for about 900 rules, mostly counts. cheers, Ian From julian at elischer.org Tue Apr 28 06:52:01 2009 From: julian at elischer.org (Julian Elischer) Date: Tue Apr 28 06:52:11 2009 Subject: IPFW MAX RULES COUNT PERFORMANCE In-Reply-To: <49F5D8A3.3050805@yan.com.br> References: <49F06985.1000303@yan.com.br> <49F0A7DD.30206@elischer.org> <49F1DBAE.1080205@yan.com.br> <49F235F4.2030202@elischer.org> <49F5D8A3.3050805@yan.com.br> Message-ID: <49F6A796.4060100@elischer.org> Daniel Dias Gon?alves wrote: > Julian, > > You could give an example of rules with tables? I'm sorry I forgot that you want to count packets from each client. tables won't work for that. for counting I suggest the technique I show below, but for just allowing, you can add allowable addresses to a table, e.g. table 1 add 1.2.3.4 and test it with allow ip from table (1) to any > > Julian Elischer escreveu: >> Daniel Dias Gon?alves wrote: >>> Very good thinking, congratulations, but my need is another. >>> The objective is a Captive Porrtal that each authentication is >>> dynamically created a rule to ALLOW or COUNT IP authenticated, which >>> I'm testing is what is the maximum capacity of rules supported, >>> therefore simultaneous user. >>> >>> Understand ? >>> >> I think so. >> >> >> do not add rules. >> have a single rule that looks in a table >> and add entries to the table when needed. >> >>> Thanks, >>> >>> Daniel >>> >>> Julian Elischer escreveu: >>>> Daniel Dias Gon?alves wrote: >>>>> Hi, >>>>> >>>>> My system is a FreeBSD 7.1R. >>>>> When I add rules IPFW COUNT to 254 IPS from my network, one of my >>>>> interfaces increases the latency, causing large delays in the >>>>> network, when I delete COUNT rules, everything returns to normal, >>>>> which can be ? >>>>> >>>>> My script: >>>> >>>> of course adding 512 rules, *all of which hav eto be evaluated* will >>>> add latency. >>>> >>>> you have several ways to improve this situation. >>>> >>>> 1/ use a differnet tool. >>>> By using the netgraph netflow module you can get >>>> accunting information that may be more useful and less impactful. >>>> >>>> 2/ you could make your rules smarter.. >>>> >>>> use skipto rules to make the average packet traverse less rules.. >>>> >>>> off the top of my head.. (not tested..) >>>> >>>> Assuming you have machines 10.0.0.1-10.0.0.254.... >>>> the rules below have an average packet traversing 19 rules and not >>>> 256 for teh SYN packet and 2 rules for others.. >>>> you may not be able to do the keep state trick if you use state for >>>> other stuff but in that case worst case will still be 19 rules. >>>> >>>> 2 check-state >>>> 5 skipto 10000 ip from not 10.0.0.0/24 to any >>>> 10 skipto 2020 ip from not 10.0.0.0/25 to any # 0-128 >>>> 20 skipto 1030 ip from not 10.0.0.0/26 to any # 0-64 >>>> 30 skipto 240 ip from not 10.0.0.0/27 to any # 0-32 >>>> 40 skipto 100 ip from not 10.0.0.0/28 to any # 0-16 >>>> [16 count rules for 0-15] >>>> 80 skipto 10000 ip from any to any >>>> 100 [16 count rules for 16-31] keep-state >>>> 140 skipto 10000 ip from any to any >>>> 240 skipto 300 ip from not 10.0.0.32/28 >>>> [16 rules for 32-47] keep-state >>>> 280 skipto 10000 ip from any to any >>>> 300 [16 count rules for 48-63] keep-state >>>> 340 skipto 10000 ip from any to any >>>> 1030 skipto 1240 ip from not 10.0.0.64/27 to any >>>> 1040 skipto 1100 ip from not 10.0.0.64/28 to any >>>> [16 count rules for 64-79] keep-state >>>> 1080 skipto 10000 ip from any to any >>>> 1100 [16 rules for 80-95] keep-state >>>> 1140 skipto 10000 ip from any to any >>>> 1240 skipto 1300 ip from not 10.0.0.96/28 to any >>>> [16 count rules for 96-111] keep-state >>>> 1280 skipto 10000 ip from any to any >>>> 1300 [16 rules for 112-127] keep-state >>>> 1340 skipto 10000 ip from any to any >>>> 2020 skipto 3030 ip from not 10.0.0.128/26 to any >>>> 2030 skipto 2240 ip from not 10.0.0.128/28 to any >>>> [16 count rules for 128-143] keep-state >>>> 2080 skipto 10000 ip from any to any >>>> 2100 [16 rules for 144-159] keep-state >>>> 2140 skipto 10000 ip from any to any >>>> 2240 skipto 2300 ip from not 10.0.0.32/28 to any >>>> [16 count rules for 160-175] keep-state >>>> 2280 skipto 10000 ip from any to any >>>> 2300 [16 count rules for 176-191] keep-state >>>> 2340 skipto 10000 ip from any to any >>>> 3030 skipto 3240 ip from not 10.0.0.192/27 to any >>>> 3040 skipto 3100 ip from not 10.0.0.192/28 to any >>>> [16 count rules for 192-207] keep-state >>>> 3080 skipto 10000 ip from any to any >>>> 3100 [16 rules for 208-223] keep-state >>>> 3240 skipto 10000 ip from any to any >>>> 3240 skipto 3300 ip from not 10.0.0.224/28 to any >>>> [16 count rules for 224-239] keep-state >>>> 3280 skipto 10000 ip from any to any >>>> 3300 [16 count rules for 240-255] keep-state >>>> 3340 skipto 10000 ip from any to any >>>> >>>> 10000 #other stuff >>>> >>>> in fact you could improve it further with: >>>> 1/ either going down to a netmask of 29 (8 rules per set) >>>> or >>>> 2/ instead of having count rules make them skipto >>>> so you would have: >>>> 3300 skipto 10000 ip from 10.0.0.240 to any >>>> 3301 skipto 10000 ip from 10.0.0.241 to any >>>> 3302 skipto 10000 ip from 10.0.0.242 to any >>>> 3303 skipto 10000 ip from 10.0.0.243 to any >>>> 3304 skipto 10000 ip from 10.0.0.244 to any >>>> 3305 skipto 10000 ip from 10.0.0.245 to any >>>> 3306 skipto 10000 ip from 10.0.0.246 to any >>>> 3307 skipto 10000 ip from 10.0.0.247 to any >>>> 3308 skipto 10000 ip from 10.0.0.248 to any >>>> 3309 skipto 10000 ip from 10.0.0.249 to any >>>> 3310 skipto 10000 ip from 10.0.0.240 to any >>>> 3311 skipto 10000 ip from 10.0.0.241 to any >>>> 3312 skipto 10000 ip from 10.0.0.242 to any >>>> 3313 skipto 10000 ip from 10.0.0.243 to any >>>> 3314 skipto 10000 ip from 10.0.0.244 to any >>>> 3315 skipto 10000 ip from 10.0.0.245 to any >>>> >>>> thus on average, a packet would traverse half the rules (8). >>>> >>>> 3/ both the above so on average they would traverse 4 rules plus >>>> one extra skipto. >>>> >>>> you should be able to do the above in a script. >>>> I'd love to see it.. >>>> >>>> (you can also do skipto tablearg in -current (maybe 7.2 too) >>>> which may also be good.. (or not)) >>>> >>>> >>>> julian >>>> >>>> >>>> >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>>> >>>> >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> From fabien.thomas at netasq.com Tue Apr 28 08:40:52 2009 From: fabien.thomas at netasq.com (Fabien Thomas) Date: Tue Apr 28 08:41:00 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <160513.83122.qm@web63904.mail.re1.yahoo.com> References: <160513.83122.qm@web63904.mail.re1.yahoo.com> Message-ID: To share my results: I have done at work modification to the polling code to do SMP polling (previously posted to this ml). SMP polling (dynamic group of interface binded to CPU) does not significantly improve the throughput (lock contention seems to be the cause here). The main advantage of polling with modern interface is not the PPS (which is nearly the same) but the global efficiency of the system when using multiple interfaces (which is the case for Firewall). The best configuration we have found with FreeBSD 6.3 is to do polling on one CPU and keep the other CPU free for other processing. In this configuration the whole system is more efficient than with interrupt where all the CPU are busy processing interrupt thread. Regards, Fabien From p.pisati at oltrelinux.com Tue Apr 28 09:27:08 2009 From: p.pisati at oltrelinux.com (Paolo Pisati) Date: Tue Apr 28 09:27:15 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: References: <160513.83122.qm@web63904.mail.re1.yahoo.com> Message-ID: <49F6C6B4.4080108@oltrelinux.com> Fabien Thomas wrote: > > To share my results: > > I have done at work modification to the polling code to do SMP polling > (previously posted to this ml). > > SMP polling (dynamic group of interface binded to CPU) does not > significantly improve the throughput (lock contention seems to be the > cause here). > The main advantage of polling with modern interface is not the PPS > (which is nearly the same) but the global efficiency of the system > when using multiple interfaces (which is the case for Firewall). > The best configuration we have found with FreeBSD 6.3 is to do polling > on one CPU and keep the other CPU free for other processing. In this > configuration the whole system > is more efficient than with interrupt where all the CPU are busy > processing interrupt thread. out of curiosity: did you try polling on 4.x? i know it doesn't "support" SMP over there, but last time i tried polling on 7.x (or was it 6.x? i don't remember...) i found it didn't gave any benefit, while switching the system to 4.x showed a huge improvement. -- bye, P. From fabien.thomas at netasq.com Tue Apr 28 09:49:30 2009 From: fabien.thomas at netasq.com (Fabien Thomas) Date: Tue Apr 28 09:49:36 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <49F6C6B4.4080108@oltrelinux.com> References: <160513.83122.qm@web63904.mail.re1.yahoo.com> <49F6C6B4.4080108@oltrelinux.com> Message-ID: <36906055-E1AE-486B-BA77-D260E0609BBB@netasq.com> Le 28 avr. 09 ? 11:04, Paolo Pisati a ?crit : > Fabien Thomas wrote: >> >> To share my results: >> >> I have done at work modification to the polling code to do SMP >> polling (previously posted to this ml). >> >> SMP polling (dynamic group of interface binded to CPU) does not >> significantly improve the throughput (lock contention seems to be >> the cause here). >> The main advantage of polling with modern interface is not the PPS >> (which is nearly the same) but the global efficiency of the system >> when using multiple interfaces (which is the case for Firewall). >> The best configuration we have found with FreeBSD 6.3 is to do >> polling on one CPU and keep the other CPU free for other >> processing. In this configuration the whole system >> is more efficient than with interrupt where all the CPU are busy >> processing interrupt thread. > out of curiosity: did you try polling on 4.x? i know it doesn't > "support" SMP over there, but last time i tried polling on 7.x (or > was it 6.x? i don't remember...) > i found it didn't gave any benefit, while switching the system to > 4.x showed a huge improvement. > yes rewriting the core polling code started at half because the polling code on 6.x and up perform badly (in our env) regarding performance. today 4.x is unbeatable regarding network perf (6.2 -> 7.0 at least, i need to do more test on 7_stable and 8). the other half of the work was to explore the SMP scaling of the polling code to gain what we loose with fine grained SMP kernel. > -- > > bye, > P. > > From auryn at zirakzigil.org Tue Apr 28 10:28:02 2009 From: auryn at zirakzigil.org (Giulio Ferro) Date: Tue Apr 28 10:28:09 2009 Subject: IPSEC NAT traversal Message-ID: <49F6D598.6040503@zirakzigil.org> What's the status of NATT patch in 8 current? Is it usable? Thanks. From vanhu at FreeBSD.org Tue Apr 28 12:00:13 2009 From: vanhu at FreeBSD.org (VANHULLEBUS Yvan) Date: Tue Apr 28 12:00:20 2009 Subject: IPSEC NAT traversal In-Reply-To: <49F6D598.6040503@zirakzigil.org> References: <49F6D598.6040503@zirakzigil.org> Message-ID: <20090428120751.GA68471@zeninc.net> On Tue, Apr 28, 2009 at 12:08:24PM +0200, Giulio Ferro wrote: > What's the status of NATT patch in 8 current? Is it usable? Hi. See recent archives, there is actually an issue with the patchset, as there are no more available bits in struct inp's flags. We're working on that to find and implement the best solution. Yvan. From barney_cordoba at yahoo.com Tue Apr 28 14:26:42 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Tue Apr 28 14:26:49 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <36906055-E1AE-486B-BA77-D260E0609BBB@netasq.com> Message-ID: <50451.74235.qm@web63901.mail.re1.yahoo.com> --- On Tue, 4/28/09, Fabien Thomas wrote: > From: Fabien Thomas > Subject: Re: Interrupts + Polling mode (similar to Linux's NAPI) > To: "Paolo Pisati" > Cc: "FreeBSD Net" > Date: Tuesday, April 28, 2009, 5:49 AM > Le 28 avr. 09 ? 11:04, Paolo Pisati a ?crit : > > > Fabien Thomas wrote: > >> > >> To share my results: > >> > >> I have done at work modification to the polling > code to do SMP polling (previously posted to this ml). > >> > >> SMP polling (dynamic group of interface binded to > CPU) does not significantly improve the throughput (lock > contention seems to be the cause here). > >> The main advantage of polling with modern > interface is not the PPS (which is nearly the same) but the > global efficiency of the system when using multiple > interfaces (which is the case for Firewall). > >> The best configuration we have found with FreeBSD > 6.3 is to do polling on one CPU and keep the other CPU free > for other processing. In this configuration the whole system > >> is more efficient than with interrupt where all > the CPU are busy processing interrupt thread. > > out of curiosity: did you try polling on 4.x? i know > it doesn't "support" SMP over there, but last > time i tried polling on 7.x (or was it 6.x? i don't > remember...) > > i found it didn't gave any benefit, while > switching the system to 4.x showed a huge improvement. > > > > yes rewriting the core polling code started at half because > the polling code on 6.x and up perform badly (in our env) > regarding performance. > today 4.x is unbeatable regarding network perf (6.2 -> > 7.0 at least, i need to do more test on 7_stable and 8). > > the other half of the work was to explore the SMP scaling > of the polling code to gain what we loose with fine grained > SMP kernel. The problem with all of this "analysis" is that it assumes that SMP coding scales intuitively; when the opposite is actually true. What you fail to address is the basic fact that moderated interrupts (ie holding off interrupts to a set number of ints/second) is exactly the same as polling, as on an active system you'll get exactly X interrupts per second at equal intervals. So all of this chatter about polling being more efficient is simply bunk. The truth is that polling requires additional overhead to the system while interrupts do not. So if polling did better for you, its simply because either 1) The polling code in the driver is better or 2) You tuned polling better than you tuned interrupt moderation. Barney From rizzo at iet.unipi.it Tue Apr 28 15:02:29 2009 From: rizzo at iet.unipi.it (Luigi Rizzo) Date: Tue Apr 28 15:02:36 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <50451.74235.qm@web63901.mail.re1.yahoo.com> References: <36906055-E1AE-486B-BA77-D260E0609BBB@netasq.com> <50451.74235.qm@web63901.mail.re1.yahoo.com> Message-ID: <20090428150739.GC8430@onelab2.iet.unipi.it> On Tue, Apr 28, 2009 at 07:26:40AM -0700, Barney Cordoba wrote: ... > The problem with all of this "analysis" is that it assumes that SMP > coding scales intuitively; when the opposite is actually true. > > What you fail to address is the basic fact that moderated interrupts > (ie holding off interrupts to a set number of ints/second) is exactly > the same as polling, as on an active system you'll get exactly X > interrupts per second at equal intervals. So all of this chatter about > polling being more efficient is simply bunk. > > The truth is that polling requires additional overhead to the system while > interrupts do not. So if polling did better for you, its simply because > either > > 1) The polling code in the driver is better > > or > > 2) You tuned polling better than you tuned interrupt moderation. > If i am not mistaken we don't have generic support for interrupt moderation in the kernel but that's a specific NIC feature: it works if the hardware supports it, and it doesn't otherwise. Of course it would be possible to modify polling to implement generic interrupt mitigation even without hardware support, so you get the best of the two worlds. cheers luigi From fabien.thomas at netasq.com Tue Apr 28 15:51:11 2009 From: fabien.thomas at netasq.com (Fabien Thomas) Date: Tue Apr 28 15:51:19 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <50451.74235.qm@web63901.mail.re1.yahoo.com> References: <50451.74235.qm@web63901.mail.re1.yahoo.com> Message-ID: <3B579474-23FD-4A9F-970B-98AA17B31EC7@netasq.com> >>>> >>>> I have done at work modification to the polling >> code to do SMP polling (previously posted to this ml). >>>> >>>> SMP polling (dynamic group of interface binded to >> CPU) does not significantly improve the throughput (lock >> contention seems to be the cause here). >>>> The main advantage of polling with modern >> interface is not the PPS (which is nearly the same) but the >> global efficiency of the system when using multiple >> interfaces (which is the case for Firewall). >>>> The best configuration we have found with FreeBSD >> 6.3 is to do polling on one CPU and keep the other CPU free >> for other processing. In this configuration the whole system >>>> is more efficient than with interrupt where all >> the CPU are busy processing interrupt thread. >>> out of curiosity: did you try polling on 4.x? i know >> it doesn't "support" SMP over there, but last >> time i tried polling on 7.x (or was it 6.x? i don't >> remember...) >>> i found it didn't gave any benefit, while >> switching the system to 4.x showed a huge improvement. >>> >> >> yes rewriting the core polling code started at half because >> the polling code on 6.x and up perform badly (in our env) >> regarding performance. >> today 4.x is unbeatable regarding network perf (6.2 -> >> 7.0 at least, i need to do more test on 7_stable and 8). >> >> the other half of the work was to explore the SMP scaling >> of the polling code to gain what we loose with fine grained >> SMP kernel. > > The problem with all of this "analysis" is that it assumes that SMP > coding scales intuitively; when the opposite is actually true. > > What you fail to address is the basic fact that moderated interrupts > (ie holding off interrupts to a set number of ints/second) is exactly > the same as polling, as on an active system you'll get exactly X > interrupts per second at equal intervals. So all of this chatter about > polling being more efficient is simply bunk. I agree with you with one interface. When you use ten interface it is not the case. > > > The truth is that polling requires additional overhead to the system > while > interrupts do not. So if polling did better for you, its simply > because > either > > 1) The polling code in the driver is better > > or > > 2) You tuned polling better than you tuned interrupt moderation. > > > Barney > > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From sullrich at gmail.com Tue Apr 28 16:15:39 2009 From: sullrich at gmail.com (Scott Ullrich) Date: Tue Apr 28 16:15:45 2009 Subject: IPSEC NAT traversal In-Reply-To: <20090428120751.GA68471@zeninc.net> References: <49F6D598.6040503@zirakzigil.org> <20090428120751.GA68471@zeninc.net> Message-ID: On Tue, Apr 28, 2009 at 8:07 AM, VANHULLEBUS Yvan wrote: > See recent archives, there is actually an issue with the patchset, as > there are no more available bits in struct inp's flags. > We're working on that to find and implement the best solution. Hi, Ermal Luci recently whipped the pfSense's NATT patch into shape: http://cvs.pfsense.com/~sullrich/NATT.RELENG_8.diff I am not sure if this is how Yvan wants to solve it for the long term but it does seem to work OK for the short term until the patch is brought up to speed. Scott From julian at elischer.org Tue Apr 28 16:52:34 2009 From: julian at elischer.org (Julian Elischer) Date: Tue Apr 28 16:52:40 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: References: <160513.83122.qm@web63904.mail.re1.yahoo.com> Message-ID: <49F73459.9080803@elischer.org> Fabien Thomas wrote: > > To share my results: > > I have done at work modification to the polling code to do SMP polling > (previously posted to this ml). > > SMP polling (dynamic group of interface binded to CPU) does not > significantly improve the throughput (lock contention seems to be the > cause here). > The main advantage of polling with modern interface is not the PPS > (which is nearly the same) but the global efficiency of the system when > using multiple interfaces (which is the case for Firewall). > The best configuration we have found with FreeBSD 6.3 is to do polling > on one CPU and keep the other CPU free for other processing. In this > configuration the whole system > is more efficient than with interrupt where all the CPU are busy > processing interrupt thread. > so it would (might) be worth while working out a framework by which this could be achieved. > Regards, > Fabien > From bzeeb-lists at lists.zabbadoz.net Tue Apr 28 17:30:09 2009 From: bzeeb-lists at lists.zabbadoz.net (Bjoern A. Zeeb) Date: Tue Apr 28 17:30:16 2009 Subject: IPSEC NAT traversal In-Reply-To: References: <49F6D598.6040503@zirakzigil.org> <20090428120751.GA68471@zeninc.net> Message-ID: <20090428170638.P15361@maildrop.int.zabbadoz.net> On Tue, 28 Apr 2009, Scott Ullrich wrote: > On Tue, Apr 28, 2009 at 8:07 AM, VANHULLEBUS Yvan wrote: >> See recent archives, there is actually an issue with the patchset, as >> there are no more available bits in struct inp's flags. >> We're working on that to find and implement the best solution. > > Hi, > > Ermal Luci recently whipped the pfSense's NATT patch into shape: > http://cvs.pfsense.com/~sullrich/NATT.RELENG_8.diff > > I am not sure if this is how Yvan wants to solve it for the long term > but it does seem to work OK for the short term until the patch is > brought up to speed. Ermal is using inp_flags2 that Kip has recently added to the inpcb. The easy way to fix it for the next day. We considered that option. The long term is that we'll have an UDP control block (patch currently circulating for review and test but possibly committed the next two days). Considering the fact that the in kernel udp tunneling callback already (ab)used the pointer and that NAT-T needs to dedicated UDP flags and we've found someone already using an udpcb we decided that now was the time to add it so that we wouldn't possibly be stuck in FreeBSD 8.x. I have NAT-T on top of that. And I am currently doing the whatever you'll call it 'final pass', will send it back to Yvan once I am done with the last 2 items and last 400 lines of key.c . After that I assume someone will commit it. As I am pretty sure you'll want to test it before it goes into the tree so you'll get a copy as well; thanks for volunteering;-p It's not yet going to use the new in kernel tunnel callback and I am not yet sure if we can actually use it due to the placing of the callback, but if we can, the change will not be visible to userland. Thus we'll be able to do it any time. It will also not yet support transport mode NAT-T that I found another person needing it the other weekend while he was debugging NAT-T and I was busy with something else; but thanks to Yvan's last patch the infrastructure to support it is in place already, so that support can be added at a later point w/o breaking the kernel/userland API/ABI or anything else (I hope at this point;). /bz -- Bjoern A. Zeeb The greatest risk is not taking one. From gelraen.ua at gmail.com Tue Apr 28 17:40:04 2009 From: gelraen.ua at gmail.com (Maxim Ignatenko) Date: Tue Apr 28 17:40:12 2009 Subject: kern/132715: [lagg] [panic] Panic when creating vlan's on lagg interface Message-ID: <200904281740.n3SHe3k2009454@freefall.freebsd.org> The following reply was made to PR kern/132715; it has been noted by GNATS. From: Maxim Ignatenko To: bug-followup@freebsd.org, gdef@wp.pl Cc: freebsd-current@freebsd.org Subject: Re: kern/132715: [lagg] [panic] Panic when creating vlan's on lagg interface Date: Tue, 28 Apr 2009 20:32:37 +0300 --0016363b88a65766ad0468a0d95b Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit em(4), igb(4) and ixgbe(4) registers EVENTHANDLER vlan_config, but don't do any checks that this event generated by adding vlan on top of their devices. I'm don't completely sure what the right way to fix this issue, but attached patch works for me. --0016363b88a65766ad0468a0d95b Content-Type: text/plain; charset=US-ASCII; name="patch.txt" Content-Disposition: attachment; filename="patch.txt" Content-Transfer-Encoding: base64 X-Attachment-Id: f_fu2vk99w0 SW5kZXg6IGUxMDAwL2lmX2lnYi5jCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIGUxMDAwL2lmX2lnYi5jCShyZXZp c2lvbiAxOTEyMDEpCisrKyBlMTAwMC9pZl9pZ2IuYwkod29ya2luZyBjb3B5KQpAQCAtNDI3NCw2 ICs0Mjc0LDggQEAKIAlzdHJ1Y3QgYWRhcHRlcgkqYWRhcHRlciA9IGlmcC0+aWZfc29mdGM7CiAJ dTMyCQljdHJsLCByY3RsLCBpbmRleCwgdmZ0YTsKIAorCWlmIChzdHJjbXAoImlnYiIsaWZwLT5p Zl9kbmFtZSkpIHJldHVybjsKKwogCWN0cmwgPSBFMTAwMF9SRUFEX1JFRygmYWRhcHRlci0+aHcs IEUxMDAwX0NUUkwpOwogCWN0cmwgfD0gRTEwMDBfQ1RSTF9WTUU7CiAJRTEwMDBfV1JJVEVfUkVH KCZhZGFwdGVyLT5odywgRTEwMDBfQ1RSTCwgY3RybCk7CkBAIC00MzA2LDYgKzQzMDgsOCBAQAog CXN0cnVjdCBhZGFwdGVyCSphZGFwdGVyID0gaWZwLT5pZl9zb2Z0YzsKIAl1MzIJCWluZGV4LCB2 ZnRhOwogCisJaWYgKHN0cmNtcCgiaWdiIixpZnAtPmlmX2RuYW1lKSkgcmV0dXJuOworCiAJLyog UmVtb3ZlIGVudHJ5IGluIHRoZSBoYXJkd2FyZSBmaWx0ZXIgdGFibGUgKi8KIAlpbmRleCA9ICgo dnRhZyA+PiA1KSAmIDB4N0YpOwogCXZmdGEgPSBFMTAwMF9SRUFEX1JFR19BUlJBWSgmYWRhcHRl ci0+aHcsIEUxMDAwX1ZGVEEsIGluZGV4KTsKSW5kZXg6IGUxMDAwL2lmX2VtLmMKPT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PQotLS0gZTEwMDAvaWZfZW0uYwkocmV2aXNpb24gMTkxMjAxKQorKysgZTEwMDAvaWZfZW0uYwko d29ya2luZyBjb3B5KQpAQCAtNDc3MSw2ICs0NzcxLDggQEAKIAlzdHJ1Y3QgYWRhcHRlcgkqYWRh cHRlciA9IGlmcC0+aWZfc29mdGM7CiAJdTMyCQljdHJsLCByY3RsLCBpbmRleCwgdmZ0YTsKIAor CWlmIChzdHJjbXAoImVtIixpZnAtPmlmX2RuYW1lKSkgcmV0dXJuOworCiAJY3RybCA9IEUxMDAw X1JFQURfUkVHKCZhZGFwdGVyLT5odywgRTEwMDBfQ1RSTCk7CiAJY3RybCB8PSBFMTAwMF9DVFJM X1ZNRTsKIAlFMTAwMF9XUklURV9SRUcoJmFkYXB0ZXItPmh3LCBFMTAwMF9DVFJMLCBjdHJsKTsK QEAgLTQ4MDMsNiArNDgwNSw4IEBACiAJc3RydWN0IGFkYXB0ZXIJKmFkYXB0ZXIgPSBpZnAtPmlm X3NvZnRjOwogCXUzMgkJaW5kZXgsIHZmdGE7CiAKKwlpZiAoc3RyY21wKCJlbSIsaWZwLT5pZl9k bmFtZSkpIHJldHVybjsKKwogCS8qIFJlbW92ZSBlbnRyeSBpbiB0aGUgaGFyZHdhcmUgZmlsdGVy IHRhYmxlICovCiAJaW5kZXggPSAoKHZ0YWcgPj4gNSkgJiAweDdGKTsKIAl2ZnRhID0gRTEwMDBf UkVBRF9SRUdfQVJSQVkoJmFkYXB0ZXItPmh3LCBFMTAwMF9WRlRBLCBpbmRleCk7CkluZGV4OiBp eGdiZS9peGdiZS5jCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIGl4Z2JlL2l4Z2JlLmMJKHJldmlzaW9uIDE5MTIw MSkKKysrIGl4Z2JlL2l4Z2JlLmMJKHdvcmtpbmcgY29weSkKQEAgLTQwMzEsNiArNDAzMSw4IEBA CiAJc3RydWN0IGFkYXB0ZXIJKmFkYXB0ZXIgPSBpZnAtPmlmX3NvZnRjOwogCXUzMgkJY3RybCwg cmN0bCwgaW5kZXgsIHZmdGE7CiAKKwlpZiAoc3RyY21wKCJpeGdiZSIsaWZwLT5pZl9kbmFtZSkp IHJldHVybjsKKwogCWN0cmwgPSBJWEdCRV9SRUFEX1JFRygmYWRhcHRlci0+aHcsIElYR0JFX1ZM TkNUUkwpOwogCWN0cmwgfD0gSVhHQkVfVkxOQ1RSTF9WTUUgfCBJWEdCRV9WTE5DVFJMX1ZGRTsK IAljdHJsICY9IH5JWEdCRV9WTE5DVFJMX0NGSUVOOwpAQCAtNDA1MCw2ICs0MDUyLDggQEAKIAlz dHJ1Y3QgYWRhcHRlcgkqYWRhcHRlciA9IGlmcC0+aWZfc29mdGM7CiAJdTMyCQlpbmRleCwgdmZ0 YTsKIAorCWlmIChzdHJjbXAoIml4Z2JlIixpZnAtPmlmX2RuYW1lKSkgcmV0dXJuOworCiAJLyog UmVtb3ZlIGVudHJ5IGluIHRoZSBoYXJkd2FyZSBmaWx0ZXIgdGFibGUgKi8KIAlpeGdiZV9zZXRf dmZ0YSgmYWRhcHRlci0+aHcsIHZ0YWcsIDAsIEZBTFNFKTsKIAo= --0016363b88a65766ad0468a0d95b-- From gelraen.ua at gmail.com Tue Apr 28 17:50:03 2009 From: gelraen.ua at gmail.com (Maxim Ignatenko) Date: Tue Apr 28 17:50:13 2009 Subject: kern/132715: [lagg] [panic] Panic when creating vlan's on lagg interface Message-ID: <200904281750.n3SHo2pX022401@freefall.freebsd.org> The following reply was made to PR kern/132715; it has been noted by GNATS. From: Maxim Ignatenko To: bug-followup@freebsd.org, gdef@wp.pl Cc: freebsd-current@freebsd.org Subject: Re: kern/132715: [lagg] [panic] Panic when creating vlan's on lagg interface Date: Tue, 28 Apr 2009 20:47:29 +0300 --0016363b7c7089f2430468a10e5a Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sorry, here is patch done relatively to root of source tree. (previous was done relatively to sys/dev) --0016363b7c7089f2430468a10e5a Content-Type: text/plain; charset=US-ASCII; name="patch.txt" Content-Disposition: attachment; filename="patch.txt" Content-Transfer-Encoding: base64 X-Attachment-Id: f_fu2w3wuo1 SW5kZXg6IGUxMDAwL2lmX2lnYi5jCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIGUxMDAwL2lmX2lnYi5jCShyZXZp c2lvbiAxOTEyMDEpCisrKyBlMTAwMC9pZl9pZ2IuYwkod29ya2luZyBjb3B5KQpAQCAtNDI3NCw2 ICs0Mjc0LDggQEAKIAlzdHJ1Y3QgYWRhcHRlcgkqYWRhcHRlciA9IGlmcC0+aWZfc29mdGM7CiAJ dTMyCQljdHJsLCByY3RsLCBpbmRleCwgdmZ0YTsKIAorCWlmIChzdHJjbXAoImlnYiIsaWZwLT5p Zl9kbmFtZSkpIHJldHVybjsKKwogCWN0cmwgPSBFMTAwMF9SRUFEX1JFRygmYWRhcHRlci0+aHcs IEUxMDAwX0NUUkwpOwogCWN0cmwgfD0gRTEwMDBfQ1RSTF9WTUU7CiAJRTEwMDBfV1JJVEVfUkVH KCZhZGFwdGVyLT5odywgRTEwMDBfQ1RSTCwgY3RybCk7CkBAIC00MzA2LDYgKzQzMDgsOCBAQAog CXN0cnVjdCBhZGFwdGVyCSphZGFwdGVyID0gaWZwLT5pZl9zb2Z0YzsKIAl1MzIJCWluZGV4LCB2 ZnRhOwogCisJaWYgKHN0cmNtcCgiaWdiIixpZnAtPmlmX2RuYW1lKSkgcmV0dXJuOworCiAJLyog UmVtb3ZlIGVudHJ5IGluIHRoZSBoYXJkd2FyZSBmaWx0ZXIgdGFibGUgKi8KIAlpbmRleCA9ICgo dnRhZyA+PiA1KSAmIDB4N0YpOwogCXZmdGEgPSBFMTAwMF9SRUFEX1JFR19BUlJBWSgmYWRhcHRl ci0+aHcsIEUxMDAwX1ZGVEEsIGluZGV4KTsKSW5kZXg6IGUxMDAwL2lmX2VtLmMKPT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PQotLS0gZTEwMDAvaWZfZW0uYwkocmV2aXNpb24gMTkxMjAxKQorKysgZTEwMDAvaWZfZW0uYwko d29ya2luZyBjb3B5KQpAQCAtNDc3MSw2ICs0NzcxLDggQEAKIAlzdHJ1Y3QgYWRhcHRlcgkqYWRh cHRlciA9IGlmcC0+aWZfc29mdGM7CiAJdTMyCQljdHJsLCByY3RsLCBpbmRleCwgdmZ0YTsKIAor CWlmIChzdHJjbXAoImVtIixpZnAtPmlmX2RuYW1lKSkgcmV0dXJuOworCiAJY3RybCA9IEUxMDAw X1JFQURfUkVHKCZhZGFwdGVyLT5odywgRTEwMDBfQ1RSTCk7CiAJY3RybCB8PSBFMTAwMF9DVFJM X1ZNRTsKIAlFMTAwMF9XUklURV9SRUcoJmFkYXB0ZXItPmh3LCBFMTAwMF9DVFJMLCBjdHJsKTsK QEAgLTQ4MDMsNiArNDgwNSw4IEBACiAJc3RydWN0IGFkYXB0ZXIJKmFkYXB0ZXIgPSBpZnAtPmlm X3NvZnRjOwogCXUzMgkJaW5kZXgsIHZmdGE7CiAKKwlpZiAoc3RyY21wKCJlbSIsaWZwLT5pZl9k bmFtZSkpIHJldHVybjsKKwogCS8qIFJlbW92ZSBlbnRyeSBpbiB0aGUgaGFyZHdhcmUgZmlsdGVy IHRhYmxlICovCiAJaW5kZXggPSAoKHZ0YWcgPj4gNSkgJiAweDdGKTsKIAl2ZnRhID0gRTEwMDBf UkVBRF9SRUdfQVJSQVkoJmFkYXB0ZXItPmh3LCBFMTAwMF9WRlRBLCBpbmRleCk7CkluZGV4OiBp eGdiZS9peGdiZS5jCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIGl4Z2JlL2l4Z2JlLmMJKHJldmlzaW9uIDE5MTIw MSkKKysrIGl4Z2JlL2l4Z2JlLmMJKHdvcmtpbmcgY29weSkKQEAgLTQwMzEsNiArNDAzMSw4IEBA CiAJc3RydWN0IGFkYXB0ZXIJKmFkYXB0ZXIgPSBpZnAtPmlmX3NvZnRjOwogCXUzMgkJY3RybCwg cmN0bCwgaW5kZXgsIHZmdGE7CiAKKwlpZiAoc3RyY21wKCJpeGdiZSIsaWZwLT5pZl9kbmFtZSkp IHJldHVybjsKKwogCWN0cmwgPSBJWEdCRV9SRUFEX1JFRygmYWRhcHRlci0+aHcsIElYR0JFX1ZM TkNUUkwpOwogCWN0cmwgfD0gSVhHQkVfVkxOQ1RSTF9WTUUgfCBJWEdCRV9WTE5DVFJMX1ZGRTsK IAljdHJsICY9IH5JWEdCRV9WTE5DVFJMX0NGSUVOOwpAQCAtNDA1MCw2ICs0MDUyLDggQEAKIAlz dHJ1Y3QgYWRhcHRlcgkqYWRhcHRlciA9IGlmcC0+aWZfc29mdGM7CiAJdTMyCQlpbmRleCwgdmZ0 YTsKIAorCWlmIChzdHJjbXAoIml4Z2JlIixpZnAtPmlmX2RuYW1lKSkgcmV0dXJuOworCiAJLyog UmVtb3ZlIGVudHJ5IGluIHRoZSBoYXJkd2FyZSBmaWx0ZXIgdGFibGUgKi8KIAlpeGdiZV9zZXRf dmZ0YSgmYWRhcHRlci0+aHcsIHZ0YWcsIDAsIEZBTFNFKTsKIAo= --0016363b7c7089f2430468a10e5a-- From gelraen.ua at gmail.com Tue Apr 28 18:10:09 2009 From: gelraen.ua at gmail.com (Maxim Ignatenko) Date: Tue Apr 28 18:10:15 2009 Subject: kern/132715: [lagg] [panic] Panic when creating vlan's on lagg interface Message-ID: <200904281810.n3SIA6hb048081@freefall.freebsd.org> The following reply was made to PR kern/132715; it has been noted by GNATS. From: Maxim Ignatenko To: bug-followup@freebsd.org, gdef@wp.pl Cc: freebsd-current@freebsd.org Subject: Re: kern/132715: [lagg] [panic] Panic when creating vlan's on lagg interface Date: Tue, 28 Apr 2009 21:05:34 +0300 GMail sent attach in very strange way, so it does not displayed correctly on website. -------------- cut here -------------- Index: sys/dev/e1000/if_em.c =================================================================== --- sys/dev/e1000/if_em.c (revision 191201) +++ sys/dev/e1000/if_em.c (working copy) @@ -4771,6 +4771,8 @@ struct adapter *adapter = ifp->if_softc; u32 ctrl, rctl, index, vfta; + if (strcmp("em",ifp->if_dname)) return; + ctrl = E1000_READ_REG(&adapter->hw, E1000_CTRL); ctrl |= E1000_CTRL_VME; E1000_WRITE_REG(&adapter->hw, E1000_CTRL, ctrl); @@ -4803,6 +4805,8 @@ struct adapter *adapter = ifp->if_softc; u32 index, vfta; + if (strcmp("em",ifp->if_dname)) return; + /* Remove entry in the hardware filter table */ index = ((vtag >> 5) & 0x7F); vfta = E1000_READ_REG_ARRAY(&adapter->hw, E1000_VFTA, index); Index: sys/dev/e1000/if_igb.c =================================================================== --- sys/dev/e1000/if_igb.c (revision 191201) +++ sys/dev/e1000/if_igb.c (working copy) @@ -4274,6 +4274,8 @@ struct adapter *adapter = ifp->if_softc; u32 ctrl, rctl, index, vfta; + if (strcmp("igb",ifp->if_dname)) return; + ctrl = E1000_READ_REG(&adapter->hw, E1000_CTRL); ctrl |= E1000_CTRL_VME; E1000_WRITE_REG(&adapter->hw, E1000_CTRL, ctrl); @@ -4306,6 +4308,8 @@ struct adapter *adapter = ifp->if_softc; u32 index, vfta; + if (strcmp("igb",ifp->if_dname)) return; + /* Remove entry in the hardware filter table */ index = ((vtag >> 5) & 0x7F); vfta = E1000_READ_REG_ARRAY(&adapter->hw, E1000_VFTA, index); Index: sys/dev/ixgbe/ixgbe.c =================================================================== --- sys/dev/ixgbe/ixgbe.c (revision 191201) +++ sys/dev/ixgbe/ixgbe.c (working copy) @@ -4031,6 +4031,8 @@ struct adapter *adapter = ifp->if_softc; u32 ctrl, rctl, index, vfta; + if (strcmp("ixgbe",ifp->if_dname)) return; + ctrl = IXGBE_READ_REG(&adapter->hw, IXGBE_VLNCTRL); ctrl |= IXGBE_VLNCTRL_VME | IXGBE_VLNCTRL_VFE; ctrl &= ~IXGBE_VLNCTRL_CFIEN; @@ -4050,6 +4052,8 @@ struct adapter *adapter = ifp->if_softc; u32 index, vfta; + if (strcmp("ixgbe",ifp->if_dname)) return; + /* Remove entry in the hardware filter table */ ixgbe_set_vfta(&adapter->hw, vtag, 0, FALSE); -------------- cut here -------------- From sullrich at gmail.com Tue Apr 28 18:39:59 2009 From: sullrich at gmail.com (Scott Ullrich) Date: Tue Apr 28 18:40:06 2009 Subject: IPSEC NAT traversal In-Reply-To: <20090428170638.P15361@maildrop.int.zabbadoz.net> References: <49F6D598.6040503@zirakzigil.org> <20090428120751.GA68471@zeninc.net> <20090428170638.P15361@maildrop.int.zabbadoz.net> Message-ID: On Tue, Apr 28, 2009 at 1:28 PM, Bjoern A. Zeeb wrote: > On Tue, 28 Apr 2009, Scott Ullrich wrote: [snip] > I have NAT-T on top of that. And I am currently doing the whatever > you'll call it 'final pass', will send it back to Yvan once I am done > with the last 2 items and last 400 lines of key.c . After that I > assume someone will commit it. > As I am pretty sure you'll want to test it before it goes into the > tree so you'll get a copy as well; thanks for volunteering;-p Hey that is great news. I will be ready to test the patch as soon as you are all ready. Thanks for the update Scott From barney_cordoba at yahoo.com Tue Apr 28 18:40:13 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Tue Apr 28 18:40:20 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <20090428150739.GC8430@onelab2.iet.unipi.it> Message-ID: <230101.78323.qm@web63904.mail.re1.yahoo.com> --- On Tue, 4/28/09, Luigi Rizzo wrote: > From: Luigi Rizzo > Subject: Re: Interrupts + Polling mode (similar to Linux's NAPI) > To: "Barney Cordoba" > Cc: "Paolo Pisati" , fabient@freebsd.org, "FreeBSD Net" > Date: Tuesday, April 28, 2009, 11:07 AM > On Tue, Apr 28, 2009 at 07:26:40AM -0700, Barney Cordoba > wrote: > ... > > The problem with all of this "analysis" is > that it assumes that SMP > > coding scales intuitively; when the opposite is > actually true. > > > > What you fail to address is the basic fact that > moderated interrupts > > (ie holding off interrupts to a set number of > ints/second) is exactly > > the same as polling, as on an active system you'll > get exactly X > > interrupts per second at equal intervals. So all of > this chatter about > > polling being more efficient is simply bunk. > > > > The truth is that polling requires additional overhead > to the system while > > interrupts do not. So if polling did better for you, > its simply because > > either > > > > 1) The polling code in the driver is better > > > > or > > > > 2) You tuned polling better than you tuned interrupt > moderation. > > > > If i am not mistaken we don't have generic support for > interrupt moderation > in the kernel but that's a specific NIC feature: it > works if the > hardware supports it, and it doesn't otherwise. Well its the silly integrator who uses whatever hardware he has lying around. You don't try to squeeze performance out of crap hardware. You get hardware that has the features you need. The point of polling was to avoid livelock. So the question is why is it still around in 7 and 8, along with the propaganda that its any better than just using a decent controller. BC From andrew at modulus.org Tue Apr 28 21:25:39 2009 From: andrew at modulus.org (Andrew Snow) Date: Tue Apr 28 21:25:47 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <20090428150739.GC8430@onelab2.iet.unipi.it> References: <36906055-E1AE-486B-BA77-D260E0609BBB@netasq.com> <50451.74235.qm@web63901.mail.re1.yahoo.com> <20090428150739.GC8430@onelab2.iet.unipi.it> Message-ID: <49F7709F.1020409@modulus.org> Luigi Rizzo wrote: > If i am not mistaken we don't have generic support for interrupt moderation > in the kernel but that's a specific NIC feature: it works if the > hardware supports it, and it doesn't otherwise. > > Of course it would be possible to modify polling to implement > generic interrupt mitigation even without hardware support, so > you get the best of the two worlds. It seems to me that you're wasting your time if you are trying to achieve a high throughput in FreeBSD without using an Intel Pro/1000 or 10gbe networking card. So I don't know if anyone would really miss out if generic polling support was completely removed from the kernel and all efforts were then placed into improving other parts of network flow in the kernel which need more help. - Andrew From rizzo at iet.unipi.it Tue Apr 28 21:26:31 2009 From: rizzo at iet.unipi.it (Luigi Rizzo) Date: Tue Apr 28 21:26:38 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <49F7709F.1020409@modulus.org> References: <36906055-E1AE-486B-BA77-D260E0609BBB@netasq.com> <50451.74235.qm@web63901.mail.re1.yahoo.com> <20090428150739.GC8430@onelab2.iet.unipi.it> <49F7709F.1020409@modulus.org> Message-ID: <20090428213141.GC20530@onelab2.iet.unipi.it> On Wed, Apr 29, 2009 at 07:09:51AM +1000, Andrew Snow wrote: > Luigi Rizzo wrote: > >If i am not mistaken we don't have generic support for interrupt moderation > >in the kernel but that's a specific NIC feature: it works if the > >hardware supports it, and it doesn't otherwise. > > > >Of course it would be possible to modify polling to implement > >generic interrupt mitigation even without hardware support, so > >you get the best of the two worlds. > > It seems to me that you're wasting your time if you are trying to > achieve a high throughput in FreeBSD without using an Intel Pro/1000 or > 10gbe networking card. this is a very partial view of the world. the point is not getting a speed record but making the system work well on a wide variety of hardware. improving other parts of the network flow is nice and useful but it still does not address livelock and the problems that interrupt mitigation or polling are dealing with. cheers luigi > So I don't know if anyone would really miss out if generic polling > support was completely removed from the kernel and all efforts were then > placed into improving other parts of network flow in the kernel which > need more help. > > > - Andrew From linimon at FreeBSD.org Wed Apr 29 05:58:43 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Wed Apr 29 05:58:55 2009 Subject: kern/134079: [em] "em0: Invalid MAC address" in FreeBSD-Current ( 8.0) Message-ID: <200904290558.n3T5wgUX007164@freefall.freebsd.org> Old Synopsis: "em0: Invalid MAC address" in FreeBSD-Current ( 8.0) New Synopsis: [em] "em0: Invalid MAC address" in FreeBSD-Current ( 8.0) Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Wed Apr 29 05:58:30 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=134079 From miki.bsd at gmail.com Wed Apr 29 07:13:56 2009 From: miki.bsd at gmail.com (miki miki) Date: Wed Apr 29 07:14:02 2009 Subject: [ed] link state constantly going down and up Message-ID: <261c29700904282342q62828573hf2631a7f79a10581@mail.gmail.com> Hi, I have a problem with a D-Link DFE-670TXD which is handled by if_ed : the link state is constantly going down and up : Apr 28 14:21:33 iut-mir-o kernel: ed0: link state changed to DOWN Apr 28 14:21:35 iut-mir-o kernel: ed0: link state changed to UP Apr 28 14:21:40 iut-mir-o kernel: ed0: link state changed to DOWN Apr 28 14:21:42 iut-mir-o kernel: ed0: link state changed to UP Apr 28 14:22:51 iut-mir-o kernel: ed0: link state changed to DOWN Apr 28 14:22:53 iut-mir-o kernel: ed0: link state changed to UP I've double checked that there are no wire/switch/hub problems. The card used to work fine with previous version of FreeBSD, the problem appear with the following commit : SVN rev 190643 on 2009-04-02 16:58:45Z by imp (CVS rev 1.126) I do not see any link state change if I revert the commit. FreeBSD [hostname] 8.0-CURRENT FreeBSD 8.0-CURRENT #6 r191614M: Tue Apr 28 07:57:07 CEST 2009 user@hostname:/usr/obj/usr/src/sys/LETHE amd64 ed0: at port 0x100-0x11f irq 19 function 0 config 32 on pccard0 ed0: WARNING: using obsoleted if_watchdog interface ed0: Ethernet address: 00:0d:88:21:54:e2 miibus1: on ed0 nsphyter0: PHY 1 on miibus1 nsphyter0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto dev.ed.0.type: DL10022 dev.ed.0.TxMem: 4608 dev.ed.0.RxMem: 19968 dev.ed.0.Mem: 24576 dev.ed.0.%desc: D-Link DFE-670TXD dev.ed.0.%driver: ed dev.ed.0.%location: function=0 dev.ed.0.%pnpinfo: manufacturer=0x0149 product=0x4530 cisvendor="D-Link" cisproduct="DFE-670TXD" function_type=6 dev.ed.0.%parent: pccard0 Should I submit a PR ? Thanks for your support, Miki From barney_cordoba at yahoo.com Wed Apr 29 12:46:33 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Wed Apr 29 12:46:40 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <49F7709F.1020409@modulus.org> Message-ID: <172091.41695.qm@web63907.mail.re1.yahoo.com> --- On Tue, 4/28/09, Andrew Snow wrote: > From: Andrew Snow > Subject: Re: Interrupts + Polling mode (similar to Linux's NAPI) > To: "Luigi Rizzo" > Cc: "FreeBSD Net" > Date: Tuesday, April 28, 2009, 5:09 PM > Luigi Rizzo wrote: > > If i am not mistaken we don't have generic support > for interrupt moderation > > in the kernel but that's a specific NIC feature: > it works if the > > hardware supports it, and it doesn't otherwise. > > > > Of course it would be possible to modify polling to > implement > > generic interrupt mitigation even without hardware > support, so > > you get the best of the two worlds. > > It seems to me that you're wasting your time if you are > trying to achieve a high throughput in FreeBSD without using > an Intel Pro/1000 or 10gbe networking card. > > So I don't know if anyone would really miss out if > generic polling support was completely removed from the > kernel and all efforts were then placed into improving other > parts of network flow in the kernel which need more help. > > > - Andrew I'm not sure if those specific NICs are the "only" choices. But I am concerned that so much brainpower is being put to extending the life of antiquated science projects and so little (maybe none?) is being put to improving drivers and the general network threading and performance. You spend 3 years redesigning the kernel, yet there are no resources to create a decent 10gb/s solution, to get rid of netgraph and to do network integration properly, or to improve the large number of mediocre drivers that were written what might as well be 100 years ago. When the collective answer to better network performance is polling, it makes it appear as if the FreeBSD project is a bunch of dudes working on stuff they feel like doing, rather than there being some centralized plan to make the project successful. Barney From ndenev at gmail.com Wed Apr 29 13:32:23 2009 From: ndenev at gmail.com (Nikolay Denev) Date: Wed Apr 29 13:32:29 2009 Subject: bce(4) sees all incoming frames as 2026 bytes in length Message-ID: <5E915E92-2B82-4331-9493-739568CC6E8C@gmail.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, I have the following problem with the new bce(4) driver on a 7.2- PRERELEASE from a few days ago. When I run tcpdump on the bce interface on the machine, all incoming frames are shown as 2026 in size, but on the sending machine tcpdump reports that it's sending frames of the correct size. This looks very strange because I don't have enabled Jumbo Frames on this bce interface , and it is still with it's default MTU of 1500 bytes. When I tried to capture the packets, they are zero padded to the 2026 frame size. The checksums are correct, so I suspect a driver bug? P.S.: I experience this problem on several machines with bce interfaces, and on all of them tcpdump sees all incoming frames with length of 2026 bytes. Regards, Niki Denev -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (Darwin) iEYEARECAAYFAkn4TxIACgkQHNAJ/fLbfrmBKgCgkJ6j//+lLrrb+dL3HcLmDxym 4mQAn1qaY41CxWHiVthhCs6lYblX6UMd =+9XA -----END PGP SIGNATURE----- From ertr1013 at student.uu.se Wed Apr 29 13:37:49 2009 From: ertr1013 at student.uu.se (Erik Trulsson) Date: Wed Apr 29 13:37:55 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <172091.41695.qm@web63907.mail.re1.yahoo.com> References: <49F7709F.1020409@modulus.org> <172091.41695.qm@web63907.mail.re1.yahoo.com> Message-ID: <20090429132156.GA42816@owl.midgard.homeip.net> On Wed, Apr 29, 2009 at 05:46:32AM -0700, Barney Cordoba wrote: > > > > > --- On Tue, 4/28/09, Andrew Snow wrote: > > > From: Andrew Snow > > Subject: Re: Interrupts + Polling mode (similar to Linux's NAPI) > > To: "Luigi Rizzo" > > Cc: "FreeBSD Net" > > Date: Tuesday, April 28, 2009, 5:09 PM > > Luigi Rizzo wrote: > > > If i am not mistaken we don't have generic support > > for interrupt moderation > > > in the kernel but that's a specific NIC feature: > > it works if the > > > hardware supports it, and it doesn't otherwise. > > > > > > Of course it would be possible to modify polling to > > implement > > > generic interrupt mitigation even without hardware > > support, so > > > you get the best of the two worlds. > > > > It seems to me that you're wasting your time if you are > > trying to achieve a high throughput in FreeBSD without using > > an Intel Pro/1000 or 10gbe networking card. > > > > So I don't know if anyone would really miss out if > > generic polling support was completely removed from the > > kernel and all efforts were then placed into improving other > > parts of network flow in the kernel which need more help. > > > > > > - Andrew > > I'm not sure if those specific NICs are the "only" choices. But I am > concerned that so much brainpower is being put to extending the life > of antiquated science projects and so little (maybe none?) is being > put to improving drivers and the general network threading and > performance. > > You spend 3 years redesigning the kernel, yet there are no resources to > create a decent 10gb/s solution, to get rid of netgraph and to do > network integration properly, or to improve the large number of mediocre > drivers that were written what might as well be 100 years ago. If you think that more resources should be applied on certain areas, then feel free to provide said resources yourself. Other people are unlikely to change what they work on just because you want them to. > > When the collective answer to better network performance is polling, it > makes it appear as if the FreeBSD project is a bunch of dudes working on > stuff they feel like doing, rather than there being some centralized plan > to make the project successful. That appearance is probably due to the fact the the FreeBSD project actually is a bunch of dudes working on what they feel like doing (or in a few cases on what they get paid for doing), and that there is very little centralized planning being done. (And even if there was, there is no way of enforcing that people work according to such a plan.) -- Erik Trulsson ertr1013@student.uu.se From rizzo at iet.unipi.it Wed Apr 29 13:52:11 2009 From: rizzo at iet.unipi.it (Luigi Rizzo) Date: Wed Apr 29 13:52:18 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <20090429132156.GA42816@owl.midgard.homeip.net> References: <49F7709F.1020409@modulus.org> <172091.41695.qm@web63907.mail.re1.yahoo.com> <20090429132156.GA42816@owl.midgard.homeip.net> Message-ID: <20090429135722.GB51674@onelab2.iet.unipi.it> On Wed, Apr 29, 2009 at 03:21:56PM +0200, Erik Trulsson wrote: > On Wed, Apr 29, 2009 at 05:46:32AM -0700, Barney Cordoba wrote: ... > > When the collective answer to better network performance is polling, it > > makes it appear as if the FreeBSD project is a bunch of dudes working on > > stuff they feel like doing, rather than there being some centralized plan > > to make the project successful. > > That appearance is probably due to the fact the the FreeBSD project actually > is a bunch of dudes working on what they feel like doing (or in a few cases > on what they get paid for doing), and that there is very little centralized > planning being done. (And even if there was, there is no way of enforcing > that people work according to such a plan.) not to mention that very little if any work has been done on polling recently: i developed the base system in 2001-2002, and since then there has been just some basic maintainance (at least in the tree). cheers luigi From pluknet at gmail.com Wed Apr 29 14:00:00 2009 From: pluknet at gmail.com (pluknet) Date: Wed Apr 29 14:00:32 2009 Subject: bce(4) sees all incoming frames as 2026 bytes in length In-Reply-To: <5E915E92-2B82-4331-9493-739568CC6E8C@gmail.com> References: <5E915E92-2B82-4331-9493-739568CC6E8C@gmail.com> Message-ID: 2009/4/29 Nikolay Denev : > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello, > > I have the following problem with the new bce(4) driver on a 7.2-PRERELEASE > from a few days ago. > When I run tcpdump on the bce interface on the machine, all incoming frames > are shown as 2026 in size, > but on the sending machine tcpdump reports that it's sending frames of the > correct size. > This looks very strange because I don't have enabled Jumbo Frames on this > bce interface , and it is still with it's default MTU of 1500 bytes. > When I tried to capture the packets, they are zero padded to the 2026 frame > size. The checksums are correct, so I suspect a driver bug? > > P.S.: I experience this problem on several machines with bce interfaces, and > on all of them tcpdump sees all incoming frames with length of 2026 bytes. hi. Please, give us more details. What is your network card model ? Share ifconfig, dmesg... I have a similar setup, except the frame size - mine are with expected values. -- wbr, pluknet From ndenev at gmail.com Wed Apr 29 14:33:22 2009 From: ndenev at gmail.com (Niki Denev) Date: Wed Apr 29 14:33:29 2009 Subject: bce(4) sees all incoming frames as 2026 bytes in length In-Reply-To: References: <5E915E92-2B82-4331-9493-739568CC6E8C@gmail.com> Message-ID: <2e77fc10904290733m4858172ayd96654f3a9a3a8a@mail.gmail.com> On Wed, Apr 29, 2009 at 4:59 PM, pluknet wrote: > 2009/4/29 Nikolay Denev : >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Hello, >> >> I have the following problem with the new bce(4) driver on a 7.2-PRERELEASE >> from a few days ago. >> When I run tcpdump on the bce interface on the machine, all incoming frames >> are shown as 2026 in size, >> but on the sending machine tcpdump reports that it's sending frames of the >> correct size. >> This looks very strange because I don't have enabled Jumbo Frames on this >> bce interface , and it is still with it's default MTU of 1500 bytes. >> When I tried to capture the packets, they are zero padded to the 2026 frame >> size. The checksums are correct, so I suspect a driver bug? >> >> P.S.: I experience this problem on several machines with bce interfaces, and >> on all of them tcpdump sees all incoming frames with length of 2026 bytes. > > hi. > > Please, give us more details. What is your network card model ? > Share ifconfig, dmesg... > I have a similar setup, except the frame size - mine are with expected values. > > -- > wbr, > pluknet > Hi, Here is one of the cards that have this problem : bce1: mem 0xf8000000-0xf9ffffff irq 16 at device 0.0 on pci3 bce1: Ethernet address: 00:22:19:xx:xx:xx bce1: [ITHREAD] bce1: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C (0x04040105); Flags( MFW MSI ) bce1: flags=8843 metric 0 mtu 1500 options=1bb ether 00:22:19:xx:xx:xx inet 10.18.2.1 netmask 0xffffff00 broadcast 10.18.2.255 media: Ethernet autoselect (1000baseTX ) status: active And here is a tcpdump that shows the problem : 16:27:32.593808 00:22:19:yy:yy:yy > 00:22:19:xx:xx:xx, ethertype IPv4 (0x0800), length 2026: (tos 0x0, ttl 64, id 45347, offset 0, flags [none], proto ICMP (1), length 84) 10.18.2.2 > 10.18.2.1: ICMP echo request, id 13578, seq 36, length 64 16:27:32.593817 00:22:19:xx:xx:xx > 00:22:19:yy:yy:yy, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 18415, offset 0, flags [none], proto ICMP (1), length 84) 10.18.2.1 > 10.18.2.2: ICMP echo reply, id 13578, seq 36, length 64 16:27:33.596569 00:22:19:yy:yy:yy > 00:22:19:xx:xx:xx, ethertype IPv4 (0x0800), length 2026: (tos 0x0, ttl 64, id 45349, offset 0, flags [none], proto ICMP (1), length 84) 10.18.2.2 > 10.18.2.1: ICMP echo request, id 13578, seq 37, length 64 16:27:33.596575 00:22:19:xx:xx:xx > 00:22:19:yy:yy:yy, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 18421, offset 0, flags [none], proto ICMP (1), length 84) 10.18.2.1 > 10.18.2.2: ICMP echo reply, id 13578, seq 37, length 64 16:27:34.599332 00:22:19:yy:yy:yy > 00:22:19:xx:xx:xx, ethertype IPv4 (0x0800), length 2026: (tos 0x0, ttl 64, id 45351, offset 0, flags [none], proto ICMP (1), length 84) 10.18.2.2 > 10.18.2.1: ICMP echo request, id 13578, seq 38, length 64 16:27:34.599338 00:22:19:xx:xx:xx > 00:22:19:yy:yy:yy, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 18427, offset 0, flags [none], proto ICMP (1), length 84) 10.18.2.1 > 10.18.2.2: ICMP echo reply, id 13578, seq 38, length 64 16:27:35.545001 00:22:19:yy:yy:yy > 01:00:5e:00:00:05, ethertype IPv4 (0x0800), length 2026: (tos 0xc0, ttl 1, id 45353, offset 0, flags [none], proto OSPF (89), length 68) 10.18.2.2 > 224.0.0.5: OSPFv2, Hello, length: 48 16:27:35.545062 00:22:19:xx:xx:xx > 01:00:5e:00:00:05, ethertype IPv4 (0x0800), length 82: (tos 0xc0, ttl 1, id 18432, offset 0, flags [none], proto OSPF (89), length 68) 10.18.2.1 > 224.0.0.5: OSPFv2, Hello, length: 48 There is nothing special about the setup, a few machines connected to a gigabit switch, and some of them have cross connects. I'm seeing this on all bce(4) interfaces, regardless of what they are connected to (other bce(4) interface, or GigE switch). -- Regards, Niki From pluknet at gmail.com Wed Apr 29 16:04:29 2009 From: pluknet at gmail.com (pluknet) Date: Wed Apr 29 16:04:35 2009 Subject: bce(4) sees all incoming frames as 2026 bytes in length In-Reply-To: <2e77fc10904290733m4858172ayd96654f3a9a3a8a@mail.gmail.com> References: <5E915E92-2B82-4331-9493-739568CC6E8C@gmail.com> <2e77fc10904290733m4858172ayd96654f3a9a3a8a@mail.gmail.com> Message-ID: 2009/4/29 Niki Denev : > bce1: mem > 0xf8000000-0xf9ffffff irq 16 at device 0.0 on pci3 > bce1: Ethernet address: 00:22:19:xx:xx:xx > bce1: [ITHREAD] > bce1: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C > (0x04040105); Flags( MFW MSI ) > > bce1: flags=8843 metric 0 mtu 1500 > options=1bb > ether 00:22:19:xx:xx:xx > inet 10.18.2.1 netmask 0xffffff00 broadcast 10.18.2.255 > media: Ethernet autoselect (1000baseTX ) > status: active > > And here is a tcpdump that shows the problem : > > 16:27:32.593808 00:22:19:yy:yy:yy > 00:22:19:xx:xx:xx, ethertype IPv4 > (0x0800), length 2026: (tos 0x0, ttl 64, id 45347, offset 0, flags > [none], proto ICMP (1), length 84) 10.18.2.2 > 10.18.2.1: ICMP echo > request, id 13578, seq 36, length 64 > 16:27:32.593817 00:22:19:xx:xx:xx > 00:22:19:yy:yy:yy, ethertype IPv4 > (0x0800), length 98: (tos 0x0, ttl 64, id 18415, offset 0, flags > [none], proto ICMP (1), length 84) 10.18.2.1 > 10.18.2.2: ICMP echo > reply, id 13578, seq 36, length 64 Ok, now I see. A link level length is 2026 for me too for some sort of packets (in opposite to proto's len where all is ok). Mine nic is (same as yours). Looks like a regression. I just also tested 7.1-R and it shows expected LL-length. -- wbr, pluknet From barney_cordoba at yahoo.com Wed Apr 29 19:05:02 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Wed Apr 29 19:05:08 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <20090429132156.GA42816@owl.midgard.homeip.net> Message-ID: <628939.54925.qm@web63901.mail.re1.yahoo.com> --- On Wed, 4/29/09, Erik Trulsson wrote: > From: Erik Trulsson > Subject: Re: Interrupts + Polling mode (similar to Linux's NAPI) > To: "Barney Cordoba" > Cc: "FreeBSD Net" , "Luigi Rizzo" , "Andrew Snow" > Date: Wednesday, April 29, 2009, 9:21 AM > On Wed, Apr 29, 2009 at 05:46:32AM -0700, Barney Cordoba > wrote: > > > > > > > > > > --- On Tue, 4/28/09, Andrew Snow > wrote: > > > > > From: Andrew Snow > > > Subject: Re: Interrupts + Polling mode (similar > to Linux's NAPI) > > > To: "Luigi Rizzo" > > > > Cc: "FreeBSD Net" > > > > Date: Tuesday, April 28, 2009, 5:09 PM > > > Luigi Rizzo wrote: > > > > If i am not mistaken we don't have > generic support > > > for interrupt moderation > > > > in the kernel but that's a specific NIC > feature: > > > it works if the > > > > hardware supports it, and it doesn't > otherwise. > > > > > > > > Of course it would be possible to modify > polling to > > > implement > > > > generic interrupt mitigation even without > hardware > > > support, so > > > > you get the best of the two worlds. > > > > > > It seems to me that you're wasting your time > if you are > > > trying to achieve a high throughput in FreeBSD > without using > > > an Intel Pro/1000 or 10gbe networking card. > > > > > > So I don't know if anyone would really miss > out if > > > generic polling support was completely removed > from the > > > kernel and all efforts were then placed into > improving other > > > parts of network flow in the kernel which need > more help. > > > > > > > > > - Andrew > > > > I'm not sure if those specific NICs are the > "only" choices. But I am > > concerned that so much brainpower is being put to > extending the life > > of antiquated science projects and so little (maybe > none?) is being > > put to improving drivers and the general network > threading and > > performance. > > > > You spend 3 years redesigning the kernel, yet there > are no resources to > > create a decent 10gb/s solution, to get rid of > netgraph and to do > > network integration properly, or to improve the large > number of mediocre > > drivers that were written what might as well be 100 > years ago. > > If you think that more resources should be applied on > certain areas, then > feel free to provide said resources yourself. Other people > are unlikely to > change what they work on just because you want them to. > > > > > When the collective answer to better network > performance is polling, it > > makes it appear as if the FreeBSD project is a bunch > of dudes working on > > stuff they feel like doing, rather than there being > some centralized plan > > to make the project successful. > > That appearance is probably due to the fact the the FreeBSD > project actually > is a bunch of dudes working on what they feel like doing > (or in a few cases > on what they get paid for doing), and that there is very > little centralized > planning being done. (And even if there was, there is no > way of enforcing > that people work according to such a plan.) Its one of the sad truths of FreeBSD. You'd think with such a large number of commercial users you'd be able to get plenty of funding for the things that really need to be done, rather then taking whatever bread crumbs are thrown your way. Perhaps you need fewer bearded academics and a few more suits to run the project more like a business than an extended masters thesis? BC From barney_cordoba at yahoo.com Wed Apr 29 19:07:37 2009 From: barney_cordoba at yahoo.com (Barney Cordoba) Date: Wed Apr 29 19:07:43 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <20090429135722.GB51674@onelab2.iet.unipi.it> Message-ID: <606440.37646.qm@web63907.mail.re1.yahoo.com> --- On Wed, 4/29/09, Luigi Rizzo wrote: > From: Luigi Rizzo > Subject: Re: Interrupts + Polling mode (similar to Linux's NAPI) > To: "Erik Trulsson" > Cc: "Barney Cordoba" , "Andrew Snow" , "FreeBSD Net" > Date: Wednesday, April 29, 2009, 9:57 AM > On Wed, Apr 29, 2009 at 03:21:56PM +0200, Erik Trulsson > wrote: > > On Wed, Apr 29, 2009 at 05:46:32AM -0700, Barney > Cordoba wrote: > ... > > > When the collective answer to better network > performance is polling, it > > > makes it appear as if the FreeBSD project is a > bunch of dudes working on > > > stuff they feel like doing, rather than there > being some centralized plan > > > to make the project successful. > > > > That appearance is probably due to the fact the the > FreeBSD project actually > > is a bunch of dudes working on what they feel like > doing (or in a few cases > > on what they get paid for doing), and that there is > very little centralized > > planning being done. (And even if there was, there is > no way of enforcing > > that people work according to such a plan.) > > not to mention that very little if any work has been done > on polling recently: i developed the base system in > 2001-2002, > and since then there has been just some basic maintainance > (at least in the tree). > > cheers > luigi Obviously someone has ported it to 5, 6 and 7, and the pat answer to performance questions about a driver is "have you tried polling"? Its counterproductive and gives people an excuse not to create any mechanisms to properly tune drivers and to write them correctly for SMP kernels. Barney From imp at bsdimp.com Wed Apr 29 20:12:50 2009 From: imp at bsdimp.com (M. Warner Losh) Date: Wed Apr 29 20:12:57 2009 Subject: [ed] link state constantly going down and up In-Reply-To: <261c29700904282342q62828573hf2631a7f79a10581@mail.gmail.com> Message-ID: <20090429.141002.-720655694.imp@bsdimp.com> : I have a problem with a D-Link DFE-670TXD which is handled by if_ed : : the link state is constantly going down and up : : Apr 28 14:21:33 iut-mir-o kernel: ed0: link state changed to DOWN : Apr 28 14:21:35 iut-mir-o kernel: ed0: link state changed to UP ... : the problem appear with the following commit : : SVN rev 190643 on 2009-04-02 16:58:45Z by imp (CVS rev 1.126) : I do not see any link state change if I revert the commit. Doh! I needed to force auto negotiation for other cards to work. Let me see if I can dig up the DFE-670TXD and go from there... Are you also seeing really horrible network performance as well? Do you see this only under load, or just at idle? Warner From vinnix.bsd at gmail.com Wed Apr 29 23:30:07 2009 From: vinnix.bsd at gmail.com (Vinicius Abrahao) Date: Wed Apr 29 23:30:14 2009 Subject: Problem with lagg failover (using bge0 and wpi0 interfaces) Message-ID: <1e31c7980904291605h55244aech10a7725d59fd0cd@mail.gmail.com> Hi people: I'm testing RELENG_7 ( 7.2-PRERELEASE, actualized and built today) and I'm fouding some issues with lagg(4). My problem happens when I disconnect the cable from bge0. The packets are sending by wpi0 but they don't income back. See tcpdump log above: # tcpdump -i bge0 host 192.168.1.1 tcpdump: WARNING: bge0: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on bge0, link-type EN10MB (Ethernet), capture size 96 bytes 19:29:19.608811 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 3, length 64 19:29:19.610150 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 3, length 64 19:29:20.609807 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 4, length 64 19:29:20.610948 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 4, length 64 19:29:21.597787 arp who-has 192.168.1.60 tell 192.168.1.1 19:29:21.597811 arp reply 192.168.1.60 is-at 00:19:b9:79:f0:af (oui Unknown) 19:29:21.610813 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 5, length 64 19:29:21.611992 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 5, length 64 19:29:22.611815 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 6, length 64 19:29:22.612982 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 6, length 64 19:29:23.612819 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 7, length 64 19:29:23.613947 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 7, length 64 19:29:24.613823 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 8, length 64 19:29:24.614983 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 8, length 64 19:29:25.614823 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 9, length 64 19:29:25.615953 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 9, length 64 19:29:26.615830 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 10, length 64 19:29:26.616991 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 10, length 64 19:29:27.616826 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 11, length 64 19:29:27.618165 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 11, length 64 19:29:28.617826 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 12, length 64 19:29:28.618951 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 12, length 64 19:29:29.618828 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 13, length 64 19:29:29.620001 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 13, length 64 19:29:30.619830 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 14, length 64 19:29:30.620996 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 14, length 64 19:29:31.620840 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 15, length 64 19:29:31.622008 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 15, length 64 19:29:32.621837 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 16, length 64 19:29:32.623013 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 16, length 64 19:29:33.622838 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 17, length 64 19:29:33.624002 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 17, length 64 19:29:34.623841 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 18, length 64 19:29:34.625012 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 18, length 64 19:29:35.624842 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 19, length 64 19:29:35.625977 IP 192.168.1.1 > 192.168.1.60: ICMP echo reply, id 45355, seq 19, length 64 Now I disconnect the cable from bge0 and one second after We see at other tcpdump log, that packets are become sending by wpi0 but, without reply. # tcpdump -i wpi0 host 192.168.1.1 tcpdump: WARNING: wpi0: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on wpi0, link-type EN10MB (Ethernet), capture size 96 bytes 19:29:36.625849 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 20, length 64 19:29:37.626843 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 21, length 64 19:29:38.627845 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 22, length 64 19:29:39.628846 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 23, length 64 19:29:40.629848 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 24, length 64 19:29:41.630861 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 25, length 64 19:29:42.631854 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 26, length 64 19:29:43.632858 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 27, length 64 19:29:44.633860 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 28, length 64 19:29:45.634860 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 29, length 64 19:29:46.635866 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 30, length 64 19:29:47.639367 IP 192.168.1.60 > 192.168.1.1: ICMP echo request, id 45355, seq 31, length 64 Here is my configurations: # route get 192.168.1.1 route to: 192.168.1.1 destination: 192.168.1.1 interface: lagg0 flags: recvpipe sendpipe ssthresh rtt,msec rttvar hopcount mtu expire 0 0 0 0 0 0 1500 1095 # ifconfig wpi0 wpi0: flags=8843 metric 0 mtu 1500 ether 00:19:b9:79:f0:af media: IEEE 802.11 Wireless Ethernet autoselect (DS/1Mbps) status: associated ssid triarius-wifi channel 6 (2437 Mhz 11g) bssid 00:18:39:39:5b:0f authmode OPEN privacy OFF txpower 50 bmiss 7 scanvalid 60 protmode CTS lagg: laggdev lagg0 # ifconfig bge0 bge0: flags=8843 metric 0 mtu 1500 options=9b ether 00:19:b9:79:f0:af media: Ethernet autoselect (none) status: no carrier lagg: laggdev lagg0 # ifconfig lagg0 # When I disconnect bge0, wpi0 come to ACTIVE mode. lagg0: flags=8843 metric 0 mtu 1500 ether 00:19:b9:79:f0:af inet 192.168.1.60 netmask 0xffffffc0 broadcast 192.168.1.63 media: Ethernet autoselect status: active laggproto failover laggport: wpi0 flags=4 laggport: bge0 flags=1 # ifconfig lagg0 # When I connect bge0 back, bge0 comes to ACTIVE mode. lagg0: flags=8843 metric 0 mtu 1500 ether 00:19:b9:79:f0:af inet 192.168.1.60 netmask 0xffffffc0 broadcast 192.168.1.63 media: Ethernet autoselect status: active laggproto failover laggport: wpi0 flags=0<> laggport: bge0 flags=5 Could you help me with this issue? Thanks, Vinicius Abrahao From miki.bsd at gmail.com Thu Apr 30 07:04:07 2009 From: miki.bsd at gmail.com (Miki) Date: Thu Apr 30 07:04:14 2009 Subject: [ed] link state constantly going down and up In-Reply-To: <20090429.141002.-720655694.imp@bsdimp.com> References: <261c29700904282342q62828573hf2631a7f79a10581@mail.gmail.com> <20090429.141002.-720655694.imp@bsdimp.com> Message-ID: <261c29700904300004k4bc02635m2729a0a54c09c135@mail.gmail.com> 2009/4/29 M. Warner Losh > : I have a problem with a D-Link DFE-670TXD which is handled by if_ed : > : the link state is constantly going down and up : > : Apr 28 14:21:33 iut-mir-o kernel: ed0: link state changed to DOWN > : Apr 28 14:21:35 iut-mir-o kernel: ed0: link state changed to UP > ... > : the problem appear with the following commit : > : SVN rev 190643 on 2009-04-02 16:58:45Z by imp (CVS rev 1.126) > : I do not see any link state change if I revert the commit. > > Doh! > > I needed to force auto negotiation for other cards to work. Let me > see if I can dig up the DFE-670TXD and go from there... Are you also > seeing really horrible network performance as well? Do you see this > only under load, or just at idle? > > Warner > Yes network performance suffers from this. The problem only appears under load but not when idle. Mikael From imp at bsdimp.com Thu Apr 30 07:22:11 2009 From: imp at bsdimp.com (M. Warner Losh) Date: Thu Apr 30 07:22:18 2009 Subject: [ed] link state constantly going down and up In-Reply-To: <261c29700904300004k4bc02635m2729a0a54c09c135@mail.gmail.com> References: <261c29700904282342q62828573hf2631a7f79a10581@mail.gmail.com> <20090429.141002.-720655694.imp@bsdimp.com> <261c29700904300004k4bc02635m2729a0a54c09c135@mail.gmail.com> Message-ID: <20090430.011738.1617886960.imp@bsdimp.com> In message: <261c29700904300004k4bc02635m2729a0a54c09c135@mail.gmail.com> Miki writes: : 2009/4/29 M. Warner Losh : : > : I have a problem with a D-Link DFE-670TXD which is handled by if_ed : : > : the link state is constantly going down and up : : > : Apr 28 14:21:33 iut-mir-o kernel: ed0: link state changed to DOWN : > : Apr 28 14:21:35 iut-mir-o kernel: ed0: link state changed to UP : > ... : > : the problem appear with the following commit : : > : SVN rev 190643 on 2009-04-02 16:58:45Z by imp (CVS rev 1.126) : > : I do not see any link state change if I revert the commit. : > : > Doh! : > : > I needed to force auto negotiation for other cards to work. Let me : > see if I can dig up the DFE-670TXD and go from there... Are you also : > seeing really horrible network performance as well? Do you see this : > only under load, or just at idle? : > : > Warner : > : : Yes network performance suffers from this. The problem only appears under : load : but not when idle. Thanks. I'll try to reproduce it here. I noticed this on one of the cards, but had trouble reproducing it, but I'll try harder. Warner From miki.bsd at gmail.com Thu Apr 30 07:29:55 2009 From: miki.bsd at gmail.com (Miki) Date: Thu Apr 30 07:30:02 2009 Subject: [ed] link state constantly going down and up In-Reply-To: <20090430.011738.1617886960.imp@bsdimp.com> References: <261c29700904282342q62828573hf2631a7f79a10581@mail.gmail.com> <20090429.141002.-720655694.imp@bsdimp.com> <261c29700904300004k4bc02635m2729a0a54c09c135@mail.gmail.com> <20090430.011738.1617886960.imp@bsdimp.com> Message-ID: <261c29700904300029s6757d39ei86fbf69ef816fa48@mail.gmail.com> 2009/4/30 M. Warner Losh > In message: <261c29700904300004k4bc02635m2729a0a54c09c135@mail.gmail.com> > Miki writes: > : 2009/4/29 M. Warner Losh > : > : > : I have a problem with a D-Link DFE-670TXD which is handled by if_ed : > : > : the link state is constantly going down and up : > : > : Apr 28 14:21:33 iut-mir-o kernel: ed0: link state changed to DOWN > : > : Apr 28 14:21:35 iut-mir-o kernel: ed0: link state changed to UP > : > ... > : > : the problem appear with the following commit : > : > : SVN rev 190643 on 2009-04-02 16:58:45Z by imp (CVS rev 1.126) > : > : I do not see any link state change if I revert the commit. > : > > : > Doh! > : > > : > I needed to force auto negotiation for other cards to work. Let me > : > see if I can dig up the DFE-670TXD and go from there... Are you also > : > seeing really horrible network performance as well? Do you see this > : > only under load, or just at idle? > : > > : > Warner > : > > : > : Yes network performance suffers from this. The problem only appears under > : load > : but not when idle. > > Thanks. I'll try to reproduce it here. I noticed this on one of the > cards, but had trouble reproducing it, but I'll try harder. > > Warner > I can easily reproduce this by downloading an ISO image via ftp and doing a checkout of a subversion repository Mikael From imp at bsdimp.com Thu Apr 30 07:38:35 2009 From: imp at bsdimp.com (M. Warner Losh) Date: Thu Apr 30 07:38:42 2009 Subject: [ed] link state constantly going down and up In-Reply-To: <261c29700904300029s6757d39ei86fbf69ef816fa48@mail.gmail.com> References: <261c29700904300004k4bc02635m2729a0a54c09c135@mail.gmail.com> <20090430.011738.1617886960.imp@bsdimp.com> <261c29700904300029s6757d39ei86fbf69ef816fa48@mail.gmail.com> Message-ID: <20090430.013610.439575052.imp@bsdimp.com> In message: <261c29700904300029s6757d39ei86fbf69ef816fa48@mail.gmail.com> Miki writes: : 2009/4/30 M. Warner Losh : : > In message: <261c29700904300004k4bc02635m2729a0a54c09c135@mail.gmail.com> : > Miki writes: : > : 2009/4/29 M. Warner Losh : > : : > : > : I have a problem with a D-Link DFE-670TXD which is handled by if_ed : : > : > : the link state is constantly going down and up : : > : > : Apr 28 14:21:33 iut-mir-o kernel: ed0: link state changed to DOWN : > : > : Apr 28 14:21:35 iut-mir-o kernel: ed0: link state changed to UP : > : > ... : > : > : the problem appear with the following commit : : > : > : SVN rev 190643 on 2009-04-02 16:58:45Z by imp (CVS rev 1.126) : > : > : I do not see any link state change if I revert the commit. : > : > : > : > Doh! : > : > : > : > I needed to force auto negotiation for other cards to work. Let me : > : > see if I can dig up the DFE-670TXD and go from there... Are you also : > : > seeing really horrible network performance as well? Do you see this : > : > only under load, or just at idle? : > : > : > : > Warner : > : > : > : : > : Yes network performance suffers from this. The problem only appears under : > : load : > : but not when idle. : > : > Thanks. I'll try to reproduce it here. I noticed this on one of the : > cards, but had trouble reproducing it, but I'll try harder. : > : > Warner : > : : I can easily reproduce this by downloading an ISO image via ftp : and doing a checkout of a subversion repository OK. I'll try that. Do you know if you are able to trigger it with ttcp too? That's where I saw the odd symptoms before... Warner From ndenev at gmail.com Thu Apr 30 11:57:05 2009 From: ndenev at gmail.com (Nikolay Denev) Date: Thu Apr 30 11:57:12 2009 Subject: bce(4) sees all incoming frames as 2026 bytes in length In-Reply-To: References: <5E915E92-2B82-4331-9493-739568CC6E8C@gmail.com> <2e77fc10904290733m4858172ayd96654f3a9a3a8a@mail.gmail.com> Message-ID: On Apr 29, 2009, at 7:04 PM, pluknet wrote: > 2009/4/29 Niki Denev : > >> bce1: mem >> 0xf8000000-0xf9ffffff irq 16 at device 0.0 on pci3 >> bce1: Ethernet address: 00:22:19:xx:xx:xx >> bce1: [ITHREAD] >> bce1: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C >> (0x04040105); Flags( MFW MSI ) >> >> bce1: flags=8843 metric 0 >> mtu 1500 >> >> options >> = >> 1bb >> ether 00:22:19:xx:xx:xx >> inet 10.18.2.1 netmask 0xffffff00 broadcast 10.18.2.255 >> media: Ethernet autoselect (1000baseTX ) >> status: active >> >> And here is a tcpdump that shows the problem : >> >> 16:27:32.593808 00:22:19:yy:yy:yy > 00:22:19:xx:xx:xx, ethertype IPv4 >> (0x0800), length 2026: (tos 0x0, ttl 64, id 45347, offset 0, flags >> [none], proto ICMP (1), length 84) 10.18.2.2 > 10.18.2.1: ICMP echo >> request, id 13578, seq 36, length 64 >> 16:27:32.593817 00:22:19:xx:xx:xx > 00:22:19:yy:yy:yy, ethertype IPv4 >> (0x0800), length 98: (tos 0x0, ttl 64, id 18415, offset 0, flags >> [none], proto ICMP (1), length 84) 10.18.2.1 > 10.18.2.2: ICMP echo >> reply, id 13578, seq 36, length 64 > > Ok, now I see. A link level length is 2026 for me too for some sort > of packets > (in opposite to proto's len where all is ok). > > Mine nic is > (same as yours). > > Looks like a regression. > I just also tested 7.1-R and it shows expected LL-length. > > > -- > wbr, > pluknet I think I got it. It seems that the mbuf fields m_pkthdr.len and m_len are not updated to the real packet size pkt_len. Well, actually they are updated, but only if we have ZERO_COPY_SOCKETS defined. After I added this : m0->m_pkthdr.len = m0->m_len = pkt_len; at about line 5930 in if_bce.c, the frame length reported by tcpdump seems correct. P.S.: I guess this could be the cause for the lagg(4) over bce(4) problems too? P.S.2: This fix will probably break the ZERO_COPY_SOCKETS case, but should be fairly easy to make it a "proper" fix. Regards, Niki Denev -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 195 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090430/5217c8e7/PGP.pgp From ndenev at gmail.com Thu Apr 30 12:04:28 2009 From: ndenev at gmail.com (Nikolay Denev) Date: Thu Apr 30 12:04:35 2009 Subject: bce(4) and lagg(4) fix [was: bce(4) sees all incoming frames as 2026 bytes in length] In-Reply-To: References: <5E915E92-2B82-4331-9493-739568CC6E8C@gmail.com> <2e77fc10904290733m4858172ayd96654f3a9a3a8a@mail.gmail.com> Message-ID: <5FD800FE-23E5-482E-8491-564FE52D91D5@gmail.com> On Apr 30, 2009, at 2:56 PM, Nikolay Denev wrote: > On Apr 29, 2009, at 7:04 PM, pluknet wrote: > >> 2009/4/29 Niki Denev : >> >>> bce1: mem >>> 0xf8000000-0xf9ffffff irq 16 at device 0.0 on pci3 >>> bce1: Ethernet address: 00:22:19:xx:xx:xx >>> bce1: [ITHREAD] >>> bce1: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C >>> (0x04040105); Flags( MFW MSI ) >>> >>> bce1: flags=8843 metric 0 >>> mtu 1500 >>> >>> options >>> = >>> 1bb >>> >>> ether 00:22:19:xx:xx:xx >>> inet 10.18.2.1 netmask 0xffffff00 broadcast 10.18.2.255 >>> media: Ethernet autoselect (1000baseTX ) >>> status: active >>> >>> And here is a tcpdump that shows the problem : >>> >>> 16:27:32.593808 00:22:19:yy:yy:yy > 00:22:19:xx:xx:xx, ethertype >>> IPv4 >>> (0x0800), length 2026: (tos 0x0, ttl 64, id 45347, offset 0, flags >>> [none], proto ICMP (1), length 84) 10.18.2.2 > 10.18.2.1: ICMP echo >>> request, id 13578, seq 36, length 64 >>> 16:27:32.593817 00:22:19:xx:xx:xx > 00:22:19:yy:yy:yy, ethertype >>> IPv4 >>> (0x0800), length 98: (tos 0x0, ttl 64, id 18415, offset 0, flags >>> [none], proto ICMP (1), length 84) 10.18.2.1 > 10.18.2.2: ICMP echo >>> reply, id 13578, seq 36, length 64 >> >> Ok, now I see. A link level length is 2026 for me too for some sort >> of packets >> (in opposite to proto's len where all is ok). >> >> Mine nic is >> (same as yours). >> >> Looks like a regression. >> I just also tested 7.1-R and it shows expected LL-length. >> >> >> -- >> wbr, >> pluknet > > > I think I got it. > > It seems that the mbuf fields m_pkthdr.len and m_len are not updated > to the real packet size pkt_len. > Well, actually they are updated, but only if we have > ZERO_COPY_SOCKETS defined. > > After I added this : > > m0->m_pkthdr.len = m0->m_len = pkt_len; > > at about line 5930 in if_bce.c, the frame length reported by tcpdump > seems correct. > > P.S.: I guess this could be the cause for the lagg(4) over bce(4) > problems too? > > P.S.2: This fix will probably break the ZERO_COPY_SOCKETS case, but > should be fairly easy to make it a "proper" fix. > > Regards, > Niki Denev > I can confirm that with this fix I was able to create lagg(4) interface in "failover" mode with only one member, a bce(4) interface, and it seems to work OK. Regards, Niki Denev -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 195 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090430/984e0c70/PGP.pgp From ndenev at gmail.com Thu Apr 30 12:38:25 2009 From: ndenev at gmail.com (Nikolay Denev) Date: Thu Apr 30 12:38:32 2009 Subject: bce(4) and lagg(4) fix [was: bce(4) sees all incoming frames as 2026 bytes in length] In-Reply-To: <5FD800FE-23E5-482E-8491-564FE52D91D5@gmail.com> References: <5E915E92-2B82-4331-9493-739568CC6E8C@gmail.com> <2e77fc10904290733m4858172ayd96654f3a9a3a8a@mail.gmail.com> <5FD800FE-23E5-482E-8491-564FE52D91D5@gmail.com> Message-ID: On Apr 30, 2009, at 3:04 PM, Nikolay Denev wrote: [snip] >> >> I think I got it. >> >> It seems that the mbuf fields m_pkthdr.len and m_len are not >> updated to the real packet size pkt_len. >> Well, actually they are updated, but only if we have >> ZERO_COPY_SOCKETS defined. >> >> After I added this : >> >> m0->m_pkthdr.len = m0->m_len = pkt_len; >> >> at about line 5930 in if_bce.c, the frame length reported by >> tcpdump seems correct. >> >> P.S.: I guess this could be the cause for the lagg(4) over bce(4) >> problems too? >> >> P.S.2: This fix will probably break the ZERO_COPY_SOCKETS case, but >> should be fairly easy to make it a "proper" fix. >> >> Regards, >> Niki Denev >> > > I can confirm that with this fix I was able to create lagg(4) > interface in "failover" mode with only one member, a bce(4) > interface, and it seems to work OK. > > Here is the patch : --- sys/dev/bce/if_bce.c.orig 2009-04-30 14:06:54.000000000 +0200 +++ sys/dev/bce/if_bce.c 2009-04-30 14:11:32.000000000 +0200 @@ -5926,6 +5926,11 @@ goto bce_rx_int_next_rx; } +#ifndef ZERO_COPY_SOCKETS + /* Adjust the packet length to match the received data. */ + m0->m_pkthdr.len = m0->m_len = pkt_len; +#endif + /* Send the packet to the appropriate interface. */ m0->m_pkthdr.rcvif = ifp; Regards, Niki Denev -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 195 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20090430/d5b4dd78/PGP.pgp From pluknet at gmail.com Thu Apr 30 13:11:29 2009 From: pluknet at gmail.com (pluknet) Date: Thu Apr 30 13:11:57 2009 Subject: bce(4) sees all incoming frames as 2026 bytes in length In-Reply-To: References: <5E915E92-2B82-4331-9493-739568CC6E8C@gmail.com> <2e77fc10904290733m4858172ayd96654f3a9a3a8a@mail.gmail.com> Message-ID: 2009/4/30 Nikolay Denev : > On Apr 29, 2009, at 7:04 PM, pluknet wrote: > >> 2009/4/29 Niki Denev : >> >>> bce1: mem >>> 0xf8000000-0xf9ffffff irq 16 at device 0.0 on pci3 >>> bce1: Ethernet address: 00:22:19:xx:xx:xx >>> bce1: [ITHREAD] >>> bce1: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C >>> (0x04040105); Flags( MFW MSI ) >>> >>> bce1: flags=8843 metric 0 mtu >>> 1500 >>> >>> ?options=1bb >>> ? ? ?ether 00:22:19:xx:xx:xx >>> ? ? ?inet 10.18.2.1 netmask 0xffffff00 broadcast 10.18.2.255 >>> ? ? ?media: Ethernet autoselect (1000baseTX ) >>> ? ? ?status: active >>> >>> And here is a tcpdump that shows the problem : >>> >>> 16:27:32.593808 00:22:19:yy:yy:yy > 00:22:19:xx:xx:xx, ethertype IPv4 >>> (0x0800), length 2026: (tos 0x0, ttl 64, id 45347, offset 0, flags >>> [none], proto ICMP (1), length 84) 10.18.2.2 > 10.18.2.1: ICMP echo >>> request, id 13578, seq 36, length 64 >>> 16:27:32.593817 00:22:19:xx:xx:xx > 00:22:19:yy:yy:yy, ethertype IPv4 >>> (0x0800), length 98: (tos 0x0, ttl 64, id 18415, offset 0, flags >>> [none], proto ICMP (1), length 84) 10.18.2.1 > 10.18.2.2: ICMP echo >>> reply, id 13578, seq 36, length 64 >> >> Ok, now I see. A link level length is 2026 for me too for some sort of >> packets >> (in opposite to proto's len where all is ok). >> >> Mine nic is >> (same as yours). >> >> Looks like a regression. >> I just also tested 7.1-R and it shows expected LL-length. >> >> >> -- >> wbr, >> pluknet > > > I think I got it. > > It seems that the mbuf fields m_pkthdr.len and m_len are not updated to the > real packet size pkt_len. > Well, actually they are updated, but only if we have ZERO_COPY_SOCKETS > defined. > > After I added this : > > ? m0->m_pkthdr.len = m0->m_len = pkt_len; > > at about line 5930 in if_bce.c, the frame length reported by tcpdump seems > correct. > JFYI In 7.1-R driver version there was length updating and it seems to be lost in 7.2-R. /* Skip over the l2_fhdr when passing the data up the s m_adj(m, sizeof(struct l2_fhdr) + ETHER_ALIGN); /* Adjust the packet length to match the received data. m->m_pkthdr.len = m->m_len = len; /* Send the packet to the appropriate interface. */ m->m_pkthdr.rcvif = ifp; > P.S.: I guess this could be the cause for the lagg(4) over bce(4) problems > too? > > P.S.2: This fix will probably break the ZERO_COPY_SOCKETS case, but should > be fairly easy to make it a "proper" fix. At least it can be ifndef'ed out in ZERO_COPY_SOCKETS case. @@ -5926,6 +5926,11 @@ goto bce_rx_int_next_rx; } +#ifndef ZERO_COPY_SOCKETS + /* Set the total packet length. */ + m0->m_pkthdr.len = m0->m_len = pkt_len; +#endif + /* Send the packet to the appropriate interface. */ m0->m_pkthdr.rcvif = ifp; -eop- -- wbr, pluknet From miki.bsd at gmail.com Thu Apr 30 13:19:34 2009 From: miki.bsd at gmail.com (Miki) Date: Thu Apr 30 13:19:40 2009 Subject: [ed] link state constantly going down and up In-Reply-To: <20090430.013610.439575052.imp@bsdimp.com> References: <261c29700904300004k4bc02635m2729a0a54c09c135@mail.gmail.com> <20090430.011738.1617886960.imp@bsdimp.com> <261c29700904300029s6757d39ei86fbf69ef816fa48@mail.gmail.com> <20090430.013610.439575052.imp@bsdimp.com> Message-ID: <261c29700904300619l3f2e2d68s580f95a3c673fddf@mail.gmail.com> 2009/4/30 M. Warner Losh > In message: <261c29700904300029s6757d39ei86fbf69ef816fa48@mail.gmail.com> > Miki writes: > : 2009/4/30 M. Warner Losh > : > : > In message: < > 261c29700904300004k4bc02635m2729a0a54c09c135@mail.gmail.com> > : > Miki writes: > : > : 2009/4/29 M. Warner Losh > : > : > : > : > : I have a problem with a D-Link DFE-670TXD which is handled by > if_ed : > : > : > : the link state is constantly going down and up : > : > : > : Apr 28 14:21:33 iut-mir-o kernel: ed0: link state changed to DOWN > : > : > : Apr 28 14:21:35 iut-mir-o kernel: ed0: link state changed to UP > : > : > ... > : > : > : the problem appear with the following commit : > : > : > : SVN rev 190643 on 2009-04-02 16:58:45Z by imp (CVS rev 1.126) > : > : > : I do not see any link state change if I revert the commit. > : > : > > : > : > Doh! > : > : > > : > : > I needed to force auto negotiation for other cards to work. Let me > : > : > see if I can dig up the DFE-670TXD and go from there... Are you > also > : > : > seeing really horrible network performance as well? Do you see > this > : > : > only under load, or just at idle? > : > : > > : > : > Warner > : > : > > : > : > : > : Yes network performance suffers from this. The problem only appears > under > : > : load > : > : but not when idle. > : > > : > Thanks. I'll try to reproduce it here. I noticed this on one of the > : > cards, but had trouble reproducing it, but I'll try harder. > : > > : > Warner > : > > : > : I can easily reproduce this by downloading an ISO image via ftp > : and doing a checkout of a subversion repository > > OK. I'll try that. Do you know if you are able to trigger it with > ttcp too? That's where I saw the odd symptoms before... > > Warner > No I'm unable to trigger it with ttcp. From pluknet at gmail.com Thu Apr 30 13:33:02 2009 From: pluknet at gmail.com (pluknet) Date: Thu Apr 30 13:33:09 2009 Subject: bce(4) and lagg(4) fix [was: bce(4) sees all incoming frames as 2026 bytes in length] In-Reply-To: References: <5E915E92-2B82-4331-9493-739568CC6E8C@gmail.com> <2e77fc10904290733m4858172ayd96654f3a9a3a8a@mail.gmail.com> <5FD800FE-23E5-482E-8491-564FE52D91D5@gmail.com> Message-ID: 2009/4/30 Nikolay Denev : > On Apr 30, 2009, at 3:04 PM, Nikolay Denev wrote: > [snip] >>> >>> I think I got it. >>> >>> It seems that the mbuf fields m_pkthdr.len and m_len are not updated to >>> the real packet size pkt_len. >>> Well, actually they are updated, but only if we have ZERO_COPY_SOCKETS >>> defined. >>> >>> After I added this : >>> >>> ? m0->m_pkthdr.len = m0->m_len = pkt_len; >>> >>> at about line 5930 in if_bce.c, the frame length reported by tcpdump >>> seems correct. >>> >>> P.S.: I guess this could be the cause for the lagg(4) over bce(4) >>> problems too? >>> >>> P.S.2: This fix will probably break the ZERO_COPY_SOCKETS case, but >>> should be fairly easy to make it a "proper" fix. >>> >>> Regards, >>> Niki Denev >>> >> >> I can confirm that with this fix I was able to create lagg(4) interface in >> "failover" mode with only one member, a ?bce(4) interface, and it seems to >> work OK. >> >> > > Here is the patch : > > --- sys/dev/bce/if_bce.c.orig ? 2009-04-30 14:06:54.000000000 +0200 > +++ sys/dev/bce/if_bce.c ? ? ? ?2009-04-30 14:11:32.000000000 +0200 > @@ -5926,6 +5926,11 @@ > ? ? ? ? ? ? ? ? ? ? ? ?goto bce_rx_int_next_rx; > ? ? ? ? ? ? ? ?} > > +#ifndef ZERO_COPY_SOCKETS > + ? ? ? ? ? ? ? /* Adjust the packet length to match the received data. */ > + ? ? ? ? ? ? ? m0->m_pkthdr.len = m0->m_len = pkt_len; > +#endif > + > ? ? ? ? ? ? ? ?/* Send the packet to the appropriate interface. */ > ? ? ? ? ? ? ? ?m0->m_pkthdr.rcvif = ifp; > Ha-ha, you was fast! The only note: I think the comment part should be consistent with ZERO_COPY_SOCKETS case. Thank you. -- wbr, pluknet From adrian at freebsd.org Thu Apr 30 15:47:44 2009 From: adrian at freebsd.org (Adrian Chadd) Date: Thu Apr 30 15:47:54 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <20090429132156.GA42816@owl.midgard.homeip.net> References: <49F7709F.1020409@modulus.org> <172091.41695.qm@web63907.mail.re1.yahoo.com> <20090429132156.GA42816@owl.midgard.homeip.net> Message-ID: 2009/4/29 Erik Trulsson : > That appearance is probably due to the fact the the FreeBSD project actually > is a bunch of dudes working on what they feel like doing (or in a few cases > on what they get paid for doing), and that there is very little centralized > planning being done. (And even if there was, there is no way of enforcing > that people work according to such a plan.) There's more centralised planning in the network stack then you seem to think there is. Personally, I'd like to see some of the multi-thread em stuff (iirc for non-multi-threaded cards) that some company has written and kept up to date make it into -current as it obviously works for them and may work well for other people. But "stuff" is happening and along a roughly consensus which will be probably playing out some more during BSDCan in the upcoming week or so. Pay attention to what Robert, Jeff and Kip (may) talk about there. 2c (as an observer of all of this..) Adrian From adrian at freebsd.org Thu Apr 30 15:51:05 2009 From: adrian at freebsd.org (Adrian Chadd) Date: Thu Apr 30 15:51:12 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: <628939.54925.qm@web63901.mail.re1.yahoo.com> References: <20090429132156.GA42816@owl.midgard.homeip.net> <628939.54925.qm@web63901.mail.re1.yahoo.com> Message-ID: 2009/4/30 Barney Cordoba : > Its one of the sad truths of FreeBSD. You'd think with such a large number > of commercial users you'd be able to get plenty of funding for the things > that really need to be done, rather then taking whatever bread crumbs > are thrown your way. Perhaps you need fewer bearded academics and a few > more suits to run the project more like a business than an extended > masters thesis? That is happening. Robert/Kris' (and others) working on parallelising the network stack all the way up and down. Kip has been working on dramatically improving TCP connection and packet forward scalability to support 10GE. This is in part commercially funded work. The problem with "commercial funding" is that for the most part, FreeBSD/Linux/etc "mostly work" for most use cases. What you're not seeing is 100% contribution back from commercial organisations who have extended FreeBSD (and linux for that matter) in their environment to fix specific performance constraints. This is finally changing and stuff is being pushed back into the public tree. 2c, Adrian From pluknet at gmail.com Thu Apr 30 16:11:49 2009 From: pluknet at gmail.com (pluknet) Date: Thu Apr 30 16:11:56 2009 Subject: Interrupts + Polling mode (similar to Linux's NAPI) In-Reply-To: References: <49F7709F.1020409@modulus.org> <172091.41695.qm@web63907.mail.re1.yahoo.com> <20090429132156.GA42816@owl.midgard.homeip.net> Message-ID: 2009/4/30 Adrian Chadd : > 2009/4/29 Erik Trulsson : >> That appearance is probably due to the fact the the FreeBSD project actually >> is a bunch of dudes working on what they feel like doing (or in a few cases >> on what they get paid for doing), and that there is very little centralized >> planning being done. (And even if there was, there is no way of enforcing >> that people work according to such a plan.) > > There's more centralised planning in the network stack then you seem > to think there is. > > Personally, I'd like to see some of the multi-thread em stuff (iirc > for non-multi-threaded cards) that some company has written and kept > up to date make it into -current as it obviously works for them and > may work well for other people. My part of fyi. That company is Yandex - the one of the largest Russian search engines (first of all) :p [1][2]. [1] http://people.yandex-team.ru/~wawa/ [2] http://company.yandex.com/general_info/yandex_today.xml (just my 2 Russian copecks) -- wbr, pluknet From vinnix.bsd at gmail.com Thu Apr 30 17:34:52 2009 From: vinnix.bsd at gmail.com (Vinicius Abrahao) Date: Thu Apr 30 17:34:58 2009 Subject: Problem with lagg failover (using bge0 and wpi0 interfaces) In-Reply-To: <1e31c7980904291605h55244aech10a7725d59fd0cd@mail.gmail.com> References: <1e31c7980904291605h55244aech10a7725d59fd0cd@mail.gmail.com> Message-ID: <1e31c7980904301034y70c678er5728f8ed6b371bf5@mail.gmail.com> Anyone know if my problem[1] is related by this PR? http://www.freebsd.org/cgi/query-pr.cgi?pr=133178 [1]: http://lists.freebsd.org/pipermail/freebsd-net/2009-April/021873.html Tks, Vinicius From andrea at brancatelli.it Thu Apr 30 19:44:21 2009 From: andrea at brancatelli.it (andrea@brancatelli.it) Date: Thu Apr 30 19:44:27 2009 Subject: lagg LACP between two hosts Message-ID: Hello everybody, I have a strange curiosity maybe you can clarify me :-) Is it possible to do a LACP lagg connection directly between two hosts using two gigalan and two crossed cables? Or maybe three... ;-) Thanks ;-) From thompsa at FreeBSD.org Thu Apr 30 20:06:00 2009 From: thompsa at FreeBSD.org (Andrew Thompson) Date: Thu Apr 30 20:06:06 2009 Subject: lagg LACP between two hosts In-Reply-To: References: Message-ID: <20090430194820.GA67455@citylink.fud.org.nz> On Thu, Apr 30, 2009 at 09:14:04PM +0200, andrea@brancatelli.it wrote: > > Hello everybody, > > I have a strange curiosity maybe you can clarify me :-) > > Is it possible to do a LACP lagg connection directly between two hosts > using two gigalan and two crossed cables? Or maybe three... ;-) Yes, that will work fine. The load balancing across the link uses the mac+ip to hash so you need variation in those to split the traffic. Andrew From steve at ibctech.ca Thu Apr 30 20:07:37 2009 From: steve at ibctech.ca (Steve Bertrand) Date: Thu Apr 30 20:07:44 2009 Subject: lagg LACP between two hosts In-Reply-To: References: Message-ID: <49FA0502.4040302@ibctech.ca> andrea@brancatelli.it wrote: > Hello everybody, > > I have a strange curiosity maybe you can clarify me :-) > > Is it possible to do a LACP lagg connection directly between two hosts > using two gigalan and two crossed cables? Or maybe three... ;-) I've done it with two GigE nics, and it works perfectly well. Steve