Bad routing performance on 500Mhz Geode LX with CURRENT, ipfw and mpd5

Eugene Grosbein egrosbein at rdtc.ru
Fri Aug 31 05:19:23 UTC 2012


01.09.2012 01:07, YongHyeon PYUN пишет:

> It would be interesting to know whether there is any difference
> before/after taskq change made in r235334.  I was told that taskq
> conversion for vr(4) resulted in better performance but I think
> taskq shall add more burden on slow hardware.
> Pre-r235334 interrupt handler still has issues since it wouldn't
> exit interrupt handler if there are any pending interrupts.
> It shall consume most of its CPU cycles in the interrupt handler
> under extreme network load.  If pre-r235334 shows better result,
> you are probably able to implement interrupt mitigation by using
> VT6102/VT6105's timer interrupt. I guess some frames would be lost
> with the interrupt mitigation under high network load but other
> part of kernel would have more chance to run important tasks.
> Anyway, vr(4) controllers wouldn't be one of best choice for slow
> machines due to DMA alignment limitation and driver assisted
> padding requirement.

I also have AMD Geode LX8-based system having two on-board vr(4)
interfaces. I've just tried it with vr(4) driver from HEAD
built as module for my 8.3-STABLE/i386.

It builds just fine with minor change:

--- if_vr.c.orig        2012-08-29 23:36:05.000000000 +0700
+++ if_vr.c     2012-08-29 22:51:01.000000000 +0700
@@ -2176,7 +2176,7 @@
        VR_LOCK(sc);
        mii = device_get_softc(sc->vr_miibus);
        LIST_FOREACH(miisc, &mii->mii_phys, mii_list)
-               PHY_RESET(miisc);
+               mii_phy_reset(miisc);
        sc->vr_flags &= ~(VR_F_LINK | VR_F_TXPAUSE);
        error = mii_mediachg(mii);
        VR_UNLOCK(sc);

dmesg says:

vr0: <VIA VT6105 Rhine III 10/100BaseTX> port 0xe000-0xe0ff mem 0xef024000-0xef0240ff irq 10 at device 12.0 on pci0
vr0: Quirks: 0x0
vr0: Revision: 0x86
miibus0: <MII bus> on vr0
ukphy0: <Generic IEEE 802.3u media interface> PHY 1 on miibus0
ukphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr0: Ethernet address: 00:10:f3:13:72:c6
vr0: [ITHREAD]
vr1: <VIA VT6105 Rhine III 10/100BaseTX> port 0xe400-0xe4ff mem 0xef025000-0xef0250ff irq 11 at device 13.0 on pci0
vr1: Quirks: 0x0
vr1: Revision: 0x86
miibus1: <MII bus> on vr1
ukphy1: <Generic IEEE 802.3u media interface> PHY 1 on miibus1
ukphy1:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr1: Ethernet address: 00:10:f3:13:72:c7
vr1: [ITHREAD]

This is industrial Nexcom NICE-3120-LX8 fanless PC system used as home router,
the only miniPCI expansion slot is occupied with ath0 WiFi card.
http://www.orbitmicro.com/global/system-4423.html

I have to say that HEAD driver runs MUCH worse. With stock 8.3 driver
I have same 3.35MByte/s one-thread http transfer through this system
but LA=1.7 only and userland is pretty responsive. top(1) shows:

last pid: 29696;  load averages:  1.70,  1.08,  0.88        up 2+00:11:31  22:21:46
94 processes:  2 running, 78 sleeping, 14 waiting
CPU:  7.7% user,  0.0% nice,  0.0% system, 15.4% interrupt, 76.9% idle
Mem: 51M Active, 671M Inact, 188M Wired, 18M Cache, 110M Buf, 60M Free
Swap:

  PID USERNAME     PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
   11 root         -68    -     0K   112K WAIT   235:11 51.56% intr{irq11: vr1}
   10 root         171 ki31     0K     8K RUN     24.4H 31.15% idle
   11 root         -44    -     0K   112K WAIT     0:51  9.38% intr{swi1: netisr 0}
   11 root         -68    -     0K   112K WAIT     0:30  6.40% intr{irq10: vr0}
29688 root          44    0  3628K  1708K RUN      0:00  0.10% top

With HEAD driver, for same test LA pikes to 8 and higher and it takes up to 10 seconds
for userland applications like shell or screen(1) to respond to physical console events:

last pid:  1335;  load averages:  8.27,  4.05,  2.04        up 0+00:14:21  23:31:18
97 processes:  2 running, 83 sleeping, 12 waiting
CPU:  0.1% user,  0.0% nice, 55.7% system, 43.6% interrupt,  0.6% idle
Mem: 40M Active, 21M Inact, 175M Wired, 2512K Cache, 109M Buf, 749M Free
Swap:

  PID USERNAME     PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
   12 root         -16    -     0K     8K sleep    1:12 44.87% ng_queue
   11 root         -28    -     0K    96K WAIT     1:45 35.60% intr{swi5: +}
   11 root         -44    -     0K    96K WAIT     1:03 18.80% intr{swi1: netisr 0}
   10 root         171 ki31     0K     8K RUN      6:34  0.39% idle
   13 root         -16    -     0K     8K -        0:07  0.10% yarrow

That's with direct NETISR mode, indirect mode makes it only worse (LA is higher
for both drivers, up to 4.5 for old one and up to 9+ for new).

I ran tests with same custom kernel, loading/unloading old/new drivers as modules
without reboot. Schedules is default SCHED_ULE.
Another note: I run mpd/PPPoE/ng0 over vr1 and http transfer were through ng0.

Eugene Grosbein


More information about the freebsd-net mailing list