Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]

Paul paul at gtcomm.net
Wed Jul 2 09:47:07 UTC 2008


Ipfw rule was simply allow ip from any to any :)
This is 64bit i'm testing now.. I have a 32 bit install I tested on 
another machine but it only has bge NIC and wasn't performing as well
so I'll reinstall 32 bit on this 2212 and test then drop in the 2222 
(3ghz) and test.
I still don't like the huge hit ipfw and lagg take :/

** I tried polling in UP mode and I got some VERY interesting results..
CPU is 44% idle (idle polling isn't on)  but I'm getting errors!  It's 
doing 530kpps with ipfw loaded, which without polling uses 100% cpu but 
now it says my cpu is 44% idle? that makes no sense.. If it was idle why 
am I getting errors?  I only get errors when em taskq was eating 100% cpu..
Idle polling on/off makes no difference.
user_frac is set to 5 ..
last pid:  1598;  load averages:  0.01,  0.16,  
0.43                                                                  up 
0+00:34:41  04:04:43
66 processes:  2 running, 46 sleeping, 18 waiting
CPU:  0.0% user,  0.0% nice,  7.3% system, 46.5% interrupt, 46.2% idle
Mem: 8064K Active, 6808K Inact, 43M Wired, 92K Cache, 9264K Buf, 1923M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
   10 root     171 ki31     0K    16K RUN     10:10 88.87% idle
 1598 root      45    0  8084K  2052K RUN      0:00  1.12% top
   11 root     -32    -     0K    16K WAIT     0:02  0.24% swi4: clock sio
   13 root     -44    -     0K    16K WAIT    14:13  0.15% swi1: net
 1329 root      44    0 33732K  4572K select   0:00  0.05% sshd

          input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls
    541186 68741   33107504          1     0          0     0
    540036 70611   33044632          1     0        178     0
    540470 66493   33043148          1     0        178     0
    541903 67981   33125414          1     0        178     0
    541238 84979   33105898          1     0        178     0
    541338 74067   33115984          2     0        356     0
    539116 49286   32991516          2     0        220     0


kldunload ipfw.......

          input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls
    600589     0   36751064          1     0        226     0
    606294     0   37102868          2     0        220     0
    616802     0   37733866          1     0        178     0
    623017     0   38117436          1     0        178     0
    624800     0   38225470          1     0        178     0
    626791     0   38347426          1     0        178     0

last pid:  1605;  load averages:  0.00,  0.13,  
0.40                                                                  up 
0+00:35:30  04:05:32
66 processes:  2 running, 46 sleeping, 18 waiting
CPU:  0.0% user,  0.0% nice,  7.1% system, 36.0% interrupt, 56.9% idle
Mem: 8064K Active, 6812K Inact, 43M Wired, 92K Cache, 9264K Buf, 1923M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
   10 root     171 ki31     0K    16K RUN     10:16 95.36% idle
   13 root     -44    -     0K    16K WAIT    14:53  0.24% swi1: net
   36 root     -68    -     0K    16K -        1:03  0.10% em3 taskq
 1605 root      44    0  8084K  2052K RUN      0:00  0.10% top
   11 root     -32    -     0K    16K WAIT     0:02  0.05% swi4: clock sio



add some more PPS......
          input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls
    749015 169684   46438936          1     0         42     0
    749176 184574   46448916          1     0        178     0
    759576 188462   47093716          1     0        178     0
    762904 182854   47300052          1     0        178     0
    798039 147509   49478422          1     0        178     0
    759528 194297   47090740          1     0        178     0
    746849 195935   46304642          1     0        178     0
    747566 186703   46349096          1     0        178     0
    750011 181630   46500702          2


last pid:  1607;  load averages:  0.19,  0.17,  
0.40                                                                  up 
0+00:36:18  04:06:20
66 processes:  2 running, 46 sleeping, 18 waiting
CPU:  0.0% user,  0.0% nice, 12.5% system, 45.4% interrupt, 42.1% idle
Mem: 8068K Active, 6808K Inact, 43M Wired, 92K Cache, 9264K Buf, 1923M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
   10 root     171 ki31     0K    16K RUN     10:21 85.64% idle
   36 root     -68    -     0K    16K -        1:07  3.61% em3 taskq
 1607 root      44    0  8084K  2052K RUN      0:00  0.93% top
   13 root     -44    -     0K    16K WAIT    15:32  0.20% swi1: net
   11 root     -32    -     0K    16K WAIT     0:02  0.05% swi4: clock sio



So my maximum without polling is close to 800kpps but if I push that it 
starts locking me from doing things, or
my maximum is 750kpps with polling and the console is very responsive? 
How on EARTH can my CPU be 42% idle with polling and i'm getting all 
these errors..
The whole thing makes no sense, something is bugged somewheres..
HZ=2000 for this test (512/512 descriptors)

If i lower HZ to 100, I can get a little over 800kpps without polling..
--------Going to reboot with 4000hz and 1024k rx/tx descriptors
..........
about the same..
           input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls
    720833 244835   44691662          1     0        178     0
    744746 215689   46174256          1     0        178     0
    744943 194252   46186470          1     0        178     0
    743685 199487   46108486          2     0        356     0
    743715 209263   46110346          2     0        356     0

last pid:  1426;  load averages:  0.22,  0.65,  
0.40                                                                  up 
0+00:07:17  04:16:43
66 processes:  2 running, 46 sleeping, 18 waiting
CPU:  0.4% user,  0.0% nice, 12.8% system, 44.2% interrupt, 42.6% idle
Mem: 8052K Active, 6192K Inact, 46M Wired, 96K Cache, 8944K Buf, 1921M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
   10 root     171 ki31     0K    16K RUN      0:49 82.52% idle
   36 root     -68    -     0K    16K -        0:31  6.84% em3 taskq
 1426 root      45    0  8084K  2052K RUN      0:00  1.32% top
   13 root     -44    -     0K    16K WAIT     3:07  0.59% swi1: net
   11 root     -32    -     0K    16K WAIT     0:00  0.05% swi4: clock sio

------reboot with 2048/2048 descriptors
NOTE: without polling, 128,256,512 give best performance for some 
strange reason, maybe cache hits
this is worse..
            input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls
    646290 269912   40080528          0     0          0     0
    672548 250198   41687440          1     0        178     0
    674856 247162   41841076          1     0        178     0
    665062 248851   41233848          1     0        178     0
    671764 253300   41649372         


bah..
------- 10000HZ, 512/512  CPU still will not go below 42% idle
700-720 kpps..
actualyl got 40% cpu idle lol


Oh well.. Tomorrow hopefully 2222 test and 32 bit test.. then i'm done 
for while.. :P

Paul






Ingo Flaschberger wrote:
> Dear Paul,
>
>> SMP DISABLED on my Opteron 2212  (ULE, Preemption on)
>> Yields ~750kpps in em0 and out em1  (one direction)
>> I am miffed why this yields more pps than
>> a) with all 4 cpus running and b) 4 cpus with lagg load balanced over 
>> 3 incoming connections so 3 taskq threads
>
> because less locking, less synchronisation, ....
>
>> I would be willing to set up test equipment (several servers plugged 
>> into a switch) with ipkvm and power port access
>> if someone or a group of people want to figure out ways to improve 
>> the routing process, ipfw, and lagg.
>>
>> Maximum PPS with one ipfw rule on UP:
>> tops out about 570Kpps.. almost 200kpps lower ? (frown)
>
> can you post the rule here?
>
>> I'm going to drop in a 3ghz opteron instead of the 2ghz 2212 that's 
>> in here and see how that scales, using UP same kernel etc I have now.
>
> really, please try 32bit and 1 cpu.
>
> Kind regards,
>     Ingo Flaschberger
>
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>



More information about the freebsd-net mailing list