FreeBSD 10G forwarding performance @Intel
Luigi Rizzo
rizzo at iet.unipi.it
Fri Jul 6 05:52:05 UTC 2012
On Thu, Jul 05, 2012 at 05:40:37PM +0400, Alexander V. Chernikov wrote:
> On 04.07.2012 19:48, Luigi Rizzo wrote:
...
> Traffic stats with most possible counters eliminated:
> (there is a possibility in ixgbe code to update rx/tx packets once per
> rx_process_limit (which is 100 by default)):
>
> input (ix0) output
> packets errs idrops bytes packets errs bytes colls
> 2.8M 0 0 186M 2.8M 0 186M 0
> 2.8M 0 0 187M 2.8M 0 186M 0
>
> And it seems that netstat uses 1024 as divisor (no HN_DIVISOR_1000
> passed in if.c to show_stat), so real frame count from Ixia side is much
> closer to 3MPPS (~ 2.961600 ).
...
> IPFW contention:
> Same setup as shown upper, same traffic level
>
> 17:48 [0] test15# ipfw show
> 00100 0 0 allow ip from any to any
> 65535 0 0 deny ip from any to any
>
> net.inet.ip.fw.enable: 0 -> 1
> input (ix0) output
> packets errs idrops bytes packets errs bytes colls
> 2.1M 734k 0 187M 2.1M 0 139M 0
> 2.1M 736k 0 187M 2.1M 0 139M 0
> 2.1M 737k 0 187M 2.1M 0 89M 0
> 2.1M 735k 0 187M 2.1M 0 189M 0
> net.inet.ip.fw.update_counters: 1 -> 0
> 2.3M 636k 0 187M 2.3M 0 148M 0
> 2.5M 343k 0 187M 2.5M 0 164M 0
> 2.5M 351k 0 187M 2.5M 0 164M 0
> 2.5M 345k 0 187M 2.5M 0 164M 0
...
> It seems that ipfw counters are suffering from this problem, too.
> Unfortunately, there is no DPCPU allocator in our kernel.
> I'm planning to make a very simple per-cpu counters patch:
> (
> allocate 65k*(u64_bytes+u64_packets) memory for each CPU per vnet
> instance init and make ipfw use it as counter backend.
>
> There is a problem with several rules residing in single entry. This can
> (probably) be worked-around by using fast counters for the first such
> rule (or not using fast counters for such rules at all)
> )
>
> What do you think about this?
the thing discussed a few years ago (at least the one i took out of the
discussion) was that the counter fields in rules should hold the
index of a per-cpu counter associated to the rule. So CTR_INC(rule->ctr)
becomes something like pcpu->ipfw_ctrs[rule->ctr]++
Once you create a new rule you also grab one free index from ipfw_ctrs[],
and the same should go for dummynet counters.
The alternative would be to allocate the rule and a set of counters
within the rule itself, but that kills 64 bytes per core per rule
to avoid cache contention.
cheers
luigi
More information about the freebsd-net
mailing list