HZ=100: not necessarily better?
Scott Long
scottl at samsco.org
Sun Jun 18 00:17:37 UTC 2006
Danial Thom wrote:
>
> --- Robert Watson <rwatson at FreeBSD.org> wrote:
>
>
>>Scott asked me if I could take a look at the
>>impact of changing HZ for some
>>simple TCP performance tests. I ran the first
>>couple, and got some results
>>that were surprising, so I thought I'd post
>>about them and ask people who are
>>interested if they could do some investigation
>>also. The short of it is that
>>we had speculated that the increased CPU
>>overhead of a higher HZ would be
>>significant when it came to performance
>>measurement, but in fact, I measure
>>improved performance under high HTTP load with
>>a higher HZ. This was, of
>>course, the reason we first looked at
>>increasing HZ: improving timer
>>granularity helps improve the performance of
>>network protocols, such as TCP.
>>Recent popular opinion has swung in the
>>opposite direction, that higher HZ
>>overhead outweighs this benefit, and I think we
>>should be cautious and do a
>>lot more investigating before assuming that is
>>true.
>>
>>Simple performance results below. Two boxes on
>>a gig-e network with if_em
>>ethernet cards, one running a simple web server
>>hosting 100 byte pages, and
>>the other downloading them in parallel
>>(netrate/http and netrate/httpd). The
>>performance difference is marginal, but at
>>least in the SMP case, likely more
>>than a measurement error or cache alignment
>>fluke. Results are
>>transactions/second sustained over a 30 second
>>test -- bigger is better; box
>>is a dual xeon p4 with HTT; 'vendor.*' are the
>>default 7-CURRENT HZ setting
>>(1000) and 'hz.*' are the HZ=100 versions of
>>the same kernels. Regardless,
>>there wasn't an obvious performance improvement
>>by reducing HZ from 1000 to
>>100. Results may vary, use only as directed.
>>
>>What we might want to explore is using a
>>programmable timer to set up high
>>precision timeouts, such as TCP timers, while
>>keeping base statistics
>>profiling and context switching at 100hz. I
>>think phk has previously proposed
>>doing this with the HPET timer.
>>
>>I'll run some more diverse tests today, such as
>>raw bandwidth tests, pps on
>>UDP, and so on, and see where things sit. The
>>reduced overhead should be
>>measurable in cases where the test is CPU-bound
>>and there's no clear benefit
>>to more accurate timing, such as with TCP, but
>>it would be good to confirm
>>that.
>>
>>Robert N M Watson
>>Computer Laboratory
>>University of Cambridge
>>
>>
>>peppercorn:~/tmp/netperf/hz> ministat *SMP
>>x hz.SMP
>>+ vendor.SMP
>>
>
> +--------------------------------------------------------------------------+
>
>>|xx x xx x xx x + +
>>+ + + ++ + ++|
>>| |_______A________|
>>|_____________A___M________| |
>>
>
> +--------------------------------------------------------------------------+
>
>> N Min Max
>>Median Avg Stddev
>>x 10 13715 13793 13750
>> 13751.1 29.319883
>>+ 10 13813 13970 13921
>> 13906.5 47.551726
>>Difference at 95.0% confidence
>> 155.4 +/- 37.1159
>> 1.13009% +/- 0.269913%
>> (Student's t, pooled s = 39.502)
>>
>>peppercorn:~/tmp/netperf/hz> ministat *UP
>>x hz.UP
>>+ vendor.UP
>>
>
> +--------------------------------------------------------------------------+
>
>>|x x xx x xx+ ++x+ ++ * +
>> + +|
>>|
>>|_________M_A_______|___|______M_A____________|
>> |
>>
>
> +--------------------------------------------------------------------------+
>
>> N Min Max
>>Median Avg Stddev
>>x 10 14067 14178 14116
>> 14121.2 31.279386
>>+ 10 14141 14257 14170
>> 14175.9 33.248058
>>Difference at 95.0% confidence
>> 54.7 +/- 30.329
>> 0.387361% +/- 0.214776%
>> (Student's t, pooled s = 32.2787)
>>
>>_______________________________________________
>>freebsd-performance at freebsd.org mailing list
>>
>
>
> --- Robert Watson <rwatson at FreeBSD.org> wrote:
>
>
>>Scott asked me if I could take a look at the
>>impact of changing HZ for some
>>simple TCP performance tests. I ran the first
>>couple, and got some results
>>that were surprising, so I thought I'd post
>>about them and ask people who are
>>interested if they could do some investigation
>>also. The short of it is that
>>we had speculated that the increased CPU
>>overhead of a higher HZ would be
>>significant when it came to performance
>>measurement, but in fact, I measure
>>improved performance under high HTTP load with
>>a higher HZ. This was, of
>>course, the reason we first looked at
>>increasing HZ: improving timer
>>granularity helps improve the performance of
>>network protocols, such as TCP.
>>Recent popular opinion has swung in the
>>opposite direction, that higher HZ
>>overhead outweighs this benefit, and I think we
>>should be cautious and do a
>>lot more investigating before assuming that is
>>true.
>>
>>Simple performance results below. Two boxes on
>>a gig-e network with if_em
>>ethernet cards, one running a simple web server
>>hosting 100 byte pages, and
>>the other downloading them in parallel
>>(netrate/http and netrate/httpd). The
>>performance difference is marginal, but at
>>least in the SMP case, likely more
>>than a measurement error or cache alignment
>>fluke. Results are
>>transactions/second sustained over a 30 second
>>test -- bigger is better; box
>>is a dual xeon p4 with HTT; 'vendor.*' are the
>>default 7-CURRENT HZ setting
>>(1000) and 'hz.*' are the HZ=100 versions of
>>the same kernels. Regardless,
>>there wasn't an obvious performance improvement
>>by reducing HZ from 1000 to
>>100. Results may vary, use only as directed.
>>
>>What we might want to explore is using a
>>programmable timer to set up high
>>precision timeouts, such as TCP timers, while
>>keeping base statistics
>>profiling and context switching at 100hz. I
>>think phk has previously proposed
>>doing this with the HPET timer.
>>
>>I'll run some more diverse tests today, such as
>>raw bandwidth tests, pps on
>>UDP, and so on, and see where things sit. The
>>reduced overhead should be
>>measurable in cases where the test is CPU-bound
>>and there's no clear benefit
>>to more accurate timing, such as with TCP, but
>>it would be good to confirm
>>that.
>>
>>Robert N M Watson
>>Computer Laboratory
>>University of Cambridge
>>
>>
>>peppercorn:~/tmp/netperf/hz> ministat *SMP
>>x hz.SMP
>>+ vendor.SMP
>>
>
> +--------------------------------------------------------------------------+
>
>>|xx x xx x xx x + +
>>+ + + ++ + ++|
>>| |_______A________|
>>|_____________A___M________| |
>>
>
> +--------------------------------------------------------------------------+
>
>> N Min Max
>>Median Avg Stddev
>>x 10 13715 13793 13750
>> 13751.1 29.319883
>>+ 10 13813 13970 13921
>> 13906.5 47.551726
>>Difference at 95.0% confidence
>> 155.4 +/- 37.1159
>> 1.13009% +/- 0.269913%
>> (Student's t, pooled s = 39.502)
>>
>>peppercorn:~/tmp/netperf/hz> ministat *UP
>>x hz.UP
>>+ vendor.UP
>>
>
> +--------------------------------------------------------------------------+
>
>>|x x xx x xx+ ++x+ ++ * +
>> + +|
>>|
>>|_________M_A_______|___|______M_A____________|
>> |
>>
>
> +--------------------------------------------------------------------------+
>
>> N Min Max
>>Median Avg Stddev
>>x 10 14067 14178 14116
>> 14121.2 31.279386
>>+ 10 14141 14257 14170
>> 14175.9 33.248058
>>Difference at 95.0% confidence
>> 54.7 +/- 30.329
>> 0.387361% +/- 0.214776%
>> (Student's t, pooled s = 32.2787)
>>
>>_______________________________________________
>>freebsd-performance at freebsd.org mailing list
>>
>
>
> And what was the cost in cpu load to get the
> extra couple of bytes of throughput?
>
> Machines have to do other things too. That is the
> entire point of SMP processing. Of course
> increasing the granularity of your clocks will
> cause to you process events that are
> clock-reliant more quickly, so you might see more
> "throughput", but there is a cost. Weighing (and
> measuring) those costs are more important than
> what a single benchmark does.
>
> At some point you're going to have to figure out
> that there's a reason that every time anyone
> other than you tests FreeBSD it completely pigs
> out. Sqeezing out some extra bytes in netperf
> isn't "performance". Performance is everything
> that a system can do. If you're eating 10% more
> cpu to get a few more bytes in netperf, you
> haven't increased the performance of the system.
>
> You need to do things like run 2 benchmarks at
> once. What happens to the "performance" of one
> benchmark when you increase the "performance" of
> the other? Run a database benchmark while you're
> running a network benchmark, or while you're
> passing a controlled stream of traffic through
> the box.
>
> I just finished a couple of simple tests and find
> that 6.1 has not improved at all since 5.3 in
> basic interrupt processing and context switching
> performance (which is the basic building block
> for all system performance). Bridging 140K pps (a
> full 100Mb/s load) uses 33% of the cpu(s) in
> Freebsd 6.1, and 17% in Dragonfly 1.5.3, on a
> dual-core 1.8Ghz opteron system. (I finally got
> vmstat to work properly after getting rid of your
> stupid 2 second timeout in the MAC learning
> table). I'll be doing some mySQL benchmarks next
> week while passing a controlled stream through
> the system. But since I know that the controlled
> stream eats up twice as much CPU on FreeBSD, I
> already know much of the answer, since FreeBSD
> will have much less CPU left over to work with.
>
> Its unfortunate that you seem to be tuning for
> one thing while completely unaware of all of the
> other things you're breaking in the process. The
> Linux camp understands that in order to scale
> well they have to sacrifice some network
> performance. Sadly they've gone too far and now
> the OS is no longer suitable as a high-end
> network appliance. I'm not sure what Matt
> understands because he never answers any
> questions, but his results are so far quite
> impressive. One thing for certain is that its not
> all about how many packets you can hammer out
> your socket interface (nor has it ever been). Its
> about improving the efficiency of the system on
> an overall basis. Thats what SMP processing is
> all about, and you're never going to get where
> you want to be using netperf as your guide.
>
> I'd also love to see the results of the exact
> same test with only 1 cpu enabled, to see how
> well you scale generally. I'm astounded that
> no-one ever seems to post 1 vs 2 cpu performance,
> which is the entire point of SMP.
>
>
> DT
>
>
You have some valid points, but they get lost in your
overly abbrassive tone. Several of us have watched
your behaviour on the DFly lists, and I dearly hope that
it doesn't overflow to our lists. It would be a shame
to loose your insight and input.
Scott
More information about the freebsd-performance
mailing list