HZ=100: not necessarily better?

Sun Jun 18 15:31:11 UTC 2006

--- Robert Watson <rwatson at FreeBSD.org> wrote:

> 
> On Sat, 17 Jun 2006, Danial Thom wrote:
> 
> > At some point you're going to have to figure
> out that there's a reason that 
> > every time anyone other than you tests
> FreeBSD it completely pigs out. 
> > Sqeezing out some extra bytes in netperf
> isn't "performance". Performance is 
> > everything that a system can do. If you're
> eating 10% more cpu to get a few 
> > more bytes in netperf, you haven't increased
> the performance of the system.
> 
> This test wasn't netperf, it was a 32-process
> web server and a 32-process 
> client, doing sendfile on UFS-backed data
> files.  It was definitely a potted 
> benchmark, in that it omits some of the
> behaviors of web servers (dynamic 
> content, significantly variable data set, etc),
> but is intended to be more 
> than a simple micro-benchmark involving two
> sockets and packet blasting. 
> Specifically, it was intended to validate
> whether or not there were 
> immediately observable changes in TCP behavior
> based on adjusting HZ under 
> load.  The answer was a qualified yes: there
> was a small but noticeable 
> negative affect on high load web serving in the
> test environment by reducing 
> HZ, likely due to to reduced timer accuracy. 
> Specifically: simply frobbing HZ 
> isn't a strategy that necessarily results in a
> performance improvement.
> 
> > You need to do things like run 2 benchmarks
> at once. What happens to the 
> > "performance" of one benchmark when you
> increase the "performance" of the 
> > other? Run a database benchmark while you're
> running a network benchmark, or 
> > while you're passing a controlled stream of
> traffic through the box.
> 
> The point of this exercise was to demonstrate
> the complexity of the issue of 
> adjusting HZ, and to suggest that simply
> changing the value in the further 
> absense of evidence could have negative
> effects, and that we might want to 
> investigate a more mature middle ground, such
> as a modified timer 
> architecture.  I'm sorry if that conclusion
> wasn't clear from my e-mail.
> 
> > I'd also love to see the results of the exact
> same test with only 1 cpu 
> > enabled, to see how well you scale generally.
> I'm astounded that no-one ever 
> > seems to post 1 vs 2 cpu performance, which
> is the entire point of SMP.
> 
> Single CPU results were included in my e-mail. 
> There are actually a couple of 
> other variations of interest you want to
> measure in more general benchmarking 
> exercises:
> 
> - Kernel compiled without any SMP support. 
> Specifically, without lock
>    prefixes on atomic instructions.
> 
> - Kernel compiled with SMP support, but with
> use of additional CPUs disabled.
> 
> - Kernel compiled with SMP support, and with
> varying numbers of CPUs enabled.
> 
> The first two cases are important, because they
> help identify the difference 
> between the general overhead of compiling in
> locked instructions (and related 
> issues), and the overheads associated with
> contention, caches, inter-CPU IPI 
> traffic, scheduling, etc.  By failing to
> compare the top to cases, it might be 
> easy to conclude that a performance improve is
> due to the additional cost of 
> atomic instructions, whereas in reality it may
> be the result of a poor 
> scheduling decision, or of data unnecessarily
> cache missing in both CPUsrather 
> than one because processing of the data is
> split poorly over available CPUs.

Of course there is a UP test, and now I see that
UP wins again. It would be interesting to see
some sort of test run at lower contention levels.
I'd think that UP would gain an advantage as
resources become scarce, as more switching and
locking would be required while waiting. As
contention for sockets or kernel-level resources
grows, SMP would be less and less efficient with
the added overhead.

It seems to me that the decision of what the
default value of HZ should be for a general
purpose OS should take into the account what the
majority of users are doing. *Most* people aren't
running fully loaded web servers. The argument
shouldn't be "can you get better performance at
the high end with a different setting", it should
be "what's the most efficient setting for general
use". Thats what "GENERIC" is all about. 

I tried to impress upon Matt (without any reponse
at all of course) that raising ITR to 10000 for
the em driver doesn't make sense, because
virtually no-one in their camp is pushing enough
traffic to make that setting worthwhile. The
possible fact that the performance is better when
pushing 70K pps should make it a tuning note and
not a default setting. If you're not on a gigabit
network a setting of 10K makes no sense at all.

DT

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com