(unknown charset) Re: Checksum/copy

From: (unknown charset) Jeff Roberson <jroberson_at_chesapeake.net>
Date: Sat, 29 Mar 2003 15:36:52 -0500 (EST)
On Sat, 29 Mar 2003, Bruce Evans wrote:

> On Fri, 28 Mar 2003, Dag-Erling [iso-8859-1] Smørgrav wrote:
>
> > Bruce Evans <bde_at_zeta.org.au> writes:
> > > Instead of fixing the comparison and any other logic bugs, I rewrote the
> > > function using orl instead of scasl, and simpler logic (ignore the changes
> > > for the previous function in the same hunk).
> >
> > Could you please commit this?  Nothing uses it, so it won't break
> > anything, but it'll make testing and benchmarking easier for
> > interested parties.
>
> Er, it is used (in pmap.c).
>
> I recently learned that makeworld is an interesting benchmark for zeroing
> pages (all %times on a system with 1 Athlon and 1GB of memory):
> - makeworld spends at least 3-5% of its time zeroing pages
> - turning off vm.idlezero_enable increases makeworld time by 1-2%
>   and moves the place where the zeroing is done significantly.  With
>   vm.idlezero_enable off, most of the idle zeroing is done in process
>   context and gets charged to makeworld; otherwise it is done in the
>   pagezero task and gets charged to that.  Most but not all of the time
>   spent in the pagezero task is "free", and we lose the 1-2% by
>   doing all zeroing in process context.
> - SCHED_ULE breaks scheduling of idleprio processes.  This results in
>   pagezero being too active.  It costs 1-2% instead of saving 1-2%.
>

Thanks for the analysis.  I know my +nice values are screwy right now.
It's actually a pretty interesting problem.  Perhaps you'll have some
insight.

The basic issue is that threads on run queues in ule must be given a
slice.  And with a slice, ignoring interactive threads, they are run every
n times we select a new thread where n is the number of runnning threads.
That means two things

1)  +nice threads always get a chance to run.
2)  Their %cpu is relative to the sum of all slices of all running
threads.

#2 is sort of what you want, except that the slice value never reaches
zero.  In sched_4bsd if you have a nice priority that is 20 away from the
lowest  priority processes you never get a chance to run.  I'm not sure if
this scales all the way across.  I know a nice 0 will always prevent a
nice 20 from running, but will a nice -20 prevent a nice 0 from running?
I believe so.  With nice +19 and a nice 0 the nice +19 gets approx 2% cpu.

So, in ule, I need a way to approximate this.  The real problem is that
the drop off point where a process gets 0 cpu time is artificial.  The
algorithm doesn't work linearly down to 0 as it does in sched_4bsd.  I
need to make slice assignments relative to all other processes in the
system.  This seems like it may break the O(1) properties of the
scheduler.

I'm just now thinking that I could assign the slice using the run queues
to find out how this thread relates to others in the system.  This all
sounds rather complicated.  I'm hoping that I'm missing some simple
elegant solution that someone may know of.

Any takers?  Comments on nice or slice selection?

Cheers,
Jeff
Received on Sat Mar 29 2003 - 12:37:16 UTC