Apparent strange disk behaviour in 6.0

Sat Jul 30 22:28:03 GMT 2005

In message <20050730221215.GA757 at uk.tiscali.com>, Brian Candler writes:
>On Sat, Jul 30, 2005 at 08:37:17PM +0200, Poul-Henning Kamp wrote:
>> In message <20050730171536.GA740 at uk.tiscali.com>, Brian Candler writes:
>> >On Sat, Jul 30, 2005 at 03:29:27AM -0700, Julian Elischer wrote:
>> >> 
>> >> The snapshot below is typical when doing tar from one drive to another..
>> >> (tar c -C /disk1 f- .|tar x -C /disk2 -f - )
>> >> 
>> >> dT: 1.052  flag_I 1000000us  sizeof 240  i -1
>> >>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d  %busy Name
>> >>     0    405    405   1057    0.2      0      0    0.0      0      0    0.0  9.8| ad0
>> >>     0    405    405   1057    0.3      0      0    0.0      0      0    0.0 11.0| ad0s2
>> >>     0    866      3     46    0.4    863   8459    0.7      0      0    0.0 63.8| da0
>> >>    25    866      3     46    0.5    863   8459    0.8      0      0    0.0 66.1| da0s1
>> >>     0    405    405   1057    0.3      0      0    0.0      0      0    0.0 12.1| ad0s2f
>> >>   195    866      3     46    0.5    863   8459    0.8      0      0    0.0 68.1| da0s1d
>...
>> >But if really is only 12.1% busy (which the 0.3 ms/r implies),
>> 
>> "busy %" numbers is *NOT* a valid measure of disk throughput, please do
>> not pay attention to such numbers!
>
>It seems to me that
>     reads/sec * milliseconds/read  =  milliseconds spent reading per second

This is only true if there can be only one outstanding request at
a time.  That is not a valid assumption.  Modern (SCSI) disks have
tagged queueing for instance.

You could have:

	Time	Event
	0.0	Issue request #0
	0.1	Issue request #1
	1.0	Request #0 comes back (service time = 1 second)
	1.1	Request #1 comes back (service time = 1 second)

	The average service time is 1 second per request.

	The throughput is 2 requests per second.

Using your formula, the drive has spent 2000 milliseconds per second
handling requests.

This is the good old bandwidth vs. delay discussion again...

>and that the "busy %" is <milliseconds per second spent with one or more

Maybe it would be more clear if you realize that, the definition is:

	%busy = 100 - %idle

>I'm not saying geom is counting wrongly;

I am 100% sure it counts correctly (well, actually, the %busy is not taking
outstanding requests properly into account, it would cost a lot of work and
math to do so.)

>I am just agreeing with the OP that
>the underlying reason for this poor utilisation is worth investigating.

If the two processes are not deliberately sleeping (which I doubt),
it has to be a scheduler bug.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.