(in)appropriate uses for MAXBSIZE

Sat Apr 17 02:11:38 UTC 2010

On Fri, 16 Apr 2010, Bruce Evans wrote:

>
> Do you have benchmarks?  A kernel build (without -j) is a good test.
> Due to include bloat and include nesting bloat, a kernel build opens
> and closes the same small include files hundreds or thousands of times
> each, with O(10^5) includes altogether, so an RPC to read attributes
> on each open costs a lot of latency.  nfs on a LAN does well to take
> only 10% longer than a local file system on a LAN and after disabling
> close/open constency takes only about half as much longer, by reducing
> the nomber of RPCs by about a factor of 2.  The difference should be
> even more noticable on a WAN.  Building with -j reduces the extra
> length by not stalling the whild build waiting for each RPC.  I probably
> needed it to take only 10% longer.
>
Well, I certainly wouldn't call these benchmarks, but here are the #s
I currently see. (The two machines involved are VERY slow by to-day's
hardware standards. One is an 800MHz PIII and the other is a 4-5year
old cheap laptop with something like a 1.5GHz Celeron CPU.)

The results for something like the Connectathon test suite's read/write
test can be highly variable, depending upon the hardware setup, etc. (I 
suspect that is at least partially based on when the writes get flushed 
during the test run. One thing that I'd like to do someday is have a
read/shared lock on the buffer cache block while a write-back to a
server is happening. It currently is write/exclusive locked, but doesn't
need to be, after the data has been copied into the buffer.)

For the laptop as client:
without delegations:
./test5: read and write
 	wrote 1048576 byte file 10 times in 7.27 seconds (1442019 bytes/sec)
 	read 1048576 byte file 10 times in 0.4  seconds (238101682 bytes/sec)
 	./test5 ok.
with delegations:
./test5: read and write
 	wrote 1048576 byte file 10 times in 1.64 seconds (6358890 bytes/sec)
 	read 1048576 byte file 10 times in 0.70 seconds (14802158 bytes/sec)
 	./test5 ok.

but for the PIII as the client (why does this case run so much better
when there are no delegations?):
without delegations:
./test5: read and write
 	wrote 1048576 byte file 10 times in 1.75 seconds (5961944 bytes/sec)
 	read 1048576 byte file 10 times in 0.7  seconds (131844940 bytes/sec)
 	./test5 ok.
with delegations:
./test5: read and write
 	wrote 1048576 byte file 10 times in 1.39 seconds (7526450 bytes/sec)
 	read 1048576 byte file 10 times in 0.67 seconds (15540698 bytes/sec)
 	./test5 ok.

Now, a kernel build with the PIII as client:
without delegations:
Real	User	System
6859	4635	1158
with delegations:
Real	User	System
6491	4634	1105

As you can see, there isn't that much improvement when delegations are
enabled. Part of the problem here is that, for an 800MHz PIII, the
build is CPU bound ("vmstat 5" shows 0->10% idle during the build), so
the speed of the I/O over NFS won't have a lot of effect on it. This
would be more interesting if the client had a much faster CPU.

Not benchmarks, but might give you some idea. (The 2 machines are
running off the same small $50 home router.)

Someday, I'd like to implement agressive client side caching to a
disk in the client and do a performance evaluation (including
introducing network latency) and see how it all does. I'm getting
close to where I can do that. Maybe this summer.

Have fun with it, rick