[PATCH] Convert the VFS cache lock to an rmlock
Ryan Stone
rysto32 at gmail.com
Fri Mar 13 15:23:08 UTC 2015
On Thu, Mar 12, 2015 at 1:36 PM, Mateusz Guzik <mjguzik at gmail.com> wrote:
> Workloads like buildworld and the like (i.e. a lot of forks + execs) run
> into very severe contention in vm, which is orders of magnitude bigger
> than anything else.
>
> As such your result seems quite suspicious.
>
You're right, I did mess up the testing somewhere (I have no idea how). As
you suggested, I switched to using a separate partition for the objdir, and
ran each build with a freshly newfsed filesystem. I scripted it to be sure
that I was following the same procedure with each run:
# Build known-working commit from head
git checkout 09be0092bd3285dd33e99bcab593981060e99058 || exit 1
for i in `jot 5`
do
# Create a fresh fs for objdir
sudo umount -f /usr/obj 2> /dev/null
sudo newfs -U -j -L OBJ $objdev || exit 1
sudo mount $objdev /usr/obj || exit 1
sudo chmod a+rwx /usr/obj || exit 1
# Ensure disk cache contains all source files
git status > /dev/null
/usr/bin/time -a -o $logfile make -s -j$(sysctl -n hw.ncpu) buildworld
buildkernel
done
I tested on the original 12-core machine, as well as a 2 package x 8 core x
2 HTT (32 logical cores) machine that a co-worker was able to lend me.
Unfortunately, the results show a performance decrease now. It's almost 5%
on the 32 core machine:
$ ministat -w 74 -C 1 12core/*
x 12core/orig.log
+ 12core/rmlock.log
+--------------------------------------------------------------------------+
|x xx x x + + + + +|
| |_________A__________| |_______________A___M__________||
+--------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 5 2478.81 2487.74 2483.45 2483.652 3.2495646
+ 5 2489.64 2501.67 2498.26 2496.832 4.7394694
Difference at 95.0% confidence
13.18 +/- 5.92622
0.53067% +/- 0.238609%
(Student's t, pooled s = 4.06339)
$ ministat -w 74 -C 1 32core/*
x 32core/orig.log
+ 32core/rmlock.log
+--------------------------------------------------------------------------+
|x x + |
|x x x + ++ +|
||__AM| |_______AM_____| |
+--------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 5 1067.97 1072.86 1071.29 1070.314 2.2238997
+ 5 1111.22 1129.05 1122.3 1121.324 6.4046569
Difference at 95.0% confidence
51.01 +/- 6.99181
4.76589% +/- 0.653249%
(Student's t, pooled s = 4.79403)
The difference is due to a significant increase in system time. Write
locks on an rmlock are extremely expensive (they involve an
smp_rendezvous), and the cost likely scales with the number of cores:
x 32core/orig.log
+ 32core/rmlock.log
+--------------------------------------------------------------------------+
|xxx x + +++ +|
||_MA__| |____MA______| |
+--------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 5 5616.63 5715.7 5641.5 5661.72 48.511545
+ 5 6502.51 6781.84 6596.5 6612.39 103.06568
Difference at 95.0% confidence
950.67 +/- 117.474
16.7912% +/- 2.07489%
(Student's t, pooled s = 80.5478)
At this point I'm pretty much at an impasse. The real-time behaviour is
critical to me, but a 5% performance degradation isn't likely to be
acceptable to many people. I'll see what I can do with this.
More information about the freebsd-current
mailing list