Very inconsistent (read) speed on UFS2

Lev Serebryakov lev at serebryakov.spb.ru
Wed Aug 31 11:37:30 UTC 2011


Hello, Jeremy.
You wrote 31 августа 2011 г., 14:12:11:

> This benchmark data is more or less unhelpful due to the fact that there
> are writes occurring during the middle of your reads.  There's another
  Yep :(

> spun-off portion of this thread that is discussing how you're
> benchmarking these things (specifically some code you wrote?).  I don't
> know what else to say in this regard.  It would really help if you could
> use something like bonnie++ and make sure the filesystem is not being
> used by ANYTHING during your benchmarks.
  I'll try bonnie++, Ok. My code is really as simple as it could be:

fd = open(fileName, O_RDONLY | O_DIRECT);
gettimeofday(&start, NULL);
/* s_BufferSize is 128KiB */
while ((rd = read(fd, s_Buffer, s_BufferSize)) > 0)
   size += rd;
gettimeofday(&end, NULL);
close(fd);


> Anyway, the data is interesting because from an aggregate total
> perspective, you're hitting some arbitrary limit on all of your devices
> which almost indicates memory bus throttling or something along those
> lines; CPU time?  I really don't know.  Aggregate write speeds
> respectively:

> 43138.8 + 43138.8 + 43044.7 + 43232.9 + 43138.8 == 215694.0 KByte/sec
> 10515.9 + 10547.2 + 10703.7 + 10484.6 + 10265.5 ==  52516.9 KByte/sec
> 56583.1 + 56677.2 + 56489.0 + 56614.5 + 56739.9 == 283103.7 KByte/sec
> 41001.3 + 40969.9 + 40844.5 + 41001.3 + 40875.9 == 204692.9 KByte/sec
> 15660.2 + 15503.6 + 15566.2 + 15785.5 + 15566.2 ==  78081.7 KByte/sec

> The totals are "all over the place", but what interests me the most is
> that the total aggregate never exceeds an amount that's slightly under
> 300MBytes/sec..  That number has some relevance if, say, you're using a
> port multiplier (5 devices aggregated across one SATA300 port).
  No. All drives are on separate ports of ICH9R chipset controller.
And, yes, sustained and constant 300MiB/s is my dream :) Keywords:
sustained and constant.

> Despite these being WD20EARS drives (4 platters, ugh!), these individual
  As ffar as I understand, 4 platters are slightly better in linear
access than 3 platters, but worse in random access, as it read more
data without heads movement.

> devices should be able to push 75-90MBytes/sec writes, and slightly
> higher reads.
   Read is about 110MiB/s at beginning of drive.

> Here's an idea: can you stop using the filesystem for a bit and instead
> do raw dd's from all of the /dev/adaX entries to /dev/null
> simultaneously (pick something like bs=64k or bs=256k), then run your
> iostats?  I'm basically trying to figure out if the bad speeds are
> actually the devices themselves or if it's the geom_raid5 stuff.  You
> get where I'm going with this.
  Not a problem! FS is unmounted, and after that:

# for d in 1 2 3 4 5 ; do dd if=/dev/ada$d of=/dev/null bs=64k & done
# iostat -c 999999 -dx ada1 ada2 ada3 ada4 ada5
device     r/s   w/s    kr/s    kw/s wait svc_t  %b
ada1     1849.1   0.0 118343.7     0.0    1   0.5  93
ada2     1920.3   0.0 122900.2     0.0    0   0.5  94
ada3     1874.5   0.0 119966.6     0.0    1   0.5  94
ada4     1794.5   0.0 114848.4     0.0    1   0.5  94
ada5     1893.0   0.0 121152.5     0.0    1   0.5  93

  It is very typical data, speed slightly goes up and down for all
 HDDs without any visible fastest or slowest drive.

> If 5 simultaneously dds reading from the drives is very fast (way faster
> than the above) and there aren't sporadic drops in performance which
> aren't caused by writes (hence my "stop using the filesystem" comment),
> then I think we've narrowed down where the issue lies -- not the drives.
   Yep. It seems to be exactly like this.

> The dd method I describe should absolutely not induce writes, hence my
> recommendation.  If writes are seen during the dd's, then either the
> filesystem is mounted and FreeBSD is doing something "interesting" on a
> filesystem or vfs level, or your system is actually an izbushka.....

> Maybe softupdates are somehow responsible?  Not sure.
  I have one ide about geom_raid5 writes... I need to check it.

-- 
// Black Lion AKA Lev Serebryakov <lev at serebryakov.spb.ru>



More information about the freebsd-fs mailing list