read two files simultaneously

Junsuk Shin junsukshin at gmail.com
Sat Feb 21 11:46:45 PST 2009


Hello,

I need to read two files simultaneously, and simply read(2) is
interleaved to do this. The problem is the performance varies
dramatically depending on the file size. I'm wondering what is the
problem in this case.

The test application does following:

open 2 files
  - the size of two file is same
  - since I read only once, bypass cache with O_DIRECT
read 16Kbytes of file1, then read 16K of file2, and so on

simplified code is like this:

fd1 = open(file1, O_RDONLY | O_DIRECT);
fd2 = open(file2, O_RDONLY | O_DIRECT);

for(...) {
    /* read 16K of file1 */
    while(...) {
        count = read(fd1,...);
        ....
    }
    /* read 16K of file2 */
    while(...) {
        count = read(fd2,...);
        ....
    }
}

When I tested with two 100M files, it takes 3.17 seconds (about 31MB/s
per file, 62MB/s in total)
However, if I test with two 700M files, it takes 162 seconds (about
4.5MB/s per file, 9MB/s in total)

I'm just guessing inode structure, the physical file location on HDD
might be related to this. But, if I read only one file, the size
doesn't matter. Reading file (10M, 100M, 700M) gives constantly about
70MB/s, and the weird thing happens when I read 2 files of big size.

The seek time might be related to this, but it looks like too huge
difference. What is going on this?

Thanks.

-- 
Junsuk


More information about the freebsd-questions mailing list