Some FreeBSD performance Issues

Thu Nov 8 14:33:00 PST 2007

Hi All,

I recently ported my HLA (High Level Assembler) compiler to FreeBSD and,
along with it, the HLA Standard Library. I have a performance-related
question concerning file I/O.

It appears that character-at-a-time file I/O is *exceptionally* slow. Yes, I
realize that when processing large files I really ought to be doing
block/buffered I/O to get the best performance, but for certain library
routines I've written it's been far more convenient to do
character-at-a-time I/O rather than deal with all the buffering issues.  In
the past, while slower, this character-at-a-time paradigm has provided
reasonable, though not stellar, performance under Windows and Linux.
However, with the port to FreeBSD I'm seeing a three-orders-of-magnitude
performance loss.  Here's my little test program:

program t;
#include( "stdlib.hhf" )
//#include( "bsd.hhf" )

static
    f       :dword;
    buffer  :char[64*1024];

begin t;

    fileio.open( "socket.h", fileio.r );
    mov( eax, f );
#if( false )

    // Windows: 0.25 seconds
    // BSD: 5.2 seconds

    while( !fileio.eof( f )) do

        fileio.getc( f );
        //stdout.put( (type char al ));

    endwhile;

#elseif( false )

    // Windows: 0.0 seconds (below 1ms threshold)
    // BSD: 5.2 seconds

    forever

        fileio.read( f, buffer, 1 );
        breakif( eax <> 1 );
        //stdout.putc( buffer[0] );

    endfor;

#elseif( false )

    // BSD: 5.1 seconds

    forever

        bsd.read( f, buffer, 1 );
        breakif( @c );
        breakif( eax <> 1 );
        //stdout.putc( buffer[0] );

    endfor;

#else

    // BSD: 0.016 seconds

    bsd.read( f, buffer, 64*1024 );
    //stdout.write( buffer, eax );

#endif

    fileio.close( f );

end t;

(I selectively set one of the conditionals to true to run a different test;
yeah, this is HLA assembly code, but I suspect that most people who can read
C can *mostly* figure out what's going on here).

The "fileio.open" call is basically a bsd.open( "socket.h", bsd.O_RDONLY );
API call.  The socket.h file is about 19K long (it's from the FreeBSD
include file set). In particular, I would draw your attention to the first
two tests that do character-at-a-time I/O. The difference in performance
between Windows and FreeBSD is dramatic (note: Linux numbers are comparable
to Windows). Just to make sure that the library code wasn't doing something
incredibly stupid, the third test makes a direct FreeBSD API call to read
the data a byte at a time -- the results are comparable to the first two
tests. Finally, I read the whole file at once, just to make sure the problem
was character-at-a-time I/O (which obviously is the problem).  Naturally, at
one point I'd uncommented all the output statements to verify that I was
reading the entire file -- no problem there.

Is this really the performance I can expect from FreeBSD when doing
character I/O this way? Is is there some tuning parameter I can set to
change internal buffering or something?  From this numbers, if I had to
guess, I'd suspect that FreeBSD was re-reading the entire 4K (or whatever)
block from the file cache everytime I read a single character. Can anyone
explain what's going on here?  I'm loathe to change my fileio module to add
buffering as that will create some subtle semantic differences that could
break existing code (I do have an object-oriented file I/O class that I'm
going to use to implement buffered I/O, I would prefer to leave the fileio
module unbuffered, if possible).

And a more general question: if this is the way FreeBSD works, should
something be done about it?
Thanks,
Randy Hyde