Some FreeBSD performance Issues

Benjamin Lutz mail at
Mon Nov 12 05:48:12 PST 2007

Randall Hyde wrote:
> Hi All,
> I recently ported my HLA (High Level Assembler) compiler to FreeBSD and,
> along with it, the HLA Standard Library. I have a performance-related
> question concerning file I/O.
> It appears that character-at-a-time file I/O is *exceptionally* slow. Yes, I
> realize that when processing large files I really ought to be doing
> block/buffered I/O to get the best performance, but for certain library
> routines I've written it's been far more convenient to do
> character-at-a-time I/O rather than deal with all the buffering issues.  In
> the past, while slower, this character-at-a-time paradigm has provided
> reasonable, though not stellar, performance under Windows and Linux.
> However, with the port to FreeBSD I'm seeing a three-orders-of-magnitude
> performance loss.  Here's my little test program:
> program t;
> #include( "stdlib.hhf" )
> //#include( "bsd.hhf" )
> static
>     f       :dword;
>     buffer  :char[64*1024];
> begin t;
> "socket.h", fileio.r );
>     mov( eax, f );
> #if( false )
>     // Windows: 0.25 seconds
>     // BSD: 5.2 seconds
>     while( !fileio.eof( f )) do
>         fileio.getc( f );
>         //stdout.put( (type char al ));
>     endwhile;
> #elseif( false )
>     // Windows: 0.0 seconds (below 1ms threshold)
>     // BSD: 5.2 seconds
>     forever
> f, buffer, 1 );
>         breakif( eax <> 1 );
>         //stdout.putc( buffer[0] );
>     endfor;
> #elseif( false )
>     // BSD: 5.1 seconds
>     forever
> f, buffer, 1 );
>         breakif( @c );
>         breakif( eax <> 1 );
>         //stdout.putc( buffer[0] );
>     endfor;
> #else
>     // BSD: 0.016 seconds
> f, buffer, 64*1024 );
>     //stdout.write( buffer, eax );
> #endif
>     fileio.close( f );
> end t;
> (I selectively set one of the conditionals to true to run a different test;
> yeah, this is HLA assembly code, but I suspect that most people who can read
> C can *mostly* figure out what's going on here).
> The "" call is basically a "socket.h", bsd.O_RDONLY );
> API call.  The socket.h file is about 19K long (it's from the FreeBSD
> include file set). In particular, I would draw your attention to the first
> two tests that do character-at-a-time I/O. The difference in performance
> between Windows and FreeBSD is dramatic (note: Linux numbers are comparable
> to Windows). Just to make sure that the library code wasn't doing something
> incredibly stupid, the third test makes a direct FreeBSD API call to read
> the data a byte at a time -- the results are comparable to the first two
> tests. Finally, I read the whole file at once, just to make sure the problem
> was character-at-a-time I/O (which obviously is the problem).  Naturally, at
> one point I'd uncommented all the output statements to verify that I was
> reading the entire file -- no problem there.
> Is this really the performance I can expect from FreeBSD when doing
> character I/O this way? Is is there some tuning parameter I can set to
> change internal buffering or something?  From this numbers, if I had to
> guess, I'd suspect that FreeBSD was re-reading the entire 4K (or whatever)
> block from the file cache everytime I read a single character. Can anyone
> explain what's going on here?  I'm loathe to change my fileio module to add
> buffering as that will create some subtle semantic differences that could
> break existing code (I do have an object-oriented file I/O class that I'm
> going to use to implement buffered I/O, I would prefer to leave the fileio
> module unbuffered, if possible).
> And a more general question: if this is the way FreeBSD works, should
> something be done about it?
> Thanks,
> Randy Hyde

Hello Randy,

First, let me out myself as a fan of yours. It was your book that got me
started on ASM and taught me a lot about computers and logic, plus it
provided some entertainment and mental sustenance in pretty boring
times, so thanks!

Now, as for your problem: I think I have to agree with the others in
this thread when they say that the problem likely isn't in FreeBSD. The
following C program, which uses the read(2) call to read socket.h
byte-by-byte, runs quickly (0.05 secs on my 2.1GHz system, measured with

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/uio.h>
#include <unistd.h>

int main(int argc, char** argv) {
        int f;
        char c;
        ssize_t result;

        f = open("/usr/include/sys/socket.h", O_RDONLY);
        if (f < 0) { perror("open"); exit(1); }

        do {
                result = read(f, &c, 1);
                if (result < 0) { perror("read"); exit(1); }
                //printf("%c", c);
        } while (result >= 1);

        return 0;

This should be quite equivalent to your second and third code fragment;
it does one read system call per byte, no buffering involved. This leads
me to believe that the slowdown occurs in your wrapper, or
maybe in the process setup/teardown process.


More information about the freebsd-hackers mailing list