cvs commit: src/sys/kern init_main.c kern_malloc.c md5c.c subr_autoconf.c subr_mbuf.c subr_prf.c tty_subr.c vfs_cluster.c vfs_subr.c

Peter Wemm peter at wemm.org
Tue Jul 22 18:34:44 PDT 2003


Garance A Drosihn wrote:
> At 5:32 PM -0700 7/22/03, Peter Wemm wrote:
> >
> >Take the i386 interrupt vector code.  Thats an example where
> >it is massively inlined.  Having a non-inlined function that
> >does all the calculations and bit shifting is much smaller
> >in code size, but slower at runtime.
> 
> If I understand this discussion correctly, then the previous
> version of gcc (in freebsd-current) was NOT inlining these
> sections event though we thought it was.

In some cases, yes, that was happening.  Not the interrupt code
though because thats generated by hand in assembler.  Things like some
kobj and VOP_* wrappers were not being inlined as they should be.

What has been highlighted is that inline has been abused over time. The
argument is how and where to draw the line in cost vs benefit.  'inline' is
a hint to the compiler that you believe that the increased code size cost
is worth it.  The problem is that some inlines were being failed because
the gcc cost estimation was happening before optimization and was way out
of sync with reality.  eg: a reference to curthread blows the estimate out
of the water, even though it accounts for 1 or 2 instructions only.

The other problem is that many of the original measurements were done years
ago by folks who are no longer with us (eg: John Dyson) on hardware that
is no longer an accurate representative of current hardware.  On the other
hand, many of us still use that older hardware and so tuning there is a
much bigger issue.  eg: my home desktop predates John Dyson leaving.

Meanwhile, -Werror is still disabled.  We have been sneaking in potentially
fatal errors.  eg: kern_umtx.c has what looks like a legitimate problem to
me, but its lost in the inline noise.

> Might we expect some
> performance improvements now that we know to force gcc to
> inline the functions?

Thats an interesting question, isn't it?  Somebody had previously measured
a 5% slowdown as a result of not inlining the VOP_* function wrappers.
I wonder if this is part of the 4.x -> 5.x slowdown that hasn't been
resolved yet.  I dont know how many VOP_* calls are or are not presently
being inlined though.

Cheers,
-Peter
--
Peter Wemm - peter at wemm.org; peter at FreeBSD.org; peter at yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5



More information about the cvs-src mailing list