cvs commit: src/sys/sparc64/include in_cksum.h
christoph.mallon at gmx.de
Fri Jun 27 22:57:31 UTC 2008
Marius Strobl wrote:
> I wasn't aware that the clobber list allows to explicitly specify
> the condition codes, thanks for the hint. Though it unfortunately
> took me longer than two days to verify it's effect on the generated
> code; sparc64 could still have been one of the archs where "cc" has
> no effect. Besides I don't think using "__volatile" for this is
> that wrong, given that the sparc64 code generated by using "cc"
> and "__volatile" is nearly identical and given that at least i386
> relies on "__volatile" telling GCC that the inline assembler uses
> the condition codes since quite some time. So the condition codes
> are probably part of what GCC treats as "important side-effects".
If this is true and GCC only handles the eflags on x86 correctly, when
__volatile is used, but not if "cc" is marked as clobbered, then this is
clearly a bug.
> Regarding the MFC, they don't happen automatically and the change
> was not wrong in general so there was no need to hurry :)
I still think, using __volatile only works by accident. volatile for an
assembler block mostly means "this asm statement has an effect, even
though the register specification looks otherwise, so do not optimise
this away (i.e. no CSE, do not remove if result is unused etc.).
On a related note: Is inline assembler really necessary here? For
example couldn't in_addword() be written as
static __inline u_short
in_addword(u_short const sum, u_short const b)
u_int const t = sum + b;
return t + (t >> 16);
This should at least produce equally good code and because the compiler
has more knowledge about it than an assembler block, it potentially
leads to better code. I have no SPARC compiler at hand, though.
In fact the in/out specification for this asm block looks rather bad:
"=&r" (__ret), "=&r" (__tmp) : "r" (sum), "r" (b) : "cc");
The "&"-modifiers (do not use the same registers as for any input
operand value) force the compiler to use 4 (!) register in total for
this asm block. It could be done with 2 registers if a proper in/out
specification was used. At the very least the in/out specification can
be improved, but I suspect using plain C is the better choice.
More information about the cvs-src