C99: Suggestions for style(9)
Roman Divacky
rdivacky at freebsd.org
Tue Apr 28 12:32:03 UTC 2009
I like the part about using as many variables as possible because
of documentation and performance enhancements. I tend to like
the other changes as well..
On Sun, Apr 26, 2009 at 09:02:36AM +0200, Christoph Mallon wrote:
> Hi hackers@,
>
> as some of you may have noticed, several years ago a new millenium
> started and a decade ago there was a new C standard. HEAD recently
> switched to C99 as default (actually gnu99, but that's rather close). So
> it's finally time to re-examine wether style(9) needs to be accomodated.
> The diff with the proposed changes is attached. Below are the changes
> with some further explanations. They are in the order as they appear in
> style(9).
> Maybe using all of these changes is a bit too big change at once, but
> I'd like some of them applied to style(9). So, please consider the
> proposed changes individually and not a as a all-or-nothing package.
>
> >-Do not put declarations
> >-inside blocks unless the routine is unusually complicated.
> >+Prefer declaring loop iterators in the for-statement to limit their scope.
> > .Bd -literal
> >- for (; cnt < 15; cnt++) {
> >+ for (int cnt = 0; cnt < 15; cnt++) {
> [...]
> >-When declaring variables in functions declare them sorted by size,
> >-then in alphabetical order; multiple ones per line are okay.
> >+When declaring variables in functions declare them sorted in alphabetical
> >order;
> >+prefer one declaration per line, especially when pointer declarators or
> >+initializations are involved.
> > If a line overflows reuse the type keyword.
> > .Pp
> >-Be careful to not obfuscate the code by initializing variables in
> >-the declarations.
> >-Use this feature only thoughtfully.
> >-DO NOT use function calls in initializers.
> >+Prefer initializing variables right at their declaration.
> >+When the value is not supposed to change in the function, make the
> >variable
> >+const.
> >+This makes it easier for a reader to identify the where the value of a
> >variable
> >+originates.
> >+Do not reuse the same variable in a different context, declare a new
> >variable.
> > .Bd -literal
> >- struct foo one, *two;
> >- double three;
> >- int *four, five;
> >- char *six, seven, eight, nine, ten, eleven, twelve;
> >+ struct foo *bar;
> >+ struct foo baz = { 42, "qux" };
> >
> >- four = myfunction();
> >+ FILE *const f = fopen("name", "r");
> >+ if (f == NULL)
> >+ err("Failed to open file");
> >+ /* We can safely assume that f is not changed anymore, even if the
> >+ * function is a hundred lines long */
>
> This change is to reduce the scope of local variables. For reasons why
> this does not cost anything in terms of stack space, please see the last
> change, which adds explanations about local variables.
> Smaller scopes and distinct variables for distinct contexts make it
> easier for readers of the code to identify the def-use-chains of
> variables, because there are less places with assignments to a variable
> and the scope is smaller. Also, when a variable is initialised right at
> its declaration and is not supposed to change, you can emphasize this by
> making it const.
> I looked at older discussions regarding this topic and identified
> several concerns, which were raised. I'll address them below:
>
> - It does not add anything to the language.
> Well, yes, just like if, do, for, goto, the comma operator, compound
> assignment (+=, ...), ++/-- and many other things. All you need to be
> Turing complete is lambda calculus - there hardly can be less syntax.
> But, like all mentioned constructs, it makes the code more concise.
>
> - It leads to sloppy code. You could reuse another variable or should
> think again whether you really need this variable.
> Reusing variables in different contexts is error prone and bad for
> maintainability. You could reuse a variable, which is not as dead as you
> thought. More local variables cost nothing, especially no stack space,
> see the last proposed changed below. It is good to use more local
> variables, because it gives the reader a name, which carries
> information. Also it makes it easier for a reader (and the compiler!) to
> identify, which expressions are the same. You could repeat
> foo->long_name->even_longer_name[2 * i + 1] five times or just declare a
> local variable and cache the value to make it easier to read.
> It also enables the compiler to produce better warnings: If you declare
> a new variable, it is not initialised and if you do not initialise it on
> all paths before a use, the compiler will warn you. But if you reuse an
> older variable, it is already initialised by its former uses and so you
> get no warning, but operate on garbage values (the old value from the
> previous use).
> So it does not lead to slopyy code, but better code, which is easier to
> comprehend, the compiler can better help you to prevent bugs, and as a
> side effect the compiler can produce better code (aliasing is a major
> problem for common subexpression elimination).
>
> - You can determine the stack usage with all the variable declarations
> at the top.
> This is not true. There is no relation between the declared variables
> and the stack used. Especially, more stack than suggested by the
> declarations can be used due to various optimisations - less space can
> be used, too, of course.
>
> - It is harder to find the declaration of a variable.
> Every editor, which is more advanced than ed, has facilities to jump to
> the declaration of a variable (e.g. in vim it's "gd"). And most often
> this is not necessary, because the declaration is just a few lines above
> instead of all the way up at the top of the function, so no jumping
> around is required at all.
>
> - You could accidently shadow a variable.
> All modern compilers have warnings for this. FreeBSD uses -Wshadow for
> GCC (and all relevant compilers understand this very option).
>
>
> >-Values in
> >-.Ic return
> >-statements should be enclosed in parentheses.
> >-.Pp
>
> return with parentheses:
> Removed, because it does not improve maintainability in any way. There
> is no source for confusion here, so the rule even contradicts the rule,
> which states not to use redundant parentheses. Maybe, decades ago it was
> just a workaround for a broken compiler, which does not exist anymore.
>
>
> >-Old-style function declarations look like this:
> >-.Bd -literal
> >-static char *
> >-function(a1, a2, fl, a4)
> >- int a1, a2; /* Declare ints, too, don't default them. */
> >- float fl; /* Beware double vs. float prototype differences. */
> >- int a4; /* List in order declared. */
> >-{
> >-.Ed
> >-.Pp
> >-Use ANSI function declarations unless you explicitly need K&R
> >compatibility.
> >+Use ANSI function declarations.
>
> Old-style function declarations:
> Removed, because there is no need to use them anymore. The C99 standard
> considers them obsolescent[1], too. All headers use prototype
> declarations, there's no reason to handle the function definitions
> different. Also they are a source of bugs - or did you know that the
> following is wrong:
> void wrong(char /* sic! */);
> void wrong(x) char x; {}
> And this is correct[2]:
> void right(int /* sic! */);
> void right(x) char x; {}
> (It's a bug in GCC that it does not complain about the former.)
>
>
> >-
> >-static void
> >-usage()
> >-{
> >- /* Insert an empty line if the function has no local variables. */
>
> Empty line when there are no local variables:
> Removed, because it does not improve readability and it actually hinders
> readability for very short functions, e.g. static inline functions,
> which just consist of one or two lines of code without any local
> variables except for parameters. Further, it makes even less sense with
> the changed recommendations for declarations.
>
>
> >+.Sh LOCAL VARIABLES
> >+Unaliased local variables are for free on modern compilers.
> >+They do not unconditionally use stack space.
> >+In fact there is no relation between their number and the used stack
> >space.
> >+A local variable is guaranteed to be unaliased, if its address has not
> >been
> >+taken (e.g. &i).
> >+Do not use the same local variable in different context just because it
> >happens
> >+that the same type is needed.
> >+It does not improve the generated code in any way, but it makes it harder
> >for a
> >+reader to determine all definitions and uses which belong together.
> >+Further, a new variable can be given a meaningful name (even if it is very
> >+short), which automatically serves as documentation and so improves
> >+comprehensibility.
> >+Especially avoid re-using variables, whose address has been taken:
> >+.Bd -literal
> >+ int i;
> >+ foo(&i);
> >+ printf("%d\\n", i);
> >+ for (i = 0; i != 10; ++i) {
> >+ /* BAD: i will be needlessly moved to and from memory in
> >+ * the loop, because its address was taken before. This
> >+ * results in larger and slower code.
> >+ * Declare a new variable for the loop instead. */
> >+ printf("%d\\n", i);
> >+ }
> >+.Ed
> >+
>
> Last, but definitely not least, I added this paragraph about the use of
> local variables. This is to clarify, how today's compilers handle
> unaliased local variables. There is absolutely no need to reuse them in
> different contexts. Trying to optimise stack usage in this way is
> misguided and is a source of bugs, when a reused variable is not as dead
> as thought. For more reasons, please read the quoted diff.
> Maxim, you requested this paragraph: Does this addition suit you?
>
>
> Hopefully at least some of these suggestions are considered improvements
> and are applied to style(9).
>
> Regards
> Christoph
>
>
> [1] ISO/IEC 9899:1999 (E) ?6.11.6 and ?6.11.7
> [2] ISO/IEC 9899:1999 (E) ?6.7.5.3:15
More information about the freebsd-hackers
mailing list