SEDIMENT

SEDIMENT coding style

This document describes the coding style for the SEDIMENT project. It is derived from the OpenSSL coding style.

This guide is not distributed as part of SEDIMENT itself.

Coding style is all about readability and maintainability using commonly available tools.

Chapter 1: Indentation

Indentation is four space characters. Do not use the tab character.

Pre-processor directives does not use indentations:


        #if
        #define
        #else
        #define
        #endif
    

Chapter 2: Breaking long lines and strings

Don’t put multiple statements, or assignments, on a single line.


        if (condition) do_this();
        do_something_everytime();
    

The limit on the length of lines is 120 columns. Statements longer than 120 columns must be broken into sensible chunks, unless exceeding 120 columns significantly increases readability and does not hide information. Descendants are always substantially shorter than the parent and are placed substantially to the right. The same applies to function headers with a long argument list. Never break user-visible strings, however, because that breaks the ability to grep for them.

Chapter 3: Placing Braces and Spaces

The other issue that always comes up in C styling is the placement of braces. Unlike the indent size, there are few technical reasons to choose one placement strategy over the other, but the preferred way, following Kernighan and Ritchie, is to put the opening brace last on the line, and the closing brace first:


        if (x is true) {
            we do y
        }
    

This applies to all non-function statement blocks (if, switch, for, while, do):


        switch (suffix) {
            case 'G':
            case 'g':
                mem <<= 30;
                break;
            case 'M':
            case 'm':
                mem <<= 20;
                break;
            case 'K':
            case 'k':
                mem <<= 10;
                /* fall through */
            default:
                break;
            }
        }
    

Note, from the above example, that the way to indent a switch statement is to align the switch and its subordinate case labels in the same column instead of double-indenting the case bodies.

There is one special case, however. Functions have the opening brace at the beginning of the next line:


        int function(int x)
        {
            body of function
        }
    

Note that the closing brace is empty on a line of its own, except in the cases where it is followed by a continuation of the same statement, such as a while in a do-statement, like this:


        do {
            ...
        } while (condition);
    

An else in an if-statement should start on a new line.


        if (x == y) {
            ...
        }
        else if (x > y) {
            ...
        } 
        else {
            ...
        }
    

Do not unnecessarily use braces around a single statement:


        if (condition)
        action();
    

and


        if (condition)
            do_this();
        else
            do_that();
    

If one of the branches is a compound statement, then use braces on both parts:


        if (condition) {
            do_this();
            do_that();
        } 
        else {
            otherwise();
        }
    

Nested compound statements should often have braces for clarity, particularly to avoid the dangling-else problem:


        if (condition) {
            do_this();
            if (anothertest)
                do_that();
        } 
        else {
            otherwise();
        }
    

Chapter 3.1: Spaces

SEDIMENT style for use of spaces depends (mostly) on whether the name is a function or keyword. Use a space after most keywords:

 
        if, switch, case, for, do, while, return
    

Do not use a space after sizeof, typeof, alignof, or __attribute__. They look somewhat like functions and should have parentheses in SEDIMENT, although they are not required by the language. For sizeof, use a variable when at all possible, to ensure that type changes are properly reflected:

 
        SOMETYPE *p = malloc(sizeof(*p) * num_of_elements);
    

Do not add spaces around the inside of parenthesized expressions. This example is wrong:

 
        s = sizeof( struct file );
    

When declaring pointer data or a function that returns a pointer type, the asterisk goes next to the data or function name, and not the type:

 
        char *openssl_banner;
        unsigned long long memparse(char *ptr, char **retptr);
        char *match_strdup(substring_t *s);
    

Use one space on either side of binary and ternary operators, such as this partial list:


        =  +  -  <  >  *  /  %  |  &  ^  <=  >=  ==  !=  ?  : +=
    

Put a space after commas and after semicolons in for statements, but not in for (;;).

Do not put a space after unary operators:


        &  *  +  -  ~  !  defined
    

Do not put a space before the postfix increment and decrement unary operators or after the prefix increment and decrement unary operators:


        foo++
        --bar
    

Do not put a space around the . and -> structure member operators:


        foo.bar
        foo->bar
    

Do not use multiple consecutive spaces except in comments, for indentation, and for multi-line alignment of definitions, e.g.:


        #define FOO_INVALID  -1   /* invalid or inconsistent arguments */
        #define FOO_INTERNAL 0    /* Internal error, most likely malloc */
        #define FOO_OK       1    /* success */
        #define FOO_GREAT    100  /* some specific outcome */
    

Do not leave trailing whitespace at the ends of lines. Some editors with smart indentation will insert whitespace at the beginning of new lines as appropriate, so you can start typing the next line of code right away. But they may not remove that whitespace if you leave a blank line, however, and you end up with lines containing trailing, or nothing but, whitespace.

Git will warn you about patches that introduce trailing whitespace, and can optionally strip the trailing whitespace; however, if applying a series of patches, this may make later patches in the series fail by changing their context lines.

Avoid empty lines at the beginning or at the end of a file.

Avoid multiple empty lines in a row.


Chapter 4: Naming

Local variable names should be short, and to the point. If you have some random integer loop counter, it should probably be called i or j.

Avoid single-letter names when they can be visually confusing, such as I and O. Avoid other single-letter names unless they are telling in the given context. For instance, m for modulus and s for SSL pointers are fine.

Use simple variable names like tmp and name as long as they are non-ambiguous in the given context.

If you are afraid that someone might mix up your local variable names, perhaps the function is too long; see the chapter on functions.

Global variables (to be used only if you REALLY need them) need to have descriptive names, as do global functions. If you have a function that counts the number of active users, you should call that count_active_users() or similar, you should NOT call it cntusr().

Do not encode the type into a name (so-called Hungarian notation, e.g., int iAge).

Align names to terms and wording used in standards and RFCs.

Make sure that names do not contain spelling errors.


Chapter 5: Functions

Ideally, functions should be short and sweet, and do just one thing. A rule of thumb is that they should fit on one or two screenfuls of text (25 lines as we all know), and do one thing and do that well.

The maximum length of a function is often inversely proportional to the complexity and indentation level of that function. So, if you have a conceptually simple function that is just one long (but simple) switch statement, where you have to do lots of small things for a lot of different cases, it’s okay to have a longer function.

If you have a complex function, however, consider using helper functions with descriptive names. You can ask the compiler to in-line them if you think it’s performance-critical, and it will probably do a better job of it than you would have done.

Another measure of complexity is the number of local variables. If there are more than five to 10, consider splitting it into smaller pieces. A human brain can generally easily keep track of about seven different things; anything more and it gets confused. Often things which are simple and clear now are much less obvious two weeks from now, or to someone else. An exception to this is the command-line applications which support many options.

In source files, separate functions with one blank line.

In function prototypes, include parameter names with their data types. Although this is not required by the C language, it is preferred in SEDIMENT because it is a simple way to add valuable information for the reader. The name in the prototype declaration should match the name in the function definition.

Separate local variable declarations and subsequent statements by an empty line.

Do not mix local variable declarations and statements.


Chapter 5.1: Checking function arguments

A public function should verify that its arguments are sensible. This includes, but is not limited to, verifying that: - non-optional pointer arguments are not NULL and - numeric arguments are within expected ranges.

Where an argument is not sensible, an error should be returned.


Chapter 5.2: Extending existing functions

From time to time it is necessary to extend an existing function. Typically this will mean adding additional arguments, but it may also include removal of some.

Where an extended function should be added the original function should be kept and a new version created with the same name and an _ex suffix. For example, the RAND_bytes function has an extended form called RAND_bytes_ex.

Where an extended version of a function already exists and a second extended version needs to be created then it should have an _ex2 suffix, and so on for further extensions.

When an extended version of a function is created the order of existing parameters from the original function should be retained. However new parameters may be inserted at any point (they do not have to be at the end), and no longer required parameters may be removed.


Chapter 6: Centralized exiting of functions

The goto statement comes in handy when a function exits from multiple locations and some common work such as cleanup has to be done. If there is no cleanup needed then just return directly. The rationale for this is as follows:

For example:


        int fun(int a)
        {
            int result = 0;
            char *buffer = OPENSSL_malloc(SIZE);
    
            if (buffer == NULL)
                return -1;
    
            if (condition1) {
                while (loop1) {
                    ...
                }
                result = 1;
                goto out;
            }
            ...
        out:
            OPENSSL_free(buffer);
            return result;
        }
    

Chapter 7: Commenting

Place comments above or to the right of the code they refer to. Comments referring to the code line after should be indented equally to that code line.

Comments are good, but there is also a danger of over-commenting. NEVER try to explain HOW your code works in a comment. It is much better to write the code so that it is obvious, and it’s a waste of time to explain badly written code. You want your comments to tell WHAT your code does, not HOW.

The preferred style for long (multi-line) comments is:


        /*
        * This is the preferred style for multi-line
        * comments in the SEDIMENT source code.
        * Please use it consistently.
        *
        * Description:  A column of asterisks on the left side,
        * with beginning and ending almost-blank lines.
        */
    

It’s also important to comment data, whether they are basic types or derived types. To this end, use just one data declaration per line (no commas for multiple data declarations). This leaves you room for a small comment on each item, explaining its use.


Chapter 8: Macros and Enums

Names of macros defining constants and labels in enums are in uppercase:


        #define CONSTANT 0x12345
    

Enums are preferred when defining several related constants. Note, however, that enum arguments to public functions are not permitted.

Macro names should be in uppercase, but macros resembling functions may be written in lower case. Generally, inline functions are preferable to macros resembling functions.

Macros with multiple statements should be enclosed in a do - while block:


        #define macrofun(a, b, c)   \
        do {                    \
            if (a == 5)         \
                do_this(b, c);  \
        } while (0)
    

Do not write macros that affect control flow:


        #define FOO(x)                 \
        do {                       \
            if (blah(x) < 0)       \
                return -EBUGGERED; \
        } while(0)
    

Do not write macros that depend on having a local variable with a magic name:


        #define FOO(val) bar(index, val)
    

It is confusing to the reader and is prone to breakage from seemingly innocent changes.

Do not write macros that are l-values:


        FOO(x) = y
    

This will cause problems if, e.g., FOO becomes an inline function.

Be careful of precedence. Macros defining an expression must enclose the expression in parentheses unless the expression is a literal or a function application:


        #define SOME_LITERAL 0x4000
        #define CONSTEXP (SOME_LITERAL | 3)
        #define CONSTFUN foo(0, CONSTEXP)
    

Beware of similar issues with macros using parameters. Put parentheses around uses of macro arguments unless they are passed on as-is to a further macro or function. For example,


        #define MACRO(a,b) ((a) * func(a, b))
    

The GNU cpp manual deals with macros exhaustively.


Chapter 9: Editor modelines

Some editors can interpret configuration information embedded in source files, indicated with special markers. For example, emacs interprets lines marked like this:


        - mode: c -
    

Or like this:


        /*
        Local Variables:
        compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c"
        End:
        */
    

Vim interprets markers that look like this:


        /* vim:set sw=8 noet */
    

Do not include any of these in source files. People have their own personal editor configurations, and your source files should not override them. This includes markers for indentation and mode configuration. People may use their own custom mode, or may have some other magic method for making indentation work correctly.


Chapter 10: Expressions

Avoid needless parentheses as far as reasonable. For example, do not write


        if ((p == NULL) && (!f(((2 * x) + y) == (z++))))
    

but


        if (p == NULL && !f(2 * x + y == z++)).
    

For clarity, always put parentheses when mixing the logical && and || operators, mixing comparison operators like <= and ==, or mixing bitwise operators like & and |. For example,


        if ((a && b) || c)
        if ((a <= b) == ((c >= d) != (e < f)))
        x = (a & b) ^ (c | d)
    

In comparisons with constants (including NULL and other constant macros) place the constant on the right-hand side of the comparison operator. For example,


        while (i++ < 10 && p != NULL)
    

Do not use implicit checks for numbers (not) being 0 or pointers (not) being NULL. For example, do not write


        if (i)
        if (!(x & MASK))
        if (!strcmp(a, "FOO"))
        if (!(p = BN_new()))
    

but do this instead:


        if (i != 0)
        if ((x & MASK) == 0)
        if (strcmp(a, "FOO") == 0)
        if ((p = BN_new()) == NULL)
    

Boolean values shall be used directly as usual, e.g.,


        if (check(x) && !success(y))
    

Note: Many functions can return 0 or a negative value on error and the Boolean forms need to be used with care.

If you need to break an expression into multiple lines, make the line break before an operator, not after. It is preferred that such a line break is made before as low priority an operator as possible. Examples:

When appearing at the beginning of a line, operators can, but do not have to, get an extra indentation (+ 4 characters). For example,


        if (long_condition_expression_1
                && condition_expression_2) {
            statement_1;
            statement_2;
        }