Introduction
As anyone who programs in either C or C++ knows, constant, macro, type, struct
(or class
), and function declarations comprising an API are put into header files typically having a .h
(or sometimes .hpp
for C++) filename extension.
Omitted from many explanations, however, is exactly how code within header files should be organized, including the order of including other header files. This matters for helping to maximize compilation speed and overall maintainability.
C++20 added modules, but that’s a story for another time. Given the enormous amount of C and pre-C++20 code that exists, header files are going to continue to be around for a while.
Include Guards
A basic header file is like:
// foo.h
#ifndef FOO_H
#define FOO_H
// ... declarations ...
#endif /* FOO_H */
That is, all the declarations should be within an include guard: a #ifndef X
, #define X
, #endif /* X */
sequence where X
is a unique name within a codebase and derived from the file’s name.
The point of an include guard is that if a particular header file is included more than once, you won’t get multiple declaration errors from the compiler because the preprocessor will omit everything within the guard if it’s already been declared.
The nomenclature for include guard names doesn’t matter: just pick a method that’s very unlikely to collide with names used in either system or third-party headers — and be consistent.
The comment after the #endif
isn’t necessary, of course, but it’s a good idea always to repeat the condition used in the #ifndef
(or #ifdef
or #if
) for readability.
Then in all .c
(or .cpp
) files that use the header, simply #include
it:
// foo.c
#include "foo.h"
// ... definitions ...
Unfortunately, this is typically where many header file explanations stop. There’s more to creating and using header files effectively than that.
Self-Sufficient Headers
Before proceeding, I want to define what it means for a header file to be self-sufficient:
- A self-sufficient header is one where if it were included by itself into a
.c
(or.cpp
) file, that file would compile without errors (specifically, without “undeclared” errors).
For example, a trivial program like:
#include "foo.h"
int main() {
}
will compile without errors only if foo.h
is self-sufficient.
Including Other Headers in a Header
Typically, a header file will need to include other header files because the declarations make use of other declarations in those other header files.
Within a header file:
- Include other local headers first, if any, followed by system headers, if any.
For example:
// color.h
#ifndef cdecl_color_H
#define cdecl_color_H
#include "config.h" // Correct: #include local headers ...
#include "strbuf.h"
#include "util.h"
#include <stdio.h> // ... before system headers.
// ...
#endif /* cdecl_color_H */
The local headers (the ones enclosed in ""
), if any, come first, followed by system headers (the ones enclosed in <>
), if any.
Why? Because this helps ensure that every header file is self-sufficient. For example, if you put system headers first:
#include <stdio.h> // WRONG: #include of system headers ...
#include "strbuf.h" // ... before local headers.
// ...
then it’s possible for declarations in strbuf.h
to use declarations in stdio.h
(e.g., FILE
) by “accident” without strbuf.h
itself including stdio.h
.
This will continue to work indefinitely, but if at some point you no longer need stdio.h
in color.h
and so delete the #include <stdio.h>
, then you’ll get “undeclared” errors in strbuf.h
. Up until this time, you’ll never have noticed that strbuf.h
isn’t self-sufficient.
Once you notice, it’s easily fixed, but it’s better to avoid the problem in the first place by always including local headers before system headers.
Forward-Declare Instead of Include
Within a C header file:
- If you use a
struct
orunion
type declared in another header only via pointers, forward-declare the type instead of including the other header.
For example, if your header print.h
uses the passwd
struct
declared in the standard header pwd.h
, but only via pointer (and you need nothing else from pwd.h
), forward-declare passwd
rather than include pwd.h
:
// print.h
struct passwd; // instead of: #include <pwd.h>
void print_passwd( struct passwd *pw );
Why? It saves the time of the preprocessor having to open pwd.h
and the compiler having to parse the entire file when all you need is a single declaration. For large C or C++ codebases, the time adds up.
The equivalent guideline for C++ is similar, but includes class
and references:
- If you use a
struct
,union
, orclass
type declared in another header only via either pointers or references, forward-declare the type instead of including the other header.
Include Everything Necessary
Within a header file:
- You must include every other header (or forward-declaration) it needs to be self-sufficient.
Never force users of your header to have to include some other header(s) before yours in order to compile without errors.
BSD-derived operating systems have historically tended to violate this guideline. The rationale for doing so is that it’s an alternate way to help maximize compilation speed. It does this by forcing you to be a human include guard.
For example:
#include <sys/types.h>
#include <pwd.h> // needs <sys/types.h>
#include <unistd.h> // needs <sys/types.h> too
Rather than pwd.h
and unistd.h
each doing #include <sys/types.h>
, both rely on you to do the include yourself.
How does that help? It eliminates the step whereby the preprocessor has to open sys/types.h
, read the file, encounter the include guard, and omit the rest of the contents if the guard has been seen before (as would be the case for unistd.h
).
So while it does help, the price is that it forces users to have to remember to include files manually that can lead to having unnecessary includes that slow down compilation. For example, if at some point you removed the includes for pwd.h
and unistd.h
, that might result in sys/types.h
no longer being needed, but you might forget to remove it.
Like many other things in computer science, it’s a trade-off. BSD-derived operating systems have been moving away from this practice and making headers self-sufficient.
Subdirectories
Large codebases often partition code into subdirectories where each subdirectory contains a set of related files. For #include "..."
, the preprocessor looks only in the current directory and not in subdirectories thereof.
To use headers in subdirectories, there are two choices:
- Use the subdirectory name between the quotes; or:
- Tell the compiler to look in the subdirectory also.
An example of the first would be:
#include "subsystem/out_q.h"
Doing the second is compiler-specific, but for Unix-based compilers such as gcc
and clang
, you typically would add a command-line option of the form -Isubsystem
that adds subsystem
to the compiler’s include path.
Either approach is fine, but if you do the second, header file names must be unique throughout the codebase. If two headers in different subdirectories were to have the same name, then including one would only include whichever one is earlier in the compiler’s include path.
Differences in Case Only
Another thing not to do is:
- Do not have files whose names differ only by case.
E.g., do not have out_q.h
and Out_Q.h
. Why not?
- It’s easy to include the wrong one.
- On case-insensitive but case-preserving filesystems (such as APFS and HFS+), such files are considered the same file.
For the second issue, it might mean that even if you include out_q.h
, you might end up including Out_Q.h
if it comes first in the include path.
Never Use ../
One thing you must never do is:
-
Never use
../
in an include path.
For example, never do anything like:
#include "../subsystem/out_q.h" // Never use ../ in a path!
Why not?
It might be the case that the build process for your codebase uses symbolic links and
..
might end up being relative to the resolved path, not the original path, so the directory you end up including from might not be the one you think it is. This can lead to hard-to-diagnose bugs.If your codebase was well architected, code should not have circular dependencies — and the include path will have been set appropriately to prevent that.
For the second, if you were to try to include subsystem/out_q.h
and got “no such file,” it means you’re not supposed to be including that file from the file you’re working on because it would create a circular dependency. Using ../
just to make your code compile subverts this intentional restriction.
Circular dependencies are often bad because they can contribute to the static initialization order fiasco.
Including Headers in .c
or .cpp
Files
For a .c
(or .cpp
) file, all the guidelines for including headers in a header file also apply, but with one tweak for including local headers:
- For a given
.c
(or.cpp
) file, e.g.,foo.c
, include its corresponding header,foo.h
, first.
Why? Two reasons:
It ensures
foo.h
is self-sufficient.It ensures the declaration of functions in
foo.h
match their definitions infoo.c
.
The second one isn’t relevant in C++ since function names are “mangled” to include their signatures (hence a mismatch results in an “undefined” error at link-time). But in C it matters because calling a function via a mismatched signature results in undefined behavior.
Conclusion
Proper header file etiquette helps to maximize compilation speed and overall maintainability. To summarize:
A self-sufficient header is one where if it were included by itself into a
.c
(or.cpp
) file, that file would compile without errors (specifically, without “undeclared” errors).In a header file, use include guards.
In a header file, include other local headers first, if any, followed by system headers, if any.
If you use a
struct
orunion
(orclass
in C++) type declared in another header only via pointers (or references in C++), forward-declare the type instead of including the other header.For a header, you must include every other header (or forward-declaration) it needs to be self-sufficient.
Do not have files whose names differ only by case.
Never use
../
in an include path.For a given
.c
(or.cpp
) file, e.g.,foo.c
, include its corresponding header,foo.h
, first.
Include responsibly.