Introduction
When Dennis Ritchie created C, he made int
(a signed integer type) be the default type. The size (number of bits) of an int
was deliberately not specified. Even when C was standardized, all that was guaranteed was a minimum size. The rationale was that the size of int
should be the “natural” word size for an integer on a given CPU.
If you needed only smaller signed integers and wanted to save a bit of space, Ritchie gave us short
; or, if you needed bigger integers, he gave us long
. (C99 gave us even bigger integers with long long
.) If you only needed unsigned integers, you could include unsigned
in a declaration. C99 also gave us specific-sized signed integer type aliases (e.g., int32_t
) and unsigned type aliases (e.g., uint32_t
).
However, in programming, negative integers (thus requiring the a signed integer type), aren’t needed most of the time. The length of strings, count of objects, size of objects, size of files, etc., are all unsigned integers. Specific-sized type aliases are needed even less than signed integers.
Yet I’ve seen a lot of code that uses integer types inappropriately. Such code can convey either underspecified or misleading information to readers (including yourself in several months’ time). It’s best to choose the right integer type for the right purpose.
Guidelines
Here are my guidelines for choosing an integer type:
- When representing a count of bytes in memory, use the
size_t
standard type alias.
This is the type used by both the C and C++ standard libraries, e.g., by memcpy()
, strlen()
, std::string::size()
, etc., so there’s plenty of precedent.
- When representing either the size of or a position within a file on disk, use the
off_t
POSIX type alias.
If you’re dealing with very large files, on some platforms, you may need to compile with -D_FILE_OFFSET_BITS=64
to get a 64-bit version of off_t
.
- When representing a count of objects in memory, use
size_t
also.
This is also the type used by both the C and C++ standard libraries, e.g., by fread()
and fwrite()
.
-
Only if you need to represent a value contained within a specific number of bits or you need to conform to a specific API, use one of the
int8_t
,int16_t
,int32_t
orint64_t
type aliases for signed types; or one of theuint8_t
,uint16_t
,uint32_t
, oruint64_t
type aliases for unsigned types.
The only times you typically need a fixed-size integer is when you “externalize” a value, e.g., write it to disk or send it over a socket.
Using a fixed-size integer when you don’t actually need a specific number of bits conveys wrong information to the reader.
Furthermore:
When representing an integer value that must be the exact size of a pointer, use either the standard
intptr_t
oruintptr_t
type alias.Only if you need negative values, use one of
short
,int
,long
, orlong long
withint
being preferred unless you need either smaller or larger values.
Lastly:
- Otherwise use one of
unsigned short
,unsigned
,unsigned long
, orunsigned long long
similarly withunsigned
being preferred unless you need either smaller or larger values.
That is, unless you’re dealing with one of the listed cases above, default to using unsigned
types.
Conclusion
Choosing the right integer type conveys correct information to the reader and can eliminate run-time checks.
Epilogue
Originally, and up until C99, int
was the implicit type, that is if you didn’t specify any type at all, it was understood to be int
. For example:
power( x, n ) /* x and n are int; returns int */
{
int p;
for ( p = 1; n > 0; --n )
p *= x;
return p;
}
defines a function that has int
parameters and returns int
, yet int
isn’t used in the declaration.
Function prototypes were back-ported from C++ to C89, yet the original “K&R style” function definitions were still allowed all the way up until C23. The ANSI C committee is a conservative bunch.
Even weirder, pre-C99 also allowed int
to be implicit in declarations such as:
i; // int i
*p; // int *p
*a[4]; // int *a[4]
*f(); // int *f()
Fortunately, such declarations have long since been illegal.