Introduction
Among other things, C11 added the _Generic
keyword that enables compile-time selection of an expression based on the type of its argument.
Personally, I think
_Generic
is too, well, generic of a name. It should have been called something like_Typeswitch
.
The motivating use-case is the ability for library authors to provide a veneer of C++ function overloading in C.
Motivating Example
The motivating example is the ability of the standard C math library in math.h
to provide specialized functions for different floating point types yet only a single function in the API. For example, the library provides these three functions:
double cbrt ( double ); // cube root of double
float cbrtf( float ); // ... of float
long double cbrtl( long double ); // ... of long double
While you certainly can use those functions as they are, it would be nice if you could always use just cbrt
and have the compiler select the right function automatically based on the type of its argument:
double d;
float f;
long double l;
double rv_d = cbrt( d ); // calls cbrt()
float rv_f = cbrt( f ); // calls cbrtf()
long double rv_l = cbrt( l ); // calls cbrtl()
To make this work in C++, you’d simply overload the functions; to make this work in C, the math library defines a macro using _Generic
:
#define cbrt(N) \
_Generic( (N), \
float : cbrtf, \
long double: cbrtl, \
default : cbrt \
)( (N) )
The
_Generic
keyword is part of the C language proper, not part of the preprocessor. However, the only way you can practically use_Generic
is via a preprocessor macro.For the above example, the
( (N) )
at the end is just calling whichever function_Generic
selected and passingN
as its argument.
_Generic
works as follows:
- It takes a single controlling expression followed by an association list of one or more type/expression pairs.
- If the type (not the value) of the controlling expression matches a particular type (before the
:
) in the association list, then the result of the_Generic
expression is the expression for that type (after the:
). - There can be at most one occurrence of any particular type. (Remember that a
typedef
type is not a distinct type.) - Optionally, one “type” may instead be
default
that will match only if no other type does. - Expressions are not evaluated; only their type is considered.
- Hence,
_Generic
is strictly compile-time and has zero run-time overhead.
Additionally, when comparing types:
- Top-level
const
,volatile
,restrict
, and_Atomic
qualifiers are discarded. (For example, an expression of typeint const
will matchint
.) - Array-to-pointer and function-to-pointer conversions happen as usual.
- However, no other conversions occur, including the usual arithmetic conversions. (For example,
short
is not promoted toint
.)
For the above example, if you’re wondering how function names like
cbrtf
,cbrtl
, andcbrt
are expressions, remember that the name of a function is a shorthand for a pointer to itself. That is for any functionf
,f
is equivalent to&f
. Hence for the above example, the function names are expressions whose value is their own pointer.Also for the above example, a keen observer might notice that the name of the macro
cbrt
is the same as the default functioncbrt
and wonder why that doesn’t cause an infinite preprocessor macro expansion loop. It doesn’t because the preprocessor will not expand a macro that references itself. Hence, if the resulting expression of_Generic
iscbrt
, that will not be expanded again.Additionally for the above example, the reason
default
is used rather thandouble
is that you wantcbrt
(the macro) when called with any other type, sayint
, to callcbrt
(the function) and theint
will be promoted todouble
via the usual arithmetic conversions.Lastly for the above example, the resulting pointer-to-function followed by
(
actually calls the function because, for any pointer-to-functionpf
,pf()
is a shorthand for(*pf)()
.
_Generic
solves its motivating example (though in a clunky manner), but it turns out that _Generic
is quite powerful and is capable of solving many other problems.
A printf
Example
When using printf
, it’s sometimes difficult to remember the correct format specifier for particular types. Using _Generic
, you can implement a helper macro:
#define PRINTF_FORMAT(T) \
_Generic( (T), \
_Bool : "%d", \
char : "%c", \
signed char : "%hhd", \
unsigned char : "%hhu", \
short : "%hd", \
int : "%d", \
long : "%ld", \
long long : "%lld", \
unsigned short : "%hu", \
unsigned int : "%u", \
unsigned long : "%lu", \
unsigned long long: "%llu", \
float : "%f", \
double : "%f", \
long double : "%Lf", \
char* : "%s", \
char const* : "%s", \
wchar_t* : "%ls", \
wchar_t const* : "%ls", \
void* : "%p", \
void const* : "%p" \
)
#define PRINT(X) printf( PRINTF_FORMAT( (X) ), (X) )
PRINT(42); // printf( "%d", 42 )
PRINT(-273.15); // printf( "%f", -273.15 )
PRINT("hello"); // printf( "%s", "hello" )
One problem with this macro is that it won’t work for any other kind of pointer. A way to fix this is:
#define PRINTF_FORMAT(T) \
_Generic( (T), \
/* ... */ \
char* : "%s", \
char const* : "%s", \
wchar_t* : "%ls", \
wchar_t const* : "%ls", \
default : PTR_FORMAT(T), \
)
#define PTR_FORMAT(P) \
_Generic( TO_VOID_PTR_EXPR( (P) ), \
void const*: "%p", \
void* : "%p" \
)
#define TO_VOID_PTR_EXPR(P) (1 ? (P) : (void*)(P))
That is, in PRINT_FORMAT
, change the void*
and void const*
cases to a default
case that calls PTR_FORMAT(T)
to handle the pointer cases.
The macro TO_VOID_PTR_EXPR
seems strange since 1
always evaluates to true
so the result is always (P)
. That may seem pointless, but we want the side-effect of the ?:
operator which is:
- If either of the if-true or if-false expressions of
?:
isvoid*
, then the type of the result shall also bevoid*
(plusconst
if either isconst
).
Since we’ve explicitly cast P
to void*
for the if-false expression, that forces the type of the result also to be void*
(or void const*
) regardless of the pointer type.
Note that a simple
(void*)(P)
(cast tovoid*
) by itself won’t work because that would cast any type tovoid*
. We want the type of the result to bevoid*
only ifP
is a pointer.
A Typename Example
You can implement a macro similar to PRINTF_FORMAT
to get the name of a type:
#define TYPENAME(T) \
_Generic( (T), \
_Bool : "_Bool", \
char : "char", \
/* ... */ \
void const* : "void const*", \
default : "other" \
)
size_t s = 0;
printf( "Real type of size_t is %s\n", TYPENAME(s) );
_Generic
with size_t
The type size_t
is a standard type commonly used to represent the size in bytes of an object or to index into an array. Invariably, size_t
is an implementation defined typedef
for either unsigned long
or unsigned long long
.
Because size_t
a typedef
, you can’t list it as a distinct type in a _Generic
association list (assuming you include other unsigned
types) because it would be a duplicate type. But what if you really want to treat size_t
differently? The trick (as with many other problems in software) requires an extra level of indirection, specifically by checking for size_t
first and using default
for all other types:
#define F(X) \
_Generic( (X), \
size_t : f_size_t, \
default: F_NOT_SIZE_T \
)( (X) )
#define F_NOT_SIZE_T(X) \
_Generic( (X), \
/* ... */ \
unsigned char : f_uc, \
unsigned short : f_us, \
unsigned int : f_ui, \
unsigned long : f_ul, \
unsigned long long: f_ull, \
/* ... */ \
)
The caveat, of course, is that whatever type size_t
is typedef
’d to will never be selected independently from F_NOT_SIZE_T
.
The same trick can be used for any
typedef
’d type, e.g.,uintmax_t
,uintptr_t
, etc., or your owntypedef
s.
const
Overloading
Consider a singly linked list:
typedef struct slist slist;
struct slist {
slist *next;
void *data;
};
and a function to scan the list looking for a node for which a given predicate function returns true
:
typedef _Bool (*slist_pred_fn)( void const* );
slist* slist_find( slist *start, slist_pred_fn pred );
A problem you can encounter is if you try to pass a const
slist
:
void f( slist const *list ) {
// ...
slist const *found = slist_find( list, &my_pred );
That will generate a “discards const” warning because you’re passing a const
slist
to a function that takes a non-const
slist
.
In C++, this would be an error.
While you could ignore the warning, it’s always best to write warning-free code. But how can it be fixed? You could cast away the const
, but that’s ugly. In C++, you could overload slist_find()
that takes slist const*
; in C, you’d have to write a distinctly named function:
inline
slist const* const_slist_find( slist const *start,
slist_pred_fn pred ) {
return slist_find( (slist*)start, pred );
}
and call that instead for a const
slist
. While it works, it’s also ugly. However, _Generic
can be used to hide the ugliness:
#define slist_find(LIST,PRED) \
_Generic( (LIST), \
slist* : slist_find, \
slist const*: const_slist_find \
)( (LIST), (PRED) )
Now, you can always call slist_find()
and it will “just work” for either a const
or non-const
slist
.
As with the earlier example, this works because the preprocessor will not expand a macro that references itself.
Assuming the above declarations are in a .h
file and the actual implementation of slist_find()
is in a .c
file (which includes the .h
file), then you’ll run into the problem where the slist_find()
(at this point, a macro) in the definition will get expanded by the preprocessor resulting in syntax errors. There are a few different fixes for this:
-
#undef slist_find
just prior to the definition (but then if it’s also used later in the.c
file, you won’t get the benefit ofconst
overloading). - Use a different name for the non-
const
function, e.g.,nonconst_slist_find
. - Use extra parentheses like
(slist_find)
in the definition.
An example of #3 is:
// slist.c
slist* (slist_find)( slist *start, slist_pred_fn pred ) {
// ...
}
This fix works for two reasons:
- In C, any declaration of the form
T x
(declarex
of typeT
) can have extra parentheses added likeT (x)
without changing its meaning. - The preprocessor will expand a function-like macro only if it’s followed by
(
.
Since the slist_find
in the definition is followed by )
and not (
, the preprocessor will not expand it.
Static if
We can also use _Generic
to implement a “static if
,” that is an if
that’s evaluated at compile-time (similar to C++’s if constexpr
):
#define STATIC_IF(EXPR,THEN,ELSE) \
_Generic( &(char[1 + !!(EXPR)]){0}, \
char (*)[2]: (THEN), \
char (*)[1]: (ELSE) \
)
This works by:
- Converting
EXPR
to either0
or1
via!!
. - Creating a compound literal array having one element plus a second element only if
EXPR
is true. - Taking the compound array’s address via
&
at which point its type is either “pointer to array 2 ofchar
” (i.e.,char(*)[2]
if true) or “pointer to array 1 ofchar
” (i.e.,char(*)[1]
if false). - If the type is
char(*)[2]
, the result isTHEN
; else: - If the type is
char(*)[1]
, the result isELSE
.
Reminder: in C, a “pointer to array N of T” (for some size N of some type T) is not the same as the “pointer to T” that results from the name of an array regardless of its size “decaying” into a pointer to its first element (e.g., array
A
being a shorthand for&A[0]
). Pointers to arrays of different sizes are distinct types.
We can also build on TO_VOID_PTR_EXPR
to make IS_PTR_TO_CONST_EXPR
:
#define IS_PTR_TO_CONST_EXPR(P) \
_Generic( TO_VOID_PTR_EXPR( (P) ), \
void const* : 1, \
default : 0 \
)
Given those macros, we can write a generalized macro that can const
overload any function:
#define CONST_OVERLOAD(FN, PTR, ...) \
STATIC_IF( IS_PTR_TO_CONST_EXPR(PTR), \
const_ ## FN, \
(FN) \
)( (PTR) __VA_OPT__(,) __VA_ARGS__ )
#define slist_find(LIST,PRED) \
CONST_OVERLOAD( slist_find, (LIST), (PRED) )
Some Type Traits
Using _Generic
, we can define macros similar to C++’s type traits functions. Note that some macros take expressions and others take types. Having one or the other (or sometimes both) is handy.
Get whether EXPR is a C string:
#define IS_C_STR_EXPR(EXPR) \
_Generic( (EXPR), \
char* : 1, \
char const* : 1, \
default : 0 \
)
Get whether TYPE is a signed or unsigned integral type:
#define IS_SIGNED_TYPE(TYPE) !IS_UNSIGNED_TYPE(TYPE)
#define IS_UNSIGNED_TYPE(TYPE) ((TYPE)-1 > 0)
Note that IS_SIGNED_TYPE()
should not be ((TYPE)-1 < 0)
because some compilers will give an “expression is always false” warning for unsigned types.
Get whether EXPR is of a signed, unsigned, or any integral type:
#define IS_SIGNED_EXPR(EXPR) \
_Generic( (EXPR), \
char : IS_SIGNED_TYPE(char), \
signed char: 1, \
short : 1, \
int : 1, \
long : 1, \
long long : 1, \
default : 0 \
)
#define IS_UNSIGNED_EXPR(EXPR) \
_Generic( (EXPR), \
_Bool : 1, \
char : IS_UNSIGNED_TYPE(char), \
unsigned char : 1, \
unsigned short : 1, \
unsigned int : 1, \
unsigned long : 1, \
unsigned long long: 1, \
default : 0 \
)
#define IS_INTEGRAL_EXPR(EXPR) \
(IS_SIGNED_EXPR(EXPR) || IS_UNSIGNED_EXPR(EXPR))
As a reminder, in C, it’s implementation defined whether
char
is signed or unsigned.
Get whether EXPR is of a floating-point type:
#define IS_FLOATING_POINT_EXPR(EXPR) \
_Generic( (EXPR), \
float : 1, \
double : 1, \
long double : 1, \
float _Complex : 1, \
double _Complex : 1, \
long double _Complex: 1, \
default : 0 \
)
Get whether EXPR is of any arithmetic type:
#define IS_ARITHMETIC_EXPR(EXPR) \
(IS_INTEGRAL_EXPR(EXPR) || IS_FLOATING_POINT_EXPR(EXPR))
As of C23 and typeof
, get whether A is of an array (as opposed to a pointer) type:
#define IS_ARRAY_EXPR(A) \
_Generic( &(A), \
typeof(*(A)) (*)[]: 1, \
default : 0 \
)
This works because if A
is actually an array:
- The
&(A)
yields “pointer to array of type T.” - The
A
(insidetypeof
) “decays” into a pointer to its first element yielding “pointer to T,” i.e.,T*
. - The
*A
dereferencesT*
yielding the element type T. - Finally,
T (*)[]
yields “pointer to array of type T” which matches 1 above and_Generic
returns1
(true).
If A
isn’t an array, e.g., a pointer, then none of the above works and _Generic
matches the default
case and returns 0
(false).
If you’re using a version of C prior to C23, both
gcc
andclang
supporttypeof
(or__typeof__
) as an extension.
Get whether P is of a pointer (as opposed to an array) expression:
#define IS_POINTER_EXPR(P) \
_Generic( &(typeof((P))){0}, \
typeof(*(P)) ** : 1, \
default : 0 \
)
This works similarly to STATIC_IF
and IS_ARRAY_EXPR
. The reason the &(typeof((P))){0}
is necessary instead of simply &(P)
is for the case where you take the address of an object via &
to yield a pointer rather than pass a pointer directly, e.g.:
#define MEM_ZERO(P) do { \
static_assert( IS_POINTER_EXPR(P), #P " must be a pointer" ); \
memset( (P), 0, sizeof( *(P) ) ); \
} while (0)
struct S { /* ... */ };
struct S s;
MEM_ZERO( &s );
If &(P)
were used, passing &s
(an rvalue) would result in &(&s)
which is illegal. However, using &(typeof((P))){0}
, results in a compound literal of type pointer to S and compound literals are lvalues that you can take the address of.
Get whether T and U are the same type (similar to C++’s std::is_same
):
#define IS_SAME_TYPE(T,U) \
_Generic( *(T*)0, \
typeof_unqual(U): 1, \
default : 0 \
)
The *(T*)0
is needed to convert T
(a type) into an expression required by _Generic
. (Reminder: the expression isn’t evaluated so it doesn’t matter that it’s dereferencing a null pointer.)
The typeof_unqual(U)
is necessary to remove qualifiers, otherwise it would never match if U
had qualifiers. (Reminder: _Generic
discards qualifiers from the type of the controlling expression.)
Cast an integral expression to an unsigned type of the same size (similar to C++’s std::make_unsigned
):
#define TO_UNSIGNED_EXPR(N) \
STATIC_IF( sizeof(N) == sizeof(char), \
(unsigned char)(N), \
STATIC_IF( sizeof(N) == sizeof(short), \
(unsigned short)(N), \
STATIC_IF( sizeof(N) == sizeof(int), \
(unsigned int)(N), \
STATIC_IF( sizeof(N) == sizeof(long), \
(unsigned long)(N), \
(unsigned long long)(N) ) ) ) )
TO_SIGNED_EXPR()
would be similar.
As of C23 and typeof
, get the underlying type of an enum
(similar to C++’s std::underlying_type
):
#define UNDERLYING_TYPE(ENUM_TYPE) \
typeof( STATIC_IF( IS_SIGNED_TYPE(ENUM_TYPE), \
TO_SIGNED_EXPR( (ENUM_TYPE)0 ), \
TO_UNSIGNED_EXPR( (ENUM_TYPE)0 ) ) )
No SFINAE (Substitution Failure is not an Error)
Consider a string buffer type:
struct strbuf {
char *str;
size_t len;
size_t cap;
};
typedef struct strbuf strbuf_t;
Suppose you want to implement a macro STRLEN()
that will get the length of either an ordinary C string or a strbuf_t
. You might write something like:
#define STRLEN(S) \
_Generic( (S), \
char const* : strlen((S)), \
strbuf_t* : (S)->len \
)
That is, if the type of S
is:
-
char const*
, callstrlen(S)
; or: -
strbuf_t*
, return(S)->len
.
That seems fairly straightforward. There’s just one problem: it won’t compile. Instead, you’ll get:
-
strlen(S)
: warning: incompatible pointer types passingstrbuf_t*
to a parameter of typeconst char*
. -
(S)->len
: error: typeconst char
is not a structure or union.
The problem with _Generic
is that all expressions must be valid — even the expressions that are not selected. Specifically for this example:
- You can’t call
strlen()
on astrbuf_t*
; and: - You can’t refer to
->len
on achar const*
.
In C++ with SFINAE, something that isn’t valid when substituted is not an error: it’s simply ignored; unfortunately, not so in C.
The way to fix this is to make every _Generic
expression similar. In this case, we can add a function:
static inline size_t strbuf_len( strbuf_t const *sbuf ) {
return sbuf->len;
}
Then rewrite STRLEN()
:
#define STRLEN(S) \
_Generic( (S), \
char const* : strlen, \
strbuf_t* : strbuf_len \
)( (S) )
This works because each expression is the name of a function to call and each is passed a single pointer of the type it expects. Note that it’s necessary to put the argument S
outside the _Generic
: if it were inside, then one function call would always be passing the wrong type.
If for whatever reason you don’t want to add an inline function, there is an alternative fix:
#define STRLEN(S) \
_Generic( (S), \
char const* : strlen( (char const*)(S) ), \
strbuf_t* : ONLY_IF_T(strbuf_t*, (S))->len \
)
#define ONLY_IF_T(T,X) \
_Generic( (X), \
T : (X), \
default : ((T)only_if_t()) \
)
void* only_if_t( void );
Similar to an earlier example, this fix works by using an extra level of indirection. For STRLEN
, if the type of S
is char const*
:
- Then for the
char const*
case, the call tostrlen()
is already the right type. (The cast tochar const*
may seem unnecessary, but wait.) - But the
strbuf_t*
case must still be valid, so it callsONLY_IF_T
.
ONLY_IF_T
treats the expression X
as type T
, but only if it really is of type T
, in this case strbuf_t*
:
- If the type of
X
is actually of typestrbuf_t*
, then the result is justX
. - However, if the type of
X
is any other type, it casts the result ofonly_if_t()
tostrbuf_t*
.
The result of all this is that the strbuf_t*
case of STRLEN
compiles because it treats the type of S
as if it were of type strbuf_t*
, so the reference to ->len
is valid.
But wait! Since we’re currently considering the case where the type of S
is char const*
, then we’re effectively casting that pointer to strbuf_t*
and attempting to refer to ->len
that will crash since the pointer doesn’t point to a strbuf_t
— or it would crash if that line of code were actually ever executed, but it never is.
Why not? Remember: we’re currently considering the case where the type of S
is char const*
which means the char const*
case in STRLEN
will be selected. All the contortions for the strbuf_t*
case are only to make the expression valid for the compiler’s sake. Once the expression passes a validity check, it’ll be discarded anyway.
A keen observer might be wondering what the only_if_t()
function is for and what its definition is. Its declaration exists only to be something that can be cast to T
. It’s intentionally not defined for two reasons:
- It’s never called since it only ever ends up in cases that the
_Generic
macros above discard anyway. - If you were to make a mistake in the implementation of the macros resulting in the function actually being called, you’ll get an “undefined symbol” error at link-time to make you aware of your mistake.
For completeness, let’s consider the other case for STRLEN
where the type of S
is strbuf_t*
:
- For the
char const*
case, the call tostrlen()
passes astrlen_t*
— the wrong type — which is why that cast tochar const*
is there: to make the code valid. This case will be discarded anyway, so it doesn’t matter. - For the
strbuf_t*
case, the call toONLY_IF_T
will simply returnS
.
This fix is more convoluted that the initial fix (such is life in C without SFINAE), but, depending on what you’re doing, might be a better fit.
Conclusion
_Generic
allows you to implement a veneer of function overloading (including const
overloading) in C and do a limited form of compile-time type introspection.