3.10. Macro Pitfalls
In this section we describe some special rules that apply to macros and
macro expansion, and point out certain cases in which the rules have
counter-intuitive consequences that you must watch out for.
3.10.1. Misnesting
When a macro is called with arguments, the arguments are substituted
into the macro body and the result is checked, together with the rest of
the input file, for more macro calls. It is possible to piece together
a macro call coming partially from the macro body and partially from the
arguments. For example,
#define twice(x) (2*(x))
#define call_with_1(x) x(1)
call_with_1 (twice)
==> twice(1)
==> (2*(1)) |
Macro definitions do not have to have balanced parentheses. By writing
an unbalanced open parenthesis in a macro body, it is possible to create
a macro call that begins inside the macro body but ends outside of it.
For example,
#define strange(file) fprintf (file, "%s %d",
…
strange(stderr) p, 35)
==> fprintf (stderr, "%s %d", p, 35) |
The ability to piece together a macro call can be useful, but the use of
unbalanced open parentheses in a macro body is just confusing, and
should be avoided.
3.10.2. Operator Precedence Problems
You may have noticed that in most of the macro definition examples shown
above, each occurrence of a macro argument name had parentheses around
it. In addition, another pair of parentheses usually surround the
entire macro definition. Here is why it is best to write macros that
way.
Suppose you define a macro as follows,
#define ceil_div(x, y) (x + y - 1) / y |
whose purpose is to divide, rounding up. (One use for this operation is
to compute how many int objects are needed to hold a certain
number of char objects.) Then suppose it is used as follows:
a = ceil_div (b & c, sizeof (int));
==> a = (b & c + sizeof (int) - 1) / sizeof (int); |
This does not do what is intended. The operator-precedence rules of
C make it equivalent to this:
a = (b & (c + sizeof (int) - 1)) / sizeof (int); |
What we want is this:
a = ((b & c) + sizeof (int) - 1)) / sizeof (int); |
Defining the macro as
#define ceil_div(x, y) ((x) + (y) - 1) / (y) |
provides the desired result.
Unintended grouping can result in another way. Consider sizeof
ceil_div(1, 2). That has the appearance of a C expression that would
compute the size of the type of ceil_div (1, 2), but in fact it
means something very different. Here is what it expands to:
sizeof ((1) + (2) - 1) / (2) |
This would take the size of an integer and divide it by two. The
precedence rules have put the division outside the sizeof when it
was intended to be inside.
Parentheses around the entire macro definition prevent such problems.
Here, then, is the recommended way to define ceil_div:
#define ceil_div(x, y) (((x) + (y) - 1) / (y)) |
3.10.3. Swallowing the Semicolon
Often it is desirable to define a macro that expands into a compound
statement. Consider, for example, the following macro, that advances a
pointer (the argument p says where to find it) across whitespace
characters:
#define SKIP_SPACES(p, limit) \
{ char *lim = (limit); \
while (p < lim) { \
if (*p++ != ' ') { \
p--; break; }}} |
Here backslash-newline is used to split the macro definition, which must
be a single logical line, so that it resembles the way such code would
be laid out if not part of a macro definition.
A call to this macro might be SKIP_SPACES (p, lim). Strictly
speaking, the call expands to a compound statement, which is a complete
statement with no need for a semicolon to end it. However, since it
looks like a function call, it minimizes confusion if you can use it
like a function call, writing a semicolon afterward, as in
SKIP_SPACES (p, lim);
This can cause trouble before else statements, because the
semicolon is actually a null statement. Suppose you write
if (*p != 0)
SKIP_SPACES (p, lim);
else … |
The presence of two statements--the compound statement and a null
statement--in between the if condition and the else
makes invalid C code.
The definition of the macro SKIP_SPACES can be altered to solve
this problem, using a do … while statement. Here is how:
#define SKIP_SPACES(p, limit) \
do { char *lim = (limit); \
while (p < lim) { \
if (*p++ != ' ') { \
p--; break; }}} \
while (0) |
Now SKIP_SPACES (p, lim); expands into
which is one statement. The loop executes exactly once; most compilers
generate no extra code for it.
3.10.4. Duplication of Side Effects
Many C programs define a macro min, for "minimum", like this:
#define min(X, Y) ((X) < (Y) ? (X) : (Y)) |
When you use this macro with an argument containing a side effect,
as shown here,
next = min (x + y, foo (z)); |
it expands as follows:
next = ((x + y) < (foo (z)) ? (x + y) : (foo (z))); |
where x + y has been substituted for X and foo (z)
for Y.
The function foo is used only once in the statement as it appears
in the program, but the expression foo (z) has been substituted
twice into the macro expansion. As a result, foo might be called
two times when the statement is executed. If it has side effects or if
it takes a long time to compute, the results might not be what you
intended. We say that min is an unsafe macro.
The best solution to this problem is to define min in a way that
computes the value of foo (z) only once. The C language offers
no standard way to do this, but it can be done with GNU extensions as
follows:
#define min(X, Y) \
({ typeof (X) x_ = (X); \
typeof (Y) y_ = (Y); \
(x_ < y_) ? x_ : y_; }) |
The ({ … }) notation produces a compound statement that
acts as an expression. Its value is the value of its last statement.
This permits us to define local variables and assign each argument to
one. The local variables have underscores after their names to reduce
the risk of conflict with an identifier of wider scope (it is impossible
to avoid this entirely). Now each argument is evaluated exactly once.
If you do not wish to use GNU C extensions, the only solution is to be
careful when using the macro min. For example, you can
calculate the value of foo (z), save it in a variable, and use
that variable in min:
#define min(X, Y) ((X) < (Y) ? (X) : (Y))
…
{
int tem = foo (z);
next = min (x + y, tem);
}
|
(where we assume that foo returns type int).
3.10.5. Self-Referential Macros
A self-referential macro is one whose name appears in its
definition. Recall that all macro definitions are rescanned for more
macros to replace. If the self-reference were considered a use of the
macro, it would produce an infinitely large expansion. To prevent this,
the self-reference is not considered a macro call. It is passed into
the preprocessor output unchanged. Consider an example:
where foo is also a variable in your program.
Following the ordinary rules, each reference to foo will expand
into (4 + foo); then this will be rescanned and will expand into
(4 + (4 + foo)); and so on until the computer runs out of memory.
The self-reference rule cuts this process short after one step, at
(4 + foo). Therefore, this macro definition has the possibly
useful effect of causing the program to add 4 to the value of foo
wherever foo is referred to.
In most cases, it is a bad idea to take advantage of this feature. A
person reading the program who sees that foo is a variable will
not expect that it is a macro as well. The reader will come across the
identifier foo in the program and think its value should be that
of the variable foo, whereas in fact the value is four greater.
One common, useful use of self-reference is to create a macro which
expands to itself. If you write
then the macro EPERM expands to EPERM. Effectively, it is
left alone by the preprocessor whenever it's used in running text. You
can tell that it's a macro with #ifdef. You might do this if you
want to define numeric constants with an enum, but have
#ifdef be true for each constant.
If a macro x expands to use a macro y, and the expansion of
y refers to the macro x, that is an indirect
self-reference of x. x is not expanded in this case
either. Thus, if we have
#define x (4 + y)
#define y (2 * x) |
then x and y expand as follows:
x ==> (4 + y)
==> (4 + (2 * x))
y ==> (2 * x)
==> (2 * (4 + y))
|
Each macro is expanded when it appears in the definition of the other
macro, but not when it indirectly appears in its own definition.
3.10.6. Argument Prescan
Macro arguments are completely macro-expanded before they are
substituted into a macro body, unless they are stringified or pasted
with other tokens. After substitution, the entire macro body, including
the substituted arguments, is scanned again for macros to be expanded.
The result is that the arguments are scanned twice to expand
macro calls in them.
Most of the time, this has no effect. If the argument contained any
macro calls, they are expanded during the first scan. The result
therefore contains no macro calls, so the second scan does not change
it. If the argument were substituted as given, with no prescan, the
single remaining scan would find the same macro calls and produce the
same results.
You might expect the double scan to change the results when a
self-referential macro is used in an argument of another macro
(Section 3.10.5 Self-Referential Macros): the self-referential macro would be
expanded once in the first scan, and a second time in the second scan.
However, this is not what happens. The self-references that do not
expand in the first scan are marked so that they will not expand in the
second scan either.
You might wonder, "Why mention the prescan, if it makes no difference?
And why not skip it and make the preprocessor faster?" The answer is
that the prescan does make a difference in three special cases:
Nested calls to a macro.
We say that nested calls to a macro occur when a macro's argument
contains a call to that very macro. For example, if f is a macro
that expects one argument, f (f (1)) is a nested pair of calls to
f. The desired expansion is made by expanding f (1) and
substituting that into the definition of f. The prescan causes
the expected result to happen. Without the prescan, f (1) itself
would be substituted as an argument, and the inner use of f would
appear during the main scan as an indirect self-reference and would not
be expanded.
Macros that call other macros that stringify or concatenate.
If an argument is stringified or concatenated, the prescan does not
occur. If you want to expand a macro, then stringify or
concatenate its expansion, you can do that by causing one macro to call
another macro that does the stringification or concatenation. For
instance, if you have
#define AFTERX(x) X_ ## x
#define XAFTERX(x) AFTERX(x)
#define TABLESIZE 1024
#define BUFSIZE TABLESIZE |
then AFTERX(BUFSIZE) expands to X_BUFSIZE, and
XAFTERX(BUFSIZE) expands to X_1024. (Not to
X_TABLESIZE. Prescan always does a complete expansion.)
Macros used in arguments, whose expansions contain unshielded commas.
This can cause a macro expanded on the second scan to be called with the
wrong number of arguments. Here is an example:
#define foo a,b
#define bar(x) lose(x)
#define lose(x) (1 + (x)) |
We would like bar(foo) to turn into (1 + (foo)), which
would then turn into (1 + (a,b)). Instead, bar(foo)
expands into lose(a,b), and you get an error because lose
requires a single argument. In this case, the problem is easily solved
by the same parentheses that ought to be used to prevent misnesting of
arithmetic operations:
#define foo (a,b)
or #define bar(x) lose((x)) |
The extra pair of parentheses prevents the comma in foo's
definition from being interpreted as an argument separator.
3.10.7. Newlines in Arguments
The invocation of a function-like macro can extend over many logical
lines. However, in the present implementation, the entire expansion
comes out on one line. Thus line numbers emitted by the compiler or
debugger refer to the line the invocation started on, which might be
different to the line containing the argument causing the problem.
Here is an example illustrating this:
#define ignore_second_arg(a,b,c) a; c
ignore_second_arg (foo (),
ignored (),
syntax error); |
The syntax error triggered by the tokens syntax error results in
an error message citing line three--the line of ignore_second_arg--
even though the problematic code comes from line five.
We consider this a bug, and intend to fix it in the near future.