An Introduction to GCC - Source-level optimization

6.1 Source-level optimization

The first form of optimization used by GCC occurs at the source-code level, and does not require any knowledge of the machine instructions. There are many source-level optimization techniques--this section describes two common types: common subexpression elimination and function inlining.

6.1.1 Common subexpression elimination

One method of source-level optimization which is easy to understand involves computing an expression in the source code with fewer instructions, by reusing already-computed results. For example, the following assignment:

x = cos(v)*(1+sin(u/2)) + sin(w)*(1-sin(u/2))

can be rewritten with a temporary variable t to eliminate an unnecessary extra evaluation of the term sin(u/2):

t = sin(u/2)
x = cos(v)*(1+t) + sin(w)*(1-t)

This rewriting is called common subexpression elimination (CSE), and is performed automatically when optimization is turned on.⁽¹⁹⁾ Common subexpression elimination is powerful, because it simultaneously increases the speed and reduces the size of the code.

6.1.2 Function inlining

Another type of source-level optimization, called function inlining, increases the efficiency of frequently-called functions.

Whenever a function is used, a certain amount of extra time is required for the CPU to carry out the call: it must store the function arguments in the appropriate registers and memory locations, jump to the start of the function (bringing the appropriate virtual memory pages into physical memory or the CPU cache if necessary), begin executing the code, and then return to the original point of execution when the function call is complete. This additional work is referred to as function-call overhead. Function inlining eliminates this overhead by replacing calls to a function by the code of the function itself (known as placing the code in-line).

In most cases, function-call overhead is a negligible fraction of the total run-time of a program. It can become significant only when there are functions which contain relatively few instructions, and these functions account for a substantial fraction of the run-time--in this case the overhead then becomes a large proportion of the total run-time.

Inlining is always favorable if there is only one point of invocation of a function. It is also unconditionally better if the invocation of a function requires more instructions (memory) than moving the body of the function in-line. This is a common situation for simple accessor functions in C++, which can benefit greatly from inlining. Moreover, inlining may facilitate further optimizations, such as common subexpression elimination, by merging several separate functions into a single large function.

The following function sq(x) is a typical example of a function that would benefit from being inlined. It computes x^2, the square of its argument x:

double
sq (double x)
{
  return x * x;
}

This function is small, so the overhead of calling it is comparable to the time taken to execute the single multiplication carried out by the function itself. If this function is used inside a loop, such as the one below, then the function-call overhead would become substantial:

for (i = 0; i < 1000000; i++)
  {
     sum += sq (i + 0.5);
  }

Optimization with inlining replaces the inner loop of the program with the body of the function, giving the following code:

for (i = 0; i < 1000000; i++)
  {
    double t = (i + 0.5);  /* temporary variable */
    sum += t * t;
  }

Eliminating the function call and performing the multiplication in-line allows the loop to run with maximum efficiency.

GCC selects functions for inlining using a number of heuristics, such as the function being suitably small. As an optimization, inlining is carried out only within each object file. The inline keyword can be used to request explicitly that a specific function should be inlined wherever possible, including its use in other files.⁽²⁰⁾ The GCC Reference Manual "Using GCC" provides full details of the inline keyword, and its use with the static and extern qualifiers to control the linkage of explicitly inlined functions (see section Further reading).

<<< previous

table of contents

next >>>