Chapter 41. OProfile
OProfile is a low overhead, system-wide performance monitoring tool. It
uses the performance monitoring hardware on the processor to retrieve
information about the kernel and executables on the system, such as when
memory is referenced, the number of L2 cache requests, and the number of
hardware interrupts received. On a Red Hat Enterprise Linux system, the
oprofile RPM package must be installed to use this
tool.
Many processors include dedicated performance monitoring hardware. This
hardware makes it possible to detect when certain events happen (such as
the requested data not being in cache). The hardware normally takes the
form of one or more counters that are incremented
each time an event takes place. When the counter value, essentially rolls
over, an interrupt is generated, making it possible to control the amount
of detail (and therefore, overhead) produced by performance monitoring.
OProfile uses this hardware (or a timer-based substitute in cases where
performance monitoring hardware is not present) to collect
samples of performance-related data each time a
counter generates an interrupt. These samples are periodically written
out to disk; later, the data contained in these samples can then be used
to generate reports on system-level and application-level performance.
OProfile is a useful tool, but be aware of some limitations when using it:
Use of shared libraries — Samples for
code in shared libraries are not attributed to the particular
application unless the --separate=library option is
used.
Performance monitoring samples are inexact
— When a performance monitoring register triggers a sample, the
interrupt handling is not precise like a divide by zero exception. Due
to the out-of-order execution of instructions by the processor, the
sample may be recorded on a nearby instruction.
opreport does not associate samples
for inline functions' properly —
opreport uses a simple address range mechanism to
determine which function an address is in. Inline function samples
are not attributed to the inline function but rather to the function
the inline function was inserted into.
OProfile accumulates data from multiple runs
— OProfile is a system-wide profiler and expects processes to
start up and shut down multiple times. Thus, samples from multiple
runs accumulate. Use the command opcontrol --reset
to clear out the samples from previous runs.
Non-CPU-limited performance problems
— OProfile is oriented to finding problems with CPU-limited
processes. OProfile does not identify processes that are asleep
because they are waiting on locks or for some other event to occur
(for example an I/O device to finish an operation).
Table 41-1 provides a brief overview of the
tools provided with the oprofile package.
Command | Description |
---|
op_help | Displays available events for the system's processor
along with a brief description of each. |
op_import | Converts sample database files from a foreign binary
format to the native format for the system. Only use this option
when analyzing a sample database from a different
architecture. |
opannotate | Creates annotated source for an executable if the
application was compiled with debugging symbols. Refer to
Section 41.5.3 Using opannotate for
details. |
opcontrol | Configures what data is collected. Refer to Section 41.2 Configuring OProfile for details. |
opreport | Retrieves
profile data. Refer to Section 41.5.1 Using opreport
for details. |
oprofiled | Runs as a daemon to periodically write sample data
to disk. |
Table 41-1. OProfile Commands