Chapter 41. OProfile
OProfile is a low overhead, system-wide performance monitoring
tool. It uses the performance monitoring hardware on the processor
to retrieve information about the kernel and executables on the
system, such as when memory is referenced, the number of L2 cache
requests, and the number of hardware interrupts received. On a Red
Hat Enterprise Linux system, the oprofile
RPM package must be installed to use this tool.
Many processors include dedicated performance monitoring
hardware. This hardware makes it possible to detect when certain
events happen (such as the requested data not being in cache). The
hardware normally takes the form of one or more counters that are incremented each time an event
takes place. When the counter value, essentially rolls over, an
interrupt is generated, making it possible to control the amount of
detail (and therefore, overhead) produced by performance
monitoring.
OProfile uses this hardware (or a timer-based substitute in
cases where performance monitoring hardware is not present) to
collect samples of performance-related
data each time a counter generates an interrupt. These samples are
periodically written out to disk; later, the data contained in
these samples can then be used to generate reports on system-level
and application-level performance.
OProfile is a useful tool, but be aware of some limitations when
using it:
-
Use of shared libraries — Samples
for code in shared libraries are not attributed to the particular
application unless the --separate=library
option is used.
-
Performance monitoring samples are
inexact — When a performance monitoring register triggers
a sample, the interrupt handling is not precise like a divide by
zero exception. Due to the out-of-order execution of instructions
by the processor, the sample may be recorded on a nearby
instruction.
-
opreport does not
associate samples for inline functions' properly —
opreport uses a simple address range
mechanism to determine which function an address is in. Inline
function samples are not attributed to the inline function but
rather to the function the inline function was inserted into.
-
OProfile accumulates data from multiple
runs — OProfile is a system-wide profiler and expects
processes to start up and shut down multiple times. Thus, samples
from multiple runs accumulate. Use the command opcontrol --reset to clear out the samples from
previous runs.
-
Non-CPU-limited performance problems
— OProfile is oriented to finding problems with CPU-limited
processes. OProfile does not identify processes that are asleep
because they are waiting on locks or for some other event to occur
(for example an I/O device to finish an operation).
Table 41-1
provides a brief overview of the tools provided with the oprofile package.
Command |
Description |
op_help |
Displays available events for the system's processor along with
a brief description of each.
|
op_import |
Converts sample database files from a foreign binary format to
the native format for the system. Only use this option when
analyzing a sample database from a different architecture.
|
opannotate |
Creates annotated source for an executable if the application
was compiled with debugging symbols. Refer to Section
41.5.3 Using opannotate for
details. |
opcontrol |
Configures what data is collected. Refer to Section 41.2 Configuring
OProfile for details.
|
opreport |
Retrieves profile data. Refer to Section
41.5.1 Using opreport for
details.
|
oprofiled |
Runs as a daemon to periodically write sample data to disk.
|
Table 41-1. OProfile Commands