Earlier we discussed the standard option -b which chooses among
different installed compilers for completely different target
machines, such as VAX vs. 68000 vs. 80386.
In addition, each of these target machine types can have its own
special options, starting with -m, to choose among various
hardware models or configurations--for example, 68010 vs 68020,
floating coprocessor or none. A single installed version of the
compiler can compile for any model or configuration, according to the
options specified.
Some configurations of the compiler also support additional special
options, usually for compatibility with other compilers on the same
platform.
These options are defined by the macro TARGET_SWITCHES in the
machine description. The default for the options is also defined by
that macro, which enables you to change the defaults.
4.17.1. IBM RS/6000 and PowerPC Options
These -m options are defined for the IBM RS/6000 and PowerPC:
GCC supports two related instruction set architectures for the
RS/6000 and PowerPC. The POWER instruction set are those
instructions supported by the rios chip set used in the original
RS/6000 systems and the PowerPC instruction set is the
architecture of the Motorola MPC5xx, MPC6xx, MPC8xx microprocessors, and
the IBM 4xx microprocessors.
Neither architecture is a subset of the other. However there is a
large common subset of instructions supported by both. An MQ
register is included in processors supporting the POWER architecture.
You use these options to specify which instructions are available on the
processor you are using. The default value of these options is
determined when configuring GCC. Specifying the
-mcpu=cpu_type overrides the specification of these
options. We recommend you use the -mcpu=cpu_type option
rather than the options listed above.
The -mpower option allows GCC to generate instructions that
are found only in the POWER architecture and to use the MQ register.
Specifying -mpower2 implies -power and also allows GCC
to generate instructions that are present in the POWER2 architecture but
not the original POWER architecture.
The -mpowerpc option allows GCC to generate instructions that
are found only in the 32-bit subset of the PowerPC architecture.
Specifying -mpowerpc-gpopt implies -mpowerpc and also allows
GCC to use the optional PowerPC architecture instructions in the
General Purpose group, including floating-point square root. Specifying
-mpowerpc-gfxopt implies -mpowerpc and also allows GCC to
use the optional PowerPC architecture instructions in the Graphics
group, including floating-point select.
The -mpowerpc64 option allows GCC to generate the additional
64-bit instructions that are found in the full PowerPC64 architecture
and to treat GPRs as 64-bit, doubleword quantities. GCC defaults to
-mno-powerpc64.
If you specify both -mno-power and -mno-powerpc, GCC
will use only the instructions in the common subset of both
architectures plus some special AIX common-mode calls, and will not use
the MQ register. Specifying both -mpower and -mpowerpc
permits GCC to use any instruction from either architecture and to
allow use of the MQ register; specify this for the Motorola MPC601.
-mnew-mnemonics, -mold-mnemonics
Select which mnemonics to use in the generated assembler code. With
-mnew-mnemonics, GCC uses the assembler mnemonics defined for
the PowerPC architecture. With -mold-mnemonics it uses the
assembler mnemonics defined for the POWER architecture. Instructions
defined in only one architecture have only one mnemonic; GCC uses that
mnemonic irrespective of which of these options is specified.
GCC defaults to the mnemonics appropriate for the architecture in
use. Specifying -mcpu=cpu_type sometimes overrides the
value of these option. Unless you are building a cross-compiler, you
should normally not specify either -mnew-mnemonics or
-mold-mnemonics, but should instead accept the default.
-mcpu=cpu_type
Set architecture type, register usage, choice of mnemonics, and
instruction scheduling parameters for machine type cpu_type.
Supported values for cpu_type are 401, 403,
405, 405fp, 440, 440fp, 505,
601, 602, 603, 603e, 604,
604e, 620, 630, 740, 7400,
7450, 750, 801, 821, 823,
860, 970, common, ec603e, G3,
G4, G5, power, power2, power3,
power4, power5, powerpc, powerpc64,
rios, rios1, rios2, rsc, and rs64a.
-mcpu=common selects a completely generic processor. Code
generated under this option will run on any POWER or PowerPC processor.
GCC will use only the instructions in the common subset of both
architectures, and will not use the MQ register. GCC assumes a generic
processor model for scheduling purposes.
-mcpu=power, -mcpu=power2, -mcpu=powerpc, and
-mcpu=powerpc64 specify generic POWER, POWER2, pure 32-bit
PowerPC (i.e., not MPC601), and 64-bit PowerPC architecture machine
types, with an appropriate, generic processor model assumed for
scheduling purposes.
The other options specify a specific processor. Code generated under
those options will run best on that processor, and may not run at all on
others.
The -mcpu options automatically enable or disable the
following options: -maltivec, -mhard-float,
-mmfcrf, -mmultiple, -mnew-mnemonics,
-mpower, -mpower2, -mpowerpc64,
-mpowerpc-gpopt, -mpowerpc-gfxopt,
-mstring. The particular options set for any particular CPU
will vary between compiler versions, depending on what setting seems
to produce optimal code for that CPU; it doesn't necessarily reflect
the actual hardware's capabilities. If you wish to set an individual
option to a particular value, you may specify it after the
-mcpu option, like -mcpu=970 -mno-altivec.
On AIX, the -maltivec and -mpowerpc64 options are
not enabled or disabled by the -mcpu option at present, since
AIX does not have full support for these options. You may still
enable or disable them individually if you're sure it'll work in your
environment.
-mtune=cpu_type
Set the instruction scheduling parameters for machine type
cpu_type, but do not set the architecture type, register usage, or
choice of mnemonics, as -mcpu=cpu_type would. The same
values for cpu_type are used for -mtune as for
-mcpu. If both are specified, the code generated will use the
architecture, registers, and mnemonics set by -mcpu, but the
scheduling parameters set by -mtune.
-maltivec, -mno-altivec
These switches enable or disable the use of built-in functions that
allow access to the AltiVec instruction set. You may also need to set
-mabi=altivec to adjust the current ABI with AltiVec ABI
enhancements.
-mabi=spe
Extend the current ABI with SPE ABI extensions. This does not change
the default ABI, instead it adds the SPE ABI extensions to the current
ABI.
-mabi=no-spe
Disable Booke SPE ABI extensions for the current ABI.
-misel=yes/no, -misel
This switch enables or disables the generation of ISEL instructions.
-mspe=yes/no, -mspe
This switch enables or disables the generation of SPE simd
instructions.
-mfloat-gprs=yes/no, -mfloat-gprs
This switch enables or disables the generation of floating point
operations on the general purpose registers for architectures that
support it. This option is currently only available on the MPC8540.
Modify generation of the TOC (Table Of Contents), which is created for
every executable file. The -mfull-toc option is selected by
default. In that case, GCC will allocate at least one TOC entry for
each unique non-automatic variable reference in your program. GCC
will also place floating-point constants in the TOC. However, only
16,384 entries are available in the TOC.
If you receive a linker error message that saying you have overflowed
the available TOC space, you can reduce the amount of TOC space used
with the -mno-fp-in-toc and -mno-sum-in-toc options.
-mno-fp-in-toc prevents GCC from putting floating-point
constants in the TOC and -mno-sum-in-toc forces GCC to
generate code to calculate the sum of an address and a constant at
run-time instead of putting that sum into the TOC. You may specify one
or both of these options. Each causes GCC to produce very slightly
slower and larger code at the expense of conserving TOC space.
If you still run out of space in the TOC even when you specify both of
these options, specify -mminimal-toc instead. This option causes
GCC to make only one TOC entry for every file. When you specify this
option, GCC will produce code that is slower and larger but which
uses extremely little TOC space. You may wish to use this option
only on files that contain less frequently executed code.
-maix64, -maix32
Enable 64-bit AIX ABI and calling convention: 64-bit pointers, 64-bit
long type, and the infrastructure needed to support them.
Specifying -maix64 implies -mpowerpc64 and
-mpowerpc, while -maix32 disables the 64-bit ABI and
implies -mno-powerpc64. GCC defaults to -maix32.
-mxl-call, -mno-xl-call
On AIX, pass floating-point arguments to prototyped functions beyond the
register save area (RSA) on the stack in addition to argument FPRs. The
AIX calling convention was extended but not initially documented to
handle an obscure K&R C case of calling a function that takes the
address of its arguments with fewer arguments than declared. AIX XL
compilers access floating point arguments which do not fit in the
RSA from the stack when a subroutine is compiled without
optimization. Because always storing floating-point arguments on the
stack is inefficient and rarely needed, this option is not enabled by
default and only is necessary when calling subroutines compiled by AIX
XL compilers without optimization.
-mpe
Support IBM RS/6000 SPParallel Environment (PE). Link an
application written to use message passing with special startup code to
enable the application to run. The system must have PE installed in the
standard location (/usr/lpp/ppe.poe/), or the specs file
must be overridden with the -specs= option to specify the
appropriate directory location. The Parallel Environment does not
support threads, so the -mpe option and the -pthread
option are incompatible.
-malign-natural, -malign-power
On Darwin and 64-bit PowerPC GNU/Linux, the option
-malign-natural overrides the ABI-defined alignment of larger
types, such as floating-point doubles, on their natural size-based boundary.
The option -malign-power instructs GCC to follow the ABI-specified
alignment rules. GCC defaults to the standard alignment defined in the ABI.
-msoft-float, -mhard-float
Generate code that does not use (uses) the floating-point register set.
Software floating point emulation is provided if you use the
-msoft-float option, and pass the option to GCC when linking.
-mmultiple, -mno-multiple
Generate code that uses (does not use) the load multiple word
instructions and the store multiple word instructions. These
instructions are generated by default on POWER systems, and not
generated on PowerPC systems. Do not use -mmultiple on little
endian PowerPC systems, since those instructions do not work when the
processor is in little endian mode. The exceptions are PPC740 and
PPC750 which permit the instructions usage in little endian mode.
-mstring, -mno-string
Generate code that uses (does not use) the load string instructions
and the store string word instructions to save multiple registers and
do small block moves. These instructions are generated by default on
POWER systems, and not generated on PowerPC systems. Do not use
-mstring on little endian PowerPC systems, since those
instructions do not work when the processor is in little endian mode.
The exceptions are PPC740 and PPC750 which permit the instructions
usage in little endian mode.
-mupdate, -mno-update
Generate code that uses (does not use) the load or store instructions
that update the base register to the address of the calculated memory
location. These instructions are generated by default. If you use
-mno-update, there is a small window between the time that the
stack pointer is updated and the address of the previous frame is
stored, which means code that walks the stack frame across interrupts or
signals may get corrupted data.
-mfused-madd, -mno-fused-madd
Generate code that uses (does not use) the floating point multiply and
accumulate instructions. These instructions are generated by default if
hardware floating is used.
-mno-bit-align, -mbit-align
On System V.4 and embedded PowerPC systems do not (do) force structures
and unions that contain bit-fields to be aligned to the base type of the
bit-field.
For example, by default a structure containing nothing but 8
unsigned bit-fields of length 1 would be aligned to a 4 byte
boundary and have a size of 4 bytes. By using -mno-bit-align,
the structure would be aligned to a 1 byte boundary and be one byte in
size.
-mno-strict-align, -mstrict-align
On System V.4 and embedded PowerPC systems do not (do) assume that
unaligned memory references will be handled by the system.
-mrelocatable, -mno-relocatable
On embedded PowerPC systems generate code that allows (does not allow)
the program to be relocated to a different address at runtime. If you
use -mrelocatable on any module, all objects linked together must
be compiled with -mrelocatable or -mrelocatable-lib.
-mrelocatable-lib, -mno-relocatable-lib
On embedded PowerPC systems generate code that allows (does not allow)
the program to be relocated to a different address at runtime. Modules
compiled with -mrelocatable-lib can be linked with either modules
compiled without -mrelocatable and -mrelocatable-lib or
with modules compiled with the -mrelocatable options.
-mno-toc, -mtoc
On System V.4 and embedded PowerPC systems do not (do) assume that
register 2 contains a pointer to a global area pointing to the addresses
used in the program.
-mlittle, -mlittle-endian
On System V.4 and embedded PowerPC systems compile code for the
processor in little endian mode. The -mlittle-endian option is
the same as -mlittle.
-mbig, -mbig-endian
On System V.4 and embedded PowerPC systems compile code for the
processor in big endian mode. The -mbig-endian option is
the same as -mbig.
-mdynamic-no-pic
On Darwin systems, compile code so that it is not
relocatable, but that its external references are relocatable. The
resulting code is suitable for applications, but not shared
libraries.
-mprioritize-restricted-insns=priority
This option controls the priority that is assigned to
dispatch-slot restricted instructions during the second scheduling
pass. The argument priority takes the value 0/1/2 to assign
no/highest/second-highest priority to dispatch slot restricted
instructions.
-msched-costly-dep=dependence_type
This option controls which dependences are considered costly
by the target during instruction scheduling. The argument
dependence_type takes one of the following values:
no: no dependence is costly,
all: all dependences are costly,
true_store_to_load: a true dependence from store to load is costly,
store_to_load: any dependence from store to load is costly,
number: any dependence which latency >= number is costly.
-minsert-sched-nops=scheme
This option controls which nop insertion scheme will be used during
the second scheduling pass. The argument scheme takes one of the
following values:
no: Don't insert nops.
pad: Pad with nops any dispatch group which has vacant issue slots,
according to the scheduler's grouping.
regroup_exact: Insert nops to force costly dependent insns into
separate groups. Insert exactly as many nops as needed to force an insn
to a new group, according to the estimated processor grouping.
number: Insert nops to force costly dependent insns into
separate groups. Insert number nops to force an insn to a new group.
-mcall-sysv
On System V.4 and embedded PowerPC systems compile code using calling
conventions that adheres to the March 1995 draft of the System V
Application Binary Interface, PowerPC processor supplement. This is the
default unless you configured GCC using powerpc-*-eabiaix.
-mcall-sysv-eabi
Specify both -mcall-sysv and -meabi options.
-mcall-sysv-noeabi
Specify both -mcall-sysv and -mno-eabi options.
-mcall-linux
On System V.4 and embedded PowerPC systems compile code for the
Linux-based GNU system.
-maix-struct-return
Return all structures in memory (as specified by the AIX ABI).
-msvr4-struct-return
Return structures smaller than 8 bytes in registers (as specified by the
SVR4 ABI).
-mabi=altivec
Extend the current ABI with AltiVec ABI extensions. This does not
change the default ABI, instead it adds the AltiVec ABI extensions to
the current ABI.
-mabi=no-altivec
Disable AltiVec ABI extensions for the current ABI.
-mprototype, -mno-prototype
On System V.4 and embedded PowerPC systems assume that all calls to
variable argument functions are properly prototyped. Otherwise, the
compiler must insert an instruction before every non prototyped call to
set or clear bit 6 of the condition code register (CR) to
indicate whether floating point values were passed in the floating point
registers in case the function takes a variable arguments. With
-mprototype, only calls to prototyped variable argument functions
will set or clear the bit.
-msim
On embedded PowerPC systems, assume that the startup module is called
sim-crt0.o and that the standard C libraries are libsim.a and
libc.a. This is the default for powerpc-*-eabisim.
configurations.
-mmvme
On embedded PowerPC systems, assume that the startup module is called
crt0.o and the standard C libraries are libmvme.a and
libc.a.
-mads
On embedded PowerPC systems, assume that the startup module is called
crt0.o and the standard C libraries are libads.a and
libc.a.
-myellowknife
On embedded PowerPC systems, assume that the startup module is called
crt0.o and the standard C libraries are libyk.a and
libc.a.
-mvxworks
On System V.4 and embedded PowerPC systems, specify that you are
compiling for a VxWorks system.
-mwindiss
Specify that you are compiling for the WindISS simulation environment.
-memb
On embedded PowerPC systems, set the PPC_EMB bit in the ELF flags
header to indicate that eabi extended relocations are used.
-meabi, -mno-eabi
On System V.4 and embedded PowerPC systems do (do not) adhere to the
Embedded Applications Binary Interface (eabi) which is a set of
modifications to the System V.4 specifications. Selecting -meabi
means that the stack is aligned to an 8 byte boundary, a function
__eabi is called to from main to set up the eabi
environment, and the -msdata option can use both r2 and
r13 to point to two separate small data areas. Selecting
-mno-eabi means that the stack is aligned to a 16 byte boundary,
do not call an initialization function from main, and the
-msdata option will only use r13 to point to a single
small data area. The -meabi option is on by default if you
configured GCC using one of the powerpc*-*-eabi* options.
-msdata=eabi
On System V.4 and embedded PowerPC systems, put small initialized
const global and static data in the .sdata2 section, which
is pointed to by register r2. Put small initialized
non-const global and static data in the .sdata section,
which is pointed to by register r13. Put small uninitialized
global and static data in the .sbss section, which is adjacent to
the .sdata section. The -msdata=eabi option is
incompatible with the -mrelocatable option. The
-msdata=eabi option also sets the -memb option.
-msdata=sysv
On System V.4 and embedded PowerPC systems, put small global and static
data in the .sdata section, which is pointed to by register
r13. Put small uninitialized global and static data in the
.sbss section, which is adjacent to the .sdata section.
The -msdata=sysv option is incompatible with the
-mrelocatable option.
-msdata=default, -msdata
On System V.4 and embedded PowerPC systems, if -meabi is used,
compile code the same as -msdata=eabi, otherwise compile code the
same as -msdata=sysv.
-msdata-data
On System V.4 and embedded PowerPC systems, put small global and static
data in the .sdata section. Put small uninitialized global and
static data in the .sbss section. Do not use register r13
to address small data however. This is the default behavior unless
other -msdata options are used.
-msdata=none, -mno-sdata
On embedded PowerPC systems, put all initialized global and static data
in the .data section, and all uninitialized data in the
.bss section.
-G num
On embedded PowerPC systems, put global and static items less than or
equal to num bytes into the small data or bss sections instead of
the normal data or bss section. By default, num is 8. The
-G num switch is also passed to the linker.
All modules should be compiled with the same -G num value.
-mregnames, -mno-regnames
On System V.4 and embedded PowerPC systems do (do not) emit register
names in the assembly language output using symbolic forms.
-mlongcall, -mno-longcall
Default to making all function calls via pointers, so that functions
which reside further than 64 megabytes (67,108,864 bytes) from the
current location can be called. This setting can be overridden by the
shortcall function attribute, or by #pragma longcall(0).
Some linkers are capable of detecting out-of-range calls and generating
glue code on the fly. On these systems, long calls are unnecessary and
generate slower code. As of this writing, the AIX linker can do this,
as can the GNU linker for PowerPC/64. It is planned to add this feature
to the GNU linker for 32-bit PowerPC systems as well.
On Mach-O (Darwin) systems, this option directs the compiler emit to
the glue for every direct call, and the Darwin linker decides whether
to use or discard it.
In the future, we may cause GCC to ignore all longcall specifications
when the linker is known to generate glue.
-pthread
Adds support for multithreading with the pthreads library.
This option sets flags for both the preprocessor and linker.
4.17.2. Darwin Options
These options are defined for all architectures running the Darwin operating
system. They are useful for compatibility with other Mac OS compilers.
-all_load
Loads all members of static archive libraries.
See man ld(1) for more information.
-arch_errors_fatal
Cause the errors having to do with files that have the wrong architecture
to be fatal.
-bind_at_load
Causes the output file to be marked such that the dynamic linker will
bind all undefined references when the file is loaded or launched.
-bundle
Produce a Mach-o bundle format file.
See man ld(1) for more information.
-bundle_loader executable
This specifies the executable that will be loading the build
output file being linked. See man ld(1) for more information.
These options are available for Darwin linker. Darwin linker man page
describes them in detail.
4.17.3. Intel 386 and AMD x86-64 Options
These -m options are defined for the i386 and x86-64 family of
computers:
-mtune=cpu-type
Tune to cpu-type everything applicable about the generated code, except
for the ABI and the set of available instructions. The choices for
cpu-type are:
i386
Original Intel's i386 CPU.
i486
Intel's i486 CPU. (No scheduling is implemented for this chip.)
i586, pentium
Intel Pentium CPU with no MMX support.
pentium-mmx
Intel PentiumMMX CPU based on Pentium core with MMX instruction set support.
i686, pentiumpro
Intel PentiumPro CPU.
pentium2
Intel Pentium2 CPU based on PentiumPro core with MMX instruction set support.
pentium3, pentium3m
Intel Pentium3 CPU based on PentiumPro core with MMX and SSE instruction set
support.
pentium-m
Low power version of Intel Pentium3 CPU with MMX, SSE and SSE2 instruction set
support. Used by Centrino notebooks.
pentium4, pentium4m
Intel Pentium4 CPU with MMX, SSE and SSE2 instruction set support.
prescott
Improved version of Intel Pentium4 CPU with MMX, SSE, SSE2 and SSE3 instruction
set support.
nocona
Improved version of Intel Pentium4 CPU with 64-bit extensions, MMX, SSE,
SSE2 and SSE3 instruction set support.
k6
AMD K6 CPU with MMX instruction set support.
k6-2, k6-3
Improved versions of AMD K6 CPU with MMX and 3dNOW! instruction set support.
athlon, athlon-tbird
AMD Athlon CPU with MMX, 3dNOW!, enhanced 3dNOW! and SSE prefetch instructions
support.
athlon-4, athlon-xp, athlon-mp
Improved AMD Athlon CPU with MMX, 3dNOW!, enhanced 3dNOW! and full SSE
instruction set support.
k8, opteron, athlon64, athlon-fx
AMD K8 core based CPUs with x86-64 instruction set support. (This supersets
MMX, SSE, SSE2, 3dNOW!, enhanced 3dNOW! and 64-bit instruction set extensions.)
winchip-c6
IDT Winchip C6 CPU, dealt in same way as i486 with additional MMX instruction
set support.
winchip2
IDT Winchip2 CPU, dealt in same way as i486 with additional MMX and 3dNOW!
instruction set support.
c3
Via C3 CPU with MMX and 3dNOW! instruction set support. (No scheduling is
implemented for this chip.)
c3-2
Via C3-2 CPU with MMX and SSE instruction set support. (No scheduling is
implemented for this chip.)
While picking a specific cpu-type will schedule things appropriately
for that particular chip, the compiler will not generate any code that
does not run on the i386 without the -march=cpu-type option
being used.
-march=cpu-type
Generate instructions for the machine type cpu-type. The choices
for cpu-type are the same as for -mtune. Moreover,
specifying -march=cpu-type implies -mtune=cpu-type.
-mcpu=cpu-type
A deprecated synonym for -mtune.
-m386, -m486, -mpentium, -mpentiumpro
These options are synonyms for -mtune=i386, -mtune=i486,
-mtune=pentium, and -mtune=pentiumpro respectively.
These synonyms are deprecated.
-mfpmath=unit
Generate floating point arithmetics for selected unit unit. The choices
for unit are:
387
Use the standard 387 floating point coprocessor present majority of chips and
emulated otherwise. Code compiled with this option will run almost everywhere.
The temporary results are computed in 80bit precision instead of precision
specified by the type resulting in slightly different results compared to most
of other chips. See -ffloat-store for more detailed description.
This is the default choice for i386 compiler.
sse
Use scalar floating point instructions present in the SSE instruction set.
This instruction set is supported by Pentium3 and newer chips, in the AMD line
by Athlon-4, Athlon-xp and Athlon-mp chips. The earlier version of SSE
instruction set supports only single precision arithmetics, thus the double and
extended precision arithmetics is still done using 387. Later version, present
only in Pentium4 and the future AMD x86-64 chips supports double precision
arithmetics too.
For i387 you need to use -march=cpu-type, -msse or
-msse2 switches to enable SSE extensions and make this option
effective. For x86-64 compiler, these extensions are enabled by default.
The resulting code should be considerably faster in the majority of cases and avoid
the numerical instability problems of 387 code, but may break some existing
code that expects temporaries to be 80bit.
This is the default choice for the x86-64 compiler.
sse,387
Attempt to utilize both instruction sets at once. This effectively double the
amount of available registers and on chips with separate execution units for
387 and SSE the execution resources too. Use this option with care, as it is
still experimental, because the GCC register allocator does not model separate
functional units well resulting in instable performance.
-masm=dialect
Output asm instructions using selected dialect. Supported choices are
intel or att (the default one).
-mieee-fp, -mno-ieee-fp
Control whether or not the compiler uses IEEE floating point
comparisons. These handle correctly the case where the result of a
comparison is unordered.
-msoft-float
Generate output containing library calls for floating point.
Warning: the requisite libraries are not part of GCC.
Normally the facilities of the machine's usual C compiler are used, but
this can't be done directly in cross-compilation. You must make your
own arrangements to provide suitable library functions for
cross-compilation.
On machines where a function returns floating point results in the 80387
register stack, some floating point opcodes may be emitted even if
-msoft-float is used.
-mno-fp-ret-in-387
Do not use the FPU registers for return values of functions.
The usual calling convention has functions return values of types
float and double in an FPU register, even if there
is no FPU. The idea is that the operating system should emulate
an FPU.
The option -mno-fp-ret-in-387 causes such values to be returned
in ordinary CPU registers instead.
-mno-fancy-math-387
Some 387 emulators do not support the sin, cos and
sqrt instructions for the 387. Specify this option to avoid
generating those instructions. This option is the default on FreeBSD,
OpenBSD and NetBSD. This option is overridden when -march
indicates that the target cpu will always have an FPU and so the
instruction will not need emulation. As of revision 2.6.1, these
instructions are not generated unless you also use the
-funsafe-math-optimizations switch.
-malign-double, -mno-align-double
Control whether GCC aligns double, long double, and
long long variables on a two word boundary or a one word
boundary. Aligning double variables on a two word boundary will
produce code that runs somewhat faster on a Pentium at the
expense of more memory.
Warning: if you use the -malign-double switch,
structures containing the above types will be aligned differently than
the published application binary interface specifications for the 386
and will not be binary compatible with structures in code compiled
without that switch.
-m96bit-long-double, -m128bit-long-double
These switches control the size of long double type. The i386
application binary interface specifies the size to be 96 bits,
so -m96bit-long-double is the default in 32 bit mode.
Modern architectures (Pentium and newer) would prefer long double
to be aligned to an 8 or 16 byte boundary. In arrays or structures
conforming to the ABI, this would not be possible. So specifying a
-m128bit-long-double will align long double
to a 16 byte boundary by padding the long double with an additional
32 bit zero.
In the x86-64 compiler, -m128bit-long-double is the default choice as
its ABI specifies that long double is to be aligned on 16 byte boundary.
Notice that neither of these options enable any extra precision over the x87
standard of 80 bits for a long double.
Warning: if you override the default value for your target ABI, the
structures and arrays containing long double variables will change
their size as well as function calling convention for function taking
long double will be modified. Hence they will not be binary
compatible with arrays or structures in code compiled without that switch.
-msvr3-shlib, -mno-svr3-shlib
Control whether GCC places uninitialized local variables into the
bss or data segments. -msvr3-shlib places them
into bss. These options are meaningful only on System V Release 3.
-mrtd
Use a different function-calling convention, in which functions that
take a fixed number of arguments return with the retnum
instruction, which pops their arguments while returning. This saves one
instruction in the caller since there is no need to pop the arguments
there.
You can specify that an individual function is called with this calling
sequence with the function attribute stdcall. You can also
override the -mrtd option by using the function attribute
cdecl. Section 6.25 Declaring Attributes of Functions.
Warning: this calling convention is incompatible with the one
normally used on Unix, so you cannot use it if you need to call
libraries compiled with the Unix compiler.
Also, you must provide function prototypes for all functions that
take variable numbers of arguments (including printf);
otherwise incorrect code will be generated for calls to those
functions.
In addition, seriously incorrect code will result if you call a
function with too many arguments. (Normally, extra arguments are
harmlessly ignored.)
-mregparm=num
Control how many registers are used to pass integer arguments. By
default, no registers are used to pass arguments, and at most 3
registers can be used. You can control this behavior for a specific
function by using the function attribute regparm.
Section 6.25 Declaring Attributes of Functions.
Warning: if you use this switch, and
num is nonzero, then you must build all modules with the same
value, including any libraries. This includes the system libraries and
startup modules.
-mpreferred-stack-boundary=num
Attempt to keep the stack boundary aligned to a 2 raised to num
byte boundary. If -mpreferred-stack-boundary is not specified,
the default is 4 (16 bytes or 128 bits), except when optimizing for code
size (-Os), in which case the default is the minimum correct
alignment (4 bytes for x86, and 8 bytes for x86-64).
On Pentium and PentiumPro, double and long double values
should be aligned to an 8 byte boundary (see -malign-double) or
suffer significant run time performance penalties. On Pentium III, the
Streaming SIMD Extension (SSE) data type __m128 suffers similar
penalties if it is not 16 byte aligned.
To ensure proper alignment of this values on the stack, the stack boundary
must be as aligned as that required by any value stored on the stack.
Further, every function must be generated such that it keeps the stack
aligned. Thus calling a function compiled with a higher preferred
stack boundary from a function compiled with a lower preferred stack
boundary will most likely misalign the stack. It is recommended that
libraries that use callbacks always use the default setting.
This extra alignment does consume extra stack space, and generally
increases code size. Code that is sensitive to stack space usage, such
as embedded systems and operating system kernels, may want to reduce the
preferred alignment to -mpreferred-stack-boundary=2.
These switches enable or disable the use of built-in functions that allow
direct access to the MMX, SSE, SSE2, SSE3 and 3Dnow extensions of the
instruction set.
To have SSE/SSE2 instructions generated automatically from floating-point
code, see -mfpmath=sse.
-mpush-args, -mno-push-args
Use PUSH operations to store outgoing parameters. This method is shorter
and usually equally fast as method using SUB/MOV operations and is enabled
by default. In some cases disabling it may improve performance because of
improved scheduling and reduced dependencies.
-maccumulate-outgoing-args
If enabled, the maximum amount of space required for outgoing arguments will be
computed in the function prologue. This is faster on most modern CPUs
because of reduced dependencies, improved scheduling and reduced stack usage
when preferred stack boundary is not equal to 2. The drawback is a notable
increase in code size. This switch implies -mno-push-args.
-mthreads
Support thread-safe exception handling on Mingw32. Code that relies
on thread-safe exception handling must compile and link all code with the
-mthreads option. When compiling, -mthreads defines
-D_MT; when linking, it links in a special thread helper library
-lmingwthrd which cleans up per thread exception handling data.
-mno-align-stringops
Do not align destination of inlined string operations. This switch reduces
code size and improves performance in case the destination is already aligned,
but GCC doesn't know about it.
-minline-all-stringops
By default GCC inlines string operations only when destination is known to be
aligned at least to 4 byte boundary. This enables more inlining, increase code
size, but may improve performance of code that depends on fast memcpy, strlen
and memset for short lengths.
-momit-leaf-frame-pointer
Don't keep the frame pointer in a register for leaf functions. This
avoids the instructions to save, set up and restore frame pointers and
makes an extra register available in leaf functions. The option
-fomit-frame-pointer removes the frame pointer for all functions
which might make debugging harder.
-mtls-direct-seg-refs, -mno-tls-direct-seg-refs
Controls whether TLS variables may be accessed with offsets from the
TLS segment register (%gs for 32-bit, %fs for 64-bit),
or whether the thread base pointer must be added. Whether or not this
is legal depends on the operating system, and whether it maps the
segment to cover the entire TLS area.
For systems that use GNU libc, the default is on.
These -m switches are supported in addition to the above
on AMD x86-64 processors in 64-bit environments.
-m32, -m64
Generate code for a 32-bit or 64-bit environment.
The 32-bit environment sets int, long and pointer to 32 bits and
generates code that runs on any i386 system.
The 64-bit environment sets int to 32 bits and long and pointer
to 64 bits and generates code for AMD's x86-64 architecture.
-mno-red-zone
Do not use a so called red zone for x86-64 code. The red zone is mandated
by the x86-64 ABI, it is a 128-byte area beyond the location of the
stack pointer that will not be modified by signal or interrupt handlers
and therefore can be used for temporary data without adjusting the stack
pointer. The flag -mno-red-zone disables this red zone.
-mcmodel=small
Generate code for the small code model: the program and its symbols must
be linked in the lower 2 GB of the address space. Pointers are 64 bits.
Programs can be statically or dynamically linked. This is the default
code model.
-mcmodel=kernel
Generate code for the kernel code model. The kernel runs in the
negative 2 GB of the address space.
This model has to be used for Linux kernel code.
-mcmodel=medium
Generate code for the medium model: The program is linked in the lower 2
GB of the address space but symbols can be located anywhere in the
address space. Programs can be statically or dynamically linked, but
building of shared libraries are not supported with the medium model.
-mcmodel=large
Generate code for the large model: This model makes no assumptions
about addresses and sizes of sections. Currently GCC does not implement
this model.
4.17.4. IA-64 Options
These are the -m options defined for the Intel IA-64 architecture.
-mbig-endian
Generate code for a big endian target. This is the default for HP-UX.
-mlittle-endian
Generate code for a little endian target. This is the default for AIX5
and GNU/Linux.
-mgnu-as, -mno-gnu-as
Generate (or don't) code for the GNU assembler. This is the default.
-mgnu-ld, -mno-gnu-ld
Generate (or don't) code for the GNU linker. This is the default.
-mno-pic
Generate code that does not use a global pointer register. The result
is not position independent code, and violates the IA-64 ABI.
-mvolatile-asm-stop, -mno-volatile-asm-stop
Generate (or don't) a stop bit immediately before and after volatile asm
statements.
-mb-step
Generate code that works around Itanium B step errata.
-mregister-names, -mno-register-names
Generate (or don't) in, loc, and out register names for
the stacked registers. This may make assembler output more readable.
-mno-sdata, -msdata
Disable (or enable) optimizations that use the small data section. This may
be useful for working around optimizer bugs.
-mconstant-gp
Generate code that uses a single constant global pointer value. This is
useful when compiling kernel code.
-mauto-pic
Generate code that is self-relocatable. This implies -mconstant-gp.
This is useful when compiling firmware code.
-minline-float-divide-min-latency
Generate code for inline divides of floating point values
using the minimum latency algorithm.
-minline-float-divide-max-throughput
Generate code for inline divides of floating point values
using the maximum throughput algorithm.
-minline-int-divide-min-latency
Generate code for inline divides of integer values
using the minimum latency algorithm.
-minline-int-divide-max-throughput
Generate code for inline divides of integer values
using the maximum throughput algorithm.
-mno-dwarf2-asm, -mdwarf2-asm
Don't (or do) generate assembler code for the DWARF2 line number debugging
info. This may be useful when not using the GNU assembler.
-mfixed-range=register-range
Generate code treating the given register range as fixed registers.
A fixed register is one that the register allocator can not use. This is
useful when compiling kernel code. A register range is specified as
two registers separated by a dash. Multiple register ranges can be
specified separated by a comma.
-mearly-stop-bits, -mno-early-stop-bits
Allow stop bits to be placed earlier than immediately preceding the
instruction that triggered the stop bit. This can improve instruction
scheduling, but does not always do so.
4.17.5. S/390 and zSeries Options
These are the -m options defined for the S/390 and zSeries architecture.
-mhard-float, -msoft-float
Use (do not use) the hardware floating-point instructions and registers
for floating-point operations. When -msoft-float is specified,
functions in libgcc.a will be used to perform floating-point
operations. When -mhard-float is specified, the compiler
generates IEEE floating-point instructions. This is the default.
-mbackchain, -mno-backchain, -mkernel-backchain
In order to provide a backchain the address of the caller's frame
is stored within the callee's stack frame.
A backchain may be needed to allow debugging using tools that do not understand
DWARF-2 call frame information.
For -mno-backchain no backchain is maintained at all which is the
default.
If one of the other options is present the backchain pointer is placed either
on top of the stack frame (-mkernel-backchain) or on
the bottom (-mbackchain).
Beside the different backchain location -mkernel-backchain
also changes stack frame layout breaking the ABI. This option
is intended to be used for code which internally needs a backchain but has
to get by with a limited stack size e.g. the linux kernel.
Internal unwinding code not using DWARF-2 info has to be able to locate the
return address of a function. That will be eased be the fact that
the return address of a function is placed two words below the backchain
pointer.
-msmall-exec, -mno-small-exec
Generate (or do not generate) code using the bras instruction
to do subroutine calls.
This only works reliably if the total executable size does not
exceed 64k. The default is to use the basr instruction instead,
which does not have this limitation.
-m64, -m31
When -m31 is specified, generate code compliant to the
GNU/Linux for S/390 ABI. When -m64 is specified, generate
code compliant to the GNU/Linux for zSeries ABI. This allows GCC in
particular to generate 64-bit instructions. For the s390
targets, the default is -m31, while the s390x
targets default to -m64.
-mzarch, -mesa
When -mzarch is specified, generate code using the
instructions available on z/Architecture.
When -mesa is specified, generate code using the
instructions available on ESA/390. Note that -mesa is
not possible with -m64.
When generating code compliant to the GNU/Linux for S/390 ABI,
the default is -mesa. When generating code compliant
to the GNU/Linux for zSeries ABI, the default is -mzarch.
-mmvcle, -mno-mvcle
Generate (or do not generate) code using the mvcle instruction
to perform block moves. When -mno-mvcle is specified,
use a mvc loop instead. This is the default.
-mdebug, -mno-debug
Print (or do not print) additional debug information when compiling.
The default is to not print debug information.
-march=cpu-type
Generate code that will run on cpu-type, which is the name of a system
representing a certain processor type. Possible values for
cpu-type are g5, g6, z900, and z990.
When generating code using the instructions available on z/Architecture,
the default is -march=z900. Otherwise, the default is
-march=g5.
-mtune=cpu-type
Tune to cpu-type everything applicable about the generated code,
except for the ABI and the set of available instructions.
The list of cpu-type values is the same as for -march.
The default is the value used for -march.
-mfused-madd, -mno-fused-madd
Generate code that uses (does not use) the floating point multiply and
accumulate instructions. These instructions are generated by default if
hardware floating point is used.
-mwarn-framesize=framesize
Emit a warning if the current function exceeds the given frame size. Because
this is a compile time check it doesn't need to be a real problem when the program
runs. It is intended to identify functions which most probably cause
a stack overflow. It is useful to be used in an environment with limited stack
size e.g. the linux kernel.
-mwarn-dynamicstack
Emit a warning if the function calls alloca or uses dynamically
sized arrays. This is generally a bad idea with a limited stack size.
These arguments always have to be used in conjunction. If they are present the s390
back end emits additional instructions in the function prologue which trigger a trap
if the stack size is stack-guard bytes above the stack-size
(remember that the stack on s390 grows downward). These options are intended to
be used to help debugging stack overflow problems. The addtionally emitted code
cause only little overhead and hence can also be used in production like systems
without greater performance degradation. The given values have to be exact
powers of 2 and stack-size has to be greater than stack-guard.
In order to be effecient the extra code makes the assumption that the stack starts
at an address aligned to the value given by stack-size. So don't expect this
to work correctly with a 8k stack size and an initial stack pointer like 0xffffefff.