To understand why FreeBSD uses the elf(5) format, you
must first know a little about the three currently “dominant” executable
formats for UNIX®:
-
a.out(5)
The oldest and “classic” UNIX object
format. It uses a short and compact header with a magic number at the beginning that is
often used to characterize the format (see a.out(5) for more
details). It contains three loaded segments: .text, .data, and .bss plus a symbol table
and a string table.
-
COFF
The SVR3 object format. The header now comprises a section table, so you can have more
than just .text, .data, and .bss sections.
-
elf(5)
The successor to COFF, featuring multiple sections
and 32-bit or 64-bit possible values. One major drawback: ELF was also designed with the assumption that there would be
only one ABI per system architecture. That assumption is actually quite incorrect, and
not even in the commercial SYSV world (which has at least three ABIs: SVR4, Solaris, SCO)
does it hold true.
FreeBSD tries to work around this problem somewhat by providing a utility for branding a known ELF executable with information about the ABI it is compliant
with. See the manual page for brandelf(1) for more
information.
FreeBSD comes from the “classic” camp and used the a.out(5) format, a
technology tried and proven through many generations of BSD releases, until the beginning
of the 3.X branch. Though it was possible to build and run native ELF binaries (and kernels) on a FreeBSD system for some time
before that, FreeBSD initially resisted the “push” to switch to ELF as the default format. Why? Well, when the Linux camp made
their painful transition to ELF, it was not so much to
flee the a.out executable format as it was their inflexible
jump-table based shared library mechanism, which made the construction of shared
libraries very difficult for vendors and developers alike. Since the ELF tools available offered a solution to the shared library
problem and were generally seen as “the way forward” anyway, the migration
cost was accepted as necessary and the transition made. FreeBSD's shared library
mechanism is based more closely on Sun's SunOS™
style shared library mechanism and, as such, is very easy to use.
So, why are there so many different formats?
Back in the dim, dark past, there was simple hardware. This simple hardware supported
a simple, small system. a.out was completely adequate for the
job of representing binaries on this simple system (a PDP-11). As people ported UNIX from this simple system, they retained the a.out format because it was sufficient for the early ports of UNIX to architectures like the Motorola 68k, VAXen, etc.
Then some bright hardware engineer decided that if he could force software to do some
sleazy tricks, then he would be able to shave a few gates off the design and allow his
CPU core to run faster. While it was made to work with this new kind of hardware (known
these days as RISC), a.out
was ill-suited for this hardware, so many formats were developed to get to a better
performance from this hardware than the limited, simple a.out
format could offer. Things like COFF, ECOFF, and a few obscure others were invented and their
limitations explored before things seemed to settle on ELF.
In addition, program sizes were getting huge and disks (and physical memory) were
still relatively small so the concept of a shared library was born. The VM system also
became more sophisticated. While each one of these advancements was done using the a.out format, its usefulness was stretched more and more with each
new feature. In addition, people wanted to dynamically load things at run time, or to
junk parts of their program after the init code had run to save in core memory and swap
space. Languages became more sophisticated and people wanted code called before main
automatically. Lots of hacks were done to the a.out format to
allow all of these things to happen, and they basically worked for a time. In time, a.out was not up to handling all these problems without an ever
increasing overhead in code and complexity. While ELF
solved many of these problems, it would be painful to switch from the system that
basically worked. So ELF had to wait until it was more
painful to remain with a.out than it was to migrate to ELF.
However, as time passed, the build tools that FreeBSD derived their build tools from
(the assembler and loader especially) evolved in two parallel trees. The FreeBSD tree
added shared libraries and fixed some bugs. The GNU folks that originally wrote these
programs rewrote them and added simpler support for building cross compilers, plugging in
different formats at will, and so on. Since many people wanted to build cross compilers
targeting FreeBSD, they were out of luck since the older sources that FreeBSD had for as and ld were not up to the task. The
new GNU tools chain (binutils) does support cross compiling,
ELF, shared libraries, C++ extensions, etc. In
addition, many vendors are releasing ELF binaries, and
it is a good thing for FreeBSD to run them.
ELF is more expressive than a.out and allows more extensibility in the base system. The
ELF tools are better maintained, and offer cross
compilation support, which is important to many people. ELF may be a little slower than a.out, but trying to measure it can be difficult. There are also
numerous details that are different between the two in how they map pages, handle init
code, etc. None of these are very important, but they are differences. In time support
for a.out will be moved out of the GENERIC kernel, and eventually removed from the kernel once the
need to run legacy a.out programs is past.