|
15.1.3 C Endianness
When a number longer than a single byte is stored in memory, it must be
stored in some particular format. Modern systems do this by storing the
number byte by byte such that the bytes can simply be concatenated into
the final number. However, the order of storage varies: some systems
store the least significant byte at the lowest address in memory, while
some store the most significant byte there. These are referred to as
little-endian and big-endian systems,
respectively.(32)
This difference means that portable code may not make any assumptions
about the order of storage of a number. For example, code like this
will act differently on different systems:
| /* Example of non-portable code; don't do this */
int i = 4;
char c = *(char *) i;
|
Although that was a contrived example, real problems arise when writing
numeric data in a file or across a network connection. If the file or
network connection may be read on a different type of system, numeric
data must be written in a format which can be unambiguously recovered.
It is not portable to simply do something like
| /* Example of non-portable code; don't do this */
write (fd, &i, sizeof i);
| This example is non-portable both because of endianness and because it
assumes that the size of the type of i are the same on both
systems.
Instead, do something like this:
| int j;
char buf[4];
for (j = 0; j < 4; ++j)
buf[j] = (i >> (j * 8)) & 0xff;
write (fd, buf, 4); /* In real code, check the return value */
| This unambiguously writes out a little endian 4 byte value. The code
will work on any system, and the result can be read unambiguously on any
system.
Another approach to handling endianness is to use the htons
and ntohs functions available on most systems. These
functions convert between network endianness and host endianness.
Network endianness is big-endian; it has that name because the standard
TCP/IP network protocols use big-endian ordering.
These functions come in two sizes: htonl and ntohl operate
on 4-byte quantities, and htons and ntohs operate on
2-byte quantities. The hton functions convert host endianness to
network endianness. The ntoh functions convert network
endianness to host endianness. On big-endian systems, these functions
simply return their arguments; on little-endian systems, they return
their arguments after swapping the bytes.
Although these functions are used in a lot of existing code, they can be
difficult to use in highly portable code, because they require knowing
the exact size of your data types. If you know that the type int
is exactly 4 bytes long, then it is possible to write code like the
following:
| int j;
j = htonl (i);
write (fd, &j, 4);
| However, if int is not exactly 4 bytes long, this example will
not work correctly on all systems.
|