Autoconf, Automake, and Libtool User Guide: C Endianness

15.1.3 C Endianness

When a number longer than a single byte is stored in memory, it must be stored in some particular format. Modern systems do this by storing the number byte by byte such that the bytes can simply be concatenated into the final number. However, the order of storage varies: some systems store the least significant byte at the lowest address in memory, while some store the most significant byte there. These are referred to as little-endian and big-endian systems, respectively.(32)

This difference means that portable code may not make any assumptions about the order of storage of a number. For example, code like this will act differently on different systems:

/* Example of non-portable code; don't do this */ int i = 4; char c = *(char *) i;

Although that was a contrived example, real problems arise when writing numeric data in a file or across a network connection. If the file or network connection may be read on a different type of system, numeric data must be written in a format which can be unambiguously recovered. It is not portable to simply do something like

/* Example of non-portable code; don't do this */ write (fd, &i, sizeof i);
This example is non-portable both because of endianness and because it assumes that the size of the type of i are the same on both systems.

Instead, do something like this:

int j; char buf[4]; for (j = 0; j < 4; ++j) buf[j] = (i >> (j * 8)) & 0xff; write (fd, buf, 4); /* In real code, check the return value */
This unambiguously writes out a little endian 4 byte value. The code will work on any system, and the result can be read unambiguously on any system.

Another approach to handling endianness is to use the htons and ntohs functions available on most systems. These functions convert between network endianness and host endianness. Network endianness is big-endian; it has that name because the standard TCP/IP network protocols use big-endian ordering.

These functions come in two sizes: htonl and ntohl operate on 4-byte quantities, and htons and ntohs operate on 2-byte quantities. The hton functions convert host endianness to network endianness. The ntoh functions convert network endianness to host endianness. On big-endian systems, these functions simply return their arguments; on little-endian systems, they return their arguments after swapping the bytes.

Although these functions are used in a lot of existing code, they can be difficult to use in highly portable code, because they require knowing the exact size of your data types. If you know that the type int is exactly 4 bytes long, then it is possible to write code like the following:

int j; j = htonl (i); write (fd, &j, 4);
However, if int is not exactly 4 bytes long, this example will not work correctly on all systems.

This document was generated by Gary V. Vaughan on February, 8 2006 using texi2html