GNU C Library (libc) Programming Guide

16.1 Socket Concepts

When you create a socket, you must specify the style of communication you want to use and the type of protocol that should implement it. The communication style of a socket defines the user-level semantics of sending and receiving data on the socket. Choosing a communication style specifies the answers to questions such as these:

What are the units of data transmission? Some communication styles regard the data as a sequence of bytes with no larger structure; others group the bytes into records (which are known in this context as packets).
Can data be lost during normal operation? Some communication styles guarantee that all the data sent arrives in the order it was sent (barring system or network crashes); other styles occasionally lose data as a normal part of operation, and may sometimes deliver packets more than once or in the wrong order.
Designing a program to use unreliable communication styles usually involves taking precautions to detect lost or misordered packets and to retransmit data as needed.
Is communication entirely with one partner? Some communication styles are like a telephone call—you make a connection with one remote socket and then exchange data freely. Other styles are like mailing letters—you specify a destination address for each message you send.

You must also choose a namespace for naming the socket. A socket name (“address”) is meaningful only in the context of a particular namespace. In fact, even the data type to use for a socket name may depend on the namespace. Namespaces are also called “domains”, but we avoid that word as it can be confused with other usage of the same term. Each namespace has a symbolic name that starts with `PF_'. A corresponding symbolic name starting with `AF_' designates the address format for that namespace.

Finally you must choose the protocol to carry out the communication. The protocol determines what low-level mechanism is used to transmit and receive data. Each protocol is valid for a particular namespace and communication style; a namespace is sometimes called a protocol family because of this, which is why the namespace names start with `PF_'.

The rules of a protocol apply to the data passing between two programs, perhaps on different computers; most of these rules are handled by the operating system and you need not know about them. What you do need to know about protocols is this:

In order to have communication between two sockets, they must specify the same protocol.
Each protocol is meaningful with particular style/namespace combinations and cannot be used with inappropriate combinations. For example, the TCP protocol fits only the byte stream style of communication and the Internet namespace.
For each combination of style and namespace there is a default protocol, which you can request by specifying 0 as the protocol number. And that's what you should normally do—use the default.

Throughout the following description at various places variables/parameters to denote sizes are required. And here the trouble starts. In the first implementations the type of these variables was simply int. On most machines at that time an int was 32 bits wide, which created a de facto standard requiring 32-bit variables. This is important since references to variables of this type are passed to the kernel.

Then the POSIX people came and unified the interface with the words "all size values are of type size_t". On 64-bit machines size_t is 64 bits wide, so pointers to variables were no longer possible.

The Unix98 specification provides a solution by introducing a type socklen_t. This type is used in all of the cases that POSIX changed to use size_t. The only requirement of this type is that it be an unsigned type of at least 32 bits. Therefore, implementations which require that references to 32-bit variables be passed can be as happy as implementations which use 64-bit values.