Input and output. Talking to the user. Why your printer is a file.
In order for a program to do anything useful, it usually must do some
kind of input and output, whether input from the keyboard and output to
the screen, or input from and output to the computer's hard disk. While
the C language itself does not provide much in the way of input and
output functions, the GNU C Library contains so many facilities for
input and output that a whole book could be written about them. In this
chapter, we will focus on the basics. For more information on the
functions described in this chapter, and many more, we urge you to
consult Table of Contents.
Most objects from which you can receive input and to which you can send
output on a GNU system are considered to be files -- not only are files
on your hard disk (such as object code files, C source code files, and
ordinary ASCII text files) considered to be files, but also such
peripherals as your printer, your keyboard, and your computer monitor.
When you write a C program that prompts the user for input from the
keyboard, your program is reading from, or accepting input from,
the keyboard, in much the same way that it would read a text string from
a text file. Similarly, when your C program displays a text string on
the user's monitor, it is writing to, or sending output to, the
terminal, just as though it were writing a text string to a text file.
In fact, in many cases you'll be using the very same functions to read
text from the keyboard and from text files, and to write text to the
terminal and to text files.
This curious fact will be explored later in the chapter. For now it is
sufficient to say that when C treats your computer's peripherals as
files, they are known as devices, and each one has its own name,
called a device name or pseudo-device name. On a GNU
system, the printer might be called /dev/lp0 (for "device line
printer zero") and the first floppy drive might be called
/dev/fd0 (for "device floppy drive zero"). (Why zero in both
cases? Most objects in the GNU environment are counted by starting with
zero, rather than one -- just as arrays in C are zero-based.)
The advantage of treating devices as files is that it is often not
necessary to know how a particular device works, only that it is
connected to the computer and can be written to or read from. For
example, C programs often get their input from the keyboard, which C
refers to with the file name stdin (for "standard input"), and
C programs often send their output to the monitor's text display,
referred to as stdout. In some cases, stdin and
stdout may refer to things other than the keyboard and monitor;
for example, the user may be redirecting the output from your program to
a text file with the > command in GNU/Linux. The beauty of
the way the standard input/output library handles things is that your
program will work just the same.
Before you can read from or write to a file, you must first connect to
it, or open it, usually by either the fopen command, which
returns its stream, or the open command, which returns its file
descriptor. You can open a file for reading, writing, or both. You can
also open a file for appending, that is, writing data after the
current end of the file.
Files are made known to functions not by their file names, except in a
few cases, but by identifiers called "streams" or "file
descriptors". For example, printf uses a stream as an
identifier, not the name of the file. So does fclose:
fprintf (my_stream, "Just a little hello from fprintf.\n");
close_error = fclose (my_stream);
On the other hand, fopen takes a name, and returns a stream:
my_stream = fopen (my_filename, "w");
This is how you map from names to streams or file descriptors: you open
the file (for reading, writing, or both, or for appending), and the
value returned from the open or fopen function is the
appropriate file descriptor or stream.
You can operate on a file either at a high level or at a low level.
Operating on a file at a high level means that you are using the file at
a high level of abstraction. (See Introduction, to refresh your
memory about the distinction between high and low levels of
abstraction.) Using high-level functions is usually safer and more
convenient than using low-level functions, so we will mostly concern
ourselves with high-level functions in this chapter, although we will
touch on some low-level functions toward the end.
A high-level connection opened to a file is called a stream. A
low-level connection to a file is called a file descriptor.
Streams and file descriptors have different data types, as we shall see.
You must pass either a stream or a file descriptor to most input/output
functions, to tell them which file they are operating on. Certain
functions (usually high-level ones) expect to be passed streams, while
others (usually low-level ones) expect file descriptors. A few
functions will accept a simple filename instead of a stream or file
descriptor, but generally these are only the functions that initialize
streams or file descriptors in the first place.
You may consider it a nuisance to have to use a stream or a file
descriptor to access your file when a simple file name would seem to
suffice, but these two mechanisms allow a level of abstraction to exist
between your code and your files. Remember the "black box" analogy we
explored at the beginning of the book. By using the data in files only
through streams or file descriptors, we are guaranteed the ability to
write a rich variety of functions that can exploit the behavior of these
two "black box" abstractions.
Interestingly enough, although streams are considered to be for
"high-level" input/output, and file descriptors for "low-level" I/O,
and GNU systems support both, more Unix-like systems support streams
than file descriptors. You can expect any system running ISO C to
support streams, but non-GNU systems may not support file descriptors at
all, or may only implement a subset of the GNU functions that operate on
file descriptors. Most of the file descriptor functions in the GNU
library are included in the POSIX.1 standard, however.
Once you have finished your input and output operations on the file, you
must terminate your connection to it. This is called closing the
file. Once you have closed a file, you cannot read from or write to it
anymore until you open it again.
In summary, to use a file, a program must go through the following routine:
Open the file for reading, writing, or both.
Read from or write to the file as appropriate, using file-handling
functions provided by the GNU C Library.