Section 10.1
Streams, Readers, and Writers
WITHOUT THE ABILITY TO INTERACT WITH
the rest of the world, a program would be useless. The interaction
of a program with the rest of the world is referred to as
input/output or I/O. Historically,
one of the hardest parts of programming language design has
been coming up with good facilities for doing input and output.
A computer can be connected to many different types of input
and output devices. If a programming language had to deal with
each type of device as a special case, the complexity would
be overwhelming. One of the major achievements in the history
of programming has been to come up with good abstractions
for representing I/O devices. In Java, the I/O abstractions
are called streams. This section is
an introduction to streams, but it is not meant to cover them
in full detail. See the official Java documentation for more
information.
When dealing with input/output, you have to keep in mind
that there are two broad categories of data: machine-formatted
data and human-readable data. Machine-formatted data is represented
in the same way that data is represented inside the computer,
that is, as strings of zeros and ones. Human-readable data
is in the form of characters. When you read a number
such as 3.141592654, you are reading a sequence of characters
and interpreting them as a number. The same number would
be represented in the computer as a bit-string that you
would find unrecognizable.
To deal with the two broad categories of data representation, Java
has two broad categories of streams: byte
streams for machine-formatted data and character
streams for human-readable data. There are many predefined
classes that represent streams of each type.
Every object that outputs data to a byte stream belongs to
one of the subclasses of the abstract class OutputStream.
Objects that read data from a byte stream belong to subclasses
of InputStream. If you write numbers to an OutputStream,
you won't be able to read the resulting data yourself. But the
data can be read back into the computer with an InputStream.
The writing and reading of the data will be very efficient, since
there is no translation involved: the bits that are used to
represent the data inside the computer are simply copied to and
from the streams.
For reading and writing human-readable character data, the main
classes are Reader and Writer. All character
stream classes are subclasses of one of these. If a number is to be
written to a Writer stream, the computer must translate it into
a human-readable sequence of characters that represents that number.
Reading a number from a Reader stream into a numeric variable
also involves a translation, from a character sequence into the
appropriate bit string. (Even if the data you are working with consists
of characters in the first place, such as words from a text editor,
there might still be some translation. Characters are stored in
the computer as 16-bit Unicode values. For people who use Western
alphabets, character data is generally stored in files in
ASCII code, which uses only 8 bits per character. The Reader
and Writer classes take care of this translation, and can also handle
non-western alphabets in countries that use them.)
It's usually easy to decide whether to use byte streams or
character streams. If you want the data to be human-readable,
use character streams. Otherwise, use byte streams. I should
note that Java 1.0 did not have character streams, and that for
ASCII-encoded character data, byte streams are largely
interchangeable with character streams. In fact, the
standard input and output streams, System.in
and System.out, are byte streams rather than
character streams. However, as of Java 1.1, you should
use Readers and Writers rather than
InputStreams and OutputStreams when
working with character data.
The standard stream classes discussed in this section
are defined in the package
java.io, along with several supporting classes.
You must import the classes from this package
if you want to use them in your program. That means putting the
directive "import java.io.*;" at the beginning of your
source file. Streams are not used in Java's graphical user interface,
which has its own form of I/O. But they are necessary for working with
files and for doing communication over a network.
They can be also used for communication between
two concurrently running threads,
and there are stream classes for reading and writing
data stored in the computer's memory.
The beauty of the stream abstraction is that it is as easy
to write data to a file or to send data over a network as it is
to print information on the screen.
The basic I/O classes Reader, Writer, InputStream,
and OutputStream provide only very primitive I/O operations.
For example, the InputStream class declares the instance method
public int read() throws IOException
for reading one byte of data (a number in the range 0 to 255) from
an input stream. If the end of the input stream is encountered,
the read() method will return the value -1 instead. If some
error occurs during the input attempt, an IOException is thrown.
Since IOException is an exception class that requires
mandatory exception-handling, this means that you can't use the
read() method except inside a try statement or
in a subroutine that is itself declared with a "throws
IOException" clause. (Exceptions and try...catch
statements were covered in Chapter 9.)
The InputStream class also defines methods for reading several
bytes of data in one step into an array of bytes. However,
InputStream provides no convenient methods for reading other
types of data, such as int or double, from a stream.
This is not a problem because you'll never use an object of
type InputStream itself. Instead, you'll use subclasses
of InputStream that add more convenient input methods to
InputStream's rather primitive capabilities.
Similarly, the OutputStream class defines a primitive output
method for writing one byte of data to an output stream, the method
public void write(int b) throws IOException
but again, in practice, you will almost always use higher-level
output operations defined in some subclass of OutputStream.
The Reader and Writer classes provide very
similar low-level read and write operations.
But in these character-oriented classes, the I/O operations read
and write char values rather than bytes.
In practice, you will use sub-classes of Reader and
Writer, as discussed below.
One of the neat things about Java's I/O package is that it lets you
add capabilities to a stream by "wrapping" it in another stream
object that provides those capabilities. The wrapper object
is also a stream, so you can read from or write
to it -- but you can do so using fancier operations than those
available for basic streams.
For example, PrintWriter is a subclass of
Writer that provides convenient methods
for outputting human-readable character representations
of all of Java's basic data types. If you have an object belonging
to the Writer class, or any of its subclasses, and you would
like to use PrintWriter methods to output data to that Writer,
all you have to do is wrap the Writer in a
PrintWriter object. You do this by constructing a
new PrintWriter object, using the Writer
as input to the constructor. For example, if charSink
is of type Writer, then you could say
PrintWriter printableCharSink = new PrintWriter(charSink);
When you output data to printableCharSink,
using PrintWriter's advanced data output methods,
that data will go to exactly the same place as data written
directly to charSink. You've just provided a better
interface to the same output stream. For example, this allows you to use
PrintWriter methods to send data to a file or over
a network connection.
For the record, the output methods of the PrintWriter
class include:
public void print(String s) // Methods for outputting
public void print(char c) // standard data types
public void print(int i) // to the stream, in
public void print(long l) // human-readable form.
public void print(float f)
public void print(double d)
public void print(boolean b)
public void println() // Output a carriage return to the stream.
public void println(String s) // These methods are identical
public void println(char c) // to the previous set,
public void println(int i) // except that the output
public void println(long l) // value is followed by
public void println(float f) // a carriage return.
public void println(double d
public void println(boolean b)
Note that none of these methods will ever throw an
IOException. Instead, the PrintWriter class
includes the method
public boolean checkError()
which will return true if any error has been encountered while
writing to the stream. The PrintWriter class catches
any IOExceptions internally, and sets the value of
an internal error flag if one occurs. The checkError()
method can be used to check the error flag. This allows you
to use PrintWriter methods without worrying about catching
exceptions. On the other hand, to write a fully robust program,
you should call checkError() to test for possible errors
every time you use a PrintWriter method.
When you use PrintWriter methods to output data to a
stream, the data is converted into the sequence of characters
that represents the data in human-readable form. Suppose you
want to output the data in byte-oriented, machine-formatted form?
The java.io package includes a byte-stream class,
DataOutputStream that can be used for writing data values
to streams in internal, binary-number format. DataOutputStream bears the
same relationship to OutputStream that PrintWriter
bears to Writer. That is, whereas OutputStream only
has methods for outputting bytes, DataOutputStream has
methods writeDouble(double x) for outputting values of type
double, writeInt(int x) for outputting
values of type int,
and so on. Furthermore, you can wrap any OutputStream
in a DataOutputStream so that you can use the higher
level output methods on it. For example, if byteSink
is of type OutputStream, you could say
DataOutputStream dataSink = new DataOutputStream(byteSink);
to wrap byteSink in a DataOutputStream, dataSink.
For input of machine-readable data, such as that created by writing
to a DataOutputStream, java.io provides
the class DataInputStream. You can wrap any InputStream in
a DataInputStream object to provide it with the ability to
read data of various types from the byte-stream.
The methods in the DataInputStream
for reading binary data are called readDouble(), readInt(),
and so on. Data written by a DataOutputStream
is guaranteed to be in a format that can be read by a DataInputStream.
This is true even if the data stream is created on one
type of computer and read on another type of computer. The cross-platform
compatibility of binary data is a major aspect of Java's
platform independence.
Still, the fact remains that much I/O is done in the form of human-readable
characters. In view of this, it is surprising that Java does not
provide a standard character input class that can read character data
in a manner that is reasonably symmetrical with the character output
capabilities of PrintWriter. Fortunately, Java's
object-oriented nature makes it possible to write such a class and
then use it in exactly the same way as if it were a standard part
of the language.
Following this model, I have written a class called TextReader that allows
convenient input of data that was written in human-readable character format. The
source code for this
class is available if you want to read it.
A TextReader can be used as a wrapper for an existing input
stream. The constructor
public TextReader(Reader dataSource)
creates an object that can be used to read
data from the given Reader, dataSource, using
the convenient input methods of the TextReader class.
The methods in my TextReader class are similar to the
static input methods in my TextIO class, except that
TextReaders can be used to read from any input stream,
whereas TextIO can only be used to read from the
standard input stream, System.in.
Instance methods in the TextReader class include:
public char peek() // Look at the next character in the stream,
// without removing it from the stream. If
// the characters in the stream have all
// been read, then the character '\0' is
// returned. If the next character in the
// stream is a carriage return, then a '\n'
// is returned.
public char getAnyChar() // Reads the next character from the
// stream. It can be a whitespace
// character. If all the characters
// in the stream have been read, an
// error occurs.
public void skipWhiteSpace() // Read and discard whitespace
// characters (space, return, tab),
// until a non-whitespace character
// is seen.
public boolean eoln() // Discards spaces or tabs in the stream,
// then tests whether the next char is
// the end of the current line (or the
// end of the data in the stream).
public boolean eof() // Discards any whitespace characters, then
// returns true if all the characters
// in the stream have been read.
public char getChar() // These routines read values of the
public byte getByte() // specified types. In each case,
public short getShort() // the computer skips any whitespace
public int getInt() // characters before trying to read a
public long getLong() // value of the specified type.
public float getFloat() // An error occurs if a value of the
public double getDouble() // correct type is not found. For
public String getWord() // the getWord() routine, a word is
public boolean getBoolean() // considered to be any string of
// non-blank characters. For
// getBoolean(), the input can be any
// of the strings "true", "false", "t",
// "f", "yes", "no", "y", "n", "1",
// or "0", ignoring case.
public String getAlpha() // This is similar to getWord(), except
// that it returns a string consisting
// of letters only. It is also special
// in that it skips over any non-letters
// before reading a word, rather than
// just skipping over white space.
public String getln(); // Reads characters up to the end of the
// current line of input. Then reads
// and discards the carriage return.
// Note that this routine does NOT skip
// leading whitespace characters, and
// that the value returned might be the
// empty string.
public char getlnChar(); // These routines are provided as a
public byte getlnByte(); // convenience. They are equivalent
public short getlnShort(); // to the above routines, except that
public int getlnInt(); // after successfully reading a value
public long getlnLong(); // of the specified type, the computer
public float getlnFloat(); // reads and discards any remaining
public double getlnDouble(); // characters on the same line.
public String getlnString();
public boolean getlnBoolean();
public String getlnAlpha();
For convenience, I also make it possible to wrap an InputStream
in a TextReader object, in the same way that it is possible to
wrap a Reader object in a TextReader.
For example, since System.in is of type InputStream,
you could say:
TextReader in = new TextReader(System.in);
The TextReader, in, could then be used in much
the same way as the TextIO class. For example,
you could use in.getInt() to read an integer from
standard input or use in.getBoolean() to read a boolean value.
The only difference would be that the TextReader does not
handle errors in the input in the same way as TextIO.
In an exactly symmetrical way, you can
wrap an OutputStream in a PrintWriter if you
want to write character data to the stream.
There remains the question of what happens when an error occurs while
one of the input routines in the TextReader class
is being executed. Whoever designed the PrintWriter
class decided not to throw exceptions when errors occur. When I
designed TextReader, I decided to give you a choice.
By default, a routine that encounters an error will
throw an exception belonging to the class TextReader.Error.
This is a static nested class declared inside the TextReader class.
(For information on nested classes, see Section 5.6.)
TextReader.Error is a subclass of the RuntimeException class.
You can catch the error in a try...catch statement and handle it,
if you want. Recall that the compiler does not force you to
use try and catch to
deal with RuntimeExceptions. However, if one
occurs and is not caught, it will crash your program.
If you prefer not to work with exceptions at all, you can
turn off this behavior by calling the TextReader
instance method
public void checkIO(boolean throwExceptions)
with its parameter set to false. In that case, when an error
occurs during input, no exception will be thrown. Instead, the value
of an internal error flag will be set, and the program will continue.
If you use this option, it is your responsibility to check for errors
after each input operation. You can do this with the instance method
public boolean checkError()
This method returns true if the most recent input operation on the
TextReader produced an error, and it returns false
if that operation completed successfully. It is probably easier to write
robust programs by catching and handling exceptions than by continually
checking for possible errors. With both options available, you can
experiment with both styles of error-handling and see which one you prefer.
The classes PrintWriter, TextReader, DataInputStream,
and DataOutputStream allow you to easily input and output all of Java's
primitive data types. But what happens when you want to read and write
objects? Traditionally, you would have to come up with some way of
encoding your object as a sequence of data values belonging to
the primitive types, which can then be output as bytes or characters.
This is called serializing
the object. On input, you have to read the serialized data
and somehow reconstitute a copy of the original object. For complex
objects, this can all be a major chore. However, you can get Java
to do all the work for you by using the classes ObjectInputStream
and ObjectOutputStream. These are subclasses of InputStream
and Outputstream that can be used for writing and reading
serialized objects.
ObjectInputStream and
ObjectOutputStream are wrapper classes that can be wrapped
around arbitrary InputStreams and OutputStreams.
This makes it possible to do object input and output on any byte-stream.
The methods for object I/O are readObject(), in ObjectInputStream,
and writeObject(Object obj), in ObjectOutputStream.
Both of these methods can throw IOExceptions. Note
that readObject() returns a value of type Object,
which generally has to be type-cast to a more useful type.
ObjectInputStream and ObjectOutputStream only work
with objects that implement an interface named Serializable.
Furthermore, all of the instance variables in the object must be
serializable. However, there is little work involved in making an
object serializable, since the Serializable interface does not
declare any methods. It exists only as a marker for the compiler,
to tell it that the object is meant to be writable and readable.
You only need to add the words "implements Serializable"
to your class definitions. Many of Java's standard classes are
already declared to be serializable, including all the component
classes in Swing and in the AWT. This means, in particular, that GUI components
can be written to ObjectOutputStreams and read from
ObjectInputStreams.