Section 8.2
Programming with Arrays
ARRAYS ARE THE MOST BASIC AND THE
MOST IMPORTANT type of data structure, and techniques for
processing arrays are among the most important programming
techniques you can learn. Two fundamental array processing
techniques -- searching and sorting -- will be covered in
Section 4. This section introduces
some of the basic ideas of array processing in general.
In many cases, processing an array means applying the
same operation to each item in the array. This is commonly
done with a for loop. A loop for processing all
the items in an array A has the form:
// do any necessary initialization
for (int i = 0; i < A.length; i++) {
. . . // process A[i]
}
Suppose, for example, that A is an array of type
double[]. Suppose that the goal is to add up
all the numbers in the array. An informal algorithm for doing this
would be:
Start with 0;
Add A[0]; (process the first item in A)
Add A[1]; (process the second item in A)
.
.
.
Add A[ A.length - 1 ]; (process the last item in A)
Putting the obvious repetition into a loop and giving a name
to the sum, this becomes:
double sum; // The sum of the numbers in A.
sum = 0; // Start with 0.
for (int i = 0; i < A.length; i++)
sum += A[i]; // add A[i] to the sum, for
// i = 0, 1, ..., A.length - 1
Note that the continuation condition, "i < A.length",
implies that the last value of i that is actually processed
is A.length-1, which is the index of the final item in the
array. It's important to use "<" here,
not "<=", since "<="
would give an array index out of bounds error.
Eventually, you should just about be able to write loops similar
to this one in your sleep. I will give a few more simple examples.
Here is a loop that will count the number of items in the array
A which are less than zero:
int count; // For counting the items.
count = 0; // Start with 0 items counted.
for (int i = 0; i < A.length; i++) {
if (A[i] < 0.0) // if this item is less than zero...
count++; // ...then count it
}
// At this point, the value of count is the number
// of items that have passed the test of being < 0
Replace the test "A[i] < 0.0", if you want
to count the number of items in an array that satisfy some other property.
Here is a variation on the same theme. Suppose you want to count
the number of times that an item in the array A is equal
to the item that follows it. The item that follows A[i]
in the array is A[i+1], so the test in this case
is "if (A[i] == A[i+1])". But there is a catch: This test
cannot be applied when A[i] is the last item in the
array, since then there is no such item as A[i+1].
The result of trying to apply the test in this case would be
an array index out of bounds error. This just means that we have to
stop one item short of the final item:
int count = 0;
for (int i = 0; i < A.length - 1; i++) {
if (A[i] == A[i+1])
count++;
}
Another typical problem is to find the largest number in A.
The strategy is to go through the array, keeping track of the
largest number found so far. We'll store the largest number found
so far in a variable called max. As we look through the array,
whenever we find a number larger than the current value of max,
we change the value of max to that larger value. After the whole
array has been processed, max is the largest item in the
array overall. The only question is, what should the original
value of max be? One possibility is to start with max
equal to A[0], and then to look through the rest of the
array, starting from A[1], for larger items:
double max = A[0];
for (int i = 1; i < A.length; i++) {
if (A[i] > max)
max = A[i];
}
// at this point, max is the largest item in A
(There is one subtle problem here. It's possible in Java for
an array to have length zero. In that case, A[0] doesn't
exist, and the reference to A[0] in the first line
gives an array index out of bounds error. However, zero-length arrays
are normally something that you want to avoid in real problems. Anyway,
what would it mean to ask for the largest item in an array that
contains no items at all?)
As a final example of basic array operations, consider the
problem of copying an array. To make a copy of our sample array
A, it is not sufficient to say
double[] B = A;
since this does not create a new array object. All it does
is declare a new array variable and make it refer to the same
object to which A refers. (So that, for example,
a change to A[i] will automatically change B[i]
as well.) To make a new array that is a copy of A,
it is necessary to make a new array object and to copy each of
the individual items from A into the new array:
double[] B = new double[A.length]; // Make a new array object,
// the same size as A.
for (int i = 0; i < A.length; i++)
B[i] = A[i]; // Copy each item from A to B.
Copying values from one array to another is such a common operation
that Java has a predefined subroutine to do it. The subroutine,
System.arraycopy(), is a static member subroutine in the
standard System class. Its declaration has the form
public static void arraycopy(Object sourceArray, int sourceStartIndex,
Object destArray, int destStartIndex, int count)
where sourceArray and destArray can be arrays
with any base type. Values are copied from sourceArray
to destArray. The count tells how many elements to
copy. Values are taken from sourceArray starting at position
sourceStartIndex and are stored in destArray starting
at position destStartIndex. For example, to make a copy
of the array, A, using this subroutine, you would say:
double B = new double[A.length];
System.arraycopy( A, 0, B, 0, A.length );
An array type, such as double[], is a full-fledged Java
type, so it can be used in all the ways that any other Java type
can be used. In particular, it can be used as the type of a formal
parameter in a subroutine. It can even be the return type of a
function. For example, it might be useful to have a function that
makes a copy of an array of doubles:
double[] copy( double[] source ) {
// Create and return a copy of the array, source.
// If source is null, return null.
if ( source == null )
return null;
double[] cpy; // A copy of the source array.
cpy = new double[source.length];
System.arraycopy( source, 0, cpy, 0, source.length );
return cpy;
}
The main() routine of a program has a parameter of
type String[]. You've seen this used since all the way
back in Chapter 2, but I haven't
really been able to explain it until now. The parameter to the
main() routine is an array of Strings. When the
system calls the main() routine, the strings in this
array are the command-line parameters.
When using a command-line interface, the user types a command to
tell the system to execute a program. The user can include
extra input in this command, beyond the name of the program.
This extra input becomes the command-line parameters. For
example, if the name of the class that contains the
main() routine is myProg, then the
user can type "java myProg" to execute the program. In this case,
there are no command-line parameters. But if the user types
the command "java myProg one two three", then the command-line
parameters are the strings "one", "two", and "three". The system
puts these strings into an array of Strings and passes that
array as a parameter to the main() routine. Here, for
example, is a short program that simply prints out any command line
parameters entered by the user:
public class CLDemo {
public static void main(String[] args) {
System.out.println("You entered " + args.length
+ " command-line parameters.");
if (args.length > 0) {
System.out.println("They were:");
for (int i = 0; i < args.length; i++)
System.out.println(" " + args[i]);
}
} // end main()
} // end class CLDemo
Note that the parameter, args, is never null when main()
is called by the system, but it might be an array of length zero.
In practice, command-line parameters are often the names of files to
be processed by the program. I will give some examples of this in
Chapter 10, when I discuss file processing.
So far, all my examples of array processing have used
sequential access. That is, the
elements of the array were processed one after the other in the
sequence in which they occur in the array. But one of the big
advantages of arrays is that they allow random
access. That is, every element of the array is equally
accessible at any given time.
As an example, let's look at a well-known problem called the
birthday problem: Suppose that there are N people in
a room. What's the chance that there are two people in the room who
have the same birthday? (That is, they were born on the same day
in the same month, but not necessarily in the same year.) Most people
severely underestimate the probability. We actually
look at a different version of the problem: Suppose you choose people
at random and check their birthdays. How many people will you
check before you find one who has the same birthday as someone you've
already checked? Of course, the answer in a particular case depends
on random factors, but we can simulate the experiment with a computer
program and run the program several times to get an idea of how many
people need to be checked on average.
To simulate the experiment, we need to keep track of each birthday
that we find. There are 365 different possible birthdays. (We'll
ignore leap years.) For each possible birthday, we need to know,
has this birthday already been used? The answer is a boolean
value, true or false. To hold this data, we can use an array
of 365 boolean values:
boolean[] used;
used = new boolean[365];
The days of the year are numbered from 0 to 364. The value
of used[i] is true if someone has been selected whose
birthday is day number i. Initially, all the values
in the array, used, are false. When we select someone
whose birthday is day number i, we first check whether
used[i] is true. If so, then this is the second person
with that birthday. We are done. If used[i] is false,
we set used[i] to be true to record the fact that we've
encountered someone with that birthday, and we go on to the
next person. Here is a subroutine that carries out the
simulated experiment (Of course, in the subroutine, there are
no simulated people, only simulated birthdays):
static void birthdayProblem() {
// Simulate choosing people at random and checking the
// day of the year they were born on. If the birthday
// is the same as one that was seen previously, stop,
// and output the number of people who were checked.
boolean[] used; // For recording the possible birthdays
// that have been seen so far. A value
// of true in used[i] means that a person
// whose birthday is the i-th day of the
// year has been found.
int count; // The number of people who have been checked.
used = new boolean[365]; // Initially, all entries are false.
count = 0;
while (true) {
// Select a birthday at random, from 0 to 364.
// If the birthday has already been used, quit.
// Otherwise, record the birthday as used.
int birthday; // The selected birthday.
birthday = (int)(Math.random()*365);
count++;
if ( used[birthday] )
break;
used[birthday] = true;
}
TextIO.putln("A duplicate birthday was found after "
+ count + " tries.");
} // end birthdayProblem()
This subroutine makes essential use of the fact that every element
in a newly created array of booleans is set to be false.
If we wanted to reuse the same array in a second simulation, we would
have to reset all the elements in it to be false with a
for loop
for (int i = 0; i < 365; i++)
used[i] = false;
Here is an applet that will run the simulation as many times
as you like. Are you surprised at how few people have to be chosen,
in general?
One of the examples in Section 6.4 was
an applet that shows multiple copies of a message in random positions,
colors, and fonts. When the user clicks on the applet, the positions,
colors, and fonts are changed to new random values. Like several other
examples from that chapter, the applet had a flaw: It didn't have any
way of storing the data that would be necessary to redraw itself.
Chapter 7 introduced off-screen canvases as a solution to this problem,
but off-screen canvases are not a good solution in every case.
Arrays provide us with an alternative solution. Here's a new
version of the applet. This version uses an array to store the
position, font, and color of each string. When the applet is
painted, this information is used to draw the strings, so it will
redraw itself correctly when it is covered and then uncovered. When
you click on the applet, the array is filled with new random values
and the applet is repainted.
In this applet, the number of copies of the message is given by a
named constant, MESSAGE_COUNT. One way to store the
position, color, and font of MESSAGE_COUNT strings would
be to use four arrays:
int[] x = new int[MESSAGE_COUNT];
int[] y = new int[MESSAGE_COUNT];
Color[] color = new Color[MESSAGE_COUNT];
Font[] font = new Font[MESSAGE_COUNT];
These arrays would be filled with random values. In the paintComponent()
method, the i-th copy of the string would be drawn at
the point (x[i],y[i]). Its color would be given by
color[i]. And it would be drawn in the font font[i].
This would be accomplished by the paintComponent() method
public void paintComponent(Graphics g) {
super.paintComponent(); // (Fill with background color.)
for (int i = 0; i < MESSAGE_COUNT; i++) {
g.setColor( color[i] );
g.setFont( font[i] );
g.drawString( message, x[i], y[i] );
}
}
This approach is said to use parallel
arrays. The data for a given copy of the message is spread out
across several arrays. If you think of the arrays as laid out in
parallel columns -- array x in the first column, array y
in the second, array color in the third, and array font
in the fourth -- then the data for the i-th string
can be found along the the i-th row. There is nothing wrong with
using parallel arrays in this simple example, but it does go against
the object-oriented philosophy of keeping related data in one object.
If we follow this rule, then we don't have to imagine
the relationship among the data because
all the data for one copy of the message is physically in one place.
So, when I wrote the applet, I made a simple class to represent all
the data that is needed for one copy of message:
class StringData {
// Data for one copy of the message.
int x,y; // Position of the message.
Color color; // Color of the message.
Font font; // Font used for the message.
}
(This class is actually defined as a static nested class in
the main applet class.)
To store the data for multiple copies of the message, I use
an array of type StringData[]. The array is declared
as an instance variable, with the name data:
StringData[] data;
Of course, the value of data is null until an actual
array is created and assigned to it. This is done in the init()
method of the applet with the statement
data = new StringData[MESSAGE_COUNT];
Just after this array is created, the value of each element in
the array is null. We want to store data in objects
of type StringData, but no such objects exist yet.
All we have is an array of variables that are capable of referring
to such objects. I decided to create the objects in the applet's
init method. (It could be done in other places -- just so long
as we avoid trying to use to an object that doesn't exist. This
is important: Remember that a newly created array whose base
type is an object type is always filled
with null elements. There are no objects in the array
until you put them there.)
The objects are created with the for loop
for (int i = 0; i < MESSAGE_COUNT; i++)
data[i] = new StringData();
Now, the idea is to store data for the i-th copy of
the message in the variables data[i].x, data[i].y,
data[i].color, and data[i].font. (Make sure that you
understand the notation here: data[i] refers to an object.
That object contains instance variables. The notation data[i].x
tells the computer: "Find your way to the object that is referred to
by data[i]. Then go to the instance variable named x
in that object." Variable names can get even more complicated than
this.) Using the array, data, the paintComponent() method for
the applet becomes
public void paintComponent(Graphics g) {
super.paintComponent(g); // (Fill with background color.)
for (int i = 0; i < MESSAGE_COUNT; i++) {
g.setColor( data[i].color );
g.setFont( data[i].font );
g.drawString( message, data[i].x, data[i].y );
}
}
There is still the matter of filling the array, data, with
random values. If you are interested, you can look at the source
code for the applet, RandomStringsWithArray.java.
The applet actually uses one other array. The font for a given copy
of the message is chosen at random from a set of five possible fonts.
In the original version of the applet, there were five variables of
type Font to represent the fonts. The variables were named
font1, font2, font3, font4,
and font5. To select one of these fonts at random,
a switch statement could be used:
Font randomFont; // One of the 5 fonts, chosen at random.
int rand; // A random integer in the range 0 to 4.
rand = (int)(Math.random() * 5);
switch (rand) {
case 0:
randomFont = font1;
break;
case 1:
randomFont = font2;
break;
case 2:
randomFont = font3;
break;
case 3:
randomFont = font4;
break;
case 4:
randomFont = font5;
break;
}
In the new version of the applet, the five fonts are stored in an array, which is
named fonts. This array is declared as an instance variable
Font[] fonts;
The array is created and filled with fonts in the init()
method:
fonts = new Font[5]; // Array to store five fonts.
fonts[0] = new Font("Serif", Font.BOLD, 14);
fonts[1] = new Font("SansSerif", Font.BOLD + Font.ITALIC, 24);
fonts[2] = new Font("Monospaced", Font.PLAIN, 20);
fonts[3] = new Font("Dialog", Font.PLAIN, 30);
fonts[4] = new Font("Serif", Font.ITALIC, 36);
This makes it much easier to select one of the fonts at random.
It can be done with the statements
Font randomFont; // One of the 5 fonts, chosen at random.
int fontIndex; // A random number in the range 0 to 4.
fontIndex = (int)(Math.random() * 5);
randomFont = fonts[ fontIndex ];
The switch statement has been replaced by a single
line of code. This is a very typical application of arrays.
Here is another example of the same sort of thing. Months are
often stored as numbers 1, 2, 3, ..., 12. Sometimes, however,
these numbers have to be translated into the names January, February,
..., December. The translation can be done with an array. The
array could be declared and initialized as
static String[] monthName = { "January", "February", "March",
"April", "May", "June",
"July", "August", "September",
"October", "November", "December" };
If mth is a variable that holds one of the
integers 1 through 12, then monthName[mth-1] is
the name of the corresponding month. We need the "-1" because
months are numbered starting from 1, while array elements are numbered
starting from 0. Simple array indexing
does the translation for us!