Chapter 11

Arrays of Objects

11.1 Composition

By now we have seen several examples of composition (the ability to combine language features in a variety of arrangements). One of the first examples we saw was using a method invocation as part of an expression. Another example is the nested structure of statements: you can put an if statement within a while loop, or within another if statement, etc.

Having seen this pattern, and having learned about arrays and objects, you should not be surprised to learn that you can have arrays of objects. In fact, you can also have objects that contain arrays (as instance variables); you can have arrays that contain arrays; you can have objects that contain objects, and so on.

In the next two chapters we will look at some examples of these combinations, using Card objects as an example.

11.2 Card objects

If you are not familiar with common playing cards, now would be a good time to get a deck, or else this chapter might not make much sense. There are 52 cards in a deck, each of which belongs to one of four suits and one of 13 ranks. The suits are Spades, Hearts, Diamonds and Clubs (in descending order in Bridge). The ranks are Ace, 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen and King. Depending on what game you are playing, the rank of the Ace may be higher than King or lower than 2.

If we want to define a new object to represent a playing card, it is pretty obvious what the instance variables should be: rank and suit. It is not as obvious what type the instance variables should be. One possibility is Strings, containing things like "Spade" for suits and "Queen" for ranks. One problem with this implementation is that it would not be easy to compare cards to see which had higher rank or suit.

An alternative is to use integers to encode the ranks and suits. By "encode," I do not mean what some people think, which is to encrypt, or translate into a secret code. What a computer scientist means by "encode" is something like "define a mapping between a sequence of numbers and the things I want to represent." For example,

Spades	`->`	3
Hearts	`->`	2
Diamonds	`->`	1
Clubs	`->`	0

The symbol -> is mathematical notation for "maps to." The obvious feature of this mapping is that the suits map to integers in order, so we can compare suits by comparing integers. The mapping for ranks is fairly obvious; each of the numerical ranks maps to the corresponding integer, and for face cards:

Jack	`->`	11
Queen	`->`	12
King	`->`	13

The reason I am using mathematical notation for these mappings is that they are not part of the Java program. They are part of the program design, but they never appear explicitly in the code. The class definition for the Card type looks like this:

class Card
{
int suit, rank;

public Card () {
this.suit = 0; this.rank = 0;
}

public Card (int suit, int rank) {
this.suit = suit; this.rank = rank;
}
}

As usual, I am providing two constructors, one of which takes a parameter for each instance variable and the other of which takes no parameters.

To create an object that represents the 3 of Clubs, we would use the new command:

Card threeOfClubs = new Card (0, 3);

The first argument, 0 represents the suit Clubs.

11.3 The printCard method

When you create a new class, the first step is usually to declare the instance variables and write constructors. The second step is often to write the standard methods that every object should have, including one that prints the object, and one or two that compare objects. I will start with printCard.

In order to print Card objects in a way that humans can read easily, we want to map the integer codes onto words. A natural way to do that is with an array of Strings. You can create an array of Strings the same way you create an array of primitive types:

String[] suits = new String [4];

Then we can set the values of the elements of the array.

suits[0] = "Clubs";
suits[1] = "Diamonds";
suits[2] = "Hearts";
suits[3] = "Spades";

Creating an array and initializing the elements is such a common operation that Java provides a special syntax for it:

String[] suits = { "Clubs", "Diamonds", "Hearts", "Spades" };

The effect of this statement is identical to that of the separate declaration, allocation, and assignment. A state diagram of this array might look like:

The elements of the array are references to the Strings, rather than Strings themselves. This is true of all arrays of objects, as I will discuss in more detail later. For now, all we need is another array of Strings to decode the ranks:

String[] ranks = { "narf", "Ace", "2", "3", "4", "5", "6",
"7", "8", "9", "10", "Jack", "Queen", "King" };

The reason for the "narf" is to act as a place-keeper for the zeroeth element of the array, which will never be used. The only valid ranks are 1--13. This wasted entry is not necessary, of course. We could have started at 0, as usual, but it is best to encode 2 as 2, and 3 as 3, etc.

Using these arrays, we can select the appropriate Strings by using the suit and rank as indices. In the method printCard,

public static void printCard (Card c) {
String[] suits = { "Clubs", "Diamonds", "Hearts", "Spades" };
String[] ranks = { "narf", "Ace", "2", "3", "4", "5", "6",
"7", "8", "9", "10", "Jack", "Queen", "King" };

System.out.println (ranks[c.rank] + " of " + suits[c.suit]);
}

the expression suits[c.suit] means "use the instance variable suit from the object c as an index into the array named suits, and select the appropriate string." The output of this code

Card card = new Card (1, 11);
printCard (card);

is Jack of Diamonds.

11.4 The sameCard method

The word "same" is one of those things that occur in natural language that seem perfectly clear until you give it some thought, and then you realize there is more to it than you expected.

For example, if I say "Chris and I have the same car," I mean that his car and mine are the same make and model, but they are two different cars. If I say "Chris and I have the same mother," I mean that his mother and mine are one and the same. So the idea of "sameness" is different depending on the context.

When you talk about objects, there is a similar ambiguity. For example, if two Cards are the same, does that mean they contain the same data (rank and suit), or they are actually the same Card object?

To see if two references refer to the same object, we can use the == operator. For example:

Card card1 = new Card (1, 11);
Card card2 = card1;

if (card1 == card2) {
System.out.println ("card1 and card2 are the same object.");
}

This type of equality is called shallow equality because it only compares the references, not the contents of the objects.

To compare the contents of the objects---deep equality---it is common to write a method with a name like sameCard.

public static boolean sameCard (Card c1, Card c2) {
return (c1.suit == c2.suit && c1.rank == c2.rank);
}

Now if we create two different objects that contain the same data, we can use sameCard to see if they represent the same card:

Card card1 = new Card (1, 11);
Card card2 = new Card (1, 11);

if (sameCard (card1, card2)) {
System.out.println ("card1 and card2 are the same card.");
}

In this case, card1 and card2 are two different objects that contain the same data

so the condition is true. What does the state diagram look like when card1 == card2 is true?

In Section 7.11 I said that you should never use the == operator on Strings because it does not do what you expect. Instead of comparing the contents of the String (deep equality), it checks whether the two Strings are the same object (shallow equality).

11.5 The compareCard method

For primitive types, there are conditional operators that compare values and determine when one is greater or less than another. These operators (< and > and the others) don't work for object types. For Strings there is a built-in compare method. For Cards we have to write our own, which we will call compareCard. Later, we will use this method to sort a deck of cards.

Some sets are completely ordered, which means that you can compare any two elements and tell which is bigger. For example, the integers and the floating-point numbers are totally ordered. Some sets are unordered, which means that there is no meaningful way to say that one element is bigger than another. For example, the fruits are unordered, which is why we cannot compare apples and oranges. In Java, the boolean type is unordered; we cannot say that true is greater than false.

The set of playing cards is partially ordered, which means that sometimes we can compare cards and sometimes not. For example, I know that the 3 of Clubs is higher than the 2 of Clubs, and the 3 of Diamonds is higher than the 3 of Clubs. But which is better, the 3 of Clubs or the 2 of Diamonds? One has a higher rank, but the other has a higher suit.

In order to make cards comparable, we have to decide which is more important, rank or suit. To be honest, the choice is completely arbitrary. For the sake of choosing, I will say that suit is more important, because when you buy a new deck of cards, it comes sorted with all the Clubs together, followed by all the Diamonds, and so on.

With that decided, we can write compareCard. It will take two Cards as parameters and return 1 if the first card wins, -1 if the second card wins, and 0 if they tie (indicating deep equality). It is sometimes confusing to keep those return values straight, but they are pretty standard for comparison methods.

First we compare the suits:

if (c1.suit > c2.suit) return 1;
if (c1.suit < c2.suit) return -1;

If neither statement is true, then the suits must be equal, and we have to compare ranks:

if (c1.rank > c2.rank) return 1;
if (c1.rank < c2.rank) return -1;

If neither of these is true, the ranks must be equal, so we return 0. In this ordering, aces will appear lower than deuces (2s).

As an exercise, fix it so that aces are ranked higher than Kings, and encapsulate this code in a method.

11.6 Arrays of cards

The reason I chose Cards as the objects for this chapter is that there is an obvious use for an array of cards---a deck. Here is some code that creates a new deck of 52 cards:

Card[] deck = new Card [52];

Here is the state diagram for this object:

The important thing to see here is that the array contains only references to objects; it does not contain any Card objects. The values of the array elements are initialized to null. You can access the elements of the array in the usual way:

if (deck[3] == null) {
System.out.println ("No cards yet!");
}

But if you try to access the instance variables of the non-existent Cards, you will get a NullPointerException.

deck[2].rank; // NullPointerException

Nevertheless, that is the correct syntax for accessing the rank of the "twoeth" card in the deck (really the third---we started at zero, remember?). This is another example of composition, the combination of the syntax for accessing an element of an array and an instance variable of an object.

The easiest way to populate the deck with Card objects is to write a nested loop:

int index = 0;
for (int suit = 0; suit <= 3; suit++) {
for (int rank = 1; rank <= 13; rank++) {
deck[index] = new Card (suit, rank);
index++;
}
}

The outer loop enumerates the suits, from 0 to 3. For each suit, the inner loop enumerates the ranks, from 1 to 13. Since the outer loop iterates 4 times, and the inner loop iterates 13 times, the total number of times the body is executed is 52 (13 times 4).

I used the variable index to keep track of where in the deck the next card should go. The following state diagram shows what the deck looks like after the first two cards have been allocated:

As an exercise, encapsulate this deck-building code in a method called buildDeck that takes no parameters and that returns a fully-populated array of Cards.

11.7 The printDeck method

Whenever you are working with arrays, it is convenient to have a method that will print the contents of the array. We have seen the pattern for traversing an array several times, so the following method should be familiar:

public static void printDeck (Card[] deck) {
for (int i=0; i<deck.length; i++) {
printCard (deck[i]);
}
}

Since deck has type Card[], an element of deck has type Card. Therefore, deck[i] is a legal argument for printCard.

11.8 Searching

The next method I want to write is findCard, which searches through an array of Cards to see whether it contains a certain card. It may not be obvious why this method would be useful, but it gives me a chance to demonstrate two ways to go searching for things, a linear search and a bisection search.

Linear search is the more obvious of the two; it involves traversing the deck and comparing each card to the one we are looking for. If we find it we return the index where the card appears. If it is not in the deck, we return -1.

public static int findCard (Card[] deck, Card card) {
for (int i = 0; i< deck.length; i++) {
if (sameCard (deck[i], card)) return i;
}
return -1;
}

The arguments of findCard are named card and deck. It might seem odd to have a variable with the same name as a type (the card variable has type Card). This is legal and common, although it can sometimes make code hard to read. In this case, though, I think it works.

The method returns as soon as it discovers the card, which means that we do not have to traverse the entire deck if we find the card we are looking for. If the loop terminates without finding the card, we know the card is not in the deck and return -1.

If the cards in the deck are not in order, there is no way to search that is faster than this. We have to look at every card, since otherwise there is no way to be certain the card we want is not there.

But when you look for a word in a dictionary, you don't search linearly through every word. The reason is that the words are in alphabetical order. As a result, you probably use an algorithm that is similar to a bisection search:

Start in the middle somewhere.
Choose a word on the page and compare it to the word you are looking for.
If you found the word you are looking for, stop.
If the word you are looking for comes after the word on the page, flip to somewhere later in the dictionary and go to step 2.
If the word you are looking for comes before the word on the page, flip to somewhere earlier in the dictionary and go to step 2.

If you ever get to the point where there are two adjacent words on the page and your word comes between them, you can conclude that your word is not in the dictionary. The only alternative is that your word has been misfiled somewhere, but that contradicts our assumption that the words are in alphabetical order.

In the case of a deck of cards, if we know that the cards are in order, we can write a version of findCard that is much faster. The best way to write a bisection search is with a recursive method. That's because bisection is naturally recursive.

The trick is to write a method called findBisect that takes two indices as parameters, low and high, indicating the segment of the array that should be searched (including both low and high).

To search the array, choose an index between low and high (call it mid) and compare it to the card you are looking for.
If you found it, stop.
If the card at mid is higher than your card, search in the range from low to mid-1.
If the card at mid is lower than your card, search in the range from mid+1 to high.

Steps 3 and 4 look suspiciously like recursive invocations. Here's what this all looks like translated into Java code:

public static int findBisect (Card[] deck, Card card, int low, int high) {
int mid = (high + low) / 2;
int comp = compareCard (deck[mid], card);

if (comp == 0) {
return mid;
} else if (comp > 0) {
return findBisect (deck, card, low, mid-1);
} else {
return findBisect (deck, card, mid+1, high);
}
}

Rather than call compareCard three times, I called it once and stored the result.

Although this code contains the kernel of a bisection search, it is still missing a piece. As it is currently written, if the card is not in the deck, it will recurse forever. We need a way to detect this condition and deal with it properly (by returning -1).

The easiest way to tell that your card is not in the deck is if there are no cards in the deck, which is the case if high is less than low. Well, there are still cards in the deck, of course, but what I mean is that there are no cards in the segment of the deck indicated by low and high.

With that line added, the method works correctly:

public static int findBisect
(Card[] deck, Card card, int low, int high) {
System.out.println (low + ", " + high);

if (high < low) return -1;

int mid = (high + low) / 2;
int comp = deck[mid].compareCard (card);

if (comp == 0) {
return mid;
} else if (comp > 0) {
return findBisect (deck, card, low, mid-1);
} else {
return findBisect (deck, card, mid+1, high);
}
}

I added a print statement at the beginning so I could watch the sequence of recursive calls and convince myself that it would eventually reach the base case. I tried out the following code:

Card card1 = new Card (1, 11);
System.out.println (findBisect (deck, card1, 0, 51));

And got the following output:

0, 51
0, 24
13, 24
19, 24
22, 24
23

Then I made up a card that is not in the deck (the 15 of Diamonds), and tried to find it. I got the following:

0, 51
0, 24
13, 24
13, 17
13, 14
13, 12
-1

These tests don't prove that this program is correct. In fact, no amount of testing can prove that a program is correct. On the other hand, by looking at a few cases and examining the code, you might be able to convince yourself.

The number of recursive calls is fairly small, typically 6 or 7. That means we only had to invoke compareCard 6 or 7 times, compared to up to 52 times if we did a linear search. In general, bisection is much faster than a linear search, especially for large arrays.

Two common errors in recusive programs are forgetting to include a base case and writing the recursive call so that the base case is never reached. Either error will cause an infinite recursion, in which case Java will (eventually) throw a StackOverflowException.

11.9 Decks and subdecks

Looking at the interface to findBisect

public static int findBisect
(Card[] deck, Card card, int low, int high)

it might make sense to treat three of the parameters, deck, low and high, as a single parameter that specifies a subdeck. We took a similar view in Section 4.8 when we were talking about bounding boxes. In that case I referred to x, y, width and height as if they were a single parameter, a bounding box.

This kind of thing is quite common, and I sometimes think of it as an abstract parameter. What I mean by "abstract," is something that is not literally part of the program text, but which describes the function of the program at a higher level.

For example, when you invoke a method and pass an array and the bounds low and high, there is nothing that prevents the invoked method from accessing parts of the array that are out of bounds. So you are not literally sending a subset of the deck; you are really sending the whole deck. But as long as the recipient plays by the rules, it makes sense to think of it, abstractly, as a subdeck.

There is one other example of this kind of abstraction that you might have noticed in Section 9.7, when I referred to an "empty" data structure. The reason I put "empty" in quotation marks was to suggest that it is not literally accurate. All variables have values all the time. When you create them, they are given default values. So there is no such thing as an empty object.

But if the program guarantees that the current value of a variable is never read before it is written, then the current value is irrelevant. Abstractly, it makes sense to think of such a variable as "empty."

This kind of thinking, in which a program comes to take on meaning beyond what is literally encoded, is a very important part of thinking like a computer scientist. Sometimes, the word "abstract" gets used so often and in so many contexts that it comes to lose its meaning. Nevertheless, abstraction is a central idea in computer science (as well as many other fields).

A more general definition of "abstraction" is "The process of modeling a complex system with a simplified description in order to suppress unnecessary details while capturing relevant behavior."

11.10 Glossary

encode: To represent one set of values using another set of values, by constructing a mapping between them.
shallow equality: Equality of references. Two references that point to the same object.
deep equality: Equality of values. Two references that point to objects that have the same value.
abstract parameter: A set of parameters that act together as a single parameter.
abstraction: The process of interpreting a program (or anything else) at a higher level than what is literally represented by the code.