
From Day 01:
Word games seem to be very popular...
The fundamental component of a word game is a dictionary. You need a dictionary to validate words entered by the user.
You can think of a dictionary as being one of:
Happily, List, Set, and SortedSet
are all sub-interfaces of Collection and Collection
has a method named contains that returns true if
the collection contains a specified element. Implementing
a dictionary is as simple as creating a Collection.
You can find open source lists of English words fairly easily on the Internet (here). I have concatenated several of the lists together to form a dictionary of common English words that you can download here. The dictionary file has one word per line.
create a collection for the dictionary open the dictionary file for each line of the file read the word add the word to the dictionary
import java.util.Collection;
import java.util.ArrayList;
import java.util.Scanner;
import java.io.*;
public class Dictionary
{
public static void main(String[] args) throws IOException
{
PrintStream output = System.out;
Scanner input = new Scanner(System.in);
Collection<String> dictionary = new ArrayList<String>();
final String DICT_NAME = "dictionary.txt";
Scanner dictionaryInput = new Scanner(new File(DICT_NAME));
while (dictionaryInput.hasNextLine())
{
String word = dictionaryInput.nextLine().trim();
dictionary.add(word);
}
output.printf("Dictionary has %d words%n", dictionary.size());
}
}
Looking up a word in the dictionary is as easy as invoking
the contains method from the Collection
interface:
prompt the user for a word
while there is still input
read the word
if the word is in the dictionary
output "in the dictionary"
else
output "not in the dictionary"
prompt the user for a word
output.print("Enter a word to look up: ");
while (input.hasNextLine())
{
String word = input.nextLine().trim().toLowerCase();
if (dictionary.contains(word))
{
output.printf("%s is in the dictionary%n%n", word);
}
else
{
output.printf("%s is not in the dictionary%n%n", word);
}
output.print("Enter a word to look up: ");
}
See Dictionary.java.
contains
As a client, you should not care how a method is implemented.
However, you might speculate that contains must
search the entire List to determine if an element
is in the collection. In fact, the ArrayList
version of contains has complexity O(N).
get and set have O(1) complexity
for ArrayList.
add to the end of the list has O(1) amortized
complexity for ArrayList (adding N elements
to the end of a list has complexity O(N)).
add to a particular location inside the list
has O(N) complexity (because elements need to be moved
down the list to make room for the new element).
We can avoid the O(N) search for a word in the dictionary
by sorting the list and using Collections.binarySearch.
After reading in the list of words:
Collections.sort(dictionary);
To look up a word:
if (Collections.binarySearch(dictionary, word) >= 0)
{
output.printf("%s is in the dictionary%n%n", word);
}
else
{
output.printf("%s is not in the dictionary%n%n", word);
}
Set
From the Java Tutorial:
A Set is a Collection that cannot
contain duplicate elements.
It models the mathematical set abstraction.
The Set interface contains only methods
inherited from Collection and adds the
restriction that duplicate elements are prohibited.
Set
public interface Set<E> extends Collection<E> {
// Basic operations
int size();
boolean isEmpty();
boolean contains(Object element);
boolean add(E element); //optional
boolean remove(Object element); //optional
Iterator<E> iterator();
// Bulk operations
boolean containsAll(Collection<?> c);
boolean addAll(Collection<? extends E> c); //optional
boolean removeAll(Collection<?> c); //optional
boolean retainAll(Collection<?> c); //optional
void clear(); //optional
// Array Operations
Object[] toArray();
<T> T[] toArray(T[] a);
}
Set
Notice that Set has no get method.
You must use iteration (traversal) to retrieve elements
from a Set.
SetHashSet
add, remove, containsTreeSet
add, remove, containsComparable
Because our dictionary was declared to be of type
Collection<String> we can simply change
the actual type to any Set:
Collection<String> dictionary = new HashSet<String>();
or
Collection<String> dictionary = new TreeSet<String>();
We can write a single program to test all of the collections we are interested in; the program asks the user for a choice of collection and we create a collection of appropriate type:
Collection<String> dictionary = null;
// Type of collection
output.println("Choose a collection");
output.println("1. ArrayList");
output.println("2. LinkedList");
output.println("3. TreeSet");
output.println("4. HashSet");
int collectionType = input.nextInt();
if (collectionType == 1)
{
dictionary = new ArrayList<String>();
}
else if (collectionType == 2)
{
dictionary = new LinkedList<String>();
}
else if (collectionType == 3)
{
dictionary = new TreeSet<String>();
}
else
{
dictionary = new HashSet<String>();
}
See DictionaryAny.java.
| Number of words searched for |
Relative average time | ||||
|---|---|---|---|---|---|
| ArrayList | LinkedList | ArrayList with binary search |
TreeSet | HashSet | |
| 100 | 1,817 | 2,891 | 7 | 6 | 1 |
| 200 | 3,554 | 5,803 | 13 | 12 | 2 |
| 400 | 7,062 | 11,344 | 24 | 26 | 4 |
| 800 | 14,597 | 22,209 | 46 | 49 | 8 |
| 1,600 | 28,004 | 43,998 | 94 | 96 | 15 |
ArrayList has
similar search efficiency to a sorted set
Suppose you have a Collection and you want to find all of the
elements that occur one or more times. You can use a Set:
// cseAcct is a Collection of CSE accounts // Collection<String> cseAcct = ... Set<String> uniqueAcct = new HashSet<String>(cseAcct);
If you want the elements to be sorted, use a TreeSet instead:
Set<String> uniqueAcct = new TreeSet<String>(cseAcct);
In both examples, a constructor is used that creates a set and populates it using the elements from the collection.
Suppose you have a Collection and you want to find all of the duplicated
elements. You can use a pair of Sets:
// cseAcct is a Collection of CSE accounts
// Collection<String> cseAcct = ...
Set<String> uniqueAcct = new HashSet<String>();
Set<String> duplicateAcct = new HashSet<String>();
for (String acct : cseAcct)
{
if (!uniqueAcct.add(acct))
{
duplicateAcct.add(acct);
}
}
See Duplicates.java
Suppose you have a set containing the accounts of students who completed assignment 1 and a second set containing the accounts of students who completed assignment 2. To find the set of students who completed either assignments, you compute the union of the two sets.
s1.addAll(s2) : transforms s1 into the union of
s1 and s2.
// s1 contains the accounts of students who completed assignment 1 // s2 contains the accounts of students who completed assignment 2 Set<String> eitherAssignment = new HashSet<String>(s1); eitherAssignment.retainAll(s2);
Suppose you have a set containing the accounts of students who completed assignment 1 and a second set containing the accounts of students who completed assignment 2. To find the set of students who completed both assignments, you compute the intersection of the two sets.
s1.retainAll(s2) : transforms s1 into the intersection of
s1 and s2.
// s1 contains the accounts of students who completed assignment 1 // s2 contains the accounts of students who completed assignment 2 Set<String> bothAssignments = new HashSet<String>(s1); bothAssignments.retainAll(s2);
Suppose you have a set containing the accounts of students who completed assignment 1 and a second set containing the accounts of students who completed assignment 2. To find the set of students who completed only assignment 1, you compute the set difference:
s1.removeAll(s2) : transforms s1 into the set difference of
s1 and s2.
// s1 contains the accounts of students who completed assignment 1 // s2 contains the accounts of students who completed assignment 2 Set<String> onlyAssignment1 = new HashSet<String>(s1); onlyAssignment1.removeAll(s2);