From Day 01:
Word games seem to be very popular...
The fundamental component of a word game is a dictionary. You need a dictionary to validate words entered by the user.
You can think of a dictionary as being one of:
Happily, List
, Set
, and SortedSet
are all sub-interfaces of Collection
and Collection
has a method named contains
that returns true
if
the collection contains a specified element. Implementing
a dictionary is as simple as creating a Collection
.
You can find open source lists of English words fairly easily on the Internet (here). I have concatenated several of the lists together to form a dictionary of common English words that you can download here. The dictionary file has one word per line.
create a collection for the dictionary open the dictionary file for each line of the file read the word add the word to the dictionary
import java.util.Collection; import java.util.ArrayList; import java.util.Scanner; import java.io.*; public class Dictionary { public static void main(String[] args) throws IOException { PrintStream output = System.out; Scanner input = new Scanner(System.in); Collection<String> dictionary = new ArrayList<String>(); final String DICT_NAME = "dictionary.txt"; Scanner dictionaryInput = new Scanner(new File(DICT_NAME)); while (dictionaryInput.hasNextLine()) { String word = dictionaryInput.nextLine().trim(); dictionary.add(word); } output.printf("Dictionary has %d words%n", dictionary.size()); } }
Looking up a word in the dictionary is as easy as invoking
the contains
method from the Collection
interface:
prompt the user for a word while there is still input read the word if the word is in the dictionary output "in the dictionary" else output "not in the dictionary" prompt the user for a word
output.print("Enter a word to look up: ");
while (input.hasNextLine())
{
String word = input.nextLine().trim().toLowerCase();
if (dictionary.contains(word))
{
output.printf("%s is in the dictionary%n%n", word);
}
else
{
output.printf("%s is not in the dictionary%n%n", word);
}
output.print("Enter a word to look up: ");
}
See Dictionary.java.
contains
As a client, you should not care how a method is implemented.
However, you might speculate that contains
must
search the entire List
to determine if an element
is in the collection. In fact, the ArrayList
version of contains
has complexity O(N).
get
and set
have O(1) complexity
for ArrayList
.
add
to the end of the list has O(1) amortized
complexity for ArrayList
(adding N elements
to the end of a list has complexity O(N)).
add
to a particular location inside the list
has O(N) complexity (because elements need to be moved
down the list to make room for the new element).
We can avoid the O(N) search for a word in the dictionary
by sorting the list and using Collections.binarySearch
.
After reading in the list of words:
Collections.sort(dictionary);
To look up a word:
if (Collections.binarySearch(dictionary, word) >= 0) { output.printf("%s is in the dictionary%n%n", word); } else { output.printf("%s is not in the dictionary%n%n", word); }
Set
From the Java Tutorial:
A Set
is a Collection
that cannot
contain duplicate elements.
It models the mathematical set abstraction.
The Set
interface contains only methods
inherited from Collection
and adds the
restriction that duplicate elements are prohibited.
Set
public interface Set<E> extends Collection<E> { // Basic operations int size(); boolean isEmpty(); boolean contains(Object element); boolean add(E element); //optional boolean remove(Object element); //optional Iterator<E> iterator(); // Bulk operations boolean containsAll(Collection<?> c); boolean addAll(Collection<? extends E> c); //optional boolean removeAll(Collection<?> c); //optional boolean retainAll(Collection<?> c); //optional void clear(); //optional // Array Operations Object[] toArray(); <T> T[] toArray(T[] a); }
Set
Notice that Set
has no get
method.
You must use iteration (traversal) to retrieve elements
from a Set
.
Set
HashSet
add
, remove
, contains
TreeSet
add
, remove
, contains
Comparable
Because our dictionary was declared to be of type
Collection<String>
we can simply change
the actual type to any Set
:
Collection<String> dictionary = new HashSet<String>();
or
Collection<String> dictionary = new TreeSet<String>();
We can write a single program to test all of the collections we are interested in; the program asks the user for a choice of collection and we create a collection of appropriate type:
Collection<String> dictionary = null; // Type of collection output.println("Choose a collection"); output.println("1. ArrayList"); output.println("2. LinkedList"); output.println("3. TreeSet"); output.println("4. HashSet"); int collectionType = input.nextInt(); if (collectionType == 1) { dictionary = new ArrayList<String>(); } else if (collectionType == 2) { dictionary = new LinkedList<String>(); } else if (collectionType == 3) { dictionary = new TreeSet<String>(); } else { dictionary = new HashSet<String>(); }
See DictionaryAny.java.
Number of words searched for |
Relative average time | ||||
---|---|---|---|---|---|
ArrayList | LinkedList | ArrayList with binary search |
TreeSet | HashSet | |
100 | 1,817 | 2,891 | 7 | 6 | 1 |
200 | 3,554 | 5,803 | 13 | 12 | 2 |
400 | 7,062 | 11,344 | 24 | 26 | 4 |
800 | 14,597 | 22,209 | 46 | 49 | 8 |
1,600 | 28,004 | 43,998 | 94 | 96 | 15 |
ArrayList
has
similar search efficiency to a sorted set