slanted W3C logo

Day 29 — Word Games

Word Games

From Day 01:

Word games seem to be very popular...

The fundamental component of a word game is a dictionary. You need a dictionary to validate words entered by the user.

Creating a Dictionary

You can think of a dictionary as being one of:

Happily, List, Set, and SortedSet are all sub-interfaces of Collection and Collection has a method named contains that returns true if the collection contains a specified element. Implementing a dictionary is as simple as creating a Collection.

Creating a Dictionary

You can find open source lists of English words fairly easily on the Internet (here). I have concatenated several of the lists together to form a dictionary of common English words that you can download here. The dictionary file has one word per line.

create a collection for the dictionary
open the dictionary file
for each line of the file
   read the word
   add the word to the dictionary

Creating a Dictionary

import java.util.Collection;
import java.util.ArrayList;
import java.util.Scanner;
import java.io.*;

public class Dictionary
{
   public static void main(String[] args) throws IOException
   {
      PrintStream output = System.out;
      Scanner input = new Scanner(System.in);
      
      Collection<String> dictionary = new ArrayList<String>();
      
      final String DICT_NAME = "dictionary.txt";
      Scanner dictionaryInput = new Scanner(new File(DICT_NAME));
      
      while (dictionaryInput.hasNextLine())
      {
         String word = dictionaryInput.nextLine().trim();
         dictionary.add(word);
      }
      
      output.printf("Dictionary has %d words%n", dictionary.size());
   }
}

Looking up Words in the Dictionary

Looking up a word in the dictionary is as easy as invoking the contains method from the Collection interface:

prompt the user for a word
while there is still input
   read the word
   if the word is in the dictionary
      output "in the dictionary"
   else
      output "not in the dictionary"
   prompt the user for a word
output.print("Enter a word to look up: ");
while (input.hasNextLine())
{
   String word = input.nextLine().trim().toLowerCase();
   if (dictionary.contains(word))
   {
      output.printf("%s is in the dictionary%n%n", word);
   }
   else
   {
      output.printf("%s is not in the dictionary%n%n", word);
   }
   output.print("Enter a word to look up: ");
}

See Dictionary.java.

Efficiency of contains

As a client, you should not care how a method is implemented. However, you might speculate that contains must search the entire List to determine if an element is in the collection. In fact, the ArrayList version of contains has complexity O(N).

get and set have O(1) complexity for ArrayList.

add to the end of the list has O(1) amortized complexity for ArrayList (adding N elements to the end of a list has complexity O(N)).

add to a particular location inside the list has O(N) complexity (because elements need to be moved down the list to make room for the new element).

Sorting the List

We can avoid the O(N) search for a word in the dictionary by sorting the list and using Collections.binarySearch.

After reading in the list of words:

Collections.sort(dictionary);

To look up a word:

if (Collections.binarySearch(dictionary, word) >= 0)
{
   output.printf("%s is in the dictionary%n%n", word);
}
else
{
   output.printf("%s is not in the dictionary%n%n", word);
}

Set

From the Java Tutorial:

A Set is a Collection that cannot contain duplicate elements. It models the mathematical set abstraction. The Set interface contains only methods inherited from Collection and adds the restriction that duplicate elements are prohibited.

Set

public interface Set<E> extends Collection<E> {
   // Basic operations
   int size();
   boolean isEmpty();
   boolean contains(Object element);
   boolean add(E element);         //optional
   boolean remove(Object element); //optional
   Iterator<E> iterator();

   // Bulk operations
   boolean containsAll(Collection<?> c);
   boolean addAll(Collection<? extends E> c); //optional
   boolean removeAll(Collection<?> c);        //optional
   boolean retainAll(Collection<?> c);        //optional
   void clear();                              //optional

   // Array Operations
   Object[] toArray();
   <T> T[] toArray(T[] a);
}

Set

Notice that Set has no get method. You must use iteration (traversal) to retrieve elements from a Set.

Classes Implementing Set

Dictionary Revisited

Because our dictionary was declared to be of type Collection<String> we can simply change the actual type to any Set:

Collection<String> dictionary = new HashSet<String>();

or

Collection<String> dictionary = new TreeSet<String>();

One Program to Test them All

We can write a single program to test all of the collections we are interested in; the program asks the user for a choice of collection and we create a collection of appropriate type:

Collection<String> dictionary = null;

// Type of collection
output.println("Choose a collection");
output.println("1. ArrayList");
output.println("2. LinkedList");
output.println("3. TreeSet");
output.println("4. HashSet");

int collectionType = input.nextInt();
if (collectionType == 1)
{
   dictionary = new ArrayList<String>();
}
else if (collectionType == 2)
{
   dictionary = new LinkedList<String>();
}
else if (collectionType == 3)
{
   dictionary = new TreeSet<String>();
}
else
{
   dictionary = new HashSet<String>();
}

See DictionaryAny.java.

Results for Several Collections

Number of words
searched for
Relative average time
ArrayList LinkedList ArrayList with
binary search
TreeSet HashSet
100 1,817 2,891 7 6 1
200 3,554 5,803 13 12 2
400 7,062 11,344 24 26 4
800 14,597 22,209 46 49 8
1,600 28,004 43,998 94 96 15