Data Structures and Algorithms -- CSci 230
Chapter 5 -- Hashing
Class Outline
Presumably, you have already studied hashing in CS II, so our coverage
of hashing will be brief. The quiz on hashing will be merged into the
quiz on trees, creating a double weighted quiz.
- In hashing, a key is mapped to a location in a table,
and the key and an associated entry are then stored at that location.
- A primary application is in building symbol tables for
compilers.
- The crucial issues are the mapping or hash function and
how to handle collisions, e.g. when multiple keys map to the
same table location.
Hash Functions
Handling Collisions
- Handling collisions is usually done via separate
chaining , where each table entry points to a (distinct) linked list.
- Earlier, when conserving memory was more of an issue,
closed hashing techniques were heavily studied. In closed
hashing collisions are resolved through a search for empty locations
in the table.
- The attached code gives one implementation of separate
chaining . It differs from the implementation given in the textbook
in several ways. Most importantly, it provides a simple mechanism for
dynamically changing the size of the table!
In-class Exercises
- 1.
- Consider computational efficiency issues:
- (a)
- What are the average case and worst-case times for inserting,
finding and removing an entry from a hash table containing n
entries?
- (b)
- Are these worst-case times likely to occur?
- (c)
- How do these compare to those of balanced trees?
- 2.
- What operations can be done on trees that can not be done on
hash tables?
- 3.
- How would you modify the public interface and the implementation
of the
HashTable
class to allow multiple instances of the same
key?
Review Problems
- 1.
- Consider the following hashing function,
similar to hash functions discussed in class.
unsigned int
Hash( const string & key, const int h_size )
{
unsigned int value = 0;
for ( int i=0; i<key.length(); i++ )
value = (value + key[i]) << 3; // multiply by 8
return value % h_size;
}
Is this a good hash function when h_size = 128
? Why or why not?
- 2.
- Suppose you need a data structure to support several types of
operations on a set of strings: insert a string, find a string, delete
a string, and print the strings in order. Consider the possibilities
of using either an unbalanced binary search tree, an AVL tree, or a
hash table with separate chaining to represent the strings. Even
under these assumptions, which data structure is best will depend on
the data and on the relative frequencies of insert, find, delete and
print operations.
- (a)
- Under what conditions should you choose a hash table? Why?
- (b)
- Under what conditions should you choose a binary search tree?
Why?
- (c)
- Under what conditions should you choose an AVL tree? Why?
Charles Stewart
10/8/1998