9.1. Hashing   

We are interested in building and searching a symbol table because this is what a compiler must do. With a var declaration, we will get symbols (variable names) and put them and other information such as type, address and scope into a table. Later we will use that table of names to check to see if all variables used have been declared: i.e. we search the symbol table to see if it contains the name of each variable used.

Since this happens all the time for compilers (searching for variables, constants , keywords, procedure and function names) we want a very fast algorithm. Hashing is the fastest method.


static sometype i,j,k; Build symbol table
{
i = 12; Use
j = k; symbol
z = 57.0; table
:
}

Hashing is very simple:

we build the symbol table in a special way by putting the
symbols we encounter into specific positions as dictated by
some function.
(When we built the binary search tree we also built it in a
special manner; this gave it its speed in searching.)
Since we know where we put them, we can easily find them at
a later stage when searching the table. If the item is not
where we would have put it when building the symbol table,
then the item was not put into the table to start with.
As simple as that.

Finding an item in a table should take only one or two comparisons on average no matter how big the table is. The basic idea of the method is simple: rather than looking at the whole table to find where something is, we apply a function to the key to tell us where to go in the table immediately. We index into the table (in this case an array) using some function of the key.