This repo shows one way of achieving the Hashing/Hash Tables using open-addressing and quadratic probing.
Instructions will be as to compile and run on the NJIT AFS servers' Linux.
g++ -std=c++11 hashing.cpp -o hashing
./hashing fileOption1.txt
2 OPTION 1: Hashing ( 160 points) We are asking you to implement a Lexicon structure to store arbitrarily long strings of ASCII chars (i.e.words). Lexicon
L
uses a Hash TableT
structure along with an ArrayA
ofNUL
separated words. In our case words are going to be English character words only (upper-case or lower case). TableT
will be organized as a hash-table using collision-resolution by open-addressing as specified in class. You are going to use quadratic probing forh(k,i)and keep the choice of the quadratic function simple:i**2
so thath(k,i) = (h′(k)+i**2)modN
. The keys that you will hash are going to be English words. Thus functionh′(k)
is also going to be kept simple: the sum of the ASCII/Unicode values of the characters minus 2 modN
, whereN
is the number of slots ofT
. Thus"alex"
is mapped to97+108+101+120−2 modN
whateverN
is. In the example below, forN=11
,h(alex,0) =6
. TableT
however won't store keyk
in it. This is because the keys are of arbitrary length. Instead,T
will store pointers/references in the form of an index to another arrayA
. The second table, arrayA
will be a character array and will store the words maintained inT
separated by NUL values\0
. This is not 2 characters a backslash followed by a zero. It is 1B (ASCII), 2B (UNICODE) whose all bits are set to 0, the NUL value. If you don't know what B is, it is a byte; read Handout 3. An insertion operation affectsT
andA
. A wordw
is hashed, an available slot inT
is computed and let that slot bet
. InT[t]
we store an index to tableA
. This index is the first location that stores the first character ofw
. The ending location is the\0
followingw
inA
. New words that do not exist (never inserted, or inserted but subsequently deleted) are appended inA
. Thus originally you need to be wise enough in choosing the appropriate size ofA
. If at some point you run-out of space, you increaseT
and thus this increasesA
as well. DoublingT
i.e.N
, is an option. This causes problems that you also need to attend to. A deletion will modifyT
as needed but will not erasew
fromA
. Let it be there. SoA
might get dirty (i.e. it contains garbage) after several deletions. If several operations later you end up insertingw
after deleting it previously, you do it the insertion way and you reinsertw
, even if a dirty copy of it might still be around. You DO NOT DO a linear search to find out if it exists arleady inA
; it is inefficient. There is not much to say for a search. You need to support few more operations:Create
,Empty
/Full
/Batch
with the last of those checking for an empty or full table/array and a mechanism to perform multiple operations in batch form.T
and its contents i.e. index values toA
. In addition it prints nicely (linear-wise in oneline) the contents ofA
. (For a\0
you will do the SEMI obvious: print a backslash but not its zero). The intent ofA
for deleted words. It prints stars for every character of a deleted word instead. (An alternative is that during deletion each such character has already been turned into a star.) FunctionCreate
createsT
withN
slots, andA
with15N
chars and initializesA
to spaces. We call a class that supports and realizesA
andT
alexicon
:L
is one instance of a lexicon. Your code should thus implement as functions minimally the functions mentioned above:Create
,Empty
,Full
,Batch
,Insert
,Delete
,Search
. Testing utilizes aBatch
function with argument a filename that is going to be provided as a commandline argument. That file consists of multiple lines containing one command per line. An example file is provided in Section B of the course web-page related to the example below. Each command is a numeric equivalent of the function named earlier plus one more (for comment). Command 10 isInsert
, Command 11 isDeletion
, and Command 12 isSearch
. Command 13 isCreate
. Command 15 isComment
: the rest of the line marked by a 15 is ignored. Command 14 forCreate
has an argument which is the value ofN
. Each one of 10, 11, and 12 has an argument which is a string.