Assignment 5
Due: 30 Nov 2011
Educational objectives:
- Primary objectives: Implement an efficient data structure for inserting words into a dictionary and searching to see if a word is in a dictionary.
- Secondary objectives: Empirically compare the performance of different data structures.
Statement of work: Implement a hash function for strings that performs well with the dictionary application using the STL tr1/unordered_set container, implement an efficient data structure for the dictionary application, and compare the performance of the STL set and STL tr1/unordered_set with the default hash function and your hash function against your data structure. You will be graded on the performance and correctness of your code. So, please use good compiler optimization flags in your makefile.
Deliverables:
- Turn in a
makefile
and all header (*.h) and cpp (*.cpp) files that are needed to build your software, as described in www.cs.fsu.edu/~asriniva/courses/DS11/HWinstructions.html. Turn in your development log too, which should be a plain ASCII text file calledLOG.txt
in your project directory.Requirements:
- Create a subdirectory called
proj5
.- You will need to have a
makefile
in this directory. In addition, all the header and cpp files needed to build your software must be present here, as well as theLOG.txt
file.- You should create the following additional files.
- MyHash.h: This should implement a hash function object that maps an STL string to an integer type.
- MyDS.h/MyDS.cpp: These files should provide the interface and implementation for your data structure. You can feel free to implement any data structure that you wish to, including designing your own data structure or combining multiple data structures. This data structure should store STL strings. It should be in a calss called
MyDS
and implement at least the following member functions: (i) default constructor, (ii)void push(const string &)
, (iii)bool search(const string &)
, and (iv) destructor, which perform the operations expected from their names. You should not use any STL container other than strings to implement your data structure.- compare.cpp: This program will be compiled to create an executable called
compare-containers
, and the executable will be run as follows.This code should store all lower case the words in the dictionary available in
./compare-containers Filename
, whereFilename
is the name of a file containing words separated by whitespaces. Each word contains a string of lower case letters./usr/share/dict/words
on linprog in four different containers: (i) STL set, (ii) STLunordered_set
with the default hash function, (iii) STLunordered_set
with your hash function from MyHash.h, and (iv) MyDS. It will then check if each word inFilename
is present in the standard dictionary using each of the four containers. For each word, it will outputAnswer Container
where Answer = Y if the word is present and N if it is not, and Container = set/hash/myhash/myds. For example:
Y set
Y hash
Y myhash
Y myds
Y set
Y hash
Y myhash
Y myds
N set
N hash
N myhash
N myds
After handling all the words in the input file, your code should output the time taken for storing the entire dictionary, the minimum search time for a word, the maximum search time for a word, and the average search time, for each container. For example:
set: store dictionary 10.1 s, search: min 0.01 s, max 1.0 s, mean 0.05 s
hash: store dictionary 2.2 s, search: min 0.02 s, max 1.5 s, mean 0.04 s
myhash: store dictionary 9.1 s, search: min 0.01 s, max 0.09 s, mean 0.03 s
myds: store dictionary 1.1 s, search: min 0.001 s, max 0.02 s, mean 0.03 s
- result.txt: This is an ASCII text file. It should first describe your data structure (MyDS). It should then discuss the relative performances of the four data structures.
Note:
- We will test your
MyDS
class on a piece of code that we will write. So it is important for this class to be exactly as specified.Bonus points (5):
You may get up to 5 additional points if your code is correct and the fastest in class. You may get up to 2 bonus points if MyDS works correctly and is faster that the two STL containers (the speed of the STL containers will be determined by a piece of code that we write, in determining the bonus points).
Copyright: Ashok Srinivasan, Florida State University.
Last modified: 30 Nov 2011