Project 6: RabinKarp

Monte Carlo & Las Vegas Substring Search

Educational Objectives: After completing this assignment, the student should be able to accomplish the following:

Operational Objectives: Design and implement the class template RabinKarp< Alphabet> that acts as a function object on input strings, returning the location of a first match or the length of the input string when no match is found.

You may have teams of 2 - 4 people. The team should compose a brief summary of work that explains the responsibilities and work products each member of the team accomplished. Also each team member should submit the project individually. Please make certain that the submissions for each member of a team are identical.

Change Log: This document is released for use and comment. Details may be changed or augmented based on commentary posted in the Project 6 Topics forum. Any substantive changes will be logged here.

Deliverables: Files:

rabinkarp.h    # definition & implementation of template< A > class RabinKarp
alphabet.h     # definition & implementation of class Alphabet
rkdriver.cpp   # driver program 
makefile       # builds all executables in project, including tests
readme.txt     # overview of team & project

Procedural Requirements

  1. The official development | testing | assessment environment is g++47 -std=c++11 -Wall -Wextra on the linprog machines. Code should compile without error or warning.

  2. Each member of a team submits all team deliverables

  3. Deliverables submitted should be identical across all team members.

  4. The team makup is listed in the file header documentation of each submitted file (see C++ Style link for standards)

  5. File readme.txt explains how the software was developed, what responsibilities each team member had, how it was tested, and how it is expected to be operated.

  6. Copy the file LIB/proj6/submit.sh into your project directory and change its permissions to executable. Edit the file "deliverables.RK" to the specific files in your project. Submit the project by executing the script: submit.sh deliverables.RK

    Warning: Submit scripts do not work on the program and linprog servers. Use shell.cs.fsu.edu to submit projects. If you do not receive the second confirmation with the contents of your project, there has been a malfunction.

Code Requirements and Specifications

  1. Class RabinKarp<char>

    1. The class should have a method void Init(const char* p) that stores a copy of p, where p is a null-terminated C-string. [p is the pattern to be matched in incoming strings.] A 1-argument constructor should accomplish the same result.
    2. The class should be a function class, so that instances of the class are function objects. In other words, if rk is an instance of the class [an object] initialized with pattern p and s is a C-string of length n [a null-terminated array of char] then rk(s,ensure) returns the position of the first match of p in s, or n if there is no match. "Match" is subject to either the Monte Carlo or Las Vegas rule, depending on the value of ensure.

  2. Monte Carlo Rule. The pattern is matched with high probability. The runtime is guaranteed O(n + k), where n is incoming string length and k is the pattern length.

  3. Las Vegas Rule. The pattern match is guaranteed. The runtime is O(n + k) with high probability, where n is incoming string length and k is the pattern length.

  4. Evaluation operator prototype. Recall that a function class is one for which operator() is overloaded as a class operator. The prototype for this operator-method in RabinKarp should be:

    size_t operator() ( const char* s , bool ensure = 0 ) const;
    

    The bool ensure signals whether the Monte Carlo [ ensure = 0 ] or Las Vegas [ ensure = 1 ] rule is used. The default rule is Monte Carlo.

  5. Probability. There should be a method float Probability() const that returns the probability estimate of success under either rule.

  6. Generalizations. Two directions for generalization should be discussed in readme.txt. First, briefly discuss how the design might accomodate alphabets other than char. Second, discuss how the design could be generalized (or perhaps re-vamped entirely) to accommodate multi-dimensional patterns of characters. The 2-dimensional case can be used for this discussion.

Hints