Predict the Next Play in an NFL Game
Due: 6 Nov 2013
Educational objectives: Experience implementing a self-adjusting BST, developing tests to verify the correctness of your program, and solving problems using the above class.
Statement of work: (i) Implement a self-adjusting BST class that self-organizes as specified below and (ii) implement a simple NFL play prediction program, which is a modification of that in assignment 3, but using your self-adjusting BST to store the plays.
Deliverables: Turn in a
makefile
and all header (*.h) and cpp (*.cpp) files that are needed to build your software. Turn in your development log too, which should be a plain ASCII text file calledLOG.txt
in your project directory. Also turn in an ASCII file, testing.txt, describing how you tested your code. This file should describe any remaining run time errors in your program. You will lose points for errors that we discover which were not identified by you in the above file. You will submit all of these as described in www.cs.fsu.edu/~asriniva/courses/DS13/HWinstructions.html.Requirements:
- Create a subdirectory called
proj4
.
- You will need to have a
makefile
in this directory. In addition, all the header and cpp files needed to build your software must be present here, as well as theLOG.txt
andtesting.txt
files.
- You should implement appropriate classes for the software. Your code should be well designed and object oriented. You should have a member function in each significant class to help you perform automated tests on your code, similar to those in our solution to assignment 1.
- You should implement a
test.cpp
file that performs automated tests on your code, similar to the one in our solution to assignment 1. Your makefile should compile this to an executable namedTestBST
.
- You should implement a self-adjusting BST class that self-organizes as described below. You must use this class to store plays. Your implementation may be specific for this requirement, instead of being a generic templated class. You are free to choose the specific features you wish to implement, but they should be reasonable. For example, you will certainly need to implement a method that lets you insert plays into the list.
- Your software's main task is as follows. A user will give it a list of
.csv
files. Each file contains a list of plays in each NFL game for one year. The user will then execute a sequence of queries to predict the next play that is likely. You may imagine, for instance, that the coach for the defense will use this to predict what the offense will do next in order to prepare for it. The queries may be of three types. Asummary
query will predict the probability of different types of plays. Alist
query will list plays that the offense executed in similar game situations. A
- The software is run by the user on the command line, as follows:
Analyze Year-List
, whereYear-List
is a non-empty list of valid years separated by whitespace. The following years are valid: 2007, 2008, 2009, 2010, 2011, and 2012. This instructs the software to analyze data for the specified years. Data for the yearn
is present in the filen.csv
. Each line of this file contains information on a particular play, except the first line which gives field/column headings. Each field in a line is separated by commas. The relevant fields for us are the following: 2. quarter, 3. minutes remaining in the game, 5. team name for offense, 6. team name for defense, 7. down, 8. yards to go for the next down, 9. starting location for that down, 10. description of the play. Some of the fields may be empty.The description is a string, which we will use to determine the type of play. We give below play types of interest to us and how they are identified, based on words in the description.
- Deep pass right: presence of the words 'deep', 'pass', and 'right' in the description.
- Deep pass left: presence of the words 'deep', 'pass', and 'left' in the description.
- Deep pass middle: presence of the words 'deep', 'pass', and 'middle' in the description.
- Short pass right: presence of the words 'short', 'pass', and 'right' in the description.
- Short pass left: presence of the words 'short', 'pass', and 'left' in the description.
- Short pass middle: presence of the words 'short', 'pass', and 'middle' in the description.
- Run to the right: presence of the word 'right' in the description, but not 'pass'.
- Run to the left: presence of the word 'left' in the description, but not 'pass'.
- Run to the middle: presence of the word 'middle' in the description, but not 'pass'.
- Field goal attempt: presence of the words 'field' and 'goal' in the description.
- Punt: presence of the word 'punts' in the description.
- The software first reads each file specified through the command line and stores relevant information in the self-adjusting BST. You should not store plays or information that are not relevant to our program. The
OFF
field is used to compare plays, using thestring
class's compare function, with a negative result indicating<
. (Recall that a comparison is required to insert objects into a BST.) If these fields are identical for two records, then the one from the earlier year is considered smaller. If these too are identical, then the one occurring earlier in the input file is considered smaller. The software then waits for a series of user input from stdin, and responds to each user input as described below.Possible user actions and required software response:
summary OFF DOWN TOGO YDLINE
: The fields in capitals specify the offense team, down, yards to go, location in field. The software should identify all plays executed in a similar situation and give the percentage of times each type of play was executed. For a play to be considered similar, it should be by the same team in offense and the same down. In addition, the yards to go should be within one yard of the above and the field position should be within 10% of the above. If no similar play exists, then outputNo similar play exists
to standard output (not to standard error).
list n MIN OFF DEF DOWN TOGO YDLINE
: The additional fields here denote the minutes remaining and defense team respectively. The software outputs then
most relevant similar plays. If fewer thann
plays are similar, then all the similar plays are output. Plays are considered similar as defined above. Relevance is a floating point number defined as:-(|Min-min|*5/3 + |TOGO-togo| + |YDLINE-ydline|)
. In addition, if the defense teams are identical, add 100 to the relevance. The fields marked in lower case denote corresponding fields in the play database. Each line of the output will give the type of play followed bymin off def down togo ydline
of the play, followed by its relevance. The plays are output in decreasing order of relevance (that is, most relevant play first). If multiple plays have the same relevance value, then the one that is smaller according to the BST comparison criterion is less relevant. If no play is similar, then outputNo similar play
to standard output.This command call also causes self-reorganization of the BST. The most relevant play that has been output due to the
list
command at least three times prior to the current command is moved to the root through rotations. For example, if Play 1 is the most relevant but has been output only once before and Play 2 is the second most relevant and has been output 5 times earlier, then Play 2 will be moved to the root. If none of then
most relevant plays has been output at least 3 prior times, then no self-reorganization takes place.
print n
: The firstn
plays in a level-order traversal of the BST are output. The format is the same as that forlist
, but without the relevance.
x
: Quit the program.
- Output
Invalid command
to standard output for any other command.A sample executable will not be provided. You need to develop good test cases to verify the correctness of your program. The .csv files are already available there under the
NFLData
subdirectory ofproj1
.Notes:
1. Your program should not have any output other than those specified above.
2. You should not use the STL
map
orset
classes. You may use the string class, STL algorithms, and functionals.3. Your program should be reasonably efficient.
Copyright: Ashok Srinivasan, Florida State University.
Last modified: 16 Oct 2013