Educational Objectives: After successfully completing this assignment, the student should be able to accomplish the following:
Operational Objectives: Create a project that computes the mean and median of a sequence of integers received via standard input.
Deliverables: Files: stats.h, stats.cpp, main.cpp, log.txt. Note that with the supplied makefile, these files constitute a self-contained project.
Assessment Rubric: The following will be used as a guide when assessing the assignment:
build test.x [0..4]: x test.x < data1.in [supplied main()] [0..4]: x test.x < data2.in [supplied main()] [0..4]: x build stats.x [0..4]: x stats.x < data1.in [student main()] [0..4]: x stats.x < data2.in [student main()] [0..4]: x code quality [-20..6]: xx # note negative points awarded during assessment dated submission deduction [(2) pts per]: (xx) # note negative points awarded during assessment -- total [0..30]: xx
Please self-evaluate your work as part of the development process.
Given a finite collection of n numbers:
Note that to find the median of a collection of data, it is convenient to first sort the data, that is, put the data in increasing (or non-decreasing) order. Then the median is just the middle datum in the sorted sequence (or the average of the two middle data, if there are an even number).
One of the simplest sort algorithms is called Selection Sort, which operates on an array of elements and has a computation which can be described in one sentence: For each element of the array, find the smallest element with equal or higher index in the array and swap these two elements. Here is a "pseudocode" description of the algorithm:
for i in [0...n) // for each element of array A k = i // find the smallest element following it for j in [i+1...n) if A[j] < A[k] k = j endif endfor // now A[k] is the smallest element following A[i] swap the values in A[i] and A[k] endfor
(You could test whether A[k] < A[i] before the swap, but it is not clear this would speed up the process - swapping may be faster than testing.)
Begin a log file named log.txt. This should be an ascii text file in cop3330/proj3 with the following header:
log.txt # log file for UIntSet project <date file created> <your name> <your CS username>
This file should document all work done by date and time, including all testing and test results.
Create and work within a separate subdirectory cop3330/proj1. Review the COP 3330 rules found in Introduction/Work Rules.
Copy these files
LIB/proj1/makefile LIB/proj1/deliverables.sh LIB/scripts/submit.sh
from the course distribution library into your project directory.
Create three more files
stats.h stats.cpp main.cpp
complying with the Technical Requirements and Specifications stated below.
Turn in four files stats.h, stats.cpp, main.cpp, and makefile using the submit.sh submit script.
Warning: Submit scripts do not work on the program and linprog servers. Use shell.cs.fsu.edu to submit projects. If you do not receive the second confirmation with the contents of your project, there has been a malfunction.
After submission, take Quiz 1 in Blackboard. This quiz covers these areas:
Note that the quiz may be taken several times. The highest of the grades will be recorded and count as 20 points (40 percent of the assignment).
The project should compile error- and warning-free on linprog with the command make stats.x.
The number of integers input by the user is not known in advance, except that it will not exceed 100. Numbers are input through standard input, either from keyboard or file re-direct. The program should read numbers until a non-digit or end-of-file is encountered or 100 numbers have been read.
Once the input numbers have been read, the program should calculate the mean and median and then report these values to standard output.
The source code should be structured as follows:
float Mean (const int* array, size_t size); // calculates mean of data in array float Median (int* array, size_t size); // calculates median of data in array void Swap (int& x, int& y); // interchanges values of x and y void Sort (int* array, size_t size); // sorts the data in array
The source code should be organized as follows:
The Sort() function should implement the Selection Sort algorithm.
When in doubt, your program should behave like the distributed executable example stats_i.x in area51. Identical behavior is not required, but the general I/O behavior should be the same. In particular, the data input loop should not be interupted by prompts for a next datum - this will make file redirect cumbersome. Just ask for the data one time, then read until a non-digit or end of file is encountered.
Sample executables are distributed in [LIB]/area51. These are named stats_i.x and stats_s.x. The suffixes indicate which of the two architectures the executable is compiled on: *_i.x runs on Intel/Linux and *_s.x runs on Sun/Unix.
To run a sample executable, follow these steps: (1) Decide which architecture you want to use. The program machines are 32-bit Sun architecture running Sun's version of Unix, and the linprog machines are Intel 64-bit architecture running Linux. (2) Copy the appropriate executable into your space where you want to run it. For example, if you are logged in to program enter the command "cp [LIB]/area51/stats_s.x .". (3) Change permissions to executable: "chmod 700 stats_s.x". (4) Execute by entering the name of the executable. If you want to run it on a data file "data1", use input redirect as in: "stats_s.x < data1". If you want the output to go to another file, use output redirect: "stats_s.x < data1 > data1.out".
A working makefile is distributed and may be used in the submission - provided that you have read and understood the makefile, so that when a makefile is required in the future you will know how to create one.
Test files can be created using the program ranint.cpp, which is distributed as part of the assignment and is compiled by the supplied makefile. To create random data files for testing, first build ranint.x with the command
make ranint.x
and then execute. Note that the program expects 3 command line arguments - (1) file name, (2) upper bound on size of integers, and (3) number of elements to generate. It will remind you if you forget. Here are examples:
~/3330/proj1>ranint.x ** required arguments: 1: filename 2: upper bound on absolute size ('0' means no upper bound) 3: count of items ** try again ~/3330/proj1>
(Forgot to give arguments.)
~/3330/proj1>ranint.x d1 99 51 Results stored in file d1 range: -99 .. 98 count: 51 ~/3330/proj1>ranint.x d2 99 52 Results stored in file d2 range: -99 .. 98 count: 52 ~/3330/proj1>
The less-than character in the command:
stats.x < data1
is a Unix/Linux operation that redirects the contents of data1
into standard input for stats.x. Using > redirects program output. For
example, the command:
stats.x < data1 > data1.out
sends the contents of data1 to standard input and then sends the
program output into the file data1.out. These are very handy operations
for testing programs.
It is sometimes simpler to develop the code in a single file (such as project.cpp) that can be edited in one window and test-compiled with a single command (such as g++ -Wall -Wextra -ostats.x project.cpp) and split the file up into the deliverables after the initial round of testing and debugging.
Note that the array in which input is stored is passed to the functions as a pointer. In the case of Mean(), this pointer is const, indicating that the elements of the array may not be changed by the call. However in the case of Median(), the array element values are allowed to change. These values are in fact changed by the call to Sort().
The function Sort() operates on the array input as a pointer. When the function returns, the values of the array should be in increasing order.
The insertion sort algorithm requires a nested pair of loops (one inside the other).
Sorting the data is essential to calculate the median: when in an array that is sorted, the middle (two) values are those contained in the middle (two) indices of the array.
The middle index of an array of n elements, when n is odd, is [(n-1)/2]. The middle two indices, when n is even, are [n/2 - 1] and [n/2].
Be careful when subtracting 1 from an unsigned integer type such as size_t.
Look at the code examples in Chapter 2 of the lecture notes to find simple ways to structure your main I/O loop.