Project 1: Stats

Finding the mean and median of numerical data

Educational Objectives: After successfully completing this assignment, the student should be able to accomplish the following:

Operational Objectives: Create a project that computes the mean and median of a sequence of integers received via standard input.

Deliverables: Files: stats.h, stats.cpp, main.cpp, log.txt. Note that with the supplied makefile, these files constitute a self-contained project.

Assessment Rubric: The following will be used as a guide when assessing the assignment:

build test.x                          [0..4]:   x
test.x < data1.in  [supplied main()]  [0..4]:   x
test.x < data2.in  [supplied main()]  [0..4]:   x
build stats.x                         [0..4]:   x
stats.x < data1.in [student main()]   [0..4]:   x
stats.x < data2.in [student main()]   [0..4]:   x
code quality                        [-20..6]:  xx  # note negative points awarded during assessment
dated submission deduction [(2) pts per]:     (xx) # note negative points awarded during assessment
                                               --
total                                [0..30]:  xx

Please self-evaluate your work as part of the development process.

Background

Given a finite collection of n numbers:

  1. The mean is the sum of the numbers divided by n, and
  2. The median is the middle value (in case n is odd) or the average of the two middle values (in case n is even).

Note that to find the median of a collection of data, it is convenient to first sort the data, that is, put the data in increasing (or non-decreasing) order. Then the median is just the middle datum in the sorted sequence (or the average of the two middle data, if there are an even number).

One of the simplest sort algorithms is called Selection Sort, which operates on an array of elements and has a computation which can be described in one sentence: For each element of the array, find the smallest element with equal or higher index in the array and swap these two elements. Here is a "pseudocode" description of the algorithm:

for i in [0...n)       // for each element of array A
  k = i                // find the smallest element following it
  for j in [i+1...n)
    if A[j] < A[k]
      k = j
    endif
  endfor               // now A[k] is the smallest element following A[i]
  swap the values in A[i] and A[k]
endfor

(You could test whether A[k] < A[i] before the swap, but it is not clear this would speed up the process - swapping may be faster than testing.)

Procedural Requirements:

  1. Begin a log file named log.txt. This should be an ascii text file in cop3330/proj3 with the following header:

    log.txt # log file for UIntSet project
    <date file created>
    <your name>
    <your CS username>
    

    This file should document all work done by date and time, including all testing and test results.

  2. Create and work within a separate subdirectory cop3330/proj1. Review the COP 3330 rules found in Introduction/Work Rules.

  3. Copy these files

    LIB/proj1/makefile
    LIB/proj1/deliverables.sh
    LIB/scripts/submit.sh
    

    from the course distribution library into your project directory.

  4. Create three more files

    stats.h
    stats.cpp
    main.cpp
    

    complying with the Technical Requirements and Specifications stated below.

  5. Turn in four files stats.h, stats.cpp, main.cpp, and makefile using the submit.sh submit script.

    Warning: Submit scripts do not work on the program and linprog servers. Use shell.cs.fsu.edu to submit projects. If you do not receive the second confirmation with the contents of your project, there has been a malfunction.

  6. After submission, take Quiz 1 in Blackboard. This quiz covers these areas:

    1. Casting; integer and floating point arithmetic.
    2. Function calls
    3. Loops
    4. This assignment
    5. Course Syllabus

    Note that the quiz may be taken several times. The highest of the grades will be recorded and count as 20 points (40 percent of the assignment).

Technical Requirements and Specifications

  1. The project should compile error- and warning-free on linprog with the command make stats.x.

  2. The number of integers input by the user is not known in advance, except that it will not exceed 100. Numbers are input through standard input, either from keyboard or file re-direct. The program should read numbers until a non-digit or end-of-file is encountered or 100 numbers have been read.

  3. Once the input numbers have been read, the program should calculate the mean and median and then report these values to standard output.

  4. The source code should be structured as follows:

    1. Implement separate functions with the following prototypes:
      float Mean   (const int* array, size_t size); // calculates mean of data in array
      float Median (int* array, size_t size);       // calculates median of data in array
      void  Swap   (int& x, int& y);                // interchanges values of x and y
      void  Sort   (int* array, size_t size);       // sorts the data in array
      
    2. I/O is handled by function main(); no other functions should do any I/O
    3. Function main() calls Mean() and Median()
    4. Function Median() calls Sort()
    5. Function Sort() calls Swap()

  5. The source code should be organized as follows:

    1. Prototypes for Mean, Median, Sort, and Swap should be in file stats.h
    2. Implementations for Mean, Median, Sort, and Swap should be in file stats.cpp
    3. Function main should be in file main.cpp

  6. The Sort() function should implement the Selection Sort algorithm.

  7. When in doubt, your program should behave like the distributed executable example stats_i.x in area51. Identical behavior is not required, but the general I/O behavior should be the same. In particular, the data input loop should not be interupted by prompts for a next datum - this will make file redirect cumbersome. Just ask for the data one time, then read until a non-digit or end of file is encountered.

Hints