Due Date: Wednesday, February 16, 11:59:59PM
Value: 50 points
(5 points for following CMSC 210 Coding Standards,
45 points for overall design, functionality, and completeness)
Collaboration: For Assignment 1, collaboration is not allowed; you must work individually. You may still see your TA and come to office hours for help, but you may not work with any other CMSC 210 students. You may post questions on Discord, but you may not post code.
Github Assignment Invite Link: https://classroom.github.com/a/4wCtxJyn
Github Classroom: https://github.com/umbc-cmsc-210-spring-2022
Your assignment file must be called assignment1.py
.
This assignment deals with some basic descriptive statistics; specifically, mean, median and standard deviation. To compute the standard deviation, you calculate the difference between each value and the mean (the variance), and square it. Then you find the average of all of these squared variances and take the square root of it.
The data that you will be working with is real. It is the weight and brain data from 28 animal species.
Your program will read in one filename supplied at the command line and produce a variety of statistics based on the data present.
You are provided with test data files containing final CMSC 201 grades as integer values, all of which are greater than or equal to zero.
The files are what are called comma-separated value or .csv
files.
That is, they consist of values that are separated from each other, or delimited, by commas. These files will be included in your git template.
You will need to import the Python pre-defined math
library for this assignment in order to use the sqrt()
(square root) function.
You may also import the Python statistics library if you wish. You can use the mean()
, median()
, and stdev()
functions for the purpose of checking the correctness of your functions (i.e. that they get the same results).
But remove the calls to these functions and the import statistics statement before turning in your assignment.
For this assignment, you may not import any other Python libraries.
For this assignment, you may assume that the user will enter valid filenames into prompts for input.
If the user enters a different type of data than you asked for (for example, a file that isn't present), your program may crash. This is acceptable.
The sample output is available as a separate file in your project template. The format of your output does not have to exactly match the sample output, but it should be similar and neat.
You are not required to turn a design in for this assignment. However, you will benefit greatly from taking the time to do a proper one!
In CMSC 201, we discussed top-down design. That is, beginning with your main function’s design and breaking it down into smaller and smaller pieces (functions). This is a good approach. The bulleted items below should be of help. Before coding, take the time to thoroughly think through the program logic.
Take an incremental approach to the implementation and testing of your program. That is, do not use the “big bang” approach of implementing all or large parts of your program before you test it. Test as you go!
A top-down approach can also be taken when implementing and testing your program. (Some people prefer bottom-up, but top-down is recommended.) The bulleted items below should be of help.
main()
function.
You’ll need to create function “stubs” for any functions that main()
calls.
Have the stubs simply print a message such as, “Function X called.” You may also need return statements – just return any constant of the appropriate type.main()
to make sure that there are no syntax errors (of course!) and that its logic works as expected. Are all function calls working?You may find that you need to adjust your program design as you implement. That’s natural. However, if you find yourself making major adjustments, you need to go back and rethink your overall design. Don’t worry – it happens!
If you find that your algorithm for a statistical function requires you to sort the list of data, the function must first copy the data to a temporary list before sorting. That way, when the function returns, the original list sent into the function has not been corrupted.
If you do find that you need to sort, you may use the Python built-in sort()
or sorted()
function.
Look at your Python references (discussed in Lecture 02) for how to use this function.
You may also need to use min()
and/or max()
, which is allowed.
As we have not yet discussed Program Assumptions (part of the file header comment), you may use the following exactly as written in your file header comment. Make sure that you read through the assumptions; they impact how you design your program.
Program Assumptions:
- Each data file will be a comma-separated file with a single record of integer
integer values >= 0
- Each data file will contain > 2 values
- Each file's data forms a normal distribution (i.e. unimodal).
- The user will always enter at least one filename
- The user will always enter a valid filename (i.e. the file exists)
- The file will always have three columns
- The first row will always be the names of the columns
We will cover the use of Github in class and provide walk-throughs for submitting assignments.
In the mean time, you can start to develop locally. The source data can be downloaded from the Github project for the assignment.
Your computed values will be compared to the sample output OR the output of the statistics library for grading.
Following coding standards | 10 points |
Correctly reading the csv file | 20 points |
Correctly computing the median | 20 points |
Correctly computing the mean | 20 points |
Correctly computing the standard deviation | 30 points |