Assignment title: Information
ENGCOMP 1012 Fall 2016 Assignment 5 Page 1 of 6
COMP 1012 Fall 2016 Assignment 5
Due Date: Friday, December 9, 2016, 11:59 PM
Material Covered
• numpy 2D arrays
• matplotlib
• file processing
Notes:
• When programming, follow the posted programming standards to avoid losing marks. A program called
CheckStandardsV2.py is provided to enable you to check that you are following standards.
• You will hand in two script files for this assignment. Name your script files as follows:
A5Q1.py. For example, LiJaneA5Q1.py is a valid name for a student named
Jane Li. If you wish to add a version number to your file, you may add it to the end of the file name. For
example, SmithRobA5Q1V2.py is a valid name for Rob Smith's script file.
• Submit output in similarly-named output files: e.g., A5Q1Output.png. To
generate a text file, open a new empty tab in Spyder and save it under the name above. Make sure you
choose Text files from the Save as type: list before you save the file! Then copy and paste the output
of your program from the console window to this file, and save it.
• You must complete the checklist for a Blanket Honesty Declaration in UM Learn to have your
assignment counted. This one honesty declaration applies to all assignments in COMP 1012.
• To submit the assignment follow the instructions on the course website carefully (link:
http://www.cs.umanitoba.ca/~comp1012/Documents/Handin_UMLearn.pdf). You will
upload both script file and output via a dropbox on the course website. There will be a period of several
days before the due date when you can submit your assignment. Do not be late! If you try to submit your
assignment within 48 hours after the deadline, it will be accepted, but you will receive a penalty (of 25%
per day late).
Group Work
NOTE: This assignment allows you to work with others as a group.
If you do the assignment all on your own and hand it in, you can get full marks if you achieve about
70% on the assignment. There will be no bonuses for doing better than that.
If you do the work with a friend, then each of you must hand in your own copy of the assignment.
They do not have to be the same, but they can be, except for the inner author identification. If the
solution you hand in is completely correct, you can get full marks, and similarly for your friend.
If you do the work with two others, then each of you must hand in your own copy of the assignment.
Even if the solution you hand in is completely correct, you can get at maximum about 3.5 out of 4
marks. If you are part of a group of four, then each can get a maximum of 3 out of 4 marks. If you are
part of a group of six, then each can get a maximum of about 2.5 out of 4 marks. If you are part of a
group of nine, then each can get a maximum of 2 out of 4 marks. The more people you work with,
the smaller is your possible mark. In general, the maximum mark is min (4, 6/ �) when n people
work as a group.
If you work as part of a group containing James Dean, Gwen Worobec and Calvin Wong as well as
yourself, you must insert lines like the following into your code's initial doc string to avoid a charge COMP 1012 Fall 2016 Assignment 5 Page 2 of 6
of academic dishonesty. The names deanj12, worobg and wongc23 are the UMnetIDs of your
associates.
@with deanj12
@with worobg
@with wongc23
They in turn should name you and one another in their submitted assignments.
If you do the assignment by yourself, put this line into your initial doc string:
@with nobody
Description
In this question, you will extend the work of Assignment 4, Question 1 to be able to approximate
data with a higher order polynomial.
In particular, you will be working with a set of (�, �) pairs, as before, representing a relationship
between an independent variable (�) and a dependent variable (�). You will be given a collection of
such pairs (�!, �!), … , (�!, �!). However, instead of trying to calculate the line of best fit � � =
�� + �, we will be trying to calculate a model polynomial
� � = �!
!
!!!
�!
Thus, we are trying to calculate coefficients �!, �!, … , �! for a degree-m model polynomial, instead
of just coefficients � and �, as in Assignment 4. For example, if � = 2, then the model polynomial
we are trying to calculate is: �! + �!� + �!�!.
For example, the figure below shows a data set (blue dots), and two model polynomials: a linear
polynomial (blue line) and a quadratic (degree two) polynomial (green curve). As you can see,
neither approximates the data set well.
(1)COMP 1012 Fall 2016 Assignment 5 Page 3 of 6
The derivation of a formula for the coefficients of the model polynomial is done in the same way as
for the linear case in Assignment 4 (by taking derivatives and solving), but the result is expressed as
a formula in terms of matrices and vectors:
� = �!� !! �! �
Here, there are three quantities and three operators:
• The output of the formula is � is the vector (�!, �!, … , �!) of coefficients we are solving
for, where � is the degree of the polynomial. This represents the polynomial in equation (1)
that we are looking for.
• � is the vector of data (�!, �!, … , �!), where � is the number of data points.
• � is a special 2D matrix of values called the Vandermonde matrix. It is built using the �!, and
is described below.
• �! is the transpose of �, obtained by flipping � along the major diagonal.
• The power of -1 represents inverting a matrix. The matrix inverse of a matrix A (as in Lab 8)
is the matrix B such that AB is the identity matrix.
• Multiplications in the formula are matrix/vector multiplications, obtained using the dot
product in numpy.
Numpy Operations
To facilitate computing the vector �, three new numpy operations will be useful:
1. numpy.linalg.inv(A): given a square matrix A, this function returns its inverse.
2. numpy.transpose(A): given a 2D matrix A, this function returns its transpose.
3. numpy.vander(xs,m,increasing=True): this function creates 2D Vandermonde
matrices. This is described in detail below.
Vandermonde Matrix
The Vandermonde matrix is used to determine the coefficients �! of the model polynomial, and is
represented by � in formula (2). The Vandermonde matrix is defined using the degree of the model
polynomial and the values �!. (Note that the �! are incorporated in the formula (2) directly.)
For values (�!, �!, . . . , �!) and an integer value �, the Vandermonde matrix � is given by
� =
1 �! �!!
1 �! �!!
1 �! �!!
�!! ⋯ �!!
�!! ⋯ �!!
�!! ⋯ �!!
⋮ ⋮ ⋮
1 �
! �!!
⋮ ⋱ ⋮
�
!
!
⋯ �
!
!
Note that if � is this Vandermonde matrix, and � = (�!, �!, … , �!), then �� is a vector where the ith entry is �(�!), where � is the model polynomial in formula (1) and � is the degree of �. Thus, the
vector � we are calculating is the best solution to � = ��.
To calculate the Vandermonde matrix, use the built-in numpy function numpy.vander(). The
function takes three parameters:
1. The first parameter is a numpy array containing the values �!.
2. The second parameter how many columns the matrix should have. Note that this value
should be one more than the degree of the polynomial that you are trying to fit to the
data.
(2)COMP 1012 Fall 2016 Assignment 5 Page 4 of 6
3. For the third parameter, you should specify "increasing=True". If you don't you will get the
mirror image of the Vandermonde matrix above.
Functions
Write a program that works with two numpy arrays containing the x and y values, and calculates
the coefficients of the model polynmomial. In particular, you should write the following functions:
• loadData(fileName): Load data from the file given by the string variable fileName. The
file is assumed to be in the same directory as the script, and the contents of the file satisfy
the format described in the "File Format" section below (that is, your program can crash if
the file doesn't exist or isn't in the right format). The function should return a 2-tuple
containing x and y values, stored in numpy arrays. This function may have a loop to read the
file.
• calculateModelPoly(xs,ys,deg): return a numpy array that represents the coefficients
model polynomial of degree deg for the data xs and ys. The parameters xs and ys are
numpy arrays. This function cannot contain any loops.
• evaluatePoly(poly,xs): this function takes two parameters: a polynomial poly, given as a
numpy array of coefficients, and xs, a sequence of x values that the polynomial should be
evaluated at, which is a numpy array. The function should return another numpy array of
the same length as xs that contains, in position i, the value of the polynomial poly
evaluated at xi (the i-th entry of xs). This function may have one loop. You can use Horner's
method or standard polynomial evaluation.
• plotModelPolys(xs,ys,start,end,filename): this function takes five parameters: two
numpy arrays xs and ys of the same length, two integers, start and end and a filename
as a string. This function should compute the model polynomials of all degrees d between
start and end (inclusive) and plot them in a matplotlib plot, as well as a scatter plot of the
xs and ys. The function should save the plots as a png in the filename provided, according
to the requirements given below in the section "Model Polynomial Plots". The function
should return nothing. This function may have one loop over the different degrees from
start to end.
• showMenu(): Display a menu to the user allowing them the choose how to generate and
process the data. See below for more details. This function may have loops to validate input.
• showTermination(): show the three termination lines (author, date, "end of processing").
You will not get marks for this function, but it must be present.
For each function, use assertions to verify that numpy arrays have the correct type and that arrays
have the same length when necessary.
Model Polynomial Plots
The function plotModelPolys should ensure the following requirements are met for the plot:
1. All data points should be displayed.
2. The plotted range of the model polynomials should be from the minimum value of the xs to
the maximum value of the xs.
3. The polynomials should be evaluated at 1000 points in the interval.
4. Each polynomial should be coloured with a different colour.
5. Ensure that the output figure has appropriate labels on the x and y axes, as well as a
reasonable title.COMP 1012 Fall 2016 Assignment 5 Page 5 of 6
Menu
Present the user with the following menu:
Select one of the following options:
1. Load a data dataset.
2. Display model polynomials for the dataset.
3. Quit.
As with Assignment 4, there is an order to the commands: if a user selects 2 without 1, then an
error should be shown. Additionally, the menu should behave as follows:
• Every time option 1 is selected, the user should be prompted for a file name.
• Every time option 2 is selected, the user should be prompted for start and end values for
the degree (as used in plotModelPolys), as well as an output filename. This should result in
a figure being saved to the output file with all model polynomials from the start to end
degrees (inclusive).
• If a user input an option that is not 1-3, or a value that is not an integer, an error should be
reported. The menu should be repeatedly displayed until the user selects 3.
Sample Output
An example run of the program is given below. It uses
Select one of the following options:
1. Load a data dataset.
2. Display model polynomials for the dataset.
3. Quit.
Enter your selection: 1
Enter the filename: data1.csv
Data loaded
Select one of the following options:
1. Load a data dataset.
2. Display model polynomials for the dataset.
3. Quit.
Enter your selection: 2
Enter starting degree: 1
Enter ending degree: 2
Enter the output filename: data1.png
Select one of the following options:
1. Load a data dataset. COMP 1012 Fall 2016 Assignment 5 Page 6 of 6
2. Display model polynomials for the dataset.
3. Quit.
Enter your selection: 3
Programmed by the Instructors
Date: Mon Nov 21 16:19:32 2016
End of Processing
Input File Format
The input files are guaranteed to have the following format. The first line of the file contains a single
number N, which is the number of data points. The next N lines contain two integers per line,
separated by a single comma. Each line represents a single data point of the form x,y.
MatplotLib Drawing Hints
Here are two drawing-specific hints for Matplotlib:
• mpl.figure(figsize=(20,10)) will change the size of the figure. You may need to adjust the
size of your figure so that the legend does not overlap with the data.
• mpl.ioff() will ensure that the figure is not drawn in the window, when saving a file.
Hand-in
Submit your script file and your output (txt file) showing the results for all input files provided. Use
appropriate values for the degrees based on the data sets. You should also submit all of the
images generated by the script as png files. Make at least one error in your user input.