MIS770, T1 2017 Assignment Two 1 | P a g e
FACULTY OF BUSINESS AND LAW
Department of Information Systems and Business Analytics
MIS770 – Foundation Skills in Data Analysis
Assignment 2 –Automotive CO2 Emissions Analysis
Particulars
• Marks: 30%
• Words: 2,000
• Submission: Online to the MIS770 assignment two drop box in CloudDeakin
Email submissions will not be accepted
• Note: This assignment is to be completed individually
Assurance of Learning
This assignment assesses the following Graduate Learning Outcomes and related Unit Learning
Outcomes:
Graduate Learning Outcome (GLO) Unit Learning Outcome (ULO)
GLO4: Critical thinking: evaluating information using
critical and analytical thinking and judgment
ULO2: Manipulate and summarise data that
accurately represents real world problems
ULO3: Interpret and appraise statistical output to
assist in real‐world decision making
Overview
The purpose of this assignment is to investigate a dataset which will enable you to answer questions
posed in a Memorandum (see Memorandum section below). In order to answer the memorandum
questions, you’ll need to analyse a given dataset, interpret the results, and then draw appropriate
conclusions.
The aims of the assignment are to:
provide you with some examples of the application of data analysis
test your understanding of the material in the relevant topics
test your ability to analyse and interpret your results
test your ability to effectively communicate the results of your analysis to others
Before attempting the assignment, make sure you have studied the materials well. At a minimum,
please read the relevant sections of the prescribed textbook and review the materials provided in
Modules 1, 2 and 3 (simple regression).MIS770, T1 2017 Assignment Two 2 | P a g e
Scenario
You play the role of Mira Hetnal in the Ministry of Transport’s Research and Analysis Department and
you have been asked to respond to a Memorandum from Selina Wang, the Chief Data Analyst. To
assist you in answering Selina’s questions, she has provided you with a dataset called
Motor_Vehicles.xlsx.
For the purposes of this assignment, the dataset relates to a random sample of new Canadian Motor
Vehicles whose CO2 Emissions were tested during 2015.
The specific questions Selina has for Mira are in the following memorandum.
Memorandum
Memorandum
Date: 4th January, 2017
To: Mira Hetnal, Research and Analysis Department
From: Selina Wang, Chief Data Analyst
Subject: Analysis of Automotive CO2 Emissions Data
Dear Mira,
Can you please carry out an analysis of the recent Automotive CO2 Emissions Data (contained in the
file Motor_Vehicles.xlsx) and prepare a Memorandum reply to me containing answers to the following
questions. In your Memorandum, please use plain language as I will provide your reply directly to
people who do not necessarily understand statistical jargon.
My specific questions are:
Q1. An Overall View of CO2 Emissions
Can you provide me with an overall summary of the variable CO2 Emissions just by itself?
Q2. Relationships with CO2 Emissions
Does there appear to be any relationship between the CO2 Emissions and the type of Fuel used?
Q3. Confidence Intervals
(a) Can you estimate the level of CO2 Emissions for all 4 Cylinder, 6 Cylinder and 8 Cylinder
vehicles? Does there appear to be any difference?
(b) Also, can you estimate the proportion of all vehicles that have 4 Cylinders, 6 Cylinders and 8
Cylinders?
Q4. Hypothesis Tests
Last month a national newspaper published an article stating the Federal Government was
investigating a proposal to restrict CO2 Emissions for new vehicles to no more than 350 grams per
kilometre. The same article suggested that this would remove at least 5% of the largest polluting
vehicles off the road. Are you able to confirm if the sample data we have (i.e. Motor_Vehicles.xlsx)
supports this claim?
Q5. Simple Regression
(a) I don’t know a lot about vehicle CO2 Emissions, but I would think that the larger an engine was,
the greater the CO2 Emissions. Would you therefore create a regression model that shows
how well the size of an engine (in litres) explains the variation in CO2 Emissions?
(b) Would you then use your model to predict the CO2 Emissions of a vehicle that had an engine
size of 1000 cc (i.e. 1 litre)? Do you have any concerns about this prediction?MIS770, T1 2017 Assignment Two 3 | P a g e
Q6. Appropriate Sample Size
Finally, I am concerned that the sample size of 1082 cars that we have in this study is far too many and
we could easily get the same results with a much smaller sample. Therefore, if we wanted to undertake
a future study, what would an appropriate sample size be if we wanted to:
(a) estimate the proportion of vehicles whose CO2 Emissions were no more than 350 grams per
kilometre to within 3% with a high level of confidence, and
(b) accurately estimate the overall combined fuel consumption (i.e. Fuel_Both) to within 0.5
l/100kms.
Basically, how many cars would we need to include in next year’s survey to satisfy both requirements?
Regards, Selina
Memorandum Requirements
Your Memorandum should be no longer than 2000 words and there is no need to include a
Table of Contents, Charts and Tables, or Appendices in the Memorandum. The
Charts/Graphics and Tables you create are only to be placed in the Data Analysis file i.e. the
Excel spreadsheet
Suggested Word formatting for the Memorandum: Single‐line spacing; no smaller that 10‐
point font; page margins approx. 25mm, and good use of white space
Your Memorandum must have a cover sheet containing your particulars and Unit details
Set out the Memorandum in the same order as in the originating Memorandum from Selina,
with each section (question) clearly marked
Use plain language and keep your explanations succinct. Avoid the use of technical or
statistical jargon. As a guide to the meaning of “Plain Language”, imagine you are explaining
your findings to a person without any statistical training (e.g. someone who has not studied
this unit). What type of language would you use in this case?
Marks will be lost if you use unexplained technical terms, irrelevant material, or have poor
presentation/organisation
All Microsoft Excel output associated with each question in the Memorandum is to be placed
in the corresponding tab in the file Motor_Vehicles.xlsx
Data Analysis Instructions/Guidelines
To prepare a reply to Selina’s Memorandum, you will need to examine and analyse the dataset
Motor_Vehicles.xlsx thoroughly.
Selina has asked several questions and your Data Analysis output (i.e. your charts/tables/graphs)
should be structured such that you answer each question on the separate tab/worksheet provided in
your Excel document. There are also two extra tabs in Motor_Vehicles.xlsx called CI and HT and you
can use the various templates contained in these tabs in your “Confidence Interval” and “Hypothesis”
answers.
In order to effectively answer the questions, your Data Analysis output needs to be appropriate.
Accordingly, you’ll need to establish which of the following techniques are applicable for any given
question:
Summary Measures (e.g. Descriptive Statistics, Inc. Outlier detection)
Comparative Summary Measures (i.e. Descriptive Statistics for multiple values of a variable)MIS770, T1 2017 Assignment Two 4 | P a g e
Suitable tables (such as a Frequency Distribution) and charts or graphics (such as Histograms,
Box Plots, Pie Charts, Bar/Column Charts) that will illustrate more clearly, other important
features of a variable
Cross Tabulations (sometimes called Contingency Tables), used to establish the relationships
(dependencies) between two variables (see Additional Materials under Topic 3 – Creating
Cross Tabulations in Excel using Pivot Tables)
Confidence Intervals. You can assume that a 95% confidence level is appropriate. We use
Confidence Intervals when we have no idea about the population parameter we are
investigating. Additionally, we would use Confidence Intervals if we are asked for an estimate.
You can use the relevant Excel templates provided in the dataset and copy them to the
applicable question tab
Hypothesis Tests. You can assume that a 5% level of significance is appropriate. We Use
Hypothesis Tests when we are testing a Claim, a Theory or a Standard. You can use the
relevant Excel templates provided in the dataset and copy them to the applicable question
tab
Note: There is an Appendix at the end of each Chapter of the Prescribed Textbook which describes the
basic Excel steps associated with that Topic. Chapters 1 to 9 are applicable for this assessment.
Submission
Your completed assignment should be submitted in two separate files:
Data Analysis (Part A): An Excel document (i.e. the Motor_Vehicles.xlsx file) containing
separate tabs/worksheets with charts/tables/graphs for each question. Please note that all
interpretations should be presented in your “Memorandum” and the Excel document should
only contain your intermediate analysis and final output
Memorandum (Part B): A Word document of no more than 2000 words which is not to contain
any charts/tables/graphs
Please name your Word document Motor_Vehicles_yourstudentid.docx and the Data Analysis file
Motor_Vehicle_yourstudentid.xlsx.
Note: The Cloud Unit site is the ONLY method of submission acceptable.MIS770, T1 2017 Assignment Two 5 | P a g e
Marking Rubric
Poor Needs
Improvement Satisfactory Good Very Good Excellent
Part A:
Data Analysis
(Marks: 12)
This part
relates to the
various
visualisations
in the form of
charts, tables
& graphs etc.
created by
Mira which
formed the
basis of her
response to
Selina.
0 points
Uses irrelevant or
inappropriate
techniques to
analyse the data,
or the Data
Analysis and
visualisation
tools were used
to analyse the
data but in an
incomplete or
inaccurate
manner.
A very poor
presentation of
the analysis, or
the analysis does
not follow
principles of good
graphical display.
0 – 3.5 Marks
4 points
Uses some
appropriate data
analysis and
visualisation
tools to analyse
the data but
there are many
errors in the
analysis.
The presentation
of the analysis
needs
improvement.
4 – 5.5 Marks
6 points
Uses appropriate
data analysis and
visualisation
tools to analyse
the data but
there are several
errors in the
analysis.
The presentation
of the analysis is
satisfactory.
6 – 6.5 Marks
7 points
Uses appropriate
data analysis and
visualisation
tools to analyse
the data but
there are some
errors in the
analysis.
The presentation
of the analysis is
of a respectable
standard.
7 – 8 Marks
8.5 points
Comprehensive
analysis of the
data using
appropriate
techniques, but
there are some
minor errors in
the analysis.
Uses data
visualisations to
understand the
patterns in data.
The analysis is
well organised
and follows
principles of good
graphical display.
8.5 – 9 Marks
12 points
Skilful and
comprehensive
analysis of data
using many
different
techniques.
Uses data
visualisations to
produce novel
insights.
An excellent
presentation of
the analysis.
9.5 – 12 Marks
Part B:
Memorandum
(Marks: 12)
This part is the
written
response by
Mira to the
questions
posed by
Selina.
0 points
Does not
communicate
any of the main
findings of the
analysis in an
accurate and/or
useful way, or the
interpretation
and
communication
of findings is at a
basic level.
The written
communication is
unprofessional or
difficult to follow
and contains
numerous errors.
0 – 3.5 Marks
4 points
Explains some of
the main findings
of the analysis
accurately which
only enables the
reader to draw a
few reasonable
conclusions.
The written
communication is
not very easy to
follow and/or it
contains too
many errors.
4 – 5.5 Marks
6 points
Explains most of
the main findings
of the analysis
accurately and
enables the
reader to draw
several
reasonable
conclusions.
The written
communication is
clear and easy to
follow but it
contains minor
errors.
6 – 6.5 Marks
7 points
Explains nearly all
of the main
findings of the
analysis
accurately and
enables the
reader to draw
mostly
reasonable
conclusions.
The written
communication is
clear and easy to
follow and
generally free of
errors.
7 – 8 Marks
8.5 points
Provides detailed
and accurate
descriptions of
the most
important
features of the
analysis along
with
appropriately
qualified
conclusions.
The written
communication is
professional,
easy to follow
and has a good
structure.
8.5 – 9 Marks
12 points
Provides
outstanding
descriptions and
conclusions that
are carefully
considered and
insightful.
The written
communication is
very professional,
logical and easy
to follow.
9.5 – 12 Marks
Overall
Assignment
Presentation
(Marks: 6)
0 points
No attempt has
been made to
follow the
assignment
Requirements/
Instructions/
Guidelines.
Poorly presented
0 – 1.5 Marks
2 points
Little attempt has
been made to
follow the
assignment
Requirements/
Instructions/
Guidelines.
Unsatisfactorily
presented
2 – 2.5 Marks
3 point
Most of the
assignment
Requirements/
Instructions/
Guidelines have
been followed.
Satisfactorily
presented
3 Mark
3.5 point
Majority of the
assignment
Requirements/
Instructions/
Guidelines have
been followed.
Good
presentation
3.5 – 4 Mark
4.5 points
All of the
assignment
Requirements/
Instructions/
Guidelines have
been followed.
Very good
presentation
4. 5 – 5 Marks
6 points
All of the
assignment
Requirements/
Instructions/
Guidelines have
been dealt with
meticulously.
Faultless
presentation
5.5 – 6 Marks