Assignment title: Information
Group Project: Sydney Weather Data
300700 Statistical Decision Making, Autumn 2016
Due: Friday of Week 13 (20 May)
This Group Project is about weather data for Sydney obtained from the Bureau of Meteorology.
The data set weatherData provided with this project in csv
format contains weather observations for Sydney for the last
14 months. See the first line of the csv file for an explanation
of the variables.
1 Mean minimum and mean maximum daily temperature in March
Using the weather records for March 2015 and March 2016,
compute 99% confidence intervals for the mean minimum
daily temperature and the mean maximum daily temperature
for the month of March
(a) [1.5 marks] using quantiles of a bootstrap distribution;
and
(b) [0.5 marks] using a t-test in R (t.test).
(c) [1 mark] Compare and interpret your results from (a)
and (b).
2 Days with rain in March
Using the weather records for March 2015 and March 2016,
compute a 99% confidence interval for the proportion of days
with rain for the month of March
(a) [1.5 marks] using quantiles of a bootstrap distribution;
and
(b) [0.5 marks] using a χ2-test in R (prop.test).
(c) [1 mark] Compare and interpret your results from (a)
and (b).
3 Wind speeds at 9am and at 3pm
[3 marks] Using the entire data set, we want to test whether
there is evidence for an association between the wind speed
at 9am and the wind speed at 3pm.
Compute the p-value of the test by constructing an appropriate randomisation distribution, and interpret your results,
using a significance level of 1%.
4 Rainfall and evaporation
Using the entire data set, we want to test whether there is
evidence for the mean daily rainfall being less than the mean
daily evaporation.
(a) [1.5 marks] Compute the p-value of the test by constructing an appropriate randomisation distribution.
(b) [0.5 marks] Compute the p-value using a t-test in R
(t.test).
(c) [1 mark] Compare and interpret your results from (a)
and (b), using a significance level of 1%.
5 Wind directions at 9am and at
3pm
Using the entire data set, we want to test whether there is
evidence for an association between the wind direction at 9am
and the wind direction at 3pm.
(a) [1.5 marks] Compute the p-value of the test by constructing an appropriate randomisation distribution.
(b) [1.5 marks] Compute the p-value using a χ2-test in R
(chisq.test) and compare it to the result from (a). Interpret your results, using a significance level of 1%.
6 Direction and speed of maximum
wind gust
Using the entire data set, we want to test whether there is
evidence for an association between the direction of the maximum daily wind gust and the speed of the maximum daily
wind gust, that is, whether the mean speed of the maximum
wind gust depends on the direction of the maximum wind
gust.
(a) [2 marks] Compute the p-value of the test by constructing
an appropriate randomisation distribution.
(b) [1 mark] Compute the p-value using ANOVA in R (aov)
and compare it to the result from (a). Interpret your
results, using a significance level of 1%.
1
By including this statement, all authors of this work declare that:
• We hold a copy of this assignment if the original is lost or damaged.
• We hereby certify that no part of this assignment has been copied from any other student's work or from
any other source except where due acknowledgement is made in the assignment.
• No part of the assignment has been written for us by any other person except where collaboration has been
authorised by the unit coordinator.
• We are aware that this work may be reproduced and submitted to plagiarism detection software programs
for the purpose of detecting possible plagiarism; this software may retain a copy on its database for future
plagiarism checking.
• We hereby certify that no part of this assignment or product has been submitted by any of us in another
(previous or current) assessment, except where appropriately referenced, and with prior permission from
the unit coordinator for this unit.
• We hereby certify that we have read and understand what the University considers to be academic misconduct, and that we are aware of the penalties that may be imposed for academic misconduct.
Name Student Number Contribution (%)
Figure 1: Statement to be included on the first page of each submission.
Submission
One report is to be submitted by each group by the due date,
containing the description and results from performing the
tasks above.
The report is to be written in R Markdown, using
the template available on the unit's vUWS site.
After editing the R Markdown file, knit it to HTML (not
Word!) using R Studio; see http://rmarkdown.rstudio.com
for full details on R Studio and R Markdown. Your names
and student numbers as well as the contribution by each
student must be entered in the template before you can knit
the R Markdown file.
Convert the resulting HTML file to PDF; you can use a web
browser (with a suitable plugin or a virtual printer), OpenOffice, MS Word or online tools for this step. (If you have L ATEX
installed (https://www.latex-project.org), you can also
knit your R Markdown file directly to PDF.)
After checking that the PDF file is formatted correctly,
submit it using the link in the Group Project tab on the unit's
vUWS site. Do not submit any format other than PDF.
If you do not submit a PDF file, it may not be possible to
mark your report at all, or you may lose marks due to bad
formatting of code, plots or text.
The first page of your report must contain the declaration
shown in Figure 1; you make this declaration by submitting
your report. Do not remove this declaration! A marker
has the right not to mark your report if the above
declaration is not included in the report.
Marking Criteria and Standards
The Group Project will contribute a maximum of 20 marks
towards your final mark.
The value of each of the tasks is indicated. Marks are
awarded according to the following criteria:
• choice of correct method for sampling, bootstrapping,
randomisation and / or analysis;
• correctness and clarity of R code; and
• correctness and clarity of analysis or interpretation.
In addition, 2 marks are awarded for the overall quality and
presentation of the report.
Remember that the marker will only see what you have
written, therefore, comment your R code and clearly explain
all decisions made, as well as the analysis and your interpretation of the results. (Don't expect the marker to spend ages
trying to figure out what you might have meant to say!)
The formatting of your report may affect its readability
and the clarity of your explanations, and hence contributes
to your mark.
2