Assignment title: Information


​​ Group Project: Sydney Weather Data 300700 Statistical Decision Making, Autumn 2016 Due: Friday of Week 13 (20 May) This Group Project is about weather data for Sydney obtained from the Bureau of Meteorology. The data set weatherData provided with this project in csv format contains weather observations for Sydney for the last 14 months. See the first line of the csv file for an explanation of the variables. 1 Mean minimum and mean maximum daily temperature in March Using the weather records for March 2015 and March 2016, compute 99% confidence intervals for the mean minimum daily temperature and the mean maximum daily temperature for the month of March (a) [1.5 marks] using quantiles of a bootstrap distribution; and (b) [0.5 marks] using a t-test in R (t.test). (c) [1 mark] Compare and interpret your results from (a) and (b). 2 Days with rain in March Using the weather records for March 2015 and March 2016, compute a 99% confidence interval for the proportion of days with rain for the month of March (a) [1.5 marks] using quantiles of a bootstrap distribution; and (b) [0.5 marks] using a χ2-test in R (prop.test). (c) [1 mark] Compare and interpret your results from (a) and (b). 3 Wind speeds at 9am and at 3pm [3 marks] Using the entire data set, we want to test whether there is evidence for an association between the wind speed at 9am and the wind speed at 3pm. Compute the p-value of the test by constructing an appropriate randomisation distribution, and interpret your results, using a significance level of 1%. 4 Rainfall and evaporation Using the entire data set, we want to test whether there is evidence for the mean daily rainfall being less than the mean daily evaporation. (a) [1.5 marks] Compute the p-value of the test by constructing an appropriate randomisation distribution. (b) [0.5 marks] Compute the p-value using a t-test in R (t.test). (c) [1 mark] Compare and interpret your results from (a) and (b), using a significance level of 1%. 5 Wind directions at 9am and at 3pm Using the entire data set, we want to test whether there is evidence for an association between the wind direction at 9am and the wind direction at 3pm. (a) [1.5 marks] Compute the p-value of the test by constructing an appropriate randomisation distribution. (b) [1.5 marks] Compute the p-value using a χ2-test in R (chisq.test) and compare it to the result from (a). Interpret your results, using a significance level of 1%. 6 Direction and speed of maximum wind gust Using the entire data set, we want to test whether there is evidence for an association between the direction of the maximum daily wind gust and the speed of the maximum daily wind gust, that is, whether the mean speed of the maximum wind gust depends on the direction of the maximum wind gust. (a) [2 marks] Compute the p-value of the test by constructing an appropriate randomisation distribution. (b) [1 mark] Compute the p-value using ANOVA in R (aov) and compare it to the result from (a). Interpret your results, using a significance level of 1%. 1 By including this statement, all authors of this work declare that: • We hold a copy of this assignment if the original is lost or damaged. • We hereby certify that no part of this assignment has been copied from any other student's work or from any other source except where due acknowledgement is made in the assignment. • No part of the assignment has been written for us by any other person except where collaboration has been authorised by the unit coordinator. • We are aware that this work may be reproduced and submitted to plagiarism detection software programs for the purpose of detecting possible plagiarism; this software may retain a copy on its database for future plagiarism checking. • We hereby certify that no part of this assignment or product has been submitted by any of us in another (previous or current) assessment, except where appropriately referenced, and with prior permission from the unit coordinator for this unit. • We hereby certify that we have read and understand what the University considers to be academic misconduct, and that we are aware of the penalties that may be imposed for academic misconduct. Name Student Number Contribution (%) Figure 1: Statement to be included on the first page of each submission. Submission One report is to be submitted by each group by the due date, containing the description and results from performing the tasks above. The report is to be written in R Markdown, using the template available on the unit's vUWS site. After editing the R Markdown file, knit it to HTML (not Word!) using R Studio; see http://rmarkdown.rstudio.com for full details on R Studio and R Markdown. Your names and student numbers as well as the contribution by each student must be entered in the template before you can knit the R Markdown file. Convert the resulting HTML file to PDF; you can use a web browser (with a suitable plugin or a virtual printer), OpenOffice, MS Word or online tools for this step. (If you have L ATEX installed (https://www.latex-project.org), you can also knit your R Markdown file directly to PDF.) After checking that the PDF file is formatted correctly, submit it using the link in the Group Project tab on the unit's vUWS site. Do not submit any format other than PDF. If you do not submit a PDF file, it may not be possible to mark your report at all, or you may lose marks due to bad formatting of code, plots or text. The first page of your report must contain the declaration shown in Figure 1; you make this declaration by submitting your report. Do not remove this declaration! A marker has the right not to mark your report if the above declaration is not included in the report. Marking Criteria and Standards The Group Project will contribute a maximum of 20 marks towards your final mark. The value of each of the tasks is indicated. Marks are awarded according to the following criteria: • choice of correct method for sampling, bootstrapping, randomisation and / or analysis; • correctness and clarity of R code; and • correctness and clarity of analysis or interpretation. In addition, 2 marks are awarded for the overall quality and presentation of the report. Remember that the marker will only see what you have written, therefore, comment your R code and clearly explain all decisions made, as well as the analysis and your interpretation of the results. (Don't expect the marker to spend ages trying to figure out what you might have meant to say!) The formatting of your report may affect its readability and the clarity of your explanations, and hence contributes to your mark. 2