Assignment title: Information


Required: A.Calculate the descriptive statistics from the data and display in a table. Be sure to comment on the central tendency, variability and shape for Price, Bedrooms, Land Area and Distance to Melbourne. How would you interpret the mean of dummy variables such as 2014? B.Draw a graph that displays the distribution of the distance of the homes to Melbourne's CBD. Be sure to comment on the distribution. C.Create a box-and-whisker plot for the distribution of the prices and describe the shape. Is there evidence of outliers in the data? D.There is a growing belief that homes in the inner suburbs are increasingly unaffordable for young first-time home buyers. What is the likelihood that a home less than 15 km from the CBD will has sell for more than $500,000? Is the price of homes statistically independent of the distance to the CBD? Use a Contingency Table. E.Estimate the 99% confidence interval for the population mean number of bedrooms. F.One of the implied benefits of living further away from the CBD is the increased land area for raising children. Test the claim at the 1% level of significance that the land area of homes further than 15 km from the CBD is larger than the 510 sq metres of the typical lot in the inner suburbs. G.Run a multiple linear regression using the data and show the output from Excel. Exclude the dummy variable "2010" from the regression results. H.Is the coefficient estimate for distance to Melbourne statistically different than zero at the 5% level of significance? Set-up the correct hypothesis test using the results found in the table in Part (G) using both the critical value and p-value approach. Interpret the coefficient estimate of the slope. I.Interpret the remaining slope coefficient estimates. Discuss whether the signs are what you are expecting and explain your reasoning. J.Interpret the value of the Adjusted R2. Is there a large difference between the R2 and the Adjusted R2? If so, what may explain the reasoning for this? K.Is the overall model statistically significant at the 5% level of significance? Use the p-value approach. L.Based on the results of the regressions, what other factors would have influenced the price of homes? Provide a couple possible examples and indicate their predicted relationship with the price if they were included. M.Predict the average price of a 3-bedroom home that is 10 km from the city with 300 square metres of land that sold in 2014 if it is appropriate to do so. Show the predicted regression equation. (1 Mark) N.Do the results suggest that the data satisfy the assumptions of a linear regression: Linearity, Normality of the Errors, and Homoscedasticity of Errors? Show using scatter diagrams, normal probability plots and/or histograms and Explain. O.Would these results tell us anything about the affordability for non-investor purchasers of these properties? If not, describe a scenario in how you would construct a sample of households that are looking to occupy a home.