Assignment title: Information
ENGG255-Data processing Assignment 2
Note: all the data provided for this assignment is dummy and provided
just for statistical analysis purpose only.
Question 1. A group of students were classified in terms of personality
(introvert or extrovert) and in terms of music preferences (Reflective and
Complex, Upbeat and Conventional, Energetic and Rhythmic and Intense and
Rebellious) with the purpose of seeing whether there is an association
(relationship) between personality and music preference. Data was collected
from 400 students and presented in the 2 (rows) x 4 (cols) contingency table
below:
Personalit
y
Music Preferences
Reflective and
Complex
Upbeat
and
Conventio
nal
Energetic
and
Rhythmic
Intense
and
Rebelliou
s
Totals
Introvert 20 6 30 44 100
Extrovert 180 34 50 36 300
Totals 200 40 80 80 400
Suitable null and alternative hypotheses might be:
• H0: Music preference is not associated with personality, and
• H1: Music preference is associated with personality
1. Plot the data presented in the contingency table. (10%)
2. What is the best way to calculate the correlation between these two
variables? And why? (10%)
3. Calculate the correlation between these two variables and interpret the
result. (20%)ENGG255-Data processing Assignment 2ENGG255-Data processing Assignment 2
Question 2: The following dataset shows just 5 records of employees of an
organisation. The first column shows the employee identifier, the rest shows
the salary, age and year of employee experience.
Employee
id
Salary Age Experience
1 25000 24 4
2 40000 27 5
3 55000 32 7
4 27000 25 5
5 53000 30 5
1. Illustrate this information on one plot to show these employees profile
based on the given 3 dimensions (salary, age, experience). (10%)
2. Create a normalised dataset -Use min-max normalisation in the range
of (0,1), using following information: (15%)
Max salary = 55001
Min Salary = 24999
Max age = 33
Min age = 23
Max experience = 8
Min experience = 3
3. Plot the employees profile based on the normalised data. (5%)
4. Explain and interpret the difference between plots? How normalisation
affects this change? (10%)ENGG255-Data processing Assignment 2
Question 3: The following table shows the monthly minimum temperature
(degree Celsius (°C)) of a city in Australia for 3 consequent years.
Mean of Minimum
Temperature
Month 2010 2011 2012
Jan 17.7 15.6 18
Feb 17.7 16 15.5
Mar 13.5 14.5 12.9
Apr 12.3 11.8 12.5
May 9.5 9.8 10
Jun 7.3 7.9 8.1
Jul 7.6 7.1 6.4
Aug 7.2 7.8 7.6
Sep 10.1 9.2 8.7
Oct 10 10.6 9
Nov 11.9 12.6 12.2
Dec 13.7 14.1 13.9
1. Propose a visualisation method to compare the mean of minimum
temperature in monthly manner for three given years. (10%)
2. Aggregate the information to provide a table and visualisation of the
temperature in seasonal bases. (15%)
Question 4: Use google or the university database (articles or books in any
field) to find a practical example of using covariance or correlation between
variables. You should write a maximum 300 worlds report explaining the
following: the practical scenario (5%), why the authors use correlation or
covariance method for their analysis (10%) and how using these parameters
helped in interpreting the results (10%).