Assignment title: Information
Question 1 [Points 20]
a. [Points 8] Using a large training data for a community of professionals, a linear equation was
developed to predict salary in terms of years of experience. It is given as: salary(in $K) = 30 +
2.5*Experience (in years). For the given three individual professionals, determine the predicted
salary and the mean squared error (MSE) over the three.
Professional# Experience (years) Actual Salary ($K)
1 2 38
2 4 38
3 6 50
b. [Points 6] Given the following regression tree, predict the salary for the following three
professionals:
Professional# Name Education Experience (years)
1 John BS 10
2 Jane PhD 5
3 Jim HS 25
EQ1 Salary=15+2*Experience
EQ2 Salary= 25+ 3*Experience
EQ3 Salary=35+4*Experience
EQ4 Salary=50+3*Experience
EQ5 Salary=60+4*Experience
EQ6 Salary=70+5*Experience
Education
HS BS
MS/PhD
EQ6
Experience Experience
<5 5-10 >10
EQ1 EQ2 EQ3 EQ4 EQ5
<8 >7
1c. [Points 6] Given the following rules with exception, predict the salary of the following three
persons:
Professional# Name Education Experience (years)
1 John BS 10
2 Jane PhD 5
3 Jim HS 25
Default: salary=40+3*experience
Except if education >=BS and experience < 8
then salary = 60+4*experience
Except if education=PhD
then salary=80+5*experience
else if experience > 10
then salary = 50 + 5*experience
except if education >= BS
then salary=70 + 7*experience;
Question 2 [Points 20]
a. [Points 8] Given the following data, derive and show a table similar to table 4.2 (for
Naïve Bayes). Using this table, predict the acceptability probabilities for the instance with
Price=L, Capacity=7, and Safety = M.
Instance# Price Capacity Safety Acceptability
1 L 4 High Good
2 H 2 High Bad
3 M 4 Medium Good
4 L 7 High Good
5 L 7 Low Bad
6 H 7 High Bad
7 M 2 Medium Good
8 H 4 High Good
9 L 2 Medium Good
10 M 7 Low Bad
Price Capacity Safety Acceptability
Bad Good Bad Good Bad Good Bad Good
L 2 L
M 4 M
H 7 H
L 2 L
M 4 M
H 7 H
b. [Points 6] For the above training data (with price, capacity, safety, and acceptability),
determine the root of the decision tree. Show your work.
c. [Points 6] Given the following set of training instances, find the nearest training instance
for the unknown instance , and predict its health.
Show your work. (Hint: Use Euclidean distance with normalized attributes)
Inst# Age Height Weight Health
1 35 5.6 175 Excellent
2 55 6.0 150 Good
3 50 5.8 200 Okay
4 65 5.5 175 Good
5 45 5.6 190 Okay
Question 3[Points 20]
a. [Points 6] For the following three test instances, the predicted probabilities and the actual
outcome (Health) were given. For each instance, determine the quadratic loss function
(QLF) and informational loss function (ILF).
Inst# Age Height Weight Predicted Prob. Actual
1 35 5.6 175 0.4 0.4 0.2 Okay
2 55 6.0 150 0.3 0.2 0.5 Good
3 50 5.8 200 0.1 0.6 0.3 Excellent
b. [Points 5] To test the efficacy of a new medical test to diagnosis a disease, a group of
2500 volunteers were given the test. Out of these, only 1500 had the disease. The results
of the test identified 1750 as having the disease. Out of these, only 1000 really had the
disease. From here, determine the sensitivity and the specificity of the diagnostic test.
c. [Points 5] Given the following data collected from a survey of customers regarding a new
product, determine the ROC curve expressed as table.
Customer# Predicted
1 0.85 Yes
2 0.50 No
3 0.95 No
4 0.99 Yes
5 0.45 Yes
6 0.97 No
7 0.80 Yes
Okay Good Excellent
Actual
Prob (Yes)
8 0.6 No
9 0.75 Yes
10 0.7 No
d. [Points 4] Given the following data with actual outcome and predicted outcome,
determine the mean-absolute error and relative-absolute error.
Instance# Actual Predicted
1 2.5 3.0
2 4.0 4.3
3 3.5 2.5
4 5.0 3.0
Question 4 [Points 20]
a. [Points 6] Given the following two clusters (C1 and C2), determine the distance between
the two clusters using (i) Single-linkage method (ii) Centroid-linkage method. (Hint:
Normalize the attributes)
Instance# Age GPA Salary($K) Cluster#
1 25 3.8 50 C1
2 30 3.6 65 C1
3 23 4.0 40 C1
4 35 3.0 70 C2
5 55 3.3 90 C2
b. [Points 6] Given the following Bayesian network, determine the probability that the
performance of the unknown candidate is predicted to be Good, Average, or Poor.
Unknown candidate:
Performance
Good=0.25
Average=0.45
Poor=0.3
GPA
Experience
Performance Low Medium High
Good 0.2 0.3 0.5
Average 0.4 0.4 0.2
Poor 0.6 0.3 0.1
Performance A B C or D
Good 0.35 0.35 0.3
Average 0.3 0.4 0.3
Poor 0.1 0.3 0.6
Salary
Performance Experience Low Medium High
Good Low 0.4 0.4 0.2
Good Medium 0.2 0.4 0.4
Good High 0.1 0.3 0.6
Average Low 0.5 0.4 0.1
Average Medium 0.3 0.5 0.2
Average High 0.2 0.4 0.4
Poor Low 0.7 0.2 0.1
Poor Medium 0.5 0.3 0.2
Poor High 0.4 0.3 0.3
c. [Points 8] Given the following data and the rule Color=Y and Size=S and Act=S and
Age=A Inflated=Yes
(i) Determine the support and accuracy of the rule.
(ii) Prune the rule so its support is at least 3 and accuracy is 75%.
Show your work.
Instance# Color Size Act Age Inflated
1 Y S S A Yes
2 Y S S C Yes
3 Y S D A Yes
4 Y S D C Yes
5 Y L S A Yes
6 Y L S C No
7 Y L D A No
8 Y L D C No
9 P S S A Yes
10 P S S C No
(outcome)
11 P S D A No
12 P S D C No
13 P L S A Yes
14 P L S C No
15 P L D A No
16 P L D C No
Question 5 [Points 20]
a. [Points 6] Given the following seven instances of numeric data for an attribute along
b. [Points 6] Given the following code for the output Grade (A, B, C, D, F), determine
with the corresponding outcome, suggest the first point that could be used to divide
the range using entropy method. Show your work.(Each instance is shown as a pair of
attribute value and the outcome (T or F)).
<80, T>, <20, T>, <50, F>, <65,F>, <40, F>, <25, T>, <110,T>
(i) The Hamming distance of the code (ii) How many errors can this code correct?
Class Class vector
A 11100111
B 00011000
C 10101010
D 01010101
F 00110011
c. [Points 4] Show one way to transform the following two attributes: Gender (M, F)
d. [Points 4] A dataset with instances containing 8 attributes has been transformed using
and GPA (A, B, C, D, F) into a single attribute. Justify.
Principal Component Analysis (PCA). The results are as follows. Determine the
principal components that we need to choose so as to capture a variance of at least
97%.
Component Variance
1 61%
2 20%
3 12%
4 2%
5 1.5%
6 1.3%
7 1.2%
8 1.0%
Question 1[Points 20] An employer collected data from employees' background and
performance, and chose three different models to represent this information. Using each of
three tables, predict the estimated salary (in K) for two new applicants with the following
qualifications: (i) Education=BS, GPA = 3.5, Experience=2 years ( ii) Education=HS,
GPA=2.5, Experience=10 years.
(a) Linear model: Salary = Education*5 + GPA*10 + Experience*5, where Education
HS=1, BS=3, MS=4, PhD=6.
(b) Model tree: