Assignment title: Information
Q1 Bayesian Networks: Metastatic cancer is a possible cause of brain tumors and is also an
explanation for increased total serum calcium. In turn, either of these could explain a patient
falling into a coma. Severe headache is also associated with brain tumors. A BN representation
of this metastatic cancer example is shown below (Figure 1). All the nodes are Booleans. Given
that a patient has severe headache, has a brain tumor, not in coma and does not have symptoms
of increased serum calcium, determine the probability that the patient has metastatic cancer.
Q2 Given the following classification rule on weather data, prune it so that it is not an overfit.
The goal is to obtain a good rule whose support is at least 3 and accuracy is 50% or more. The
current rule has a support of 1 and accuracy of 100%. Show your work.
Outlook=sunny and temp=cool and humidity=normal and windy=false ==> Play = Yes
What to submit? Submit a pdf file with your answers via the Blackboard. Your output should
look like this:
Name Course HW#
Q1 Work and results for Q1
Q2 Work and results for Q2
Q1. Given the following data, show ways to discretize age based on (i) Equal-width binning (4
bins) (ii) Equal frequency binning (4 bins) (iii) :
Entropy-based discretization. Salary is the outcome class.
Age Experience Education Salary
45 20 MS High
65 40 BS Medium
25 5 HS Low
35 10 BS High
27 5 BS High
22 0 BS Low
'30 3 MS Medium
66 40 MS Medium
50 25 BS Medium
37 15 BS High
33 10 MS Medium
40 15 MS High
23 5 HS Low
24 2 ES Medium
Q2. Transform salary into binary variables using the standard method, the err-correcting code
method, and nested dichotomies.