Assignment title: Information


­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­ Answer the following questions: 1. True/False: For logical propositions, P and Q, and a valid rule PQ, if we know that Q is true, then we can infer that P is true. Briefly (in 1~2 sentences) justify your answer. [5] 2. For English statements, "Everyone owns at least one car. If a person has a hot temper and the person owns a red car, the car is dangerous.", answer the following questions: (a) Write logical expressions using propositional calculus with the same meaning as above statements. [5] (b) Write logical expressions using predicate calculus with the same meaning as above statements. [5] (c) Write corresponding Prolog rules to the predicate calculus expressions in (b). [5] 3. Given the training data set below that consists of positive (Yes) and negative (No) examples with three attributes to learn the concept of mammal, what is the entropy of this data set? Show your process of computing the entropy. [10] Mammal Breath Skin Food Yes Nose Hair Grass No Nose Bare Meat Yes Gills Hair Grass No Nose Hair Meat 4. Considering the following training data set with two attributes, X, Y and class column, Output, answer the following questions. Note: Id is record ID only used for convenience, not a meaningful attribute of the data set. Id Out put X Y r1 1 5 8 r2 0 4 5 r3 1 8 3 r4 0 5 6 r5 1 1 8 Initial weight vector, W = [0.1, ‐0.2, 0.3], Bias = 1, Learning rate = 1, Activation function f(net): If ∑wixi >= 0, then 1 else 0 (a) When Perceptron discussed in class is used, what will be the weight vector that will be used for the second record, r2? Show your computation process for the weight vector. [10] (b) We want to cluster this training data set using K­means to verify the validity of the expected output values. So we will ignore the output column for each record for clustering. The value of K is 2 and the dissimilarity is measured by |Xi – Xj|+|Yi – Yj|, where i and j are record Ids. Initial centroids are chosen sequentially from the first record. What will be the centroid value and the record IDs assigned in each cluster C1 and C2, right after the second round of assignment is completed? Show your clustering process for each round of assignment including centroid value and record Ids assigned in each cluster. [10]