Assignment title: Information


Homework 5

Problem 1 (30pts)

Derive weights for sequences

ACTA ACTT CGTT AGAT using Thompson, Higgins, and Gibson method

Use the outline below (a-d) to solve this problem

a) compute pairwise distances between sequences

b) apply UPGMA method to join sequences and consequently the clusters)

c) build phylogenetic tree

d) derive sequence weights

Problem 2 (10pts)

We assumed additive property when constructed UPGMA tree in problem 1. What is limitation of this assumption (if any)?

Problem 3 (20pts)

The protein sequence of bacterial species "B3" was used to blast against swissprot protein database. The query returned significant hits to four other bacterial proteins (B1,B2,B4, B5), and one protein in human genome (H). No other mammalian species have shown presence of protein that is similar to B3. Phylogenetic tree construction by several methods resulted in a tree shown below. Explain the presence of this gene in humans.

B1HB2B3B4B5

Problem 4 (10pts) Describe technical and theoretical challenges associated with building phylogenetic trees.

Problem 5 (10pts) Compare and contrast parsimony, maximum likelihood, UPGMA, and neighbor-joining methods

Problem 6 (20) Create multiple sequence alignment and phylogenetic tree in R using ape and clustalw by following steps below:

1. Install clutalw (depending you your OS) on your computer using http://www.clustal.org/clustal2/ link

2. Open R. (all of the following steps will be implemented in R) 3. Set a working directory

4. Install package "ape" from your R session by typing: intall.packages("ape ")

5. Load "ape" package by typing library("ape ") 6. Read accession numbers of sequences you downloaded for Homework 2 from GenBank; this step rather for exercising purposes since you have already downloaded these sequences.

7. Save the result from step 6 as file 8. Run clustalw by typing:

system(paste('"path_to_YOUR_clustalw/clustalw2.exe" new.fas')) 9. Read alignment file (*aln) it should be in your working directory 10. Create phylogenetic tree using neighbor-joining method

11. Plot the tree Submit working R-code in a separate file