Assignment title: Information
Homework 5
Problem 1 (30pts)
Derive weights for sequences
ACTA
ACTT
CGTT
AGAT
using Thompson, Higgins, and Gibson method
Use the outline below (a-d) to solve this problem
a) compute pairwise distances between sequences
b) apply UPGMA method to join sequences and consequently the clusters)
c) build phylogenetic tree
d) derive sequence weights
Problem 2 (10pts)
We assumed additive property when constructed UPGMA tree in problem 1.
What is limitation of this assumption (if any)?
Problem 3 (20pts)
The protein sequence of bacterial species "B3" was used to blast against swissprot
protein database. The query returned significant hits to four other bacterial proteins
(B1,B2,B4, B5), and one protein in human genome (H). No other mammalian species
have shown presence of protein that is similar to B3. Phylogenetic tree construction by
several methods resulted in a tree shown below. Explain the presence of this gene in
humans.
B1HB2B3B4B5
Problem 4 (10pts)
Describe technical and theoretical challenges associated with building phylogenetic trees.
Problem 5 (10pts)
Compare and contrast parsimony, maximum likelihood, UPGMA, and neighbor-joining
methods
Problem 6 (20)
Create multiple sequence alignment and phylogenetic tree in R using ape and clustalw by
following steps below:
1. Install clutalw (depending you your OS) on your computer using
http://www.clustal.org/clustal2/ link
2. Open R. (all of the following steps will be implemented in R)
3. Set a working directory
4. Install package "ape" from your R session by typing:
intall.packages("ape ")
5. Load "ape" package by typing
library("ape ")
6. Read accession numbers of sequences you downloaded for Homework 2 from
GenBank; this step rather for exercising purposes since you have already
downloaded these sequences.
7. Save the result from step 6 as file
8. Run clustalw by typing:
system(paste('"path_to_YOUR_clustalw/clustalw2.exe" new.fas'))
9. Read alignment file (*aln) it should be in your working directory
10. Create phylogenetic tree using neighbor-joining method
11. Plot the tree
Submit working R-code in a separate file