Assignment title: Information
Do the following exercises and submit by the due date. Submit one word doc file with all your
code pasted into it (no need for output snap shots) and also submit all the .pl files (all in one zip
file would be great).
1) Write a program that tests to see whether a sequence has an AT repeat. For this
problem, we'll define a repeat as 3 or more occurrences of AT, i.e. ATATAT…etc. Store a
sequence as a string (make one up or get one from NCBI) to test your code. Return a
message to the user whether or not an AT repeat exits in the sequence. Use at least one
subroutine.
2) EMBL (European Molecular Biology Laboratory) and EBI (European Bioinformatics
Institute) are basically the equivalent of NCBI here in the US. They hold the same data
but store the data in a different format. Your task is to parse out an EMBL record (see
file attached) just like we did for GenBank records above.
EMBL's records are actually easier to parse out! Just parse out the sequence ID,
description (DE) and sequence. Use at least one subroutine.