Assignments #3: Genome and Comparative Genomics
1.
Go to the taxonomy browser
in Entrez. Can you use this to find the taxonomic
position of Pyrococcus and Aeropyrum?
To which kingdoms do they belong?
2.
Go to the
ENTREZ genome section.
Select the genome from Aeropyrum
pernix.
Explore the different genome views (click and explore the different options).
Select Taxtable or TAXA.
How many ORFs do you find whose most similar sequence is a eukaryotic one?
What could be the reason for this? (check out the one but last pink dot for a clue – click on it)
Find out what it might encode. (Hint: blast / PSI blast) What might be the function of the protein encoded by this ORF in evolutionary terms?
Are you surprised to find a homologue to this ORF encoded in some inteins?
3.
Go to Robert L.
Charlebois’ genome and bioinformatics site. Using the program gene clusters try to find
out if the vacuolar / archaeal ATPases subunits are arranged in an operon in Borrelia,
Methanococcus and Archaeoglobus.
In which order are these subunits arranged in the
operon?
4.
Use the same approach
for Aeropyrum. Is there an
operon for the vacuolar/archaeal ATPases in Aeropyrum? Does it contain all the subunits? (one way to address this question is the
NCBI genome ORF listing). Devise a
strategy to search for the “missing” ORFs, and use your strategy to locate at
least one additional ORF.
5.
Using Aquifex
aeolicus as one of the genomes in the program gene cluster, and another
bacterial genome of your choice, try to determine if the other organism has an
F-ATPase operon.
6.
Choose at least one
other program from the genome
and bioinformatics site and explore its potential use. Summarize your experience in one or two
sentences.
E.g.:
·
Gene families (groups
of paralogous genes) in Thermotoga maritima, are there any that consist
exclusively of unidentified ORF? Which one is the largest?
·
Plot of cumulative strand bias by position. Choose G-C over G+C. Any idea what it might mean?
·
Comparative inventories of sequenced genomes. Select
two closely related organisms and ask which ORFs are present in one genome but
not in the other. Did you find any surprises?