Assignments #3: Genome and Comparative Genomics

 

1.              Go to the taxonomy browser in Entrez.  Can you use this to find the taxonomic position of Pyrococcus and Aeropyrum?
To which kingdoms do they belong?
   

 

2.               Go to the ENTREZ genome section. 
Select the genome from
Aeropyrum pernix.

Explore the different genome views (click and explore the different options).
 
Select Taxtable or TAXA. 
 
How many ORFs do you find whose most similar sequence is a eukaryotic one? 
What could be the reason for this? (check out the one but last pink dot for a clue – click on it)
Find out what it might encode.  (Hint: blast / PSI blast)  What might be the function of the protein encoded by this ORF in evolutionary terms?  
Are you surprised to find a homologue to this ORF encoded in some inteins?  
 
 

3.              Go to Robert L. Charlebois genome and bioinformatics site.  Using the program gene clusters try to find out if the vacuolar / archaeal ATPases subunits are arranged in an operon in Borrelia, Methanococcus and Archaeoglobus. 

In which order are these subunits arranged in the operon?

 

4.              Use the same approach for Aeropyrum.  Is there an operon for the vacuolar/archaeal ATPases in Aeropyrum?   Does it contain all the subunits?   (one way to address this question is the NCBI genome ORF listing).  Devise a strategy to search for the “missing” ORFs, and use your strategy to locate at least one additional ORF.

 

5.              Using Aquifex aeolicus as one of the genomes in the program gene cluster, and another bacterial genome of your choice, try to determine if the other organism has an F-ATPase operon.  

 

6.              Choose at least one other program from the genome and bioinformatics site and explore its potential use.  Summarize your experience in one or two sentences.
E.g.:

·       Gene families (groups of paralogous genes) in Thermotoga maritima, are there any that consist exclusively of unidentified ORF? Which one is the largest?

·       Plot of cumulative strand bias by position.  Choose G-C over G+C.  Any idea what it might mean? 

·       Comparative inventories of sequenced genomes. Select two closely related organisms and ask which ORFs are present in one genome but not in the other.  Did you find any surprises?