Assignment for Class 17

Your name:
Your email address:


  1. (20 minutes) Gene Identification Exercise A (Prokaryotic genomic DNA)

    For the following sequences from a prokaryote (the archaeon Thermoplasma acidophilum ), identify possible Open Reading Frames using ORF-finder at the NCBI at
    http://www.ncbi.nlm.nih.gov/gorf/gorf.html

    • Cut and paste the Thermoplasma acidophilum genomic DNA sequence fragment (T_acido.fa) into the sequence window of the ORF finder and press <OrfFind>
      The web page that will be returned has a symbolic representation of the 6 possible translation frames. Potential ORF are indicated in green.

    • select one ORF by clicking on it in the symbolic representation

    • Use the link to BLAST on the top of the page, and do a blast search for the selected and one additional open reading frame - check the graphical overview button in the parameter box on the format page before requesting the results. 

    • Based on your and your neighbors analyses, A) which ORFs are likely to form an operon?  B) Which is the coding strand?  C) Is it the same for all ORFs? 

    • You can select an ORF to work on also by clicking onto the green boxes in the table on the right.   The first number gives the reading frame, the entries are ranked according to length. Select longest ORF on frame +2 and do a blast search. Check the graphical overview box on the format page. D) Do you notice anything strange? 

    • E) What happens when you press the accept button below the graphic representation?
       
    • F) Glimmer is a program that aims to find real ORFs based on compositional analyses. A web version is here. Copy /paste the Thermoplasma acidophilum genomic DNA sequence fragment (T_acido.fa) into the sequence window, select linear and bacterial/archaeal. Did Glimmer identify the ORFs as most probable that you identified as being part of the ATPase operon?
       
    • G) As an aside, Glimmer and many similar programs do really well in identifying real ORF (98% of the real ORF are identified). What other parameter would you like to know in order to judge the overall success rate of the program?




  2. (15 minutes) Genome Alignments and Synteny

    So far, you have learn how to search, compare and manipulate single genes using bioinformatic tools. However, it is also possible to align complete prokaryotic and eukaryotic genomes. There is many tools freely available to perform different kind of genome comparison but for today's class, we will concentrate on a new service developed by the Joint Genome Institute (JGI). The Integrated Microbial Genomes server allow you to search, select and compare portions of fully sequenced or partially completed genome sequences.

    Go to the IMG server (link). In this exercise, we will compare portions of genomes from different bacterial species to verify if they all have the same order for the genes encoding the ATP synthase subunits. First, click on the find genes tool bar. In the Keyword window, type in ATP synthase and select Thermotoga maritima in the organism list. The resulting search displays all subunits that are part of the complete ATP synthase (except the Flagellum-specific ATP synthase which is part of the bacterial flagellum assembly).

    In the Gene object ID column, click on the number corresponding to the beta chain, this will lead you to a page containing information about that particular ORF. On this page click on "show all gene information". In the Evidence for function Prediction box, you can see a graphical representation of the genome where the ATP synthase beta chain is located (red orf). You can put your mouse cursor over the ORF to display the identity of the gene. (You need to wait until the page is completely loaded!) Look at the ORFs located around the beta chain gene. Where do you think the operon containing the ATP synthase begins and ends?


    Now, click on "Show ortholog neighborhood regions in user-selected organisms" (since we haven't selected any other organism, the next page will display a default list of different species). Is the subunit order of the ATP synthase conserved in the other bacterial species? List a few of the differences you can find?

  3. (15 Minutes)The NCBI provides a facility for pairwise genome comparison. This is similar to dotlet, but the units of comparison are complete Open Reading Frames. A circle is placed in the plot, if a gene "x" from genome A has a top scoring blast hit in genome B (gene "y"), AND if gene "x" also is the top-scoring hit of gene "y". Go to the NCBI GenePlot page.

    • First compare the two Leptospira interrogans serovars from the scroll down lists (serovar Copenhageni and serovar Lai) and do the Genome Plot by pressing the 'Compare' button. What can you conclude on the genome differences by looking on the left window?


    • Select two closely related species and do the Genome Plot (If the two species you chose to compare do not contain any interesting results, retry with other species). Which species did you compare? What interesting feature did you find when comparing those two genomes? Does the plot change when you switch the axis? Try to use the zoom in feature in the right window to figure out what an interesting sequence encodes.

Finished?

Check the appropriate radio button below before pressing the submit button:

Send email to your instructor (and yourself) upon submit
Send email to yourself only upon submit (as a backup)
Show summary upon submit but do not send email to anyone.