Gene Identification Exercise

Background:

Problems in finding Open Reading Frames (ORFs) and Coding Sequences (cds) provide a nice example for failing first principle approach:

In higher Eukaryotes the coding sequence is often interrupted by introns. Genes are transcribed into RNA. With the help of so-called spliceosomes the introns are removed from the RNA and the exon portions are religated. In Arabidopsis the splice site consensus is as follows (from www.arabidopsis.org):

The "|" denotes the exon-intron boundary. The columns represent the nucleotide position relative to the exon-intron boundary. The rows present the total number of the different nucleotides observed in the dataset.

5' consensus

                          --- intron --- >
A   281  338  588   97  |   11   21  635  545  201  226  307
C   230  347  128   45  |    4    8   47  136   87  147  173
G   174  158   98  740  |  933   29  107   52  474  116  112
T   285  127  156   88  |   22  912  181  237  208  481  378
               A     G       G   T

3' consensus
                    <  --- intron ---
A   174  183  172  291   77  912   27  |  235  249  274  272
C   123  117  107   61  600   14    4  |   90  142  167  182
G   194  161  101  360   29   27  931  |  529  174  215  270
T   479  509  590  258  264   17    8  |  116  405  314  246
                    G    C     A    G      G    T

Given the many introns known in Arabidopsis, and the fact that many of the spliceosomal RNAs have been sequenced, one might expect that given a sequence it would be possible to recognize with high reliability which parts of a sequence are coding. The following exercises will demonstrate that this is not the case.

Your name:
Your email address:

Gene Identification Exercise 1 (Prokaryotic genomic DNA)

For the following sequences from a prokaryote (the archaeon Thermoplasma acidophilum), identify possible Open Reading Frames using ORF-Finder at the NCBI.

Click on one of the detected ORFs, and it should turn pink. Use the link to BLAST on the top of the ORF-Finder results page, and do searches for at least two open reading frames - check the parameter box on the format page before requesting the results, check the graphical overview button in BLAST page.

  1. Based on your and your neighbors analyses, which ORFs are likely to form an operon?

    Which is the coding strand?

    Is it the same for all ORFs?



  2. select longest ORF on frame +2 and do a BLAST search. Check the graphical overview box on the BLAST format page. Do you notice anything strange?

    >Thermoplasma acidophilum genomic DNA
    aactacagta gatgctcttt tagctattgc ggcttcgatc tcaatcgccg gcggccttat aggtaccggt atggcacagc agggaatagg agccgctggt atgggtataa tcgcggagaa
    acctgagaag ttcggccagg ttctgttctt ctttgttata cctgagacgc tctgggtcat aggtcttgct ctgggtatca tactgctgct ccatataatc tgatctctat gtccctcgaa
    gaagtgctga aggatatcga acgcgacaag gaagaaaaga agaaagagat agcagatgct gcgtccaggg agacggcaaa gatagagaag gagagggaag aaaagatcca
    gattctgcag agggaatacg agaacaggat gagggaagag ggcagcaggc tgtacaattc cataatcgac aaggcgaatg tggaggccag gaacatcgtg aggatgaggg ttcaggagat
    cctggaccag tatggcgcaa aggccgatga actgataaag aatctggcca aaacaaagga atacgatgat gtcctgaaga agatgatcga ggtatccaga aaggccctcg ggccagactg
    cattgtgaag gtgaacacag ccgataaggg ccgcatctct gatggaaaca taaagttcga ggatatagat ccgtatgggg gcgtactggc aacctccaga gacggaaaga tagaactcga
    tctgagaata tcaagcataa ggcgcgatat tcttgaacgg ttcaaggtgc ggctttattc aatgatagag gattgagatg gattcaactt acatagggtc ctatggcagg ctccgtgttt
    atcagacaga gtttttcagc aggcagcaaa tagatcagat gctgtcaatg actgatccaa aggatgtgtc tgcctttctc tacaacggtc cttacaggga agattatgac agcctgtccg
    cggtcttcaa ggatcctgat ctgaccgaga tggcaataaa caggcacatg gtcaggaaca atcgcctggt gctttttgcc atcccgcctc tggcgaagaa tgccgtggtg gcttatctca
    gcaaatggga tatagagaat atcaagaccg tcatatccgc aaaattcctg gggcacggga taagggaagc tgagcctttc cttgtgagct ttcgcgacat accgcttggt ataatgagcg
    gaacactcac caacgaggat tacaggaaca tgataaatct gcccaacata gaggctatac tcaattacct tgcaaggttc ggatacggta cgtacatgat gcagttcctg gaagattaca
    ggaagaccgg tgatatatca ccgatgctct attcgctgga tcgctattac tacatgaatc tgctgtcggc cctgaagtat tacaggggcg acgaggcgcc tgttctcaat tatgtgcgat
    cggacataga tcgccagaac atagtgacta tgctgaaggg aaaggtgctt aagataccgt ttgaaaggat gtcctcaggc ataatcgatg ggggtaacat aggcgttaac cggttgcggg
    agatctactc atcccaggac gccgtttccg ttgcggatgc tctgaagcag tactacgatc tcgaagagcc aaagaagaaa tacatggaaa caggcgatct ctaccatttt gatatagcga
    tccgcaacat aattgccaga aggttccttg acaccatgtc catgctgcct ctgtcgctgg acagcctgtt ctacttcata ctgaggagcg agattgaaag gcacaaccta aggacaatat
    atttgtccaa ggtgaacggc atgccaaggg agatcacgga aagcctgctg ataacggaga tgatgtgatt ggagagctgc ataactgtca taggcgaaag ggacgttgtc ctcggattca
    ggctcctcgg tattcagcac accataatag ctgagggcaa ggatcttctg aagaagtttc tggaggtctt tcagaaccct cagtgcaata taataatcgt ttccgaaaat gtgaagaaca
    tgatggataa aaggacgctg agaagcgtgg agatctcgtc aaagccgctg gtagtcttca taccccttcc aggcgtaagg gaggaggaga ccatagagga gatggcgaag aggatcctcg
    gtattgatat tggaaatgtt tgaggtgaat caatgggaaa gataatcaga atttcaggtc cagtagtcgt ggctgaagat gttgaagacg ccaagatgta cgatgttgtc aaggtcggag
    agatgggcct catcggtgag ataataaaga ttgaggggaa cagatcgacc atacaggtct atgaggatac tgcaggcata aggcctgacg aaaaggttga gaacaccagg aggccgctgt
    cggtggagct cggcccaggc atactcaaat cgatatacga tggaatacag aggccactgg atgtgatcaa gatcacttct ggagatttca tagctcgcgg tctgaaccca cccgcacttg
    acaggcagaa gaagtgggag tttgttcccg ctgtaaaaaa aggagagacg gtctttcctg gccagatact cggtaccgtg caggaaacct cgctgataac ccacaggata atggttcccg
    agggtatttc aggaaaggtg acgatgatcg ccgatgggga gcacagggtt gaggatgtga tagcgacggt atcaggaaat ggcaagagct acgatattca gatgatgaca acgtggcccg
    tcaggaaggc gaggagggtg cagaggaaac tgcctccaga gatcccgctg gtaacgggac agagggtaat agatgcgctt ttccccgtgg cgaagggcgg aactgccgcc gtacccgggc
    cattcggaag tggaaaatgt gtgtctggcg atacaccggt acttctggat gccggcgaga ggaggatagg cgacctgttc atggaggcca ttcaggacca aaagaacgcg gtcgaaatag
    gccagaacga agagatagtc cggctccatg atccgctgcg catatattcc atggtcggtt ctgaaatagt cgaaagcgtc tctcacgcca tatatcacgg aaagagcaat gccattgtaa
    ccgttaggac ggagaatgga agagaggtca gggtgacacc tgtccacaaa ctctttgtta aaattggaaa ctctgtaatc gagaggccag cctcagaggt gaatgagggc gatgaaatag
    catgcgcaag cgtaagtgag aacggtgatt cccaaaccgt caccacaacg ctggtattga cattcgatag agtggtatca aaggaaatgc atagcggcgt attcgatgtc tacgatctga
    tggttccgga ttatggatac aacttcatag gcggaaatgg cctcatagtc cttcacaaca ccgtgataca acaccagctg gcaaaatgga gcgatgcaaa catagttgtt tacataggct
    gtggcgagcg cggaaatgag atgactgaaa tactcaccac cttcccggag ctgaaagatc ctaacacggg ccagccgctg atggacagga ctgtccttat agccaacact tctaatatgc
    ccgtggcagc aagagaggcg agcatataca caggtataac gatagcggag tactacaggg acatgggata cgacgttgcc ctgatggcag acagcacatc acgctgggcg gaggcactca
    gggagatctc aggcaggctg gaggagatgc cgggagaaga gggatatcct gcctatctgg gtagaagggt ttcagaattc tacgagagat ccggaagggc gaggctcgta tcgccggatg
    agaggtacgg atcaataacg gttatcggtg ctgtatcacc gccgggagga gacatatccg agccggtatc gcagaacacc ctgcgtgtaa caagggtatt ctgggctctg gatgccgccc
    tggccaacag gaggcatttt ccatcgataa actggctcaa cagctattcg ctttacaccg aggatctgag atcctggtac gataagaacg tatcatccga atggtctgct ctaagggaaa
    gagcgatgga aatactgcag cgggagagcg agctccagga ggtcgcacag ctcgttggat acgatgccat gcctgaaaaa gagaaatcaa tactggacgt tgccaggata ataagggagg
    acttcctgca gcagagcgcg ttcgacgaga tcgatgctta ctgctccctg aaaaagcagt acctcatgct gaaggcaata atggagatcg atacctatca gaacaaggcg ctcgactccg
    gcgcaacaat ggataacctg gcttctcttg cagttaggga gaaactctcg aggatgaaga tagtgccaga ggcgcaggta gaatcctatt acaatgatct tgttgaggag atccacaagg
    agtatggaaa tttcattggt gagaaaaatg ccgaagctag cctataaatc tgtttcacaa ataagtggcc cactgctctt cgttgagaac gtgccaaatg ccgcttacaa cgagatggtt
    gacatcgaac ttgagaacgg ggaaaccagg caggggcagg ttctggacac caggaagggc ctcgccatag tgcagatatt cggtgcaaca accggtatag gcactcaggg aaccactgtt
    aaattcaggg gagagaccgc caggcttcct atatctgagg acatgctggg cagggtattc aatggcattg gcgagcccat agacggtggc cctgagataa tagcaaagga gagaatggag
    atcaccagca acgccataaa cccttattca agggaggaac cttccgaatt catagaaacc ggaatttcgg caatagacgg aatgaatacg cttgttaggg gccagaagct gcccatattc
    tccggttccg ggctgccgca caaccagctt gccgctcaga tagcaaggca ggcaaaggtt ctggattcct cagagaattt cgcggttgtc ttcggtgcaa tgggcataac gagcgaggag
    gctaattatt tcacgaacca gttcagggaa actggtgcgc tatcaagatc ggtcatgttc cttaacctct cttcggatcc gtccatggag aggatcatcc tgcccaggat agcactcaca
    actgcagagt acctggcatt ccagaagggc atgcacatac tcgtaatatt gacggatatg acgaactact gtgaggccct tcgtgagata tctgccgcca gggaggaggt tccgggaaga
    aggggctacc caggatacat gtacacggat ctgagcacca tatacgagag ggcaggaaag ctgaagggaa acaatggatc cataacgcag atccccatac tcaccatgcc aggcgacgat
    ataacgcatc ccgtgccgga tctcacaggc tacataaccg agggccagat agtgatttca agagatctca acagaaagga catgtatcca ggcatagacg tgctcctctc cctctcaagg
    ctgatgaacc agggcatagg gaagggaagg acaagggagg atcatagggg cctggcggat cagctttacg ctgcatacgc ttcaggaaag gatctgagat cactgactgc aatcgttggt
    gaggaggccc tcagccagaa cgacagaaag tatcttcact ttgcagacac ctttgagtca aggtacatca agcaggggtt cttcgaggat cgctcaatag aggatacgct tggcctggga
    tgggatcttc ttgctgatct tcctgttcag gacatgaaga gggttaagcc tgatcacatc cagaagtatg gcagatggaa gaaggagtga acatggacat acgaccgaca aggatagaac
    tgatacgcac caggaggaga ataaggcttg caaagaaggg ccttgacctt ctgaagatga agaggtccgc ccttatatac gaattcctgc agataagcag aaccataagg ggcatgaagg
    agaacctcag aaaggaggtt gttgaagccc tgaacatcat aaaggtggcc agcgtcctgg aggggtccct tgcactggag cgcatagcga acatgtcaag cgattccagg ataaatgtca
    actccagaaa tgtcatgggc gtaaatatac ccacccttga ggtctcatac aacctgtcca tattatcgga cgtttaccgt acagtgtctg tcccggttgc catagatgat tccatacgca
    ggtttcagaa gctgttctac gatctcattc tgatagtgga aaaggagaac tctctgcgca acctgctgat ggagatagac agaacaaaga gaaggagcaa cgctatagag aatatactga
    tacccaggct tgagtatcag gcgaagatga taaagatgac cctggatgag agggagaggg ataccttcac cacgcttaaa accataaaga agaagataga ggctgagaat gattagaaac
    ccgtggtttg atatcggaac gaggaagtac gtgaagaatg tcgatataac cagggcgaag gatccgaagc tgatcaggaa gttcataatc ataaggaacc tgatcatgct gttcaatgtg
    gctgttgcag cgctaatact ggtgctggtt tggagctgat gcgtatggat gatgttgaga ctatcaggat tatcaaggaa aaggaaacaa gtgcagatga ggagatcaat cagttcaagg
    aggaacagga aaagatcata aaagaagcca gggagaagga agcgcttgat ttggaaaaga ccgaggatga actgaaatcc agatatcagg agtatctgga atcgagaaga aaggaggctg
    aggagaaagc atcggaaata atagataatg caaagcaaag ggcatctgcg ataaatcttg acataaagga gaaggatctg cagaagatgg tcctggaaat aataatgaag tatctagagg
    agtaatatgc tgagaccagt taagatggag aagatcagga tcatagcccc gtattcctac agggatcctg tcatatccgc ccttcatgac ctgggcgtca tgcagataga ggagatgagg
    gaagatgttg acaggcttct gtctcctgcc aaagcttcgg aacaggcaaa aaccgttatg gattacctgc agaagttccg aggatacgag aacatacttc cgaagaggcc agtgagaaca
    agagccaaat tcacctctct tgcagatatc ctcaacgagg catccaagat aaacatagac gatgatatac gcatagctgt gaacagggaa aacgacattg cagcagccat gaaggatatc
    gatagcaggc tttctgcgct tgaatacatg aagggatatg atttcgacgt atccatattc aacgggaagc acttcgagtc ttacataata cctgataaga atgtggatat caaggcgttc
    tccagcctga acgcagaaat tgtgccgctg aagaatgcat tcataataac cgtggctgag gacagaacac aggatctcag caggatcgcc aattcgattg gagcaaggct cattcacatt
    ccagatctca agggaaagcc tgatgatgta atagctatgc tcaatgacga aagggcaaag ctggatcagg caatgcagga gataagaaag caccttggcg atctttccga taaatattat
    gagaagatag cccagatcag ggaagccctg gagatcgagg caaagaagat agatgtggag gataaattaa aaggaactga gtacacattt gccgtggagg ggtggatacc atcagattcg
    ttcggcagag tgagcgatgc catcaacaga gttactggga acagctgcat aataagcaca gtgaagacca acgagatgcc gccaaccctg ctcagaaatc ccaggaggat ctcgcttttc
    gaattcttca tcaaattcta ttcgcttcca gagggtacgg agtatgatcc tacgctcata ttcgcactgg tctttcccgt attcttcggg ttgatggtcg gtgattgggg ctacgggctg
    gccatcctgc tgatctctct tttcataata caccgcgttg atcatccacc ggcaaagagt cacataccca gagtcataag cagatttgtt ctgatgataa tgtcgccgca atccctgaag
    acgctggcaa aggccctgat tccgtcatcc atagtagcaa tcatagctgg cttacttttc aatgaattct tcggattcgc tatcctgcca ttcaccgttt tccatgtgta cgcggttctt
    ccgaagctga tgctgatcgc cggatacata ggccttggca tggtggtatt cggcttcata ctcggattca ttgaggattt gtggatgaag gatgtcaagg gagccatgga tagactcgga
    tggcttttct ttgcggttgg aatcgcaacc atagggctta acctgataca ccacgatctg acgttcagcg taagtaccgg gatatcgaat ctgattgcag ttgcactgat agttatcggc
    atacctctga tagccatcaa ggagaagtcg cagggattca tagagatgcc ttccataata agccacatac tctcatatct caggcttgtg ggaatactga tagccagcgt cgtcatcgct
    gagataatag acctggtatt catgaagagc atagtttcgc attccatcgg gcttgccatc gccggtgttg tcatactgat attcgggcag atgttcaact taatacttgc agtattcgag
    ccaggaatac agggagcaag gctgatatac gtggagttct tctcaaagtt ctaccacgga aacggaagaa tgttcaggcc attcaggagc cagagaaaat acaccgagga tggcctcgat
    tttgataagg ctagataaac gtttaagcat ggaaacgata caaagatgat gataccttct ttttcccctt gtgaaggcga tgtgaaatga acattggagt tcttggcttt cagggagatg
    tgcaggaaca catggatatg ctgaaaaaat tatccagaaa gaacagagac cttacattaa cccacgtaaa aagggttatc gatctggaac acgtagatgc gctcataata cctggaggag
    aaagtacgac tatatacaag cttactctgg aatacggcct ttacgacgcc atagtgaaga gatctgccga aggtatgccg attatggcca catgcgccgg cctgatactc gtatcgaaga
    atacaaatga tgaaagggtc agaggtatgg gcctactgga tgtgaccata agaaggaatg cctatggaag acaggtcatg tccttcgaaa cggacataga aataaatgga atcggcatgt
    ttccggccgt attcataagg gctccggtaa tagaggattc tggaaaaacc gaggttcttg gtacgctgga tggaaagccc gttatcgtca aacaggggaa tgtgataggg atgacatttc
    atccagagct caccggcgat acaaggctgc atgaatactt cataaacatg gtgaggggga gaggggggta catttccact gcagatgtga aaaggtgatg gtatgaggac tgtactatat
    gatgagcatg caaaactgaa cgcaaagttc accgaattca atggatggga tatgcccctt tactacagga gcataatcga agagcatatg gccgtcagga agcatgttgg catatttgat
    gtatcccata tgggcgacgt gacggtaagc ggaaaggatg cttcggcctt ccttgaccac atgtttccaa cgaaggtaag caatctgaag aatggagaat gcgtttacac agccttcctg
    aacgacagcg ggctgatgat agatgacacg atagtttata ggatgggcga agattcgtac ttcttcgttc caaatgcggg gacaacggaa aagatataca gatgggtgtc tgatcactcc
    gcaggataca gcgtaaagat agagaacgta tctaacagga tatcaagcat agcccttcag ggccctgaat ctgaagaagt gctgaatgaa cttggatttt catatcctgg atacttcaag
    tttcaatacg tttcaggaaa gtacatgaat gcaataacag gtaaagatca aattattata tcaggaacag gttacaccgg tgagaaagga gtagaattca taataccgaa cgaacacgct
    gttgaactct ggaagaaact gctggaagcg ataaacaaaa gaaatgggct tccggctggc ctcggtgcca gagataccct tcgaatggaa aagggtatgc tgctctcagg ccatgatttc
    aatgaagaca gagatccata tgaagcttca gtatcattca tcgtcaacaa cgatgaagat tttgtaggaa agaaaaatct tgagatcaga agaagatctg atcatgagat attcagggga
    ttcgtgcttt ctgacgggat tccaagaaat ggcaatccaa taaaagcagg cgggaagagg gttggaaccg tcaccagcgg aacaatttct ccagtactca ataagggcat agctcttgga
    tacatagata aagcgtattc aaaagaaaat acggaagtta tgatagagat aagatccgta gatcacaagg ctgtcgttac aaagcctagg attgtgaaat gatgttgcag tggaaataca
    cccctccccc tctatattaa ttttttattt aatataatta attcatataa tctgcttcat tatctcgcct atggccacct taacggcctg atcgctgtta tatctgttct tccaccccgt
    tgccttgagc ttctttatat ccattcttgc gtatttcaca tcgcctggcc agcctctgcc catgtaccct cctttgcgca ctatccttgt atccctgagc cccatggcct ctatgacgta
    ctttgctatc gtatccacgt ttgttacatc gtcatttcca aggttgaaca cttccgttcc gctgatcctg tcatagatgt agaacatgct tccaacgcag tccgtgacgt gcatgtacga
    tttcgcctgc gttccatccc cgagtacttc cagttctttg ctgttctttt ttagcttgtt tataaaatcg aatataacgc catgcgtgga attctttccc actatgtttg cgaacctgaa
    gatcttggcg ttgattccat aatagtgcga gtatgatgat atgaaagcct cagccgagag tttggaggcg ccgtatgagg atattgggag gaggggcccg tagtcttcag gcgtgggcat
    aacctttgcc tctccgtata tcgttgaact ggaggcgaac agtatgtctt tgacatcttt cttcctcatc atctcaagca cgttgacagt ccctatgacg tttgatctca gatctatcgt
    cggatccacg gatccgttcc tcacatcgga atcagcagcg agatggacga caagatcgta atctccaggc gtaacggatt ccgttatgtc ttcctttatg aacctgaagt tcttttttcc
    catgaacggt tttatgtatc tgtcatccat tatgctcagg ttatctataa ccgtgacatc gttgtcctca agaagcattt ccaccatatt cgatcctatg aaacctgcgc ccccagttat
    catcacatgt tttccattca tatactggta atatttgatg taataatagt tgttctcatg gccgatctgg gtttttatcc ttcagtcatt gttaaatgaa gcatgaaaaa tatctatgta
    aaaagtatta aaaaactacg taatttcagg tagatgcgca aataattttt acggccacgc catgcttacc ttcatgtcac agaagatcca ttgcataata tgcggatcta taatttattc
    tggattatac tgttccgatt gcctttcaga gatccagaga tcccgcacca ttgatgacga atcattcgaa gattggctcg cccgaacaag ggaatcttca aatgtcaaac cagacgataa
    ggaatgcatg
    
    


Gene Identification Exercise 2 (Eukaryotic genomic DNA)

The sequence below is from the Arabidopsis thaliana genome.

  1. Use Genescan at http://genes.mit.edu/GENSCAN.html or at http://genome.dkfz-heidelberg.de/cgi-bin/GENSCAN/genscan.cgi to predict exons and introns encoded on this piece of genomic DNA. Save the predicted peptide, and inspect the graphic output.

  2. Use GENEMACHINE, enter your email address and the sequence below and have the results send to your email account. (The turnaround for the genemachine is a couple of hours. If you are in a hurry, you can find the response here, right click and save to your computer.)

  3. After saving the results onto your computer, open SEQUIN (this is a program provided by the NCBI to submit data to the databank using the ASN format, the output of the genemachine is written in the same format, and you should be able to open the results in SEQUIN as an existing record.)

  4. In SEQUIN explore the different format options to look at the genemachine results.

  5. From the result, does it appear that the genescan program worked correctly?



  6. Go to the Biologist's workbench at http://workbench.sdsc.edu/, and set up an account for yourself. This is not the most user-friendly place, but (a) one can get used to it, (b) it is free, and (c) you can access your sequences and analyses wherever you have access to a browser. Import the Arabidopsis genomic sequence given below, and translate all three open reading frames. The nicest output for browsing is generated by the restriction map program (named TACG in NUCLEIC ACID TOOLS), select all three forward reading frames. You also can calculate an optimal pairwise alignment between gi:2266990 (protein) and the predicted sequence, or the genomic DNA and the cDNA (gi:2266989)

>51028.t00050, Chromosome 1, pre-processing

AGTTTTTGAATCTCTGATTGCTGAGAAAATGCCGGCGTTTTACGGAGGAAAGCTTACGAC

CTTCGAGGACGATGAGAAGGAGAGCGAGTATGGTTACGTTCGTAAGGTATTATCCTGTTT

CGTTCGATCTGGTTTCAATTTGTTTTTTTTTCTGTTTTGCGATGTTAGTTTTTGGTGATG

GATAGATGAAATAGTTGATCTGCTTACCAGGTAAGATTGGTGGGATAGCTAGATTTGATC

TGATAGTTATCAGTGATTGAATCGGTTGATTCCGCGTTGGTTCAGTAACCTCGTCTTTGA

ATTTCTGATCTGATCTGATAGTTCTCAGTGATTTGACATTTTCTTTTATGGGATGCAGGT

TTCAGGTCCTGTTGTTGTTGCCGATGGTATGGCCGGTGCTGCTATGTATGAGTTAGTGCG

TGTTGGTCATGATAATTTGATTGGTGAAATCATCCGTCTTGAAGGAGATTCTGCCACCAT

CCAAGGTTTGTTTCTTCTATTGTGCTTCCTAGTGTAATTTACTTTACGGCATCTTATGTG

ACCTCTGTCGAAGTAAGATATCTTAACTGATATTTGGCAACTTCCTTTTGATCAGTTTAC

GAGGAAACAGCTGGATTGACAGTTAATGATCCCGTTCTTCGAACACACAAGGTTCGCGAG

TTATTTATCTTGGTTTTTTCTAGTGTTGTTCATCTGCAGCTAACATATAATTTGTCCTGA

ATTTACTACAGCCACTTTCTGTGGAGCTCGGGCCAGGAATATTGGGAAATATCTTTGATG

GAATTCAGGTTCAGTTGGATTTATAATCTTGCTAGACATGATTTTTTTTACTTTTATGAT

TCGTTTTATGTGGCTTCTTACGATTCTTTGGTTTCATTTCTTTAAATGTCACAGAGGCCT

TTGAAGACTATTGCAAGAATATCCGGTGATGTGTACATTCCTCGGGGTGTGTCTGTTCCA

GCTCTTGACAAAGATTGTCTTTGGGAGTTCCAGCCCAATAAATTTGGTAATGTGGTTTAC

TCCATATGCCTGTCTATGGAAGTGTTCATTTGGTTTTAATCTTGATGGTCAATTGAATTC

GTTTTGTTTGCAGTCGAGGGAGACACAATAACTGGTGGTGACTTGTATGCTGTAAGTTTA

TTGGTCTCCTCTTTAATCTGCTTTTGACAAGGGAATCTATTTACACAGTTACCGTGGTGT

TTCCCTTGTTTACACTGGGAATAGTTTTTTCTGAAAGTCAAATTAAACTTTGGAATGCAG

ACTGTCTTTGAGAACACTTTGATGAATCACCTCGTTGCCCTTCCTCCGGATGCCATGGGG

AAGATCACTTACATTGCTCCAGCTGGTCAATATTCGCTTAAGGTTTGACTTTAAGTTTCC

CTCAAACAGTTATGAATAAATACGTTTCAAACTTTTTCTTCCTTGATTTCTTTGAATTCA

ACGTTTGAGTTAATATATGGCTAACTTGATCAATTGGTAATCACTTCCTGTTGTAGATCA

TGTTTGGCTTGTTGCTAATAATTGTTTGTCGGTGATTTTCATTTCTCAGGATACCGTGAT

AGAGCTTGAATTCCAGGGGATCAAGAAATCTTACACCATGCTTCAGGTTTGCATGTATCT

TTAATCTTCCTACTTGCAAACGTAAATTTTAAGCTATTTGGTTCACTCTGTTAAATTGGT

TTGGTTGATATATGTCAGAGCTGGCCTGTACGTACGCCTAGGCCAGTTGCATCAAAGCTT

GCTGCCGATACTCCTCTACTTACGGGGCAGGTGATTACTCGATTAATTCTTCTTACAGTG

GTGATAGTCATTTGAATACATGTGTTGCTGATTGCTTTCTTTTCCTGTTGTCAGCGTGTT

CTTGATGCCCTTTTCCCTTCTGTTCTTGGTGGAACCTGTGCCATTCCTGGTGCTTTTGGC

TGTGGGAAAACTGTTATCAGTCAGGCACTTTCCAAGGTACCTTGTGACACTCTCTGGTTT

TGTTCCATTTAATTACTGGATAGATTGAATTTCCAAAGCTAACTTTTTCTTATTTACATA

GTACTCCAACTCTGATGCTGTTGTGTATGTTGGTTGTGGAGAGAGAGGAAATGAAATGGC

TGAGGTATATCTCTTCTCATTCTAAATTTGCATATTGTTCATACAAATCGGACATTTGAT

CTGATTGTTTCTCATAAATTAGGTTCTTATGGACTTCCCACAATTGACAATGACGTTGCC

TGATGGCCGTGAGGAATCTGTCATGAAACGTACCACACTTGTTGCTAACACCTCTAACAT

GCCTGTGGCTGCTCGTGAAGCCTCAATTTACACAGGTAATGTTCAGGCACACAGATTTAA

TAGTTATTGATGAATCCCATTGCCTATGCTCATTTTTTTTTTTTTTTTTTAATGTGAATT

CCAGGAATCACAATCGCTGAATATTTTAGAGATATGGGCTACAATGTTAGTATGATGGCA

GACTCAACTTCCCGTTGGGCAGAAGCATTAAGAGAAATTTCAGGACGGCTGGTAATCTTA

TGCGTTTCACTTTTGCTATATGGATGTTCGTGTTGTCCTCATCTCACTTTTCTTTTTCTC

AGTTTATTGACACCTATTTTGCTTTGTTTTATAGGCTGAAATGCCTGCTGACAGTGGATA

TCCAGCCTATCTAGCAGCACGTTTAGCATCTTTCTATGAACGTGCTGGTAAAGTAAAATG

TCTTGGTGGACCAGAACGTAACGGAAGTGTTACAATTGTTGGTGCAGTTTCGCCTCCTGG

AGGAGACTTTTCAGATCCTGTGACTTCAGCAACCCTTAGTATTGTGCAGGTGATTATTTG

GTTCATGTCTGCTTCCCTATCTTCCATTGTAGATTACATAGTCGTATATGTTGGTTGAGA

TGAACCAGATGGTGTTTAGTTTTAGATCTGCCGCAGACTCGTATATTTAAGCATTTTTTT

TCTCCACTTTGAAATGCTTACTCTTCCATTCTGGTTGTTTCTCTTTTCTTCTGCAGGTCT

TCTGGGGTTTGGACAAAAAGCTTGCCCAGAGAAAACATTTTCCCTCTGTTAATTGGTTGA

TTTCTTACTCAAAGTATTCAACGGTATGCTTAAATATTCTCGGTTCAAACTTGTCTTGGT

TTACTATCTAGAAATCTTGTATATAAAACGCTGCTTTTTGTTTTAGGCACTGGAATCTTT

CTATGAGAAGTTCGATCCAGATTTCATCAACATCAGGACAAAGGCCAGAGAGGTGTTGCA

GAGGGAAGACGATCTTAATGAAATTGTCCAGGTATGTATCACTTATCCTTGTATAAGTAT

CTATTGTGGTGACCAATGAACTCTTGTCTCAGCAACCCTAATACATTTTGAAGGGGTTGA

ACGATAATCTTGGCATGTAAACTTGACTTGAGTTATAGAAGGAAAACAGTGCTAGCACGT

TATTCTTTTCGAAAGGAACTTATTTGACCCACACATTGCTTTTTGTGTGCAGCTTGTAGG

AAAAGATGCGCTAGCAGAAGGGGACAAAATCACATTGGAAACAGCTAAGCTATTGAGGGA

AGATTACCTTGCTCAAAACGCGTTTACACCGTAAGATTTGTTGGCTCCCTTCGTTTTGGT

TTAGTACTCTCTCTTTCTCTCTCAACGGGTTATTCACTCTTGAACCTTTTGGATGAATTT

TTTGACAGATATGACAAATTCTGTCCTTTCTACAAGTCCGTGTGGATGATGCGTAACATT

ATCCATTTCTACAACCTAGCCAACCAGGTAAATAAGATGAGATTTATACATACTATGCTA

AGTGGGGATTAAGGTCAATTGGTTTGTCTAGGTAAAAACCCATTAATTGTTTTGGATACA

CAGGCGGTTGAGAGAGCAGCTGGAATGGACGGTCAAAAGATTACCTACACTCTTATCAAG

CATCGCTTAGGAGATCTTTTCTACCGTTTAGTGTAAGCAAACGACTTGCTTCTCCTCGAT

TTCTCTATGACTCTGTTACATAGCGCTCTAATAAAATGGTCTGAAACGGAATTATGGGAA

CTACAGGTCTCAGAAGTTCGAAGACCCAGCAGAAGGGGAGGATACACTGGTGGAAAAATT

CAAGAAATTGTACGACGATCTCAATGCTGGATTCCGTGCTTTGGAAGATGAAACTCGGTA

AGCTGTCGAGTCTCCACCGCAAGTAAAAAAAATCCACAGAATTGGGTTGTTTTTGGAGAA

AGAGGGTTTCATTCATGGTCTCTTTCTTGTGTTTTTGAACCAACAACTATCATAGTGGTC

GGTATTTTATTTATCGGTTTGGTCGATCGATTGAGTTTTAGCTCTGTGAGCGTCATGATT

CTCCGGCTGTGCTGTGCTGTGTAATATGTTTGATTCGTTGTTTTCATGTTTTTATTTCGG

TGGTAATAAGGTACAGCCAATGTGAGTCATATATTTGATTTGATGTACCCTCTCAATTCA

ATAAGTTAATTTTATGTCCAAAAACATATTGGGGATACCGTTATTTTTCTCATAATAAAT

ACCATCATTTT
                                                       

Finished?

Check the appropriate radio button below before pressing the submit button:

Send email to your instructor (and yourself) upon submit
Send email to yourself only upon submit (as a backup)
Show summary upon submit but do not send email to anyone