CLASS 8. Dotlet Exercise

INSTRUCTIONS:

For each exercise, provide search query used and keep the answers brief. Email me the answers by Sunday 11:59PM AST at the latest.

Use "CLASS 8 EXERCISE" as a message subject, and type answers directly to email body (i.e., no document attachments please). Make sure that first line of your message is your NAME.

The Swiss Institute for Bioinformatics provides a JAVA applet that perform interactive dot plots. It is called DotLet. The main use of dot plots is to detect domains, duplications, insertions, deletions, and, if you work at the DNA level, inversions (excellent illustrations of the use of dot plots are given on the examples page).

  1. Comparing yeast ATPase catalytic subunit with yeast HO endonuclease (sex change enzyme). Go to the applet and input the sequences: Sce_VMA.fa, SceHO.fa, vma1Neurospora.fa and Sce_intein.fa. (When you input sequences make sure you paste sequence only, without a sequence description line. Give the sequences a name that allows you to recognize which sequence is which [e.g. Yeast_vma1, YeastHO, Neurospora_vma1, Yeast_intein])

    Select Neurospora A-subunit (vma1Neurospora.fa) and the yeast subunit with intein (Sce_VMA.fa). Select a window size between 9 and 15 and click "compute". The program will compare every window of the chosen size in one sequence to all the possible windows in the other sequence. On the right you see a histogram that describes how often window pairs with the indicated score occurred. The sliding bars below and above the histogram let you select the colors with which matches are depicted.

    If you click on the dot plot panel, the alignment window at the bottom aligns the two sequences accordingly. You can fine-tune the alignment using the sliding bars above and below the alignment window (or using the arrow keys).

    Which sequence positions (from ... to....) in the yeast sequence represent the intein?

  2. If you compare the HO endonuclease (SceHO.fa) to the intein (Sce_intein.fa), does the full-length intein sequence match to something in the HO endonuclease? Is there a part of the sequence in the HO endonuclease that might correspond to an extein?

  3. Comparison of nucleotide sequence with introns vs. protein sequence it codes.
    Another application of dot plots is to analyze and visualize the intron/exon structure of genes. In dotlet, if you use a nucleotide sequence for the first sequence, and a protein sequence for the second, the program will compare the translation in all three frames to the protein sequence. Load the following two sequences into dotlet:

    A) The genomic sequence from Arabidopsis thaliana containing the gene encoding the vacuolar ATPase (arab.fa), the given sequence is the reverse complement of a sequence that is part of chromosome 1.

    B) The protein sequence as translated from the cDNA sequence as given in GI 3334404

    How many exons are in the gene ?

  4. Are neighboring exon sequences always in the same reading frame? (Use the mouse pointer to place the blue cross-hairs on the diagonal and then use the arrow key until one of the three frames matches to the protein sequence.) Try this for a couple of exons.

  5. Repetitive proteins in Dotlet
    Using dotlet load GI 15668394 and GI 19887539 (again omit the labels from the sequence, but give them a name so you can recognize them :)).
    Compare the Methanocaldococcus protein against itself. Do you see any repetitive units? How many? Does the choice of scoring matrix make a difference?

  6. Compare the Methanopyrus sequence against the one from Methanocaldococcus. How many equivalents to the single repeat unit in Methanocaldococcus do you find?

  7. How many repeats do you identify when you compare the Methanopyrus sequence against itself?

  8. Compare the two sequences using Pairwise Blast. Which BLAST program should you use? What is the effect of turning the low-complexity filter on or off?