!!!! Wednesdays class will meet in the Kresge library (TLS, 2nd floor)!!!! | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A few comments on the exercises from class 3 are here. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Today's topic is the use of genome data. In Exercise 1-2 we will first use a query sequence (an intein - for more info on inteins go here), PSI blast and the so-called non-redundant database to calculate a position specific scoring matrix (PSSM). (= Exercise 1) We then will use this matrix to search different genomes for the presence of sequences that be inteins. We then will use blink and blast searches with the sequences that the PSSM retrieved to verify if the target sequence indeed represents an intein. In a third exercise will will screen a baterial genome for genes that are candidates for interdomain or inter phylum horizontal gene transfer. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Short lecture and demonstrations: The NCBI provides several different interfaces to browse through and analyse genomes. For example, in the Borrelia genome, if you click on the “complete genome”, you get a graphical representation, further clicks move you down throw several levels to the nucleotide and encoded amino acid sequence. If you click on an ORF, you retrieve the sequence followed by an output of a blast search of this sequence against the nr database. The graphic representation shows you which part of the ORF generated the match, if you click on the number that represents the score, you open a new window with the alignment (again with nice graphics included). If you click on the number an window with the matching sequence in gb-format opens up. If the ORF is part of a cluster of putatively orthologous genes, you can get information on the cluster by clicking on the COGnumber. From the Borrelia genome page, you can go to tables listing all ORF, or to taxtable, which provides an interesting nearest neighbor coloring of the genome. It is noteworthy that many of the pink dots are endonucleases. Also, there are many transporters among the odd colored genes. In an attempt to capture some phylogenetic information in blast comparisons, Olendzenski et al. pioneered an approach to use multiple reference genomes to screen for putatively horizontally transferred genes (see Fig. 4). A similar approach, but using only two instead of three reference genomes is implemented in the TAX PLOT program at the NCBI's genome page (see the examples given in question 3 below!). You pick one genome to analyze, and two reference genomes. The program returns a plot of every ORF in the selected genome represented in a coordinate system, where the two coordinates are the highest alignment score with the two reference genomes:
Selected genome was from Borrelia burgdorferi. The list of selected genes is below:
|
!!!! Wednesdays class will meet in the Kresge library (TLS 2nd floor) !!!!
Assignment #3:
[Links from this page open in a separate window]
|
!!!! Wednesdays class will meet in the Kresge library (TLS 2nd floor) !!!!