!!! NEXT CLASS (MONDAY FEB
4th)
Blast, PSIblast, and Homology
PRSS Example. There are many other alignment programs. BLAST is a program that is widely used and offered through the NCBI (go here for more info). It also offers to do pairwise comparisons (go here do example). To force the program to report an alignment increase the E-value. Rules of thumb: If you can demonstrate significant similarity using either
randomization or an unweighted blast search, your sequences are homologous
(i.e. related by common ancestry). Convergent evolution has not been
shown to lead to sequence similarities detectable by these means (see
above - this might not be true for scores in PSI-blast) If the actual alignment score is more than three standard
deviations (of the randomized sequences) better than the mean for the
randomized sequences, the two sequences are homologous (i.e. related by
common ancestry). PRSS and many other program use more accurate distributions
to describe the distribution of random hits. The expectation value for
the alignment-score of the actual sequences is based on these statistics. Usually E values (in a blast search or through randomization)
smaller than 10-4 are convincing.
E-values give the expected number of matches with an alignment score
this good or better, P-values give the probability of to find a match of this quality or better.
P values are [0,1], E-values are [0,infinity). For small values E=P z-values, give the distance between the actual alignment score and the
mean of the scores for the randomized sequences expressed as multiples
of the standard deviation calculated for the randomized scores. For example:
a z-value of 3 means that the actual alignment score is 3 standard deviations
better than the average for the randomized sequences. Z-values > 3
are usually considered as suggestive of homology, z-values > 5 are
considered as sufficient demonstration. (see the but below). A somewhat readable description of E, P HSP and other values is here. BUT: Examples: Jim Knox (MCB-UConn) has studied many proteins
involved in bacterial cell wall biosynthesis and antibiotic binding, synthesis
or destruction. Many of these proteins have identical 3-D structure, and
therefore can be assumed to be homologous, however, the above tests fail
to detect this homologies. (For example, enzymes with GRASP nucleotide
binding sites are depicted here) DNA replication involves many different enzymes.
Some of the proteins do the same thing in bacteria, archaea and eukaryotes;
they have similar 3-D structures (e.g.: sliding clamp, E. coli
dnaN and eukaryotic PCNA, see Edgell and Doolittle, Cell 89, 995-998),
but again, the above tests fail to detect homology. BLAST and PSI BLAST Run a blast trial with this sequence A tutorial for standard blast search is here. An easy way to force the program to report less significant
matches is to increase the expect value in the advanced blast
page. The NCBI page describes PSI blast as follows: The results of a normal blast search are aligned and a
pattern of conserved residues is extracted from the alignment. This pattern
is used for the next iteration. An important parameter to adjust is the
E-value threshold down to which matches are included in the alignment
and pattern extraction. |
!!! NEXT CLASS (MONDAY, FEB 4th) WILL MEET IN THE COMPUTER LAB !!!Assignment #2:[Links from this page open in a separate window]
START WORKING ON YOUR STUDENT PROJECT. If you still have time left after that, you may continue with the questions below.
!! NEXT CLASS (MONDAY, FEB 4th) WILL MEET IN THE COMPUTER LAB !!! |