Trees with CLUSTALXBesides aligning sequences, Clustal also includes programs to calculate distance trees. The trees generated by clustalw certainly have their limitations, however, if one is aware of these limitations, the program is extremely useful for initial exploration. Trees are calculated from a corrected or uncorrected distance matrix using the neighbor joining method. This method does not use an optimization procedure but a much faster algorithmic approach. Several parameters that you can choose in clustalw influence tree building.
Clustalw
also provides possibilities for bootstrapping:
Bootstrapping - how to assess reliability of partitions given in a tree. |
|
Baron Karl Friedrich Hieronymus von Münchhausen |
Bootstrapping is one of the most popular ways to assess the reliability
of branches. The term bootstrapping
goes back to the Baron Münchhausen (pulled himself out of a swamp
by his shoe laces). Briefly, positions of the aligned sequences are
randomly sampled from the multiple sequence alignment with replacements.
The sampled positions are assembled into new data sets, the
so-called bootstrapped samples. Each
position has an about 63% chance to make it into a particular bootstrapped
sample.If a grouping has a lot of support, it will
be supported by at least some positions in each of the bootstrapped
samples, and all the bootstrapped samples will yield this grouping.
Bootstrapping can be applied to all methods of phylogenetic reconstruction.
Bootstrapping has become very popular to assess the reliability of reconstructed phylogenies. Its advantage is that it can be applied to different methods of phylogenetic reconstruction, and that it assigns a probability-like number to every possible partition of the dataset (= branch in the resulting tree). Its disadvantage is that the support for individual groups decreases as you add more sequences to the dataset, and that it just measures how much support for a partition is in your data given a method of analysis. If the method of reconstruction falls victim to a bias or an artifact, this will be reproduced for every of the bootstrapped samples, and it will result in high bootstrap support values. |
Creating a bootstrapped sample Joe Felsenstein describes the bootstrap procedure in his manual to the seqboot program (part of the PHYLIP package, the manual is here, the citations here) as follows:
The sample input and output of the seqboot program illustrates the generation of the bootstrapped samples:
TEST DATA SET
CONTENTS OF OUTPUT FILE(If Replicates are set to 10 and seed to 4333)
|
||
Problems with clustalw: The input
order in analyzing the bootstrapped samples is not randomized; therefore,
if you have no phylogenetic information at all, you get 100% bootstrap values.
LOOK AT YOUR ALIGNMENTS CAREFULLY! - or "From junk comes junk!" If you
have very different branch lengths, even if you have a "molecular clock"
running, long branches have the tendency to attract each other.
TREEVIEW To view trees generated
by clustalw, you can use treeview from Rod Page.
The program should be already
installed on your computers. The program is extremely user-friendly. Trees generated
can be copied and pasted into Microsoft Word, and the labels can be rearranged/modified
after double clicking on the imported image. Assignment #5
Discussion of Results
|