Trees with CLUSTALW
Neighborjoining is a fast
algorithmic approach to tree building. The
default option in clustal is to delete gaps only from the pair wise
comparisons, and not globally (the latter means that a position with a gap in
any one of the sequences will not be considered in any of the other sequences). This pairwise gap removal has bad
consequences if the different parts of the protein experience substitutions
with different frequency. Usually it is a good idea
to delete gaps globally
(exception parsimony analyses, where the gaps might be encoded as missing
data).
You should select the
option to correct
for multiple substitutions.
If you want to see how
changes in the alignment parameters affect the alignment, you need to select
the option remove
all gaps before the alignment. Otherwise, the gaps are left
in the sequence and the sequences, including the gaps are just realigned using
the new set of parameters (which usually doesn’t change much).
What is the difference
between an algorithmic tree building method and one that finds an optimal tree?
Trees should be considered
as unrooted, if not rooted by assuming a molecular clock or by an
outgroup.
What information is
contained in a tree? Which trees are
different? Examples
Bootstrapping and the Baron
Münchhausen
Principle behind
bootstrapping
Blackboard example on how
to generate a bootstrapped sample
CAVEAT: the neighbor
joining tree and the bootstrap consensus tree are not necessarily identical.
TREEVIEW
Which operations in the tree editor actually change the
topology of the unrooted tree?
Ladderize, swap branches, exchange branches, re-root, collapse branch or clade,
remove branch, move branch….
ALIGNMENT of DIVERGENT
SEQUENCES:
a) using clustal’s profile alignment
option, e.g. to align different types of proly tRNA synthases (profile1 here, profile 2 here)
b) using structural information
THE SWISS PROTEIN DATA
BANK VIEWER
There are several
programs that allow the inspection and manipulation of 3-D structural protein
data. In this course we will use the Swiss
Protein Data Bank Viewer, and to a lesser extend Chime, an add-on to
Netscape (and in the classroom it seems only to work with Netscape) that allows
viewing 3-D structures. If it is
installed correctly it automatically opens Netscape and displays a 3D image of
the structure when you double click on a *.pdb file. Chime (if you want to install it on your home PC, click here)
is great in that it allows a “comic”-like representation of the structure
(yellow arrows for beta sheets, red spirals for alpha helices). You also can retrieve pdb files from the
NCBI, or from the protein structure data bank
at Rutgers University.
· If correctly
installed chime should start-up and load the structure of lysozyme + substrate
if you click here,
the file is called 1HEW.pdb. Here
should be the bovine mitochondrial F-ATPase.
Save both files locally as 1bmf.pdb and 1HEW.pdb, respectively.
· After the file is
loaded into Netscape, point at the structure and right click (MAC keep button
pressed down)
· Explore different
display options; check out the help page.
· Color with secondary structure
· Display as “cartons”
While chime is
great for a first orientation, and to create publishable figures, it has some
limitations with respect to the possible manipulations you can perform.
SPDBV is an
excellent choice, also because it provides an interface to the Swiss Protein
databank modeling software, and it allows to align proteins based on their
structure. SPDBV is available at ExPASy
There are several
excellent on-line tutorials available to learn the use spdbv:
A basic tutorial is at
http://www.usm.maine.edu/~rhodes/SPVTut/index.html
And a course on structure, spdbv, and
modeling is at
http://www.expasy.ch/swissmod/course/course-index.htm
The
exercises in the following sections are taken with slight modifications from
Gale Rhode's the basic tutorial, many of the exercises in the following
sections parallel exercises in the basic tutorial.
Demonstration
using two histones H2b and H3 from
the nucleosome.
Open the nucleosome pdb file in spdbv
Check the pdb file
Select one histone only and safe selected residues only. Same with second histone.
Load two histones separately
do magic fit,
improve fit,
color RMS,
open alignment window,
open layer info,
move cursor through aligned sequences