topic#5

Trees with CLUSTALW

Neighborjoining is a fast algorithmic approach to tree building. The default option in clustal is to delete gaps only from the pair wise comparisons, and not globally (the latter means that a position with a gap in any one of the sequences will not be considered in any of the other sequences). This pairwise gap removal has bad consequences if the different parts of the protein experience substitutions with different frequency. Usually it is a good idea to delete gaps globally (exception parsimony analyses, where the gaps might be encoded as missing data).

You should select the option to correct for multiple substitutions.

If you want to see how changes in the alignment parameters affect the alignment, you need to select the option remove all gaps before the alignment. Otherwise, the gaps are left in the sequence and the sequences, including the gaps are just realigned using the new set of parameters (which usually doesn’t change much).

What is the difference between an algorithmic tree building method and one that finds an optimal tree?

Trees should be considered as unrooted, if not rooted by assuming a molecular clock or by an outgroup.

What information is contained in a tree? Which trees are different? Examples

Bootstrapping and the Baron Münchhausen

Principle behind bootstrapping

Blackboard example on how to generate a bootstrapped sample

CAVEAT: the neighbor joining tree and the bootstrap consensus tree are not necessarily identical.

TREEVIEW

Which operations in the tree editor actually change the topology of the unrooted tree?
Ladderize, swap branches, exchange branches, re-root, collapse branch or clade, remove branch, move branch….

ALIGNMENT of DIVERGENT SEQUENCES:

a) using clustal’s profile alignment option, e.g. to align different types of proly tRNA synthases (profile1 here, profile 2 here)

b) using structural information

THE SWISS PROTEIN DATA BANK VIEWER

There are several programs that allow the inspection and manipulation of 3-D structural protein data. In this course we will use the Swiss Protein Data Bank Viewer, and to a lesser extend Chime, an add-on to Netscape (and in the classroom it seems only to work with Netscape) that allows viewing 3-D structures. If it is installed correctly it automatically opens Netscape and displays a 3D image of the structure when you double click on a *.pdb file. Chime (if you want to install it on your home PC, click here) is great in that it allows a “comic”-like representation of the structure (yellow arrows for beta sheets, red spirals for alpha helices). You also can retrieve pdb files from the NCBI, or from the protein structure data bank at Rutgers University.

· If correctly installed chime should start-up and load the structure of lysozyme + substrate if you click here, the file is called 1HEW.pdb. Here should be the bovine mitochondrial F-ATPase. Save both files locally as 1bmf.pdb and 1HEW.pdb, respectively.

· After the file is loaded into Netscape, point at the structure and right click (MAC keep button pressed down)

· Explore different display options; check out the help page.

· Color with secondary structure

· Display as “cartons”

While chime is great for a first orientation, and to create publishable figures, it has some limitations with respect to the possible manipulations you can perform.

SPDBV is an excellent choice, also because it provides an interface to the Swiss Protein databank modeling software, and it allows to align proteins based on their structure. SPDBV is available at ExPASy

There are several excellent on-line tutorials available to learn the use spdbv:

A basic tutorial is at
http://www.usm.maine.edu/~rhodes/SPVTut/index.html

And a course on structure, spdbv, and modeling is at
http://www.expasy.ch/swissmod/course/course-index.htm

The exercises in the following sections are taken with slight modifications from Gale Rhode's the basic tutorial, many of the exercises in the following sections parallel exercises in the basic tutorial.

Demonstration using two histones H2b and H3 from the nucleosome.

Open the nucleosome pdb file in spdbv
Check the pdb file
Select one histone only and safe selected residues only. Same with second histone.
Load two histones separately
do magic fit,
improve fit,
color RMS,
open alignment window,
open layer info,
move cursor through aligned sequences