MCB 372 - CLASS 12
Questions and comments regarding last Wednesday's
class
Sequences with gaps deleted are here,
same without prokaryotes are here.
Results from calculation of distance matrix are here
Neighbor joining tree from these distances w/o prokaryotes is here;
with prokaryotic sequences is here
Eight usertrees (somewhat modified according to justified expectation)
are here
The evaluation of the usertrees is here
(note 6, 7 and 8 are unresolved trees)
A test for the molecular clock using the ml ratio test is here.
Note that the topology in the usertree within the plants is probably
incorrect (several branches are close to 0).
Note the currently official version of puzzle does not allow you to
put the the root in a user defined location (the option is there, but
the root goes on the branch leading to the first species. The analysis
given here as an example was calculated with a version of puzzle provided
by one of the authors (Heiko
Schmidt).
Student presentations #4 and #5:
Among Site
Rate Variation
Estimating number of substitutions
Application of ML mapping to comparative Genome
analyses
A recent article on the use of ml mapping in comparative genome analyses
is here. (Go
through Fig1, 2, 3, 4, 7, and Tab. 4)
Automation of Repetitive Tasks
SEALS demo (SSH client PuTTY is available
here or here).
Protein Data Bank at Research Collaboratory
for Structural Bioinformatics (RCSB)
Protein Data Bank (PDB)
is a public collection of three-dimensional structures of macromolecular
complexes experimentally determined by X-ray crystallographers and NMR
spectroscopists.
There are three ways to search the PDB:
- using the PDB ID codes
- using SearchLite
- using SearchFields.
PDB ID code is a unique four-character alphanumerical
code assigned to every structure in the databank. The characters in the
code might be numbers 0-9, and the uppercase letters from A to Z. The
PDB ID codes are often reported in the articles as a reference to the
structure of a biomolecule.
Search Lite allows you to search databank
using the keywords about the biomolecule you are interested in. The full
list of attributes as well as the examples of search queries are given
here.
And SearchFields
allows you to create very sophisticated and customized queries. For example,
it allows to use primary sequence in FASTA format to search for structures.
The data in the Protein Data Bank is stored in the special format,
so called PDB format. If you are interested in the format specification,
you can click here.
This is a format that SPDBV (and other visualization programs) reads.
It contains information about position of every atom in a molecule,
as well as some auxilary information such as citations and comments.
|