PUZZLE 4.0.2
Type of analysis: tree reconstruction
Parameter estimation: approximate (faster)
Parameter estimation uses: neighbor-joining tree (for substitution process and rate variation)
Standard errors (S.E.) are obtained by the curvature method.
The upper and lower bounds of an approximate 95% confidence interval
for parameter or branch length x are x-1.96*S.E. and x+1.96*S.E.
SEQUENCE ALIGNMENT
Input data: 4 sequences with 363 amino acid sites
Number of constant sites: 21 (= 5.8% of all sites)
SUBSTITUTION PROCESS
Model of substitution: JTT (Jones et al. 1992)
Amino acid frequencies (estimated from data set):
pi(A) = 9.0%
pi(R) = 4.4%
pi(N) = 6.0%
pi(D) = 7.5%
pi(C) = 1.5%
pi(Q) = 3.1%
pi(E) = 6.9%
pi(G) = 5.9%
pi(H) = 0.9%
pi(I) = 6.4%
pi(L) = 7.7%
pi(K) = 6.5%
pi(M) = 1.6%
pi(F) = 4.0%
pi(P) = 4.9%
pi(S) = 6.2%
pi(T) = 3.6%
pi(W) = 3.0%
pi(Y) = 4.8%
pi(V) = 6.0%
RATE HETEROGENEITY
Model of rate heterogeneity: Gamma distributed rates
Gamma distribution parameter alpha (estimated from data set): 2.02 (S.E. 0.59)
Number of Gamma rate categories: 8
Rates and their respective probabilities used in the likelihood function:
Category Relative rate Probability
1 0.2052 0.1250
2 0.4027 0.1250
3 0.5778 0.1250
4 0.7595 0.1250
5 0.9651 0.1250
6 1.2190 0.1250
7 1.5777 0.1250
8 2.2930 0.1250
Categories 1-8 approximate a continous Gamma-distribution with expectation 1
and variance 0.50.
Combination of categories that contributes the most to the likelihood
(computation done without clock assumption assuming quartet-puzzling tree):
8 3 8 8 8 7 1 3 8 6 2 8 8 1 4 4 8 7 5 4 8 4 4 4 4 3 8 8 2 6
6 3 1 8 3 3 8 8 8 7 8 8 8 8 8 2 1 8 1 1 4 8 3 8 4 8 8 4 7 8
7 4 4 3 8 2 7 8 8 8 1 8 7 1 1 3 1 8 3 1 2 5 8 8 5 3 7 2 1 8
2 8 4 8 1 1 1 7 1 2 2 1 5 1 4 1 8 7 3 2 1 2 6 8 8 7 8 8 2 1
3 8 4 2 1 1 5 1 3 1 7 1 1 2 1 1 3 1 1 7 8 8 3 8 8 6 8 1 8 6
1 7 2 1 6 3 2 1 8 1 8 5 8 1 5 3 2 7 5 4 4 2 8 1 5 5 1 5 1 1
1 2 1 8 1 1 2 8 1 1 3 4 1 8 2 1 8 2 3 6 7 4 6 8 2 2 8 8 8 8
8 4 5 7 2 1 3 7 2 4 8 7 1 1 8 4 8 1 8 4 2 8 7 3 6 2 2 1 2 1
1 3 3 8 2 2 7 8 6 8 3 8 1 7 2 8 6 4 3 2 2 6 4 4 1 7 7 6 1 8
8 8 8 8 8 8 2 8 3 5 1 8 3 8 8 8 5 8 1 8 8 8 8 1 1 5 4 1 8 3
2 5 3 2 5 3 6 8 1 8 3 4 8 8 1 4 8 1 8 8 8 6 1 8 8 7 2 8 5 1
1 7 1 6 1 1 8 1 1 1 1 7 1 8 8 8 1 1 1 6 6 1 6 1 1 8 8 8 3 4
1 8 4
SEQUENCES IN INPUT ORDER
5% chi-square test p-value
Thermotoga passed 10.03% [0]
Methanococ failed 0.02% [0]
Deinococcu failed 0.00% [0]
Pseudomona failed 3.72% [0]
The chi-square tests compares the amino acid composition of each sequence
to the frequency distribution assumed in the maximum likelihood model.
The number in square brackets indicates how often each sequence is
involved in one of the 0 completely unresolved quartets of the
quartet puzzling tree search.
IDENTICAL SEQUENCES
The sequences in each of the following groups are all identical. To speed
up computation please remove all but one of each group from the data set.
All sequences are unique.
MAXIMUM LIKELIHOOD DISTANCES
Maximum likelihood distances are computed using the selected model of
substitution and rate heterogeneity.
4
Thermotoga 0.00000 1.07388 2.29357 5.26092
Methanococ 1.07388 0.00000 2.86917 4.56132
Deinococcu 2.29357 2.86917 0.00000 3.14771
Pseudomona 5.26092 4.56132 3.14771 0.00000
Average distance (over all possible pairs of sequences): 3.20110
TREE SEARCH
Quartet puzzling is used to choose from the possible tree topologies
and to simultaneously infer support values for internal branches.
Number of puzzling steps: 1000
Analysed quartets: 1
Unresolved quartets: 0 (= 0.0%)
Quartet trees are based on exact maximum likelihood values
using the selected model of substitution and rate heterogeneity.
QUARTET PUZZLING TREE
Support for the internal branches of the unrooted quartet puzzling
tree topology is shown in percent.
This quartet puzzling tree is completely resolved.
:---Deinococcu
:100:
: :---Pseudomona
:
:-------Methanococ
:
:-------Thermotoga
Quartet puzzling tree (in CLUSTAL W notation):
(Thermotoga,(Deinococcu,Pseudomona)100,Methanococ);
BIPARTITIONS
The following bipartitions occured at least once in all intermediate
trees that have been generated in the 1000 puzzling steps:
Bipartitions included in the quartet puzzling tree:
(bipartition with sequences in input order : number of times seen)
**.. : 1000
Bipartitions not included in the quartet puzzling tree:
(bipartition with sequences in input order : number of times seen)
None (all bipartitions are included)
MAXIMUM LIKELIHOOD BRANCH LENGTHS ON QUARTET PUZZLING TREE (NO CLOCK)
Branch lengths are computed using the selected model of
substitution and rate heterogeneity.
:----3 Deinococcu
:----------5
: :---------------4 Pseudomona
:
:---2 Methanococ
:
:---1 Thermotoga
branch length S.E. branch length S.E.
Thermotoga 1 0.54203 0.10122 5 1.69656 0.28262
Methanococ 2 0.55668 0.10357
Deinococcu 3 0.62888 0.22925 15 iterations until convergence
Pseudomona 4 2.85202 0.42824 log L: -3082.39
Quartet puzzling tree with maximum likelihood branch lengths
(in CLUSTAL W notation):
(Thermotoga:0.54203,(Deinococcu:0.62888,Pseudomona:2.85202)100:1.69656,
Methanococ:0.55668);
TIME STAMP
Date and time: Thu Jun 17 14:13:13 1999
Runtime: 129 seconds (= 2.1 minutes = 0.0 hours)