PUZZLE 4.0.2 Type of analysis: tree reconstruction Parameter estimation: approximate (faster) Parameter estimation uses: neighbor-joining tree (for substitution process and rate variation) Standard errors (S.E.) are obtained by the curvature method. The upper and lower bounds of an approximate 95% confidence interval for parameter or branch length x are x-1.96*S.E. and x+1.96*S.E. SEQUENCE ALIGNMENT Input data: 19 sequences with 1147 amino acid sites Number of constant sites: 2 (= 0.2% of all sites) SUBSTITUTION PROCESS Model of substitution: JTT (Jones et al. 1992) Amino acid frequencies (estimated from data set): pi(A) = 6.0% pi(R) = 5.4% pi(N) = 4.6% pi(D) = 5.9% pi(C) = 1.3% pi(Q) = 3.0% pi(E) = 7.6% pi(G) = 6.4% pi(H) = 3.3% pi(I) = 8.0% pi(L) = 9.3% pi(K) = 6.5% pi(M) = 2.6% pi(F) = 3.6% pi(P) = 4.6% pi(S) = 5.6% pi(T) = 4.8% pi(W) = 0.6% pi(Y) = 3.4% pi(V) = 7.3% RATE HETEROGENEITY Model of rate heterogeneity: Gamma distributed rates Gamma distribution parameter alpha (estimated from data set): 1.76 (S.E. 0.03) Number of Gamma rate categories: 8 Rates and their respective probabilities used in the likelihood function: Category Relative rate Probability 1 0.1769 0.1250 2 0.3704 0.1250 3 0.5484 0.1250 4 0.7366 0.1250 5 0.9526 0.1250 6 1.2224 0.1250 7 1.6077 0.1250 8 2.3849 0.1250 Categories 1-8 approximate a continous Gamma-distribution with expectation 1 and variance 0.57. Combination of categories that contributes the most to the likelihood (computation done without clock assumption assuming quartet-puzzling tree): 8 8 8 8 8 8 8 8 8 8 7 8 7 8 1 8 7 1 3 8 8 5 8 6 8 8 1 8 8 3 7 4 8 8 5 3 1 1 1 1 3 8 1 3 1 3 4 1 1 3 8 8 6 8 8 1 7 8 3 4 8 5 1 1 4 5 3 1 1 1 4 3 1 7 1 5 8 4 1 8 8 1 3 8 3 8 6 8 1 8 5 1 1 2 5 6 1 8 2 4 1 2 8 4 2 8 8 6 6 1 1 5 2 2 4 7 2 3 2 6 2 2 3 3 8 4 8 2 8 1 8 2 8 7 8 7 8 4 3 2 5 8 5 1 1 3 2 5 2 2 5 5 2 8 8 3 1 8 8 4 8 8 4 7 8 3 6 8 7 3 2 7 5 8 6 8 7 4 7 6 8 8 8 8 8 8 8 8 7 3 5 3 7 7 3 1 2 7 4 1 1 1 1 1 2 2 8 2 3 4 5 2 4 4 2 1 3 1 2 1 3 2 3 3 8 8 2 8 7 8 7 4 7 2 7 5 3 8 7 4 6 7 8 7 6 6 6 8 8 8 8 2 1 2 2 2 1 1 1 1 1 1 1 1 5 1 3 2 1 5 1 4 4 7 5 6 3 6 2 1 2 2 4 1 4 2 1 3 2 1 4 7 4 2 1 6 1 7 3 8 8 8 8 8 8 8 8 8 8 8 8 8 1 8 4 8 8 5 3 2 4 7 6 6 8 8 1 8 3 8 3 8 8 7 7 6 7 8 5 3 4 6 6 2 2 5 5 3 3 4 4 2 4 6 3 4 3 4 4 4 7 8 8 3 3 6 4 2 3 6 5 4 5 2 2 1 1 1 1 1 1 1 1 1 2 5 5 5 4 4 7 3 3 8 3 2 2 2 1 1 1 3 1 5 5 7 2 1 2 2 6 6 1 5 5 6 7 7 5 1 5 4 1 4 1 1 3 1 1 1 2 2 4 8 3 5 6 1 4 5 1 5 5 2 2 5 4 2 7 4 6 1 5 6 2 3 8 3 1 7 1 1 1 2 2 1 1 1 2 1 1 1 1 1 1 1 1 3 2 4 1 1 2 2 3 7 6 5 5 8 6 8 7 8 7 8 8 4 2 1 1 3 2 3 1 3 2 1 8 3 1 1 3 3 3 2 3 3 6 2 5 3 1 8 8 1 8 8 5 8 4 2 7 7 4 7 8 7 2 8 1 1 1 3 2 5 8 2 8 7 2 6 5 7 8 5 2 8 4 7 4 4 5 8 4 8 8 8 8 2 5 1 2 2 1 1 1 1 1 2 4 2 1 5 2 3 5 3 2 2 8 3 6 2 2 4 4 1 7 2 2 2 1 1 1 8 8 8 1 2 1 2 1 1 1 1 2 8 3 3 6 2 6 6 5 1 7 7 8 6 7 5 5 7 3 8 6 2 7 4 3 3 8 1 8 5 4 4 4 1 1 2 1 3 1 5 5 3 8 3 5 4 3 1 3 7 6 8 5 3 2 2 2 4 1 3 5 3 1 1 1 5 4 5 4 6 4 2 2 6 4 1 8 4 8 7 8 8 8 4 7 8 1 2 8 8 8 8 8 8 8 7 8 7 8 4 8 4 3 7 2 5 4 2 6 1 5 4 7 6 7 4 8 2 8 4 5 6 6 8 4 7 8 6 1 2 8 8 8 8 8 5 5 8 8 2 4 8 8 8 8 8 5 7 1 5 8 8 4 8 8 8 8 8 4 8 8 8 8 6 8 8 1 8 5 4 8 8 8 4 8 8 5 8 8 6 2 5 6 6 2 7 2 2 6 4 7 5 6 8 7 5 2 8 8 3 1 6 5 7 8 7 2 3 5 3 5 7 7 8 6 7 1 8 6 6 8 8 8 8 8 7 5 8 8 8 5 7 8 6 4 8 2 5 8 8 3 8 4 8 8 8 8 8 8 4 4 8 8 5 4 8 8 8 8 3 8 2 6 8 8 8 8 7 7 7 3 2 8 8 5 8 4 5 8 4 8 4 1 1 8 8 8 8 4 7 8 8 5 8 3 3 6 5 8 1 6 5 5 6 5 4 3 6 8 4 5 4 6 4 3 7 3 3 8 8 8 8 8 1 8 2 1 1 4 1 8 7 1 8 4 4 7 3 4 7 7 8 8 7 8 6 8 5 8 8 8 2 5 8 8 8 8 8 3 8 8 5 7 4 3 7 8 8 8 7 8 8 4 6 1 3 8 8 8 5 1 8 8 4 7 1 8 8 7 7 8 6 4 7 8 7 8 8 5 4 8 8 8 8 8 1 3 3 4 4 8 4 6 7 6 8 8 8 5 8 5 8 8 6 7 8 3 4 8 4 3 7 8 8 8 8 7 8 8 5 2 6 8 3 8 8 8 7 4 6 4 7 3 8 3 5 4 8 2 4 8 3 2 8 8 8 8 8 8 8 1 8 8 6 8 3 2 5 6 4 8 8 8 7 3 8 7 8 8 8 8 8 6 1 8 3 8 7 7 3 5 1 7 8 8 8 8 4 2 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 SEQUENCES IN INPUT ORDER 5% chi-square test p-value Bostaurus passed 90.95% [63] Homosapien passed 93.90% [34] Schizosacc passed 36.82% [80] Saccharomy passed 10.48% [72] Methanobac passed 45.54% [91] Archaeoglo passed 12.83% [85] Methanococ passed 8.89% [95] Pyrococcus passed 5.17% [93] Methanoco2 failed 2.30% [103] Methanoco3 failed 3.99% [110] Mycobacter failed 0.00% [69] Streptomyc failed 0.00% [101] Methanoco4 passed 33.80% [131] Archaeogl2 passed 17.05% [95] Vibrio failed 0.00% [146] Deinococcu failed 0.00% [49] Arabidopsi failed 0.02% [53] Plasmodium failed 0.00% [51] Celegans passed 13.74% [107] The chi-square tests compares the amino acid composition of each sequence to the frequency distribution assumed in the maximum likelihood model. The number in square brackets indicates how often each sequence is involved in one of the 407 completely unresolved quartets of the quartet puzzling tree search. IDENTICAL SEQUENCES The sequences in each of the following groups are all identical. To speed up computation please remove all but one of each group from the data set. Bostaurus, Homosapien. Methanoco2, Methanoco3. MAXIMUM LIKELIHOOD DISTANCES Maximum likelihood distances are computed using the selected model of substitution and rate heterogeneity. 19 Bostaurus 0.00000 0.00000 1.06832 1.24025 2.27909 2.28737 2.11931 2.20000 2.52014 2.46873 6.56759 7.17150 2.16030 2.49738 2.96659 3.18458 3.15693 3.62887 5.43784 Homosapien 0.00000 0.00000 0.79981 0.87647 2.20519 2.25772 2.04535 2.03709 2.63779 2.63779 8.99901 8.99842 2.10893 2.52844 2.95293 3.00191 2.31255 2.38500 3.72344 Schizosacc 1.06832 0.79981 0.00000 0.86339 2.56255 2.63683 2.51534 2.40217 2.64173 2.62234 6.80829 7.58547 2.35765 2.52630 3.29639 3.31838 3.46101 4.34871 5.62680 Saccharomy 1.24025 0.87647 0.86339 0.00000 2.49224 2.59020 2.38938 2.44679 2.88598 2.85496 6.20301 7.20246 2.47001 2.56814 3.21090 3.50099 3.55505 4.04576 5.65846 Methanobac 2.27909 2.20519 2.56255 2.49224 0.00000 0.75944 0.76478 0.79277 2.10803 2.06174 6.27932 6.21625 1.73936 1.97050 2.76421 2.96897 2.47138 3.62403 4.13420 Archaeoglo 2.28737 2.25772 2.63683 2.59020 0.75944 0.00000 0.82748 0.82131 2.19687 2.15071 6.00009 5.86490 1.72418 1.83559 2.53727 3.17501 2.57997 3.89773 4.13561 Methanococ 2.11931 2.04535 2.51534 2.38938 0.76478 0.82748 0.00000 0.70796 2.09786 2.04805 6.38599 5.93310 1.65489 1.99168 2.62055 3.37542 2.25286 3.65600 5.00997 Pyrococcus 2.20000 2.03709 2.40217 2.44679 0.79277 0.82131 0.70796 0.00000 2.04147 2.01292 5.90511 5.70331 1.80979 1.94678 2.54493 3.53909 2.49426 3.72305 4.39318 Methanoco2 2.52014 2.63779 2.64173 2.88598 2.10803 2.19687 2.09786 2.04147 0.00000 0.00000 6.66600 6.83701 1.94508 1.80009 2.92939 3.86657 3.01965 3.18120 5.03173 Methanoco3 2.46873 2.63779 2.62234 2.85496 2.06174 2.15071 2.04805 2.01292 0.00000 0.00000 6.63248 6.78281 1.94552 1.80009 2.86513 3.64301 2.99388 3.11031 5.03435 Mycobacter 6.56759 8.99901 6.80829 6.20301 6.27932 6.00009 6.38599 5.90511 6.66600 6.63248 0.00000 0.80315 7.65498 6.30747 6.45677 5.55164 7.63986 8.99860 8.23630 Streptomyc 7.17150 8.99842 7.58547 7.20246 6.21625 5.86490 5.93310 5.70331 6.83701 6.78281 0.80315 0.00000 6.76198 7.28391 6.37041 4.80918 6.84341 8.99854 8.99836 Methanoco4 2.16030 2.10893 2.35765 2.47001 1.73936 1.72418 1.65489 1.80979 1.94508 1.94552 7.65498 6.76198 0.00000 2.08080 2.59398 3.19669 2.38057 2.69297 4.60746 Archaeogl2 2.49738 2.52844 2.52630 2.56814 1.97050 1.83559 1.99168 1.94678 1.80009 1.80009 6.30747 7.28391 2.08080 0.00000 3.18890 3.07029 2.89379 3.29960 4.28877 Vibrio 2.96659 2.95293 3.29639 3.21090 2.76421 2.53727 2.62055 2.54493 2.92939 2.86513 6.45677 6.37041 2.59398 3.18890 0.00000 2.64659 3.15898 3.99341 4.76869 Deinococcu 3.18458 3.00191 3.31838 3.50099 2.96897 3.17501 3.37542 3.53909 3.86657 3.64301 5.55164 4.80918 3.19669 3.07029 2.64659 0.00000 3.28330 5.07860 3.93366 Arabidopsi 3.15693 2.31255 3.46101 3.55505 2.47138 2.57997 2.25286 2.49426 3.01965 2.99388 7.63986 6.84341 2.38057 2.89379 3.15898 3.28330 0.00000 3.20129 6.56634 Plasmodium 3.62887 2.38500 4.34871 4.04576 3.62403 3.89773 3.65600 3.72305 3.18120 3.11031 8.99860 8.99854 2.69297 3.29960 3.99341 5.07860 3.20129 0.00000 6.78028 Celegans 5.43784 3.72344 5.62680 5.65846 4.13420 4.13561 5.00997 4.39318 5.03173 5.03435 8.23630 8.99836 4.60746 4.28877 4.76869 3.93366 6.56634 6.78028 0.00000 Average distance (over all possible pairs of sequences): 3.59570 TREE SEARCH Quartet puzzling is used to choose from the possible tree topologies and to simultaneously infer support values for internal branches. Number of puzzling steps: 1000 Analysed quartets: 3876 Unresolved quartets: 407 (= 10.5%) Quartet trees are based on approximate maximum likelihood values using the selected model of substitution and rate heterogeneity. QUARTET PUZZLING TREE Support for the internal branches of the unrooted quartet puzzling tree topology is shown in percent. This quartet puzzling tree is not completely resolved! :---Schizosacc :-------------98: : :---Saccharomy : : :---Arabidopsi : :---------93: : : :---Plasmodium : : :-96: : :---Methanobac : : : :-78: : : : : :---Archaeoglo : : :-----90: : : : : :---Methanococ : : : :-57: : : : :---Pyrococcus : : : : :-90: :---Methanoco2 : : :-68: : : :-60: :---Methanoco3 : : : : : :-66: :-------Archaeogl2 : : : : : :-----------Methanoco4 : : : : :---Mycobacter : : :-89: : : : :---Streptomyc : : : : :-----54:-------Vibrio : : : :-------Deinococcu : : : :-------Celegans : :-----------------------Homosapien : :-----------------------Bostaurus Quartet puzzling tree (in CLUSTAL W notation): (Bostaurus,((Schizosacc,Saccharomy)98,((Arabidopsi,Plasmodium)93, ((Methanobac,Archaeoglo)78,(Methanococ,Pyrococcus)57)90, (((Methanoco2,Methanoco3)68,Archaeogl2)60,Methanoco4)66, ((Mycobacter,Streptomyc)89,Vibrio,Deinococcu,Celegans)54)90)96, Homosapien); BIPARTITIONS The following bipartitions occured at least once in all intermediate trees that have been generated in the 1000 puzzling steps: Bipartitions included in the quartet puzzling tree: (bipartition with sequences in input order : number of times seen) **..****** ********* : 983 **........ ......... : 961 ********** ******..* : 928 ****...... ......... : 904 ****....** ********* : 897 ********** ..******* : 890 ****..**** ********* : 782 ********.. ********* : 677 ********.. **..***** : 657 ********.. ***.***** : 598 ******..** ********* : 569 ********** ..**..**. : 535 Bipartitions not included in the quartet puzzling tree: (bipartition with sequences in input order : number of times seen) ****...... **..***** : 489 ********** ..**..*** : 452 ****...... ......**. : 417 ********** ..***.*** : 366 ********.. **.****** : 316 ********** ****..*** : 306 ********** ******... : 297 ********** ..***.**. : 279 ****...*** ********* : 251 ********** ..**..... : 222 ********** *****.**. : 213 ****...... ......*** : 207 ********.* **.****** : 169 ****..*.** ********* : 139 ****....** ..**..**. : 137 ****....** ..**..... : 114 ****.*.*** ********* : 114 ********** ****..**. : 114 ****...... ***.***** : 108 ****....** **.****** : 94 (207 other less frequent bipartitions not shown) MAXIMUM LIKELIHOOD BRANCH LENGTHS ON QUARTET PUZZLING TREE (NO CLOCK) Branch lengths are computed using the selected model of substitution and rate heterogeneity. :--3 Schizosacc :--20 : :--4 Saccharomy :-31 : : :----17 Arabidopsi : : :------21 : : : :-------18 Plasmodium : : : : : :--5 Methanobac : : :-22 : : : :--6 Archaeoglo : :---24 : : : :--7 Methanococ : : :-23 : : :--8 Pyrococcus : -----30 : : :-9 Methanoco2 : : :----25 : : : :-10 Methanoco3 : : :--26 : : : :----14 Archaeogl2 : :-27 : : :----13 Methanoco4 : : : : :--11 Mycobacter : : :---------------------28 : : : :--12 Streptomyc : : : : :------15 Vibrio : ---29 : :-------16 Deinococcu : : : :-------------19 Celegans : :-2 Homosapien : :-1 Bostaurus branch length S.E. branch length S.E. Bostaurus 1 0.00001 0.00027 20 0.47939 0.06082 Homosapien 2 0.00001 0.00025 21 1.71494 0.20689 Schizosacc 3 0.35582 0.04444 22 0.11480 0.03236 Saccharomy 4 0.56501 0.05183 23 0.03333 0.02832 Methanobac 5 0.41570 0.04439 24 0.93263 0.12181 Archaeoglo 6 0.51813 0.04993 25 1.20281 0.14597 Methanococ 7 0.42233 0.04421 26 0.55865 0.11840 Pyrococcus 8 0.41725 0.04434 27 0.21394 0.08634 Methanoco2 9 0.00001 0.00021 28 6.86779 0.81944 Methanoco3 10 0.00001 0.00019 29 0.85213 0.18403 Mycobacter 11 0.46461 0.06682 30 1.61255 0.17120 Streptomyc 12 0.44379 0.06684 31 0.32713 0.05712 Methanoco4 13 1.15679 0.13775 Archaeogl2 14 1.26336 0.15090 Vibrio 15 1.96241 0.22762 Deinococcu 16 2.31174 0.35285 Arabidopsi 17 1.08861 0.16041 Plasmodium 18 2.21263 0.19688 35 iterations until convergence Celegans 19 4.20156 0.41385 log L: -23686.76 WARNING --- at least one brach length is close to an internal boundary! Quartet puzzling tree with maximum likelihood branch lengths (in CLUSTAL W notation): (Bostaurus:0.00001,((Schizosacc:0.35582,Saccharomy:0.56501)98:0.47939, ((Arabidopsi:1.08861,Plasmodium:2.21263)93:1.71494,((Methanobac:0.41570, Archaeoglo:0.51813)78:0.11480,(Methanococ:0.42233,Pyrococcus:0.41725) 57:0.03333)90:0.93263,(((Methanoco2:0.00001,Methanoco3:0.00001)68:1.20281, Archaeogl2:1.26336)60:0.55865,Methanoco4:1.15679)66:0.21394,((Mycobacter:0.46461, Streptomyc:0.44379)89:6.86779,Vibrio:1.96241,Deinococcu:2.31174, Celegans:4.20156)54:0.85213)90:1.61255)96:0.32713,Homosapien:0.00001); TIME STAMP Date and time: Thu Jun 17 10:40:02 1999 Runtime: 10822 seconds (= 180.4 minutes = 3.0 hours)