PUZZLE 4.0.2 Type of analysis: tree reconstruction Parameter estimation: approximate (faster) Parameter estimation uses: neighbor-joining tree (for substitution process and rate variation) Standard errors (S.E.) are obtained by the curvature method. The upper and lower bounds of an approximate 95% confidence interval for parameter or branch length x are x-1.96*S.E. and x+1.96*S.E. SEQUENCE ALIGNMENT Input data: 21 sequences with 394 amino acid sites Number of constant sites: 9 (= 2.3% of all sites) SUBSTITUTION PROCESS Model of substitution: JTT (Jones et al. 1992) Amino acid frequencies (estimated from data set): pi(A) = 10.8% pi(R) = 4.1% pi(N) = 4.4% pi(D) = 4.3% pi(C) = 1.3% pi(Q) = 2.7% pi(E) = 5.6% pi(G) = 10.3% pi(H) = 1.5% pi(I) = 6.8% pi(L) = 7.8% pi(K) = 6.5% pi(M) = 2.7% pi(F) = 4.5% pi(P) = 4.3% pi(S) = 6.1% pi(T) = 6.1% pi(W) = 1.0% pi(Y) = 1.8% pi(V) = 7.6% RATE HETEROGENEITY Model of rate heterogeneity: Gamma distributed rates Gamma distribution parameter alpha (estimated from data set): 1.58 (S.E. 0.13) Number of Gamma rate categories: 8 Rates and their respective probabilities used in the likelihood function: Category Relative rate Probability 1 0.1497 0.1250 2 0.3375 0.1250 3 0.5172 0.1250 4 0.7116 0.1250 5 0.9380 0.1250 6 1.2245 0.1250 7 1.6383 0.1250 8 2.4833 0.1250 Categories 1-8 approximate a continous Gamma-distribution with expectation 1 and variance 0.63. Combination of categories that contributes the most to the likelihood (computation done without clock assumption assuming quartet-puzzling tree): 1 4 4 2 1 8 4 8 1 1 2 1 8 8 1 2 1 1 2 4 7 8 6 8 8 8 8 6 8 8 8 8 8 8 8 7 3 8 8 7 5 8 8 3 8 8 8 6 7 6 8 6 8 3 8 8 8 2 8 6 1 4 6 6 2 8 8 8 5 8 8 4 4 6 5 7 2 7 4 1 2 1 5 2 1 1 1 3 4 2 3 2 8 4 4 1 4 7 5 2 6 5 2 6 6 1 4 2 6 5 2 3 7 4 5 7 4 6 4 3 6 7 1 8 6 7 4 8 4 1 5 7 2 7 7 4 5 1 5 1 1 3 4 8 6 8 1 8 5 5 2 7 7 5 7 6 5 4 1 8 8 5 2 3 3 2 1 3 1 1 1 8 8 1 8 4 8 2 2 8 1 8 4 1 4 8 4 4 1 8 4 2 6 6 7 4 6 5 6 5 1 6 3 5 5 4 4 1 2 2 2 1 3 4 1 2 6 7 4 5 5 3 6 7 6 7 4 3 6 8 7 7 5 2 5 1 2 2 1 2 1 7 7 4 8 8 6 6 5 8 1 6 5 4 2 3 1 2 1 2 2 2 6 6 4 2 2 4 1 1 7 1 7 3 1 6 7 2 2 1 1 1 5 1 3 1 1 6 2 6 1 5 4 3 7 7 4 4 8 7 8 3 5 2 8 7 4 8 8 7 8 6 8 8 7 3 1 4 3 1 5 1 7 2 4 2 3 2 5 3 3 5 2 3 3 1 3 7 2 1 4 4 3 1 1 6 4 2 8 3 1 1 1 7 5 5 7 8 2 1 5 8 5 8 8 1 4 8 8 8 7 8 8 8 8 8 8 8 5 6 8 4 5 2 7 4 1 6 1 1 8 1 8 1 SEQUENCES IN INPUT ORDER 5% chi-square test p-value Saccharomy passed 83.68% [43] Saccharom2 passed 63.59% [54] Musmusculu passed 68.94% [37] Rattusnorv passed 50.10% [42] Deinococcu failed 1.45% [77] Bacillus4 passed 86.19% [134] Rhodopseud passed 10.38% [93] Bacillus2 passed 23.19% [79] Bacillus3 passed 17.61% [89] Sphingomon passed 47.30% [64] Thermotoga passed 17.46% [96] Archaeoglo passed 40.35% [102] Chlamydia passed 12.30% [84] Homosapien passed 93.50% [46] Homosapie2 passed 93.48% [43] Rattusnor2 passed 97.76% [45] Rattusnor3 passed 97.99% [48] Celegans passed 92.24% [39] Celegans2 passed 61.82% [40] Celegans3 passed 57.69% [38] Bacillus passed 64.16% [127] The chi-square tests compares the amino acid composition of each sequence to the frequency distribution assumed in the maximum likelihood model. The number in square brackets indicates how often each sequence is involved in one of the 355 completely unresolved quartets of the quartet puzzling tree search. IDENTICAL SEQUENCES The sequences in each of the following groups are all identical. To speed up computation please remove all but one of each group from the data set. All sequences are unique. MAXIMUM LIKELIHOOD DISTANCES Maximum likelihood distances are computed using the selected model of substitution and rate heterogeneity. 21 Saccharomy 0.00000 0.00383 1.64751 1.61490 1.51381 2.59169 3.19939 2.86749 2.93381 3.01664 2.69031 2.91680 2.57336 2.73165 2.71181 2.73373 2.66841 2.73536 2.60107 2.62023 2.09524 Saccharom2 0.00383 0.00000 1.60215 1.55322 1.34176 2.53202 3.29550 2.79520 2.86512 2.97771 2.73491 2.93652 2.74176 2.43481 2.41390 2.42595 2.36314 2.46666 2.37386 2.31791 1.96442 Musmusculu 1.64751 1.60215 0.00000 0.10430 1.80642 2.97521 3.61245 2.66386 2.68383 2.97031 2.84101 2.75554 2.40480 2.53710 2.50583 2.77056 2.70083 2.24327 2.30001 2.24178 2.00413 Rattusnorv 1.61490 1.55322 0.10430 0.00000 1.76093 2.86204 3.42025 2.69292 2.71334 3.05125 2.68334 2.75427 2.46597 2.60402 2.57173 2.78830 2.71797 2.38558 2.33068 2.23731 1.98334 Deinococcu 1.51381 1.34176 1.80642 1.76093 0.00000 2.56411 2.07153 2.32417 2.35507 1.89550 2.00318 2.98385 1.95272 2.90774 2.95436 2.81328 2.75466 2.63374 2.49664 2.27617 1.76011 Bacillus4 2.59169 2.53202 2.97521 2.86204 2.56411 0.00000 2.38644 2.29368 2.42257 2.44890 2.24959 2.29987 2.60464 3.18829 3.21688 3.28612 3.20236 2.99505 2.98508 2.92934 3.10666 Rhodopseud 3.19939 3.29550 3.61245 3.42025 2.07153 2.38644 0.00000 2.11450 2.15486 1.85669 2.23886 1.98698 1.92172 2.98110 3.05399 2.97669 2.90499 2.88416 2.70755 2.65945 2.28042 Bacillus2 2.86749 2.79520 2.66386 2.69292 2.32417 2.29368 2.11450 0.00000 0.07900 1.51278 1.72353 1.59265 2.04024 2.98416 2.95725 2.96731 2.89888 2.67414 2.62032 2.66397 1.91595 Bacillus3 2.93381 2.86512 2.68383 2.71334 2.35507 2.42257 2.15486 0.07900 0.00000 1.49536 1.80206 1.70391 1.99964 2.82122 2.79584 2.85050 2.78486 2.63466 2.63004 2.60589 1.90977 Sphingomon 3.01664 2.97771 2.97031 3.05125 1.89550 2.44890 1.85669 1.51278 1.49536 0.00000 2.08956 2.02474 2.03186 3.11938 3.17529 3.16650 3.08532 3.37708 3.09598 2.97137 2.50089 Thermotoga 2.69031 2.73491 2.84101 2.68334 2.00318 2.24959 2.23886 1.72353 1.80206 2.08956 0.00000 1.46215 1.89419 2.43167 2.41693 2.38699 2.34857 2.57757 2.38466 2.31630 2.09803 Archaeoglo 2.91680 2.93652 2.75554 2.75427 2.98385 2.29987 1.98698 1.59265 1.70391 2.02474 1.46215 0.00000 2.21496 2.87818 2.90117 3.03516 2.95446 2.60086 2.50883 2.44413 2.34809 Chlamydia 2.57336 2.74176 2.40480 2.46597 1.95272 2.60464 1.92172 2.04024 1.99964 2.03186 1.89419 2.21496 0.00000 2.42004 2.47831 2.54951 2.50281 2.90409 2.83464 2.75036 2.22492 Homosapien 2.73165 2.43481 2.53710 2.60402 2.90774 3.18829 2.98110 2.98416 2.82122 3.11938 2.43167 2.87818 2.42004 0.00000 0.01224 0.21796 0.20228 1.23955 1.14970 1.02075 1.49561 Homosapie2 2.71181 2.41390 2.50583 2.57173 2.95436 3.21688 3.05399 2.95725 2.79584 3.17529 2.41693 2.90117 2.47831 0.01224 0.00000 0.23621 0.22013 1.26754 1.17631 1.04488 1.54245 Rattusnor2 2.73373 2.42595 2.77056 2.78830 2.81328 3.28612 2.97669 2.96731 2.85050 3.16650 2.38699 3.03516 2.54951 0.21796 0.23621 0.00000 0.01215 1.33997 1.18604 1.12349 1.56493 Rattusnor3 2.66841 2.36314 2.70083 2.71797 2.75466 3.20236 2.90499 2.89888 2.78486 3.08532 2.34857 2.95446 2.50281 0.20228 0.22013 0.01215 0.00000 1.32248 1.17079 1.10895 1.55247 Celegans 2.73536 2.46666 2.24327 2.38558 2.63374 2.99505 2.88416 2.67414 2.63466 3.37708 2.57757 2.60086 2.90409 1.23955 1.26754 1.33997 1.32248 0.00000 0.16228 0.47713 1.62667 Celegans2 2.60107 2.37386 2.30001 2.33068 2.49664 2.98508 2.70755 2.62032 2.63004 3.09598 2.38466 2.50883 2.83464 1.14970 1.17631 1.18604 1.17079 0.16228 0.00000 0.42020 1.54840 Celegans3 2.62023 2.31791 2.24178 2.23731 2.27617 2.92934 2.65945 2.66397 2.60589 2.97137 2.31630 2.44413 2.75036 1.02075 1.04488 1.12349 1.10895 0.47713 0.42020 0.00000 1.40679 Bacillus 2.09524 1.96442 2.00413 1.98334 1.76011 3.10666 2.28042 1.91595 1.90977 2.50089 2.09803 2.34809 2.22492 1.49561 1.54245 1.56493 1.55247 1.62667 1.54840 1.40679 0.00000 Average distance (over all possible pairs of sequences): 2.28024 TREE SEARCH Quartet puzzling is used to choose from the possible tree topologies and to simultaneously infer support values for internal branches. Number of puzzling steps: 1000 Analysed quartets: 5985 Unresolved quartets: 355 (= 5.9%) Quartet trees are based on approximate maximum likelihood values using the selected model of substitution and rate heterogeneity. QUARTET PUZZLING TREE Support for the internal branches of the unrooted quartet puzzling tree topology is shown in percent. This quartet puzzling tree is not completely resolved! :---Musmusculu :-----------------99: : :---Rattusnorv : : :---Bacillus2 : :-94: : : :---Bacillus3 : : : : :---Thermotoga : :-79: :-65: : :---Archaeoglo : : :---------92: : : : :-------Bacillus4 : : : : : : : :-------Rhodopseud : : : : : : : :-------Sphingomon : : : : : : : :-------Chlamydia : : : : :-93: :---Rattusnor2 : : :100: : : : :---Rattusnor3 :-83: : :-98: : : : : : :---Homosapien : : : : :-99: : : : :-90: :---Homosapie2 : : : : : : : : : : :---Celegans : : : : : :-98: : : :-88: :-94: :---Celegans2 : : : : : : : :-------Celegans3 : : : : : :---------------Bacillus : : : :---------------------------Deinococcu : :-------------------------------Saccharom2 : :-------------------------------Saccharomy Quartet puzzling tree (in CLUSTAL W notation): (Saccharomy,(((Musmusculu,Rattusnorv)99,(((Bacillus2,Bacillus3)94, (Thermotoga,Archaeoglo)79,Bacillus4,Rhodopseud,Sphingomon, Chlamydia)92,((((Rattusnor2,Rattusnor3)100,(Homosapien,Homosapie2)99)98, ((Celegans,Celegans2)98,Celegans3)94)90,Bacillus)88)93)65, Deinococcu)83,Saccharom2); BIPARTITIONS The following bipartitions occured at least once in all intermediate trees that have been generated in the 1000 puzzling steps: Bipartitions included in the quartet puzzling tree: (bipartition with sequences in input order : number of times seen) ********** *****..*** * : 997 **..****** ********** * : 990 ********** ***..***** * : 986 ********** *******..* * : 984 ********** ***....*** * : 979 *******..* ********** * : 937 ********** *******... * : 935 *****..... .......... . : 934 *****..... ...******* * : 924 ********** ***....... * : 903 ********** ***....... . : 882 **........ .......... . : 830 ********** ..******** * : 794 **..*..... .......... . : 648 Bipartitions not included in the quartet puzzling tree: (bipartition with sequences in input order : number of times seen) *******... ********** * : 414 ******.*** **.******* * : 363 ****...... .......... . : 336 ******.**. ********** * : 304 *****..... ..******** * : 247 *****..*** ********** * : 239 *****.**** ..******** * : 237 *****..*** **.******* * : 211 *****..... **.******* * : 200 ******.**. **.******* * : 197 *******... ..******** * : 195 ******.... **.******* * : 195 ******.... ********** * : 193 *******..* ..******** * : 145 *.**.***** ********** * : 128 ******.... ...******* * : 122 *****..**. **.******* * : 119 *****..... ********** * : 106 *****..**. ********** * : 85 ********** *******... . : 72 (128 other less frequent bipartitions not shown) MAXIMUM LIKELIHOOD BRANCH LENGTHS ON QUARTET PUZZLING TREE (NO CLOCK) Branch lengths are computed using the selected model of substitution and rate heterogeneity. :-3 Musmusculu :-------22 : :-4 Rattusnorv :-34 : : :-8 Bacillus2 : : :-------23 : : : :-9 Bacillus3 : : : : : : :-------11 Thermotoga : : :---24 : : : :------12 Archaeoglo : : : : : :------------6 Bacillus4 : : :------25 : : : :-----------7 Rhodopseud : : : : : : : :---------10 Sphingomon : : : : : : : :-----------13 Chlamydia : :----33 : : :-16 Rattusnor2 : : :-26 : : : :-17 Rattusnor3 : : :----28 : : : : :-14 Homosapien : : : :-27 : : : :-15 Homosapie2 : : :----31 : : : : :-18 Celegans : : : : :--29 : : : : : :-19 Celegans2 : : : :---30 : : : :--20 Celegans3 : :---32 : :------21 Bacillus :------35 : :------5 Deinococcu : :-2 Saccharom2 : :-1 Saccharomy branch length S.E. branch length S.E. Saccharomy 1 0.00433 0.00435 22 0.86848 0.12095 Saccharom2 2 0.00001 0.00050 23 0.92292 0.12788 Musmusculu 3 0.06731 0.02027 24 0.28083 0.09287 Rattusnorv 4 0.04350 0.01885 25 0.69265 0.13091 Deinococcu 5 0.77094 0.11117 26 0.11921 0.02628 Bacillus4 6 1.61824 0.20014 27 0.08638 0.02479 Rhodopseud 7 1.43151 0.18193 28 0.47396 0.08033 Bacillus2 8 0.01931 0.01566 29 0.23457 0.04299 Bacillus3 9 0.06958 0.01980 30 0.40505 0.07578 Sphingomon 10 1.14542 0.15155 31 0.49808 0.10524 Thermotoga 11 0.85880 0.12502 32 0.30822 0.10297 Archaeoglo 12 0.77007 0.11660 33 0.45209 0.11756 Chlamydia 13 1.41970 0.17639 34 0.12532 0.07156 Homosapien 14 0.00001 0.00033 35 0.78912 0.11429 Homosapie2 15 0.01244 0.00624 Rattusnor2 16 0.01235 0.00620 Rattusnor3 17 0.00001 0.00035 Celegans 18 0.12114 0.02465 Celegans2 19 0.05270 0.01857 Celegans3 20 0.16470 0.03921 14 iterations until convergence Bacillus 21 0.69463 0.11077 log L: -9237.04 WARNING --- at least one brach length is close to an internal boundary! Quartet puzzling tree with maximum likelihood branch lengths (in CLUSTAL W notation): (Saccharomy:0.00433,(((Musmusculu:0.06731,Rattusnorv:0.04350)99:0.86848, (((Bacillus2:0.01931,Bacillus3:0.06958)94:0.92292,(Thermotoga:0.85880, Archaeoglo:0.77007)79:0.28083,Bacillus4:1.61824,Rhodopseud:1.43151, Sphingomon:1.14542,Chlamydia:1.41970)92:0.69265,((((Rattusnor2:0.01235, Rattusnor3:0.00001)100:0.11921,(Homosapien:0.00001,Homosapie2:0.01244) 99:0.08638)98:0.47396,((Celegans:0.12114,Celegans2:0.05270)98:0.23457, Celegans3:0.16470)94:0.40505)90:0.49808,Bacillus:0.69463)88:0.30822) 93:0.45209)65:0.12532,Deinococcu:0.77094)83:0.78912,Saccharom2:0.00001); TIME STAMP Date and time: Tue Dec 14 17:06:27 1999 Runtime: 5016 seconds (= 83.6 minutes = 1.4 hours)