PUZZLE 4.0.2 Type of analysis: tree reconstruction Parameter estimation: approximate (faster) Parameter estimation uses: neighbor-joining tree (for substitution process and rate variation) Standard errors (S.E.) are obtained by the curvature method. The upper and lower bounds of an approximate 95% confidence interval for parameter or branch length x are x-1.96*S.E. and x+1.96*S.E. SEQUENCE ALIGNMENT Input data: 19 sequences with 854 amino acid sites Number of constant sites: 4 (= 0.5% of all sites) SUBSTITUTION PROCESS Model of substitution: JTT (Jones et al. 1992) Amino acid frequencies (estimated from data set): pi(A) = 10.6% pi(R) = 4.9% pi(N) = 3.7% pi(D) = 5.0% pi(C) = 0.8% pi(Q) = 3.2% pi(E) = 5.9% pi(G) = 8.9% pi(H) = 3.0% pi(I) = 5.5% pi(L) = 9.7% pi(K) = 4.4% pi(M) = 1.5% pi(F) = 4.3% pi(P) = 5.0% pi(S) = 6.3% pi(T) = 6.4% pi(W) = 0.9% pi(Y) = 2.9% pi(V) = 7.1% RATE HETEROGENEITY Model of rate heterogeneity: uniform rate SEQUENCES IN INPUT ORDER 5% chi-square test p-value Emericilla passed 54.79% [15] Schizosacc passed 94.72% [15] Candida1 passed 5.52% [17] Candida2 failed 4.45% [16] Deinococcu failed 0.91% [22] Leptospira passed 49.12% [15] Thermotoga passed 36.27% [18] Mycobacter passed 5.32% [29] Deincoccus failed 0.30% [20] Thermus failed 0.00% [23] Trichomona failed 0.67% [38] Arabadopsi failed 0.34% [35] Pyrococcus failed 0.46% [36] Bacillus passed 43.67% [18] Helicobact passed 85.44% [28] Herpetosip passed 71.16% [26] Haemophilu passed 67.08% [27] Pseudomona passed 8.81% [30] Bradyrhizo passed 7.80% [88] The chi-square tests compares the amino acid composition of each sequence to the frequency distribution assumed in the maximum likelihood model. The number in square brackets indicates how often each sequence is involved in one of the 129 completely unresolved quartets of the quartet puzzling tree search. IDENTICAL SEQUENCES The sequences in each of the following groups are all identical. To speed up computation please remove all but one of each group from the data set. All sequences are unique. MAXIMUM LIKELIHOOD DISTANCES Maximum likelihood distances are computed using the selected model of substitution and rate heterogeneity. 19 Emericilla 0.00000 0.42975 0.47799 0.47663 0.60928 0.72495 0.71093 0.78271 1.14230 1.29890 1.37835 1.57266 1.70070 1.68444 1.45158 1.41941 1.40595 1.33365 3.45763 Schizosacc 0.42975 0.00000 0.51559 0.51900 0.67351 0.73073 0.64579 0.77824 1.26316 1.31886 1.32467 1.56230 1.66665 1.56984 1.41309 1.33357 1.38001 1.19950 3.68339 Candida1 0.47799 0.51559 0.00000 0.00228 0.73585 0.83333 0.83590 0.88652 1.34772 1.32364 1.34067 1.54508 1.70577 1.56181 1.38815 1.44241 1.37197 1.33456 3.49079 Candida2 0.47663 0.51900 0.00228 0.00000 0.73508 0.83250 0.83595 0.89053 1.34816 1.32805 1.34067 1.54508 1.70577 1.56181 1.38815 1.44241 1.37197 1.33456 3.49079 Deinococcu 0.60928 0.67351 0.73585 0.73508 0.00000 0.81731 0.82514 0.81383 1.13450 1.18044 1.47760 2.01241 1.59741 1.52544 1.38398 1.39383 1.49380 1.28569 3.85026 Leptospira 0.72495 0.73073 0.83333 0.83250 0.81731 0.00000 0.64036 0.68359 1.33107 1.30472 1.30825 1.58849 1.47488 1.55107 1.29288 1.35790 1.43828 1.23972 3.22575 Thermotoga 0.71093 0.64579 0.83590 0.83595 0.82514 0.64036 0.00000 0.71100 1.30360 1.34451 1.23632 1.56726 1.59511 1.47428 1.39239 1.27291 1.20129 1.27252 3.79569 Mycobacter 0.78271 0.77824 0.88652 0.89053 0.81383 0.68359 0.71100 0.00000 1.25897 1.44325 1.30084 1.44509 1.66950 1.42897 1.24655 1.25357 1.31396 1.15240 3.40519 Deincoccus 1.14230 1.26316 1.34772 1.34816 1.13450 1.33107 1.30360 1.25897 0.00000 1.06266 1.47359 1.55962 1.62605 1.48341 1.46011 1.59615 1.67336 1.35443 3.83810 Thermus 1.29890 1.31886 1.32364 1.32805 1.18044 1.30472 1.34451 1.44325 1.06266 0.00000 1.53789 1.80974 1.45336 1.50849 1.62128 1.63275 1.57315 1.36093 3.67194 Trichomona 1.37835 1.32467 1.34067 1.34067 1.47760 1.30825 1.23632 1.30084 1.47359 1.53789 0.00000 1.21351 1.24118 1.03596 1.01934 1.08522 1.14925 1.28657 3.36106 Arabadopsi 1.57266 1.56230 1.54508 1.54508 2.01241 1.58849 1.56726 1.44509 1.55962 1.80974 1.21351 0.00000 1.37515 1.29782 1.26264 1.24158 1.37220 1.29031 4.48077 Pyrococcus 1.70070 1.66665 1.70577 1.70577 1.59741 1.47488 1.59511 1.66950 1.62605 1.45336 1.24118 1.37515 0.00000 1.42670 1.33838 1.42585 1.20768 1.44389 4.37924 Bacillus 1.68444 1.56984 1.56181 1.56181 1.52544 1.55107 1.47428 1.42897 1.48341 1.50849 1.03596 1.29782 1.42670 0.00000 0.81224 1.04470 1.00607 1.34671 3.15516 Helicobact 1.45158 1.41309 1.38815 1.38815 1.38398 1.29288 1.39239 1.24655 1.46011 1.62128 1.01934 1.26264 1.33838 0.81224 0.00000 0.75093 0.89759 1.29219 3.52051 Herpetosip 1.41941 1.33357 1.44241 1.44241 1.39383 1.35790 1.27291 1.25357 1.59615 1.63275 1.08522 1.24158 1.42585 1.04470 0.75093 0.00000 0.99639 1.16484 3.50540 Haemophilu 1.40595 1.38001 1.37197 1.37197 1.49380 1.43828 1.20129 1.31396 1.67336 1.57315 1.14925 1.37220 1.20768 1.00607 0.89759 0.99639 0.00000 1.29370 3.50255 Pseudomona 1.33365 1.19950 1.33456 1.33456 1.28569 1.23972 1.27252 1.15240 1.35443 1.36093 1.28657 1.29031 1.44389 1.34671 1.29219 1.16484 1.29370 0.00000 3.36523 Bradyrhizo 3.45763 3.68339 3.49079 3.49079 3.85026 3.22575 3.79569 3.40519 3.83810 3.67194 3.36106 4.48077 4.37924 3.15516 3.52051 3.50540 3.50255 3.36523 0.00000 Average distance (over all possible pairs of sequences): 1.49785 TREE SEARCH Quartet puzzling is used to choose from the possible tree topologies and to simultaneously infer support values for internal branches. Number of puzzling steps: 1000 Analysed quartets: 3876 Unresolved quartets: 129 (= 3.3%) Quartet trees are based on approximate maximum likelihood values using the selected model of substitution and rate heterogeneity. QUARTET PUZZLING TREE Support for the internal branches of the unrooted quartet puzzling tree topology is shown in percent. This quartet puzzling tree is not completely resolved! :---Candida1 :--------------------100: : :---Candida2 : : :---Deincoccus : :---------99: : : :---Thermus : : : : :---Arabadopsi : : :-69: : : :-53: :---Pyrococcus : :-94: : : : : : : :-------Trichomona :-66: : : : : : : : :-----------Bacillus : : : : : : : : : :-----------Helicobact : : : :-89: : : : :-----------Herpetosip : : :-90: : : : : : :-----------Haemophilu : : : : : : : : : :-----------Pseudomona : : : : : : : : : :-----------Bradyrhizo : :-83: : : : : :---Leptospira : : : :-61: : : :---------61: :---Thermotoga : : : : : :-------Mycobacter : : : :-----------------------Deinococcu : :-------------------------------Schizosacc : :-------------------------------Emericilla Quartet puzzling tree (in CLUSTAL W notation): (Emericilla,((Candida1,Candida2)100,((((Deincoccus,Thermus)99, (((Arabadopsi,Pyrococcus)69,Trichomona)53,Bacillus,Helicobact, Herpetosip,Haemophilu,Pseudomona,Bradyrhizo)89)94,((Leptospira, Thermotoga)61,Mycobacter)61)90,Deinococcu)83)66,Schizosacc); BIPARTITIONS The following bipartitions occured at least once in all intermediate trees that have been generated in the 1000 puzzling steps: Bipartitions included in the quartet puzzling tree: (bipartition with sequences in input order : number of times seen) **..****** ********* : 1000 ********.. ********* : 985 ********.. ......... : 937 *****..... ......... : 903 ********** ......... : 891 ****...... ......... : 831 ********** *..****** : 693 **........ ......... : 661 *****...** ********* : 614 *****..*** ********* : 606 ********** ...****** : 526 Bipartitions not included in the quartet puzzling tree: (bipartition with sequences in input order : number of times seen) ********** ****..*** : 462 ********** .......*. : 449 ********** .......** : 441 ********** *******.. : 415 ********** ***....** : 409 ********** ***...*** : 347 *******... ......... : 279 ********** ***....*. : 272 ********** ***.****. : 267 ********** ***..**** : 261 ********** ..******* : 252 *.**...... ......... : 200 ******..** ********* : 196 ********** ***...**. : 150 *****.*.** ********* : 143 ********** ***.**.** : 138 ********** .**....*. : 132 ********** ****...** : 126 *...*..... ......... : 118 ********** ........* : 103 (185 other less frequent bipartitions not shown) MAXIMUM LIKELIHOOD BRANCH LENGTHS ON QUARTET PUZZLING TREE (NO CLOCK) Branch lengths are computed using the selected model of substitution and rate heterogeneity. :-3 Candida1 :--20 : :-4 Candida2 :-30 : : :----9 Deincoccus : : :--21 : : : :----10 Thermus : : :--25 : : : : :-----12 Arabadopsi : : : : :-22 : : : : : :-----13 Pyrococcus : : : : :--23 : : : : : :---11 Trichomona : : : : : : : : :----14 Bacillus : : : : : : : :---15 Helicobact : : : ---24 : : : :---16 Herpetosip : : : : : : : :----17 Haemophilu : : : : : : : :-----18 Pseudomona : : : : : : : :---------------------19 Bradyrhizo : : :-28 : : : : :--6 Leptospira : : : : :-26 : : : : : :--7 Thermotoga : : : :-27 : : : :---8 Mycobacter : :-29 : :---5 Deinococcu : :--2 Schizosacc : :--1 Emericilla branch length S.E. branch length S.E. Emericilla 1 0.20001 0.02713 20 0.30447 0.03385 Schizosacc 2 0.23313 0.02872 21 0.25449 0.04840 Candida1 3 0.00001 0.00030 22 0.14236 0.04436 Candida2 4 0.00228 0.00228 23 0.21059 0.04194 Deinococcu 5 0.40591 0.04096 24 0.46003 0.05652 Leptospira 6 0.32664 0.03675 25 0.23763 0.04241 Thermotoga 7 0.32959 0.03656 26 0.05109 0.01979 Mycobacter 8 0.39864 0.04047 27 0.08198 0.02460 Deincoccus 9 0.54827 0.05767 28 0.09974 0.02633 Thermus 10 0.54272 0.05757 29 0.09830 0.02404 Trichomona 11 0.49580 0.05256 30 0.04118 0.01609 Arabadopsi 12 0.71905 0.06826 Pyrococcus 13 0.70398 0.06988 Bacillus 14 0.60791 0.05493 Helicobact 15 0.47953 0.04843 Herpetosip 16 0.48263 0.05236 Haemophilu 17 0.58082 0.05489 Pseudomona 18 0.82382 0.06784 9 iterations until convergence Bradyrhizo 19 3.36536 0.29293 log L: -16333.70 WARNING --- at least one brach length is close to an internal boundary! Quartet puzzling tree with maximum likelihood branch lengths (in CLUSTAL W notation): (Emericilla:0.20001,((Candida1:0.00001,Candida2:0.00228)100:0.30447, ((((Deincoccus:0.54827,Thermus:0.54272)99:0.25449,(((Arabadopsi:0.71905, Pyrococcus:0.70398)69:0.14236,Trichomona:0.49580)53:0.21059,Bacillus:0.60791, Helicobact:0.47953,Herpetosip:0.48263,Haemophilu:0.58082,Pseudomona:0.82382, Bradyrhizo:3.36536)89:0.46003)94:0.23763,((Leptospira:0.32664,Thermotoga:0.32959) 61:0.05109,Mycobacter:0.39864)61:0.08198)90:0.09974,Deinococcu:0.40591) 83:0.09830)66:0.04118,Schizosacc:0.23313); TIME STAMP Date and time: Thu Mar 30 19:17:17 2000 Runtime: 657 seconds (= 10.9 minutes = 0.2 hours)