PUZZLE 4.0.2 Type of analysis: tree reconstruction Parameter estimation: approximate (faster) Parameter estimation uses: neighbor-joining tree (for substitution process and rate variation) Standard errors (S.E.) are obtained by the curvature method. The upper and lower bounds of an approximate 95% confidence interval for parameter or branch length x are x-1.96*S.E. and x+1.96*S.E. SEQUENCE ALIGNMENT Input data: 17 sequences with 644 amino acid sites Number of constant sites: 35 (= 5.4% of all sites) SUBSTITUTION PROCESS Model of substitution: JTT (Jones et al. 1992) Amino acid frequencies (estimated from data set): pi(A) = 8.5% pi(R) = 7.6% pi(N) = 3.1% pi(D) = 6.1% pi(C) = 1.0% pi(Q) = 3.2% pi(E) = 6.2% pi(G) = 8.6% pi(H) = 2.2% pi(I) = 4.8% pi(L) = 9.6% pi(K) = 4.8% pi(M) = 1.2% pi(F) = 3.4% pi(P) = 6.5% pi(S) = 5.6% pi(T) = 5.5% pi(W) = 1.4% pi(Y) = 2.0% pi(V) = 8.9% RATE HETEROGENEITY Model of rate heterogeneity: uniform rate SEQUENCES IN INPUT ORDER 5% chi-square test p-value 1Deinococc passed 24.03% [3] 2Deinococc passed 26.17% [6] M.tubercul passed 90.28% [7] M.leprae passed 38.75% [6] Streptomyc passed 28.84% [11] 2M.tubercu failed 0.07% [6] 2M.leprae passed 20.88% [6] Arabidopsi passed 11.00% [8] 1Bos passed 69.29% [9] 2Bos passed 95.68% [5] Rattus passed 42.92% [5] Mus passed 42.40% [7] Homo passed 32.34% [12] Drosophila passed 94.55% [6] Schizosacc failed 0.04% [5] Caenorhabd failed 0.00% [8] Saccharomy failed 0.00% [10] The chi-square tests compares the amino acid composition of each sequence to the frequency distribution assumed in the maximum likelihood model. The number in square brackets indicates how often each sequence is involved in one of the 30 completely unresolved quartets of the quartet puzzling tree search. IDENTICAL SEQUENCES The sequences in each of the following groups are all identical. To speed up computation please remove all but one of each group from the data set. 1Deinococc, 2Deinococc. 1Bos, 2Bos. MAXIMUM LIKELIHOOD DISTANCES Maximum likelihood distances are computed using the selected model of substitution and rate heterogeneity. 17 1Deinococc 0.00000 0.00000 0.85783 0.92251 1.04810 1.07096 1.07740 1.20454 1.19296 1.19296 1.18647 1.20738 1.14436 1.36411 1.45136 1.56271 1.70706 2Deinococc 0.00000 0.00000 0.87562 0.93468 1.01076 1.09478 1.10229 1.23673 1.23653 1.16588 1.21654 1.22756 1.17434 1.36242 1.44965 1.58646 1.64275 M.tubercul 0.85783 0.87562 0.00000 0.19850 1.21259 1.00594 1.04868 1.20461 1.20895 1.20895 1.20159 1.20413 1.19795 1.23010 1.44601 1.65455 1.59250 M.leprae 0.92251 0.93468 0.19850 0.00000 1.24936 1.02036 1.11271 1.23232 1.25729 1.25729 1.27392 1.26066 1.24561 1.24009 1.48023 1.54261 1.55205 Streptomyc 1.04810 1.01076 1.21259 1.24936 0.00000 1.34342 1.39613 1.33848 1.45402 1.45402 1.43895 1.43875 1.45120 1.51781 1.67476 1.67282 1.80262 2M.tubercu 1.07096 1.09478 1.00594 1.02036 1.34342 0.00000 0.26628 1.30280 1.29345 1.19693 1.33622 1.34201 1.28012 1.57528 1.64276 1.76849 1.75902 2M.leprae 1.07740 1.10229 1.04868 1.11271 1.39613 0.26628 0.00000 1.33800 1.42335 1.32577 1.43526 1.45220 1.40050 1.55679 1.71081 1.72902 1.74331 Arabidopsi 1.20454 1.23673 1.20461 1.23232 1.33848 1.30280 1.33800 0.00000 1.03853 1.02082 1.07330 1.05185 1.03008 1.16881 1.37604 1.65216 1.72032 1Bos 1.19296 1.23653 1.20895 1.25729 1.45402 1.29345 1.42335 1.03853 0.00000 0.00000 0.14933 0.14269 0.13959 0.93946 1.24802 1.51532 1.54734 2Bos 1.19296 1.16588 1.20895 1.25729 1.45402 1.19693 1.32577 1.02082 0.00000 0.00000 0.12547 0.11859 0.12086 0.88580 1.22816 1.46445 1.52023 Rattus 1.18647 1.21654 1.20159 1.27392 1.43895 1.33622 1.43526 1.07330 0.14933 0.12547 0.00000 0.05545 0.14109 0.95309 1.23487 1.52590 1.60582 Mus 1.20738 1.22756 1.20413 1.26066 1.43875 1.34201 1.45220 1.05185 0.14269 0.11859 0.05545 0.00000 0.14073 0.93482 1.23827 1.52791 1.61618 Homo 1.14436 1.17434 1.19795 1.24561 1.45120 1.28012 1.40050 1.03008 0.13959 0.12086 0.14109 0.14073 0.00000 0.93215 1.21285 1.54304 1.52210 Drosophila 1.36411 1.36242 1.23010 1.24009 1.51781 1.57528 1.55679 1.16881 0.93946 0.88580 0.95309 0.93482 0.93215 0.00000 1.31865 1.52071 1.56850 Schizosacc 1.45136 1.44965 1.44601 1.48023 1.67476 1.64276 1.71081 1.37604 1.24802 1.22816 1.23487 1.23827 1.21285 1.31865 0.00000 1.53840 1.63565 Caenorhabd 1.56271 1.58646 1.65455 1.54261 1.67282 1.76849 1.72902 1.65216 1.51532 1.46445 1.52590 1.52791 1.54304 1.52071 1.53840 0.00000 1.55258 Saccharomy 1.70706 1.64275 1.59250 1.55205 1.80262 1.75902 1.74331 1.72032 1.54734 1.52023 1.60582 1.61618 1.52210 1.56850 1.63565 1.55258 0.00000 Average distance (over all possible pairs of sequences): 1.21547 TREE SEARCH Quartet puzzling is used to choose from the possible tree topologies and to simultaneously infer support values for internal branches. Number of puzzling steps: 1000 Analysed quartets: 2380 Unresolved quartets: 30 (= 1.3%) Quartet trees are based on approximate maximum likelihood values using the selected model of substitution and rate heterogeneity. QUARTET PUZZLING TREE Support for the internal branches of the unrooted quartet puzzling tree topology is shown in percent. This quartet puzzling tree is not completely resolved! :---1Bos :-99: : :---2Bos :-75: : : :---Rattus :-92: :-98: : : :---Mus :-88: : : : :-----------Homo : : : :---------------Drosophila : :-98: :---Caenorhabd : : :-76: : :---------72: :---Saccharomy : : : : : :-------Schizosacc :-98: : : : :-------------------Arabidopsi : : : : :---2M.tubercu : : :-98: :100: : : :---2M.leprae : : :-------------78: : : : :---M.tubercul : : :-98: : : :---M.leprae : : : :---------------------------Streptomyc : :-------------------------------2Deinococc : :-------------------------------1Deinococc Quartet puzzling tree (in CLUSTAL W notation): (1Deinococc,(((((((1Bos,2Bos)99,(Rattus,Mus)98)75,Homo)92, Drosophila)88,((Caenorhabd,Saccharomy)76,Schizosacc)72,Arabidopsi)98, ((2M.tubercu,2M.leprae)98,(M.tubercul,M.leprae)98)78)98, Streptomyc)100,2Deinococc); BIPARTITIONS The following bipartitions occured at least once in all intermediate trees that have been generated in the 1000 puzzling steps: Bipartitions included in the quartet puzzling tree: (bipartition with sequences in input order : number of times seen) **........ ....... : 1000 ********.. ******* : 994 **..*..... ....... : 983 ********** ..***** : 979 *****..*** ******* : 977 *******... ....... : 975 **..****** ******* : 975 ********.. ...**** : 918 ********.. ....*** : 882 **..*..*** ******* : 783 ********** *****.. : 757 ********.. ..***** : 747 ********** ****... : 723 Bipartitions not included in the quartet puzzling tree: (bipartition with sequences in input order : number of times seen) *******... ....*** : 478 ********.. ....... : 327 ********** ****..* : 237 *******... .....** : 232 ********.. .....** : 231 ********** ...**** : 207 *****..... ....... : 160 *******... ...**** : 112 *******.** **.**** : 46 **..***... ....... : 46 ***.*..*** ******* : 19 **..**.*** ******* : 19 ********.. **.**** : 18 *******.** ...**** : 16 *******... ******* : 16 ****.**... ....... : 13 *******... ..***** : 12 ********** .*.**** : 12 ********** ***.... : 11 *******.** ..***** : 10 (35 other less frequent bipartitions not shown) MAXIMUM LIKELIHOOD BRANCH LENGTHS ON QUARTET PUZZLING TREE (NO CLOCK) Branch lengths are computed using the selected model of substitution and rate heterogeneity. :-9 1Bos :--18 : :-10 2Bos :-20 : : :-11 Rattus : :-19 : :-12 Mus :------21 : :-13 Homo :---22 : :--------14 Drosophila :-----25 : : :------------16 Caenorhabd : : :----23 : : : :------------17 Saccharomy : :---24 : : :----------15 Schizosacc : : : :----------8 Arabidopsi :---29 : : :--6 2M.tubercu : : :--------26 : : : :---7 2M.leprae : :--28 : : :--3 M.tubercul : :------27 : :--4 M.leprae :------30 : :-----------5 Streptomyc : :-2 2Deinococc : :-1 1Deinococc branch length S.E. branch length S.E. 1Deinococc 1 0.00001 0.00025 18 0.07183 0.01337 2Deinococc 2 0.00001 0.00029 19 0.05693 0.01208 M.tubercul 3 0.07877 0.01855 20 0.03218 0.01154 M.leprae 4 0.12274 0.02071 21 0.39917 0.04566 Streptomyc 5 0.68103 0.05986 22 0.19519 0.03968 2M.tubercu 6 0.09336 0.01959 23 0.24487 0.05465 2M.leprae 7 0.17383 0.02254 24 0.19206 0.04681 Arabidopsi 8 0.61964 0.05810 25 0.32644 0.04661 1Bos 9 0.00001 0.00020 26 0.53333 0.05281 2Bos 10 0.00001 0.00020 27 0.37275 0.04365 Rattus 11 0.03105 0.00895 28 0.07758 0.02974 Mus 12 0.02701 0.00849 29 0.14459 0.03481 Homo 13 0.04028 0.01218 30 0.35317 0.04283 Drosophila 14 0.53506 0.05176 Schizosacc 15 0.66680 0.06271 Caenorhabd 16 0.79133 0.07493 13 iterations until convergence Saccharomy 17 0.77667 0.07362 log L: -12223.31 WARNING --- at least one brach length is close to an internal boundary! Quartet puzzling tree with maximum likelihood branch lengths (in CLUSTAL W notation): (1Deinococc:0.00001,(((((((1Bos:0.00001,2Bos:0.00001)99:0.07183,( Rattus:0.03105,Mus:0.02701)98:0.05693)75:0.03218,Homo:0.04028)92:0.39917, Drosophila:0.53506)88:0.19519,((Caenorhabd:0.79133,Saccharomy:0.77667) 76:0.24487,Schizosacc:0.66680)72:0.19206,Arabidopsi:0.61964)98:0.32644, ((2M.tubercu:0.09336,2M.leprae:0.17383)98:0.53333,(M.tubercul:0.07877, M.leprae:0.12274)98:0.37275)78:0.07758)98:0.14459,Streptomyc:0.68103) 100:0.35317,2Deinococc:0.00001); TIME STAMP Date and time: Thu Mar 30 19:52:15 2000 Runtime: 395 seconds (= 6.6 minutes = 0.1 hours)