Contents lists available at ScienceDirect Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev What are the roles of taxon sampling and model fit in tests of cyto-nuclear discordance using avian mitogenomic data? Ryan A. Tamashiroa, Noor D. Whiteb,c, Michael J. Braunb,c, Brant C. Fairclothd, Edward L. Brauna,e, Rebecca T. Kimballa,⁎ a Department of Biology, University of Florida, Gainesville, FL 32611, USA b Behavior, Ecology, Evolution, and Systematics Program, University of Maryland, College Park, MD 20742, USA c Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution—MRC 163, PO Box 37012, Washington, DC 20013, USA d Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, USA eGenetics Institute, University of Florida, Gainesville, FL 32611, USA A B S T R A C T Conflicts between nuclear and mitochondrial phylogenies have led to uncertainty for some relationships within the tree of life. These conflicts have led some to question the value of mitochondrial DNA in phylogenetics now that genome-scale nuclear data can be readily obtained. However, since mitochondrial DNA is maternally inherited and does not recombine, its phylogeny should be closer to the species tree. Additionally, its rapid evolutionary rate may drive accumulation of mutations along short internodes where relevant information from nuclear loci may be limited. In this study, we examine the mitochondrial phylogeny of Cavitaves to elucidate its congruence with recently published nuclear phylogenies of this group of birds. Cavitaves includes the orders Trogoniformes (trogons), Bucerotiformes (hornbills), Coraciiformes (kingfishers and allies), and Piciformes (woodpeckers and allies). We hypothesized that sparse taxon sampling in previously published mitochondrial trees was responsible for apparent cyto-nuclear discordance. To test this hypothesis, we assembled 27 additional Cavitaves mitogenomes and esti- mated phylogenies using seven different taxon sampling schemes ranging from five to 42 ingroup species. We also tested the role that partitioning and model choice played in the observed discordance. Our analyses demonstrated that improved taxon sampling could resolve many of the disagreements. Similarly, partitioning was valuable in improving congruence with the topology from nuclear phylogenies, though the model used to generate the mitochondrial phylogenies had less influence. Overall, our results suggest that the mitochondrial tree is trustworthy when partitioning is used with suitable taxon sampling. 1. Introduction Resolving the tree of life has been a focus of scientists for years, and the avian tree of life has been especially problematic. Early molecular studies (e.g., Groth and Barrowclough, 1999; van Tuinen et al., 2000; Braun and Kimball, 2002) divided birds into three major groups (Pa- laeoganthae, Galloanseres, and Neoaves). However, most avian lineages are members of Neoaves and diversification of that group appears to reflect an explosive radiation, the timing of which is still debated (Feduccia, 2003; Ksepka et al., 2014; Cracraft et al., 2015; Mitchell et al., 2015). The radiation of most bird orders at the base of Neoaves is relatively ancient, occurring at least 65 million years ago, and resolving the deep relationships in Neoaves has been challenging (Jarvis et al., 2014). Both Poe and Chubb (2004) and Suh (2016) suggest that it may not be possible to resolve the phylogeny of Neoaves (i.e., they hy- pothesize that the radiation was a hard polytomy). However, studies using large amounts of nuclear sequence data are beginning to resolve these deeper relationships (Hackett et al., 2008; Kimball et al., 2013; McCormack et al., 2013; Jarvis et al., 2014; Prum et al., 2015; Reddy et al., 2017). The recent availability of practical methods to generate large amounts of nuclear data using sequence capture (Faircloth et al., 2012; McCormack et al., 2013; Prum et al., 2015) or even whole genome sequencing (e.g., Jarvis et al., 2014) raises questions about the con- tinued role of mitogenomes in phylogenetics and other evolutionary studies. We believe that mitogenomic data retain value for several reasons. The mitogenome has long been used as a source of information for phylogenetic studies in birds and other vertebrate groups (e.g., Mindell et al., 1999; Braun and Kimball 2002; Pratt et al., 2009; Pacheco et al., 2011; Mahmood et al., 2014), it remains the only marker sampled for many species (Burleigh et al., 2015), and it is a key marker for some extinct species (e.g., Mitchell et al., 2014). Vertebrate mito- genomes provide a long non-recombining sequence (Berlin and Ellegren, 2001) that might allow accurate estimation of the mitochon- drial gene tree, without any concerns regarding the impact of re- combination (for discussion of the issue of recombination see Springer and Gatesy, 2016; Springer and Gatesy, 2018). The shorter coalescence time of mitochondrial sequences relative to the nuclear genome is https://doi.org/10.1016/j.ympev.2018.10.008 Received 2 April 2018; Received in revised form 11 September 2018; Accepted 9 October 2018 ⁎ Corresponding author. E-mail address: rkimball@ufl.edu (R.T. Kimball). Molecular Phylogenetics and Evolution 130 (2019) 132–142 Available online 12 October 2018 1055-7903/ © 2018 Elsevier Inc. All rights reserved. T expected to lead to a higher probability that the mitochondrial gene tree matches the species tree (Moore, 1995). Finally, large numbers of off-target mitogenomic reads are often generated by sequence capture efforts (Meiklejohn et al., 2014; do Amaral et al., 2015; Wang et al., 2017) without additional cost or laboratory effort, so researchers can easily take advantage of the desirable properties of mitochondrial data to complement the nuclear data being targeted. Despite those desirable properties, mitochondrial DNA also has some short-comings. As a non-recombining sequence, the mitochon- drial genome is considered a single genetic marker and is susceptible to the issues intrinsic to any individual gene tree (reviewed in Maddison, 1997; Rubinoff and Holland, 2005). Specifically, introgression and in- complete lineage sorting (ILS) are challenges for the inference of species tree from mitogenomic data. These phenomena can lead to genuine cyto-nuclear discordance. Second, mitogenomes are fast-evolving se- quences compared to nuclear DNA (Kumar, 1996), suggesting that analyses of mitochondrial DNA might be useful for examining closely related taxa but problematic for deeper divergences. Empirically, there are cases where estimates of deep avian phylogeny based on mitoge- nomic data conflict with the results of analyses of much larger nuclear datasets (e.g., Fig. 2 in Jarvis et al., 2014), suggesting either that cyto- nuclear discordance exists or that it is difficult to obtain accurate esti- mates of the mitochondrial tree. Those results suggest that it is im- portant to understand the factors that drive observed incongruence; one important question is whether we can, in fact, obtain an accurate es- timate of the mitochondrial tree at deep levels of divergence. Several approaches that have been employed to improve phyloge- netic estimation could also be used to resolve questions about cyto- nuclear discordance. Increased taxon sampling has been shown to im- prove the accuracy of phylogenetic analyses in many studies (Hillis, 1996; Pollock et al., 2002; Zwickl and Hillis, 2002; Soltis et al., 2004). For that reason, prior studies using mitogenomes (Pacheco et al., 2011; Mahmood et al., 2014) have called for increased taxon sampling to help resolve different clades. A major benefit of increased taxon sampling is the ability to break up long branches, as it is well established that un- related species can be grouped together artifactually, sometimes with high support, when they exist as long branches (long branch attraction; see Felsenstein, 1978; Hendy and Penny, 1989). The choice of taxa to break up branches should not be random; the best taxa maximally subdivide (i.e., bisect) long branches (Goldman, 1998; Poe and Swofford, 1999; Slack et al., 2007). Increased taxon sampling can also improve parameter estimation (reviewed in Cummings and Meyer, 2005), so adding taxa to well-established clades has the potential to further improve model-based phylogenetics. Sparse taxon sampling represents only one of the challenges in modeling the evolution of mitogenomes. Vertebrate mitochondrial DNA accumulates substitutions much more rapidly than nuclear DNA, ex- hibits a high transition-transversion ratio, and has a very skewed base composition, especially for third codon positions (Kumar, 1996). For these reasons, increased model complexity often improves analyses of mitochondrial data and produces more accurate trees (e.g., Braun and Kimball, 2002). Likewise, Leavitt et al. (2013) demonstrated that par- titioning the mitogenome increases accuracy across several maximum likelihood (ML) models. The variability in substitution rates among different regions of the mitogenome has a large impact on the effec- tiveness of partitioning (Duchene et al., 2011). By utilizing these ana- lytical methods along with improved taxon sampling, we believe that resolving questions about cyto-nuclear discordance and estimating deep divergences using mitochondrial data are possible. One group within Neoaves where available estimates of the mi- tochondrial tree are incongruent with the nuclear topology is Cavitaves (Yuri et al., 2013). This clade includes four orders: Trogoniformes (trogons), Bucerotiformes (hoopoes, woodhoopoes, and hornbills), Coraciiformes (bee-eaters, rollers, and allies), and Piciformes (wood- peckers and allies), all of which are obligate cavity nesters. Modern phylogenies strongly place Cavitaves within the Telluraves (Yuri et al., 2013; also called the “core” landbirds; Jarvis et al., 2014). Phylogenies based on large-scale nuclear studies using several different marker types have resulted in a strongly-corroborated topology because those analyses consistently recover a specific relationship among these four orders (Fig. 1A, see Hackett et al., 2008; Kimball et al., 2013; Jarvis et al., 2014; Prum et al., 2015; Reddy et al., 2017). The sole exception among studies with genome-wide sampling of markers is one of two trees in McCormack et al. (2013): the tree based on 416 ultraconserved element (UCE) loci that excluded Trogoniformes was congruent with Fig. 1A while an incomplete data matrix of 1541 UCE loci united Bu- cerotiformes with Piciformes, in conflict with Fig. 1A. This conflict suggests that the data in McCormack et al. (2013) could not rigorously address relationships within Cavitaves, possibly due to insufficient taxon sampling. In contrast to studies that used nuclear loci, studies Fig. 1. Prior estimates of Cavitaves phylogeny. (A) Most analyses using multiple nuclear genes (e.g., Hackett et al. 2008; Wang et al. 2012; Kimball et al. 2013; Jarvis et al. 2014; Prum et al. 2015; Reddy et al. 2017) recover this topology, often with high support. (B) The mitogenomic phylogeny from Pratt et al. (2009). (C) The mitogenomic phylogeny from Pacheco et al. (2011). (D) The mitogenomic phylogeny from Mahmood et al. (2014). R.A. Tamashiro et al. Molecular Phylogenetics and Evolution 130 (2019) 132–142 133 using complete mitogenomes have produced trees showing much greater topological variation within Cavitaves (Fig. 1B, C and D; Pratt et al., 2009, Pacheco et al., 2011, Mahmood et al., 2014), none of which are congruent with the nuclear topology (Fig. 1A). We considered two hypotheses that explain incongruence between the mitochondrial and nuclear trees. First, that incongruence may re- flect biological history, where some gene trees differ from the species history due to ILS or introgression (Maddison, 1997; also see Ballard and Whitlock, 2004 for a review of these phenomena focused on ver- tebrate mitochondrial DNA). We call this the “genuine discordance” hypothesis. Second, the incongruence could reflect inaccurate estima- tion of the mitochondrial phylogeny, possibly due to sparse taxon sampling and/or poor fit between the model used and data analyzed (e.g., Meiklejohn et al., 2014). We call this the “inaccurate estimation” hypothesis. The inaccurate estimation hypothesis suggests that the true nuclear and mitochondrial trees are (largely) congruent, and therefore methods that have been shown to improve phylogenetic analysis in other studies should increase congruence between the mitochondrial tree and the best estimate of the species tree and, in doing so, falsify the genuine discordance hypothesis. In this study, we explored the impact of taxon sampling and model fit on the mitochondrial phylogeny of Cavitaves to assess whether these factors affect the congruence between the mitochondrial gene tree and nuclear topologies (we consider the topology in Fig. 1A to be an ac- curate estimate of the nuclear topology based on congruence among Hackett et al., 2008; Kimball et al., 2013; Jarvis et al., 2014; Prum et al., 2015; Reddy et al., 2017, and Fig. 2B in McCormack et al., 2013). We extracted the five species ingroup of Pratt et al. (2009) and, using published and unpublished data, expanded it to 42 ingroup species (including two representatives of one species), a significant increase in taxon sampling. In total, our analysis used more than 600 kilobase pairs of mitochondrial DNA (if all species were considered) that comprise the protein-coding genes and ribosomal RNAs from the 43 ingroup taxa and seven outgroups. To examine the impact of taxon sampling on a finer scale, we compared our results using all taxa with trees we generated by: (1) varying the inclusion of key taxa and (2) excluding taxa to mimic the sparse Cavitaves taxon sampling of previous studies. To ex- amine the impact of model fit, we used four different substitution models and conducted analyses with and without partitioning the data. If we postulate that the inaccurate estimation hypothesis is correct, we might expect improved taxon sampling, better fitting models, and/or partitioning to eliminate the observed incongruence between the mi- tochondrial and nuclear topology. On the other hand, if the genuine discordance hypothesis is correct, adding taxa or altering the analytical approach should not increase congruence between the estimated mi- tochondrial tree and the nuclear topology. 2. Methods 2.1. Data collection In this study, we sampled 19 of the 21 families within Cavitaves, only lacking Leptosomidae (Cuckoo rollers) and Semnornithidae (Toucan-barbets), which collectively include only three species. We generated mitogenome sequences for 26 new Cavitaves species, ob- tained a second Dryocopus pileatus mitogenome (this added 12 new sampled families; see Supplementary Tables S1 and S2), and added these sequences to published Cavitaves mitogenomes (Supplementary Table S2). We also included seven outgroup taxa from the orders Accipitriformes (hawks, eagles, New World vultures, and allies), Falconiformes (falcons), and Strigiformes (owls); these raptorial taxa are the members of Telluraves with the shortest branches and are likely among the closest outgroups to Cavitaves (Hackett et al., 2008; Wang et al., 2012; Jarvis et al., 2014; Prum et al., 2015; Reddy et al., 2017). 2.2. Sequencing, assembly, and alignment DNA was extracted using a phenol-chloroform protocol (Rosel and Block, 1996). DNA quality was assessed by agarose gel electrophoresis and fluorometry was used to quantify the DNA. Following this, DNAs were sheared via sonication to about 500 base pair (bp) in length. Li- braries were constructed using a Kapa Biosystems library preparation kit following the “Illumina TruSeq Library Prep for Target Enrichment” protocol (v1.9) available from ultraconserved.org. Eight to ten samples were pooled and those individual pools were enriched for ultra- conserved elements (UCEs) using a version of the original protocol (Blumenstiel et al., 2010) for target enrichment with long oligonu- cleotide baits protocol that was slightly modified for the enrichment of ultraconserved elements (v1.4 from http://ultraconserved.org, Faircloth et al., 2012). The reduced stringency of the washing buffers and washing procedures described in this protocol increase the pro- portion of off-target reads in the resulting enriched pools, and when used with libraries prepared from tissues having high mitochondrial copy number (e.g., muscle), many of the off-target reads are from mtDNA genomes. After enrichment, paired-end, 100 bp sequencing was conducted on Illumina platforms (HiScan and HiSeq2000). We generated new mitogenome assemblies by read mapping in Geneious (Version 6.1.6., Biomatters ltd., 2013). We mapped the Illumina sequencing reads to two reference sequences: Dryocopus pi- leatus (NC_008546) and Halcyon pileata (NC_024198) because each re- ference sequence represents one of two different gene orders in avian mitochondrial DNA (Mindell et al., 1998). The quality of the mapping depended on the gene order of the focal species. In most cases it was straightforward to determine whether the reads aligned better to one or the other of the reference sequences, and we chose the better of the two alignments. We used an iterative approach, transferring the annotation to the consensus sequence and then remapping the raw reads to the consensus sequence generated from the previous step. As expected, the control region was often difficult to reconstruct in its entirety, likely due to its high nucleotide variability and the presence of repeats in some taxa (Simon et al., 1994). In most cases, we were able to close all gaps except the control region, which often did not assemble well. There was a long, repetitive non-coding insertion between tRNA-Pro and tRNA-Thr that prevented limited assembly of the bee-eater (Merops nubicus and Nyctyornis amictus) sequences using 100 bp reads. Finally, we checked the annotations that were transferred from the reference genomes and finalized the annotation of the novel sequences. We aligned our sequences using MUSCLE v. 3.8.31 (Edgar, 2004), imported the alignment into Mesquite (Version 3.10, Maddison and Maddison, 2016), and manually checked the annotation and alignment of all 13 protein coding genes and the two rRNAs. Most species con- tained a frameshift in ND3, a common feature in avian mitogenomes (Mindell et al., 1999); the base associated with the frameshift was ex- cluded from analysis. We also excluded the tRNAs as well as the highly variable and difficult to align control region and intergenic regions from analysis. 2.3. Taxon sampling Previous phylogenetic studies that included Cavitaves mitogenomes had sparse taxon sampling, with very limited sampling of families and, in some cases, lacking orders. We created datasets to replicate these taxon-poor studies to see if we could recreate the published results given our outgroups and analytical approaches. For example, Pratt et al. (2009) did not include any Bucerotiformes, and Mahmood et al. (2014) excluded Trogoniformes (they reported that they generated a phylo- geny that included one trogon but decided to exclude it because they had only a single member of that order and its inclusion lowered the resolution of their trees). Given the very sparse taxon sampling of prior studies we felt that adding a large number of taxa to break up branches within Cavitaves might increase congruence with the nuclear tree. We R.A. Tamashiro et al. Molecular Phylogenetics and Evolution 130 (2019) 132–142 134 also chose one part of the tree in the taxon-rich datasets where we varied the taxon sample. This led us to vary the inclusion of the dol- larbird (Eurystomus orientalis, a roller) and the bee-eaters (M. nubicus and N. amictus) from our “baseline” taxon sample. Altogether, we analyzed seven different sets of ingroup taxa. The taxon-rich sets were: (1) the baseline set, which contained 40 ingroup taxa; (2) the “dollar- bird addition set” which added the dollarbird to the baseline set, re- sulting in 41 ingroup taxa; (3) the “bee-eater addition set” which added the bee-eaters to the baseline set, resulting in 42 ingroup taxa; and (4) an “entire set” of 43 ingroup taxa, comprising the dollarbird, bee-eaters, and all taxa in the baseline set (see Supplementary Table S2). The taxon-poor sets were: (5) “Pratt”, which included the five Cavitaves in Pratt et al. (2009) and did not include Bucerotiformes; (6) “Pacheco”, which included the eight Cavitaves in Pacheco et al. (2011); (7) “Mahmood”, which included the seven Cavitaves in Mahmood et al. (2014) and did not include Trogoniformes (see Supplementary Table S2). The same seven outgroup species were used with each ingroup taxon sample because maintaining the same outgroup species allowed us to focus on the effects of varying the ingroup taxon sample and di- rectly compare the results to our other analyses. 2.4. Phylogenetic analyses and partitioning Along with varying the taxon sampling, we performed partitioned and unpartitioned analyses on each taxon set. Partitioning divides the genome into subsets based on function or other criteria, then groups the subsets into partitions based on various similarities. Our sequences were initially partitioned into 41 subsets (separately by codon position for each protein coding gene, and the two rRNAs). It is possible that using all 41 data subsets as partitions could overparameterize the model (Lanfear et al., 2012), so for the four taxon-rich data sets, we used PartitionFinder (v. 1.1.1, Lanfear et al., 2014) with the rcluster algo- rithm to identify the best partitioning scheme for the taxon-rich data- sets, beginning with the 41 subsets listed above. The best partitioning schemes were selected using the AICc. The final number of partitions of our taxon-rich sampling schemes were 36 (baseline set), 34 (dollarbird addition set), 36 (bee-eater addition set), and 39 (entire set). For the three taxon-poor sets, we wanted to mimic the published analyses to better assess whether taxon sampling was the relevant variable, so we did not use PartitionFinder and simply used all 41 subsets as partitions. Each taxon set and partitioning scheme was analyzed using four different analytical approaches. First, we used RAxML 7.2.7 (Stamatakis 2006) through the CIPRES portal (Miller et al., 2010) with the GTRG- AMMA model and 500 bootstraps replicates. We conducted the three remaining analyses in IQ-TREE (Version 1.4.1, Nguyen et al., 2015), where we performed 1000 ultrafast bootstrap replicates (Minh et al., 2013) per analysis. The second approach used the GTR+G4 model as implemented in IQ-TREE. For the third approach, we used IQ-TREE Fig. 2. Partitioned analysis of our largest taxon sample of Cavitaves using RAxML results in an estimate of phylogeny congruent with previously published analyses based on the nuclear genome. Relationships among the orders, which are indicated to the right of the tree, are identical to those based on analyses of nuclear data (Fig. 1A). Relationships among families within each order are identical to those in Hackett et al. (2008), Prum et al. (2015), and Reddy et al. (2017), except for the trogons (see Table 1). R.A. Tamashiro et al. Molecular Phylogenetics and Evolution 130 (2019) 132–142 135 with the FreeRates model (Yang 1995). The final approach allowed IQ- TREE to choose the best fitting model (which we call “IQ-TREE Choice”). The FreeRates model was not considered in the set of models used by IQ-TREE Choice. In total, we performed 56 distinct analyses (eight for each taxon set). 3. Results Using off-target reads from sequence capture, we were able to as- semble 27 mitogenomes with an average coverage of 207x (Supplementary Table S3). The average number cleaned reads was 3,554,760, with an average of 0.86% of those reads mapping to the mitogenome (ranging from 0.23 to 2.30% of reads; Supplementary Table S3). Thus, while the percentage of reads that mapped to the mitogenome was relatively low, the total number of reads and read length allowed us to assemble complete (or nearly complete) mito- genomes without additional lab work. Using the entire taxon set resulted in a phylogeny in which all of the orders were monophyletic (Fig. 2), and relationships among orders were congruent with the nuclear topology (Fig. 1A) with relatively high bootstrap support (Fig. 3, Table S4, Supplementary Tree Files). Boot- strap support values were low for interordinal relationships, likely re- flecting the short branches separating the orders. Relationships among families within each order (where sufficient taxon sampling existed in published studies) were largely identical to those in Hackett et al. (2008), Prum et al. (2015), and Reddy et al. (2017). The position of the bee-eaters differed from that of Hackett et al. (2008) but it was identical to that in Prum et al. (2015) and Reddy et al. (2017). Within Taxon Set Analysis Type C av ita ve s (T +B +P +C ) B +P +C P+ C C Taxon Set Analysis Type C av ita ve s (T +B +P +C ) B +P +C P+ C C En tir e Se t RAxML (P) Pr at t S et RAxML (P) NA IQ-GTR+G4 (P) IQ-GTR+G4 (P) NA IQ-FreeRate (P) IQ-FreeRate (P) NA IQ- Choice (P) IQ- Choice (P) NA RAxML (U) RAxML (U) NA IQ-GTR+G4(U) IQ-GTR+G4 (U) NA IQ-FreeRate (U) IQ-FreeRate (U) NA IQ- Choice (U) IQ- Choice (U) NA B ee -e at er A dd iti on RAxML (P) Pa ch ec o Se t RAxML (P) IQ-GTR+G4 (P) IQ-GTR+G4 (P) IQ-FreeRate (P) IQ-FreeRate (P) IQ- Choice (P) IQ- Choice (P) RAxML (U) RAxML (U) IQ-GTR+G4(U) IQ-GTR+G4 (U) IQ-FreeRate (U) IQ-FreeRate (U) IQ- Choice (U) IQ- Choice (U) D ol la rb ird A dd iti on RAxML (P) M ah m oo d Se t RAxML (P) NA IQ-GTR+G4 (P) IQ-GTR+G4 (P) NA IQ-FreeRate (P) IQ-FreeRate (P) NA IQ- Choice (P) IQ- Choice (P) NA RAxML (U) RAxML (U) NA IQ-GTR+G4(U) IQ-GTR+G4 (U) NA IQ-FreeRate (U) IQ-FreeRate (U) NA IQ- Choice (U) IQ- Choice (U) NA B as el in e RAxML (P) IQ-GTR+G4 (P) IQ-FreeRate (P) IQ- Choice (P) •  bootstrap sXpport RAxML (U) •  bootstrap sXpport IQ-GTR+G4(U) •  bootstrap sXpport IQ-FreeRate (U)   bootstrap sXpport IQ- Choice (U) Non-monophyletic topology Fig. 3. Bootstrap support for monophyly of specific clades in Cavitaves (T=Tragoniformes, B=Bucerotiformes, C=Coraciiformes, P=Piciformes). IQ refers to analyses conducted in IQ-TREE. Partitioned analyses are designated by (P) and unpartitioned analyses are designated by (U). Bootstrap support for certain nodes are not available (NA) because Pratt lacked Bucerotiformes and Mahmood lacked Trogoniformes. See Table S4 for more details. R.A. Tamashiro et al. Molecular Phylogenetics and Evolution 130 (2019) 132–142 136 kingfishers, relationships were congruent with a recent study using many nuclear loci (Andersen et al., 2018). Lastly, relationships among trogons differed in some of our analyses from that of Reddy et al. (2017), the only study with comparable taxon sampling. 3.1. Effect of taxon sampling While analyses of the entire taxon set resulted in a topology con- gruent with the nuclear tree, our three, taxon-poor sets showed much greater variability in topology. When analyzing the taxon-poor datasets, we had trouble achieving congruence with the nuclear tree or the published topologies (see Figs. 1 and 3, Supplementary Tree Files). Perhaps surprisingly, the Pratt taxon set (n=5 ingroup species) yielded the nuclear phylogeny in five of the eight analyses; the other three analyses placed Trogoniformes sister to Piciformes, identical to the Pratt et al. (2009) topology (Fig. 1). However, Pratt et al. (2009) only included three orders (it lacked Bucerotiformes), reducing the number of possible alternative topologies. Analyses of the Pacheco taxon set produced only two trees that were congruent with the nuclear topology; the remaining analyses placed Bucerotiformes sister to Piciformes, with Coraciiformes sister to Bucerotiformes+Piciformes and Trogoniformes as the most divergent order of Cavitaves (Supplementary Tree Files). Neither of those topologies mirrored the results reported by Pacheco et al. (2011). Similarly, we did not recover the nuclear topology from any analyses of the Mahmood taxon set and only recovered the Mahmood et al. (2014) topology in one of our eight analyses. The re- maining analyses placed Bucerotiformes sister to Piciformes. Mahmood et al. (2014) only sampled three orders, lacking the more divergent Trogoniformes. Thus, the internodes among orders are shorter for the Mahmood analyses than the Pratt analyses, increasing the probability of genuine discordance due to ILS (Pamilo and Nei, 1988). However, the shorter internal branches combined with long terminal branches also make the Mahmood taxon sample more likely to be affected by pro- cesses such as long branch attraction. In contrast, when we consider the four taxon-rich datasets, we find that the details of taxon sampling have a much more limited impact. For the baseline set, which lacked dollarbird and bee-eaters, most of our mitochondrial gene trees were congruent with the consensus nuclear topology (Fig. 1A). The four ingroup orders were typically recovered as monophyletic, and the relationships among the four orders agreed with the most recent phylogenies based on nuclear data (Hackett et al., 2008; Wang et al., 2012; Jarvis et al., 2014; Prum et al., 2015; Reddy et al., 2017). The baseline set yielded the nuclear topology for higher-level relationships in all but one of our analyses (Fig. 3, Table S4, Supple- mentary Tree Files). However, the position of Geobiastes squamiger (the ground roller) was variable in analyses of the baseline set, sometimes resulting in non-monophyly of Coraciiformes, but, ordinal relationships based on the mitogenome were always congruent with the nuclear tree if Geobiastes squamiger was excluded from consideration (Supplemen- tary Tree Files). The only major anomaly we observed for the baseline taxon sample was non-monophyly of Cavitaves, where the falcon out- group was placed sister to trogons (Fig. 3 and Table S4), when the FreeRate model was used without partitioning. Adding taxa to the baseline taxon set had varying effects on the inferred relationships for the orders in Cavitaves. In general, addition of the dollarbird appeared to be disadvantageous, since it resulted in the placement of Bucerotiformes (rather than Trogoniformes) as sister to all other Cavitaves in three of our estimates of phylogeny (Fig. 3, Table S4, Supplementary Tree Files); that rearrangement decreased congruence with the nuclear tree. Conversely, addition of the bee-eaters corrected the sole incongruence found in the baseline set, producing the nuclear topology across all analyses (Fig. 3, Table S4). The negative effects of adding the dollarbird were mitigated when the bee-eaters were added to create the entire set. 3.2. Effects of partitioning Partitioning greatly improved the AICc score of each analysis, re- gardless of the other aspects of model choice or taxon set (see Supplementary Table S4). Among orders, partitioning the taxon-rich datasets led to more cases where the orders were monophyletic (there were 13 cases of non-monophyly using unpartitioned analyses, but only four when using partitioning; Fig. 3, Supplementary Table S4, Supple- mentary Tree Files). The dollarbird addition set was the only dataset in which partitioning did not consistently reduce the incongruence with the nuclear topology (Fig. 3, Supplementary Table S4, and Supple- mentary Tree Files). In the remaining taxon-rich datasets (1, 3, and 4), partitioning produced mitochondrial gene trees identical to the con- sensus nuclear tree at the ordinal level. With respect to the placement of the bee-eaters and dollarbird, partitioning appeared to have dimin- ishing effects with the addition of taxa because the partitioned and unpartitioned analyses converged on a common topology and placed these taxa within a monophyletic Coraciiformes (Fig. 3, Supplementary Table S4, Supplementary Tree Files). For the taxon-poor datasets, however, monophyly occurred more often with unpartitioned than partitioned analyses (Fig. 3, Supple- mentary Table S4, Supplementary Tree Files), although AICc would suggest partitioning is better when comparing partitioned versus un- partitioned results for a specific dataset and analysis (Supplementary Table S4). In general, partitioning also increased the congruence with the nuclear tree within orders and/or led to modest increases in bootstrap support (e.g., Fig. 4, Supplementary Tree Files). This point is well il- lustrated by a major disagreement among the different analyses in the relationships among motmots, todies and kingfishers (Fig. 4, Supple- mentary Tree Files). In every partitioned analysis of the four, taxon-rich datasets (1–4), regardless of model choice, the motmot (Momotus mo- mota) was sister to the kingfisher clade, a relationship supported by nuclear studies (Hackett et al., 2008, Prum et al., 2015, Reddy et al 2017). All unpartitioned analyses, however, placed the motmot sister to the tody (Todus angustirostris). 3.3. Effects of model choice The final aspect of the analytical strategy that we examined in- volved varying the models of sequence evolution and the programs used to infer the phylogenies. We found that RAxML typically provided lower (i.e., more conservative) estimates of bootstrap support than the IQ-TREE analyses; this was expected because we used the ultrafast bootstrap in IQ-TREE and that approach produces higher estimates of bootstrap support (Minh et al., 2013). Within the three IQ-TREE models, IQ-TREE FreeRate was the best model, as measured by the AICc, for every taxon set except the bee-eater addition set, where IQ-TREE Choice provided the lowest AICc value (Supplementary Table S4). However, the FreeRate model often exhibited a greater degree of non- monophyly and lower bootstrap support than did the IQ-TREE Choice model across analyses (Fig. 3). In addition, the FreeRate analysis of the baseline dataset was the only one that failed to resolve monophyly of Cavitaves, as a falcon (Falco sparverius) was placed sister to Trogoni- formes (Fig. 3, Supplementary Table S4, Supplementary Tree Files). Model choice also played a role in determining relationships within trogons. While Trogoniformes was well supported as a clade, we dis- covered two alternative topologies that differed on which trogon was sister to the others. Most of our taxon-rich analyses placed Pharomachrus auriceps as sister to the remaining trogons (19 of 32 analyses, Table 1, Supplementary Tree Files), which is supported by one previously published study using a combination of mtDNA and nuclear DNA (Moyle, 2005). Of the 19 analyses that found that relationship, most (15) were from two similar models, RAxML and IQ-TREE GTR +G4. However, the 13 remaining analyses produced the more com- monly published topology with Apaloderma narina as sister to other R.A. Tamashiro et al. Molecular Phylogenetics and Evolution 130 (2019) 132–142 137 trogons (nuclear DNA, Reddy et al., 2017; mtDNA, de los Monteros, 1998, 2000; combination of the two, Hosner et al., 2010), with IQ-TREE Choice having supported this relationship in seven of eight cases. 3.4. “Rare” mitogenomic changes We also observed two larger-scale mutational changes in the mi- tochondrial genome sequences. First, changes in the mitochondrial gene order, a type of rare genomic change (Gibb et al., 2007), have occurred several times within Cavitaves. Most of the species in our study shared a mitochondrial gene order with the chicken (Gallus gene order; Desjardins and Morais 1990). However, an alternative gene order does appear in up to three separate lineages. The gene order is char- acterized by having a second control region between tRNA-Thr and tRNA-Phe (Mindell et al., 1998). The largest group sharing the alter- native gene order is Piciformes, excluding the jacamars (Galbulidae) and puffbirds (Bucconidae). Two of the hornbills (Rhabdotorrhinus waldini and Penelopides panini) also have the alternative gene order (Pacheco et al., 2011). The last group that may share an alternative gene order are the bee-eaters. The long insertion that we could not assemble is located where the second control region is predicted to be in the alternate gene order. We were unable to determine whether the insertion represented a second control region or a product of poor as- sembly due to the limited read length of the sequence data used to assemble the mitogenomes of our focal taxa. Second, many birds have a single base pair insertion in ND3 that interrupts the reading frame (Mindell et al., 1999); this nucleotide is known to be present in chicken ND3 mRNA (Russell and Beckenbach 2008), indicating that it is recoded by programmed frameshift rather than editing. The ND3 frameshift was present in almost all of the taxa we sequenced, but it was absent in two sister species within Bucer- otiformes (Phoeniculus purpureus and Rhinopomastus cyanomelas) and one species of Piciformes (Megalaima virens), suggesting two in- dependent losses. 4. Discussion Our results demonstrated that both increased taxon sampling and partitioning of datasets are useful strategies for improving the resolu- tion of deep relationships using mitochondrial data, as expected based on studies focused on other groups. Increased taxon sampling appeared to be especially important; the taxon-rich schemes resulted in estimates of the mitogenomic tree that were more congruent with the nuclear tree (e.g., Fig. 1A) than those obtained using the three taxon-poor sampling schemes. Even when a few key taxa (especially the bee-eaters) were not included in the taxon-rich dataset, fewer analyses recovered monophyly of orders (Fig. 3). The combination of dense taxon sampling and par- titioning led to trees that were more congruent with the nuclear to- pology, and our results suggested that model choice generally became less relevant as more taxa were added. However, simply adding taxa was not a panacea for increasing congruence with the nuclear tree; adding the bee-eaters generally increased congruence with the nuclear tree whereas adding the dollarbird was more problematic. Thus, im- proving taxon sampling should not be viewed as simply adding more taxa. Instead, the differential impact of the taxa that we varied in our taxon-rich datasets appeared to reflect the ways in which added taxa Taxon Set Model Dollarbird+Bee- eater Addition Bee-eater Addition Dollarbird Addition Baseline Pa rti tio ne d RAxML A (63) A (58) A (64) A (61) IQ-TREE GTR+G4 A (88) A (74) A (91) A (95) IQ-TREE FreeRate A (90) A (93) A (92) A (91) IQ-TREE Choice A (92) A (91) A (89) A (90) U np ar tit io ne d RAxML B (77) B (80) B (72) B (69) IQ-TREE GTR+G4 B (73) B (71) B (63) B (71) IQ-TREE FreeRate B (74) B (81) B (57) B (69) IQ-TREE Choice B (72) B (77) B (68) B (67) B A Alcedinidae Momotus momota Alcedinidae Momotus momota Todus angustirostris Todus angustirostris Fig. 4. Alternative relationships between todies (Todus angustirostris),motmots (Momotus momota), and kingfishers (Alcedinidae). Bootstrap values for node marked X in topology A or B are in parentheses. R.A. Tamashiro et al. Molecular Phylogenetics and Evolution 130 (2019) 132–142 138 alter the number of long branches (see below). Regardless of the details, the overall pattern indicated that the conflicts between the topologies generated using nuclear and mi- tochondrial data in previous studies appear to largely reflect inaccurate tree estimation in analyses, likely reflecting the limited taxon samples that were analyzed in prior studies. Given that our analytical strategies, particularly our improved taxon sampling, eliminated many of the differences and that the congruence between the nuclear and mitoge- nomic trees increased with increasing taxon sampling, it is unlikely that genuine discordance between the mitochondrial and nuclear genomes led to the appearance of cyto-nuclear discordance in Cavitaves in pre- vious studies. Instead, the better explanation for the incongruence ob- served in prior studies is the inaccurate estimation hypothesis. 4.1. Taxon Sampling, Partitioning, and model fit Although we corroborated the inaccurate estimation hypothesis, this did not address the reason why taxon addition appeared to improve estimates of phylogeny. The most common explanation is that adding taxa results in the subdivision of long branches because it has long been appreciated that analyses of sequence data generated on trees with certain arrangements of long branches is problematic (Felsenstein, 1978; Hendy and Penny, 1989; Kim, 2000). Alternatively, adding taxa could improve the estimation of parameters in the models used for the ML analysis and, in doing so, indirectly improve the estimate of phy- logeny. The first explanation is generally considered to be more likely and is much more extensively discussed in the literature (e.g., Hillis, 1996; Pollock et al., 2002; Zwickl and Hillis, 2002; Soltis et al., 2004), though both could act in concert. While adding large numbers of taxa clearly increased the congruence with the nuclear topology relative to the taxon-poor data sets, we also found that varying the inclusion of one or two species in our taxon-rich sets could alter the estimate of phylogeny in certain cases. While both the dollarbird and the bee-eaters broke up long branches in the tree, their effects differed. Including the bee-eaters, and thus breaking up the long branch leading to the ground roller, sup- ported the monophyly of Coraciiformes and stabilized the overall to- pology (Fig. 3). However, addition of the dollarbird did not lead to the same conclusion. While the dollarbird also divided the long branch to the ground roller, it did so near the base with the result that it added a second long branch. In contrast, the bee-eaters added two taxa that subdivided the ground roller branch without adding new long branches. If we focus on a classic “Felsenstein zone” tree, then bisecting long branches is the recommended strategy when adding taxa (Poe and Swofford, 1999). Our results add to the literature in suggesting that some taxon additions are more advantageous others and further sug- gests that taxa to include should be chosen judiciously. Partitioning proved to be a powerful tool that can enhance accuracy of phylogenetic estimation, as shown in other analyses of avian mi- tochondria (e.g., Powell et al., 2013; Meiklejohn et al., 2014; Wang et al., 2017). Our results demonstrated that partitioning improved congruence with the nuclear topology among our taxon-rich data sets but not with the taxon-poor sets. This may reflect that partitioning such small data sets could simply be less effective than with larger data sets. Partitioning greatly improved model fit, according to the AICc, re- gardless of the type of model that was employed. Model choice had a greater impact with sparse taxon sampling and unpartitioned analyses, and the incongruence among trees generated using different models decreased as taxa were added and when the data were partitioned. One question raised by our analytical approach, Table 1 Alternative resolutions of the most deeply branching Trogoniformes (Apaloderma narina or Pharomachrus auriceps) in analyses of taxon-rich mitogenomic datasets. The numbers are the bootstrap values that support excluding the earliest diverging trogoniform taxon from the other three. Bootstrap Support of Earliest Branching Trogoniformes Taxon Set Analyses Type Apaloderma narina Pharomachrus auriceps Entire Set RAxML (P) – 64 IQ-TREE GTR+G4 (P) – 87 IQ-TREE FreeRates (P) – 84 IQ-TREE Choice (P) – 68 RAxML (U) – 62 IQ-TREE GTR+G4 (U) – 77 IQ-TREE FreeRates (U) 55 – IQ-TREE Choice (U) 72 – Bee-eater Addition Set RAxML (P) – 69 IQ-TREE GTR+G4 (P) – 75 IQ-TREE FreeRates (P) – 78 IQ-TREE Choice (P) 59 – RAxML (U) – 54 IQ-TREE GTR+G4 (U) – 74 IQ-TREE FreeRates (U) 65 – IQ-TREE Choice (U) 74 – Dollarbird Addition Set RAxML (P) – 46 IQ-TREE GTR+G4 (P) 80 – IQ-TREE FreeRates (P) 84 – IQ-TREE Choice (P) 83 – RAxML (U) – 63 IQ-TREE GTR+G4 (U) – 80 IQ-TREE FreeRates (U) – 54 IQ-TREE Choice (U) 51 – Baseline Set RAxML (P) – 56 IQ-TREE GTR+G4 (P) – 67 IQ-TREE FreeRates (P) 72 – IQ-TREE Choice (P) 72 – RAxML (U) – 55 IQ-TREE GTR+G4 (U) – 68 IQ-TREE FreeRates (U) 73 – IQ-TREE Choice (U) 84 – R.A. Tamashiro et al. Molecular Phylogenetics and Evolution 130 (2019) 132–142 139 which used a variety of models, is why we did not focus exclusively on the tree generated using the best-fitting model. We did assess model fit using the AICc, and use of information theory criteria such as the AICc are a standard approach for model selection in phylogenetics (e.g., Posada and Buckley, 2004; Lanfear et al., 2014; but see Sanderson and Kim, 2000 for a discussion of concerns regarding these criteria). How- ever, we believe it is desirable to assess the behavior of models in phylogenetic analyses using multiple criteria rather than simply ap- plying standard information theoretic criteria like the AICc. For ex- ample, the IQ-TREE FreeRate model was the best model overall based on AICc (Supplementary Table S4). However, the FreeRate model was also more sensitive to partitioning and differences in taxon sampling than models with GAMMA-distributed rates, as seen in the re- construction of the trogon clade (Table 1). Additionally, the unparti- tioned FreeRate analysis of the baseline taxon set yielded Cavitaves non-monophyly (Fig. 3, Supplementary Table S4); that result is very unlikely. FreeRate models are poorly-studied relative to models with GAMMA-distributed rates and these observations suggest that FreeRate models may behave poorly in parts of parameter space. Our use of multiple models highlighted these behaviors of the FreeRates models and suggest that continued use of models with GAMMA-distributed rates might be more appropriate. Ultimately, the fact that most of our analyses using large taxon samples are congruent with nuclear estimates of phylogeny suggests: (1) that the details of the model are not critical for this particular problem as long as there is extensive taxon sampling; and (2) that the mitogenomic tree is congruent with the nuclear tree, at least for the Cavitaves backbone. In fact, relationships among families based on the mitogenomic tree are the same as the nuclear trees from Prum et al. (2015) and Reddy et al. (2017), further corroborating the inaccurate estimation hypothesis as an explanation for the prior incongruent trees. 4.2. The value of mitochondria moving forward The fact that our estimate of the mitochondrial tree appears to be congruent with the nuclear phylogeny actually leaves one question open: is it possible, at least in principle, to assess whether the mitoge- nomic tree is accurate if it is incongruent with the nuclear genome? Had our estimate of mitogenomic phylogeny conflicted with the nuclear topology, even after the addition of many taxa and use of improved models, one could turn to either of two distinct hypotheses: (1) we obtained an accurate mitogenomic tree and it provides evidence for genuine cyto-nuclear discordance; or (2) we obtained an inaccurate estimate of the mitogenomic tree. The possibility that the mitogenomic tree is inaccurate could simply reflect stochastic error (Patel et al., 2013) or it could reflect estimation error due to unrecognized model misspecification (Richards et al., 2018). Analyses of individual mi- tochondrial gene regions appear to be associated with substantial sto- chastic error, but this does not appear to be as large of a problem for the complete mitogenome (Meiklejohn et al., 2014), probably due to the larger number of sites. Thus, the second possibility (model mis- specification) is likely to represent the best explanation for incon- gruence. Ultimately, it will be necessary to find methods that can reveal whether models used for analyses are likely to provide accurate esti- mates of phylogeny, an area that has vexed the phylogenetics com- munity for many years (Sanderson and Kim, 2000; Steel, 2005). It seems likely that an appeal to the “biochemical realism” of evolutionary models will be an important component of any attempts to corroborate the hypothesis that genuine cyto-nuclear discordance exists. The ob- servation that the results of our analyses improved congruence and thus falsified the genuine discordance hypothesis means we do not need to address this issue for Cavitaves mitogenomes. The future value of mitochondrial genomes for higher-level phylo- genetics depends on the ability of the available analytical techniques to produce an accurate estimate of the mitogenomic tree. Along with practical reasons for using mitochondrial DNA, including that some fascinating extinct taxa may primarily or exclusively yield mitochon- drial data (e.g., Mitchell et al., 2014), there are theoretical arguments that mitogenomic trees will typically be closer to the species tree than trees derived from individual nuclear genes. The mitochondrial genome is maternally inherited and therefore reflects a smaller effective popu- lation size than the nuclear genome, limiting the impact of ILS. More recently, Hill (2017) suggested that the mitogenome might have a special role in reproductive isolation, reflecting the evolution of hybrid incompatibilities due to co-adaptation between mitochondrial proteins and nuclear-encoded but mitochondrially-localized proteins. The Hill (2017) hypothesis is based on a number of possible consequences of this co-adaptation (e.g., Hill and Johnson, 2013; Koch et al., 2017). If this hypothesis is correct, the mitogenomic tree is likely to be even closer to the species tree than expected based on the neutral multispecies coa- lescent, highlighting that estimating a mitogenomic tree can be valu- able. Like other studies (e.g., Meiklejohn et al., 2014; do Amaral et al., 2015), we have demonstrated that mitogenomic sequences may be extracted from large-scale genomic sequencing with little effort. Given that we were able to reconcile many of the differences between the nuclear and mitochondrial tree with existing analytical techniques, we have demonstrated that mitochondrial data can produce trustworthy trees and provide future utility even in an era where obtaining whole genomes is increasing feasible. Hopefully additional studies will also take advantage of the off-target mitochondrial reads to help build large datasets of both nuclear and mitogenomic data. Acknowledgements We thank Mike Andersen, an anonymous reviewer, and F. Raposo do Amaral. for suggestions that improved this manuscript. We also thank the museums that provided the tissues we used (Supplementary Table S1). Funding This work was supported by the Smithsonian Institution’s Scholarly Studies Program, United States to M.J.B and E.L.B., the Smithsonian’s Grand Challenges Program (Consortium for Understanding and Sustaining a Biodiverse Planet), United States, to M.J.B. and the United States National Science Foundation (grant DEB-1655683 to E.L.B. and R.T.K and grant DEB-1655624 to B.C.F.). Appendix A. Supplementary material Supplementary data to this article can be found online at https:// doi.org/10.1016/j.ympev.2018.10.008. References Amaral, F.R., Neves, L.G., Resende, M.F.R. Jr, Mobili, F., Miyaki, C.Y., Pellegrino, K.C.M., Biondo, C., 2015. Ultraconserved elements sequencing as a low-cost source of com- plete mitochondrial genomes and microsatellite markers in non-model Amniotes. PLoS One 10, e0138446. Andersen, M.J., McCullough, J.M., Mauck III, W.M., Smith, B.T., Moyle, R.G., 2018. A phylogeny of kingfishers reveals an Indomalayan origin and elevated rates of di- versification on oceanic islands. J. Biogeogr. 45, 269–281. Ballard, J.W., Whitlock, M.C., 2004. The incomplete natural history of mitochondria. Mol. Ecol. 13, 729–744. Berlin, S., Ellegren, H., 2001. Evolutionary genetics - Clonal inheritance of avian mi- tochondrial DNA. Nature 413, 37–38. Blumenstiel, B., Cibulskis, K., Fisher, S., DeFelice, M., Barry, A., Fennell, T., Abreu, J., Minie, B., Costello, M., Young, G., Maquire, J., Kernytsky, A., Melnikov, A., Rogov, P., Gnirke, A., Gabriel, S., 2010. Targeted exon sequencing by in-solution hybrid selec- tion. Curr. Protocols Hum. Genet. 66 18.4.1-18.4.24. Braun, E.L., Kimball, R.T., 2002. Examining basal avian divergences with mitochondrial sequences: Model complexity, taxon sampling, and sequence length. Syst. Biol. 51, 614–625. Burleigh, J.G., Kimball, R.T., Braun, E.L., 2015. Building the avian tree of life using a large-scale, sparse supermatrix. Mol. Phylogenet. Evol. 84, 53–63. R.A. Tamashiro et al. Molecular Phylogenetics and Evolution 130 (2019) 132–142 140 Cracraft, J., Houde, P., Ho, S.Y.W., Mindell, D.P., Fjeldsa, J., Lindow, B., Edwards, S.V., Rahbek, C., Mirarab, S., Warnow, T., Gilbert, M.T.P., Zhang, G., Braun, E.L., Jarvis, E.D., 2015. Response to Comment on “Whole-genome analyses resolve early branches in the tree of life of modern birds”. Science 349, 1460. Cummings, M.P., Meyer, A., 2005. Magic bullets and golden rules: Data sampling in molecular phylogenetics. Zoology 108, 329–336. de los Monteros, A.E., 1998. Phylogenetic relationships among the trogons. Auk 115, 937–954. de los Monteros, A.E., 2000. Higher-level phylogeny of Trogoniformes. Mol. Phylogenet. Evol. 14, 20–34. Desjardins, P., Morais, R., 1990. Sequence and gene organization of the chicken mi- tochondrial genome. A novel gene order in higher vertebrates. J. Mol. Biol. 212, 599–634. Duchene, S., Archer, F.I., Vilstrup, J., Caballero, S., Morin, P.A., 2011. Mitogenome phylogenetics: The impact of using single regions and partitioning schemes on to- pology, substitution rate and divergence time estimation. PLoS ONE 6. Edgar, R., 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucl. Acids Res. 32, 1792–1797. Faircloth, B.C., McCormack, J.E., Crawford, N.G., Harvey, M.G., Brumfield, R.T., Glenn, T.C., 2012. Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Syst. Biol. 61, 717–726. Feduccia, A., 2003. 'Big bang' for tertiary birds? Trends Ecol. Evol. 18, 172–176. Felsenstein, J., 1978. Cases in which parsimony or compatibility methods will be posi- tively misleading. Syst. Zool. 27, 401–410. Gibb, G.C., Kardailsky, O., Kimball, R.T., Braun, E.L., Penny, D., 2007. Mitochondrial genomes and avian phylogeny: complex characters and resolvability without ex- plosive radiations. Mol. Biol. Evol. 24, 269–280. Goldman, N., 1998. Phylogenetic information and experimental design in molecular systematics. Proc. Royal Soc. B-Biol. Sci 265, 1779–1786. Groth, J.G., Barrowclough, G.F., 1999. Basal divergences in birds and the phylogenetic utility of the nuclear RAG-1 gene. Mol. Phylogenet. Evol. 12, 115–123. Hackett, S.J., Kimball, R.T., Reddy, S., Bowie, R.C.K., Braun, E.L., Braun, M.J., Chojnowski, J.L., Cox, W.A., Han, K.L., Harshman, J., Huddleston, C.J., Marks, B.D., Miglia, K.J., Moore, W.S., Sheldon, F.H., Steadman, D.W., Witt, C.C., Yuri, T., 2008. A phylogenomic study of birds reveals their evolutionary history. Science 320, 1763–1768. Hendy, M.D., Penny, D., 1989. A framework for the quantitative study of evolutionary trees. Syst. Zool. 38, 297–309. Hill, G.E., 2017. The mitonuclear compatibility species concept. Auk 134, 393–409. Hill, G.E., Johnson, J.D., 2013. The mitonuclear compatibility hypothesis of sexual se- lection. Proc. Royal Soc. B-Biol. Sci. 280, 20131314. Hillis, D.M., 1996. Inferring complex phylogenies. Nature 383, 130–131. Hosner, P.A., Sheldon, F.H., Lim, H.C., Moyle, R.G., 2010. Phylogeny and biogeography of the Asian trogons (Aves: Trogoniformes) inferred from nuclear and mitochondrial DNA sequences. Mol. Phylogenet. Evol. 57, 1219–1225. Jarvis, E.D., Mirarab, S., Aberer, A.J., Li, B., Houde, P., Li, C., Ho, S.Y.W., Faircloth, B.C., Nabholz, B., Howard, J.T., Suh, A., Weber, C.C., da Fonseca, R.R., Li, J.W., Zhang, F., Li, H., Zhou, L., Narula, N., Liu, L., Ganapathy, G., Boussau, B., Bayzid, M.S., Zavidovych, V., Subramanian, S., Gabaldon, T., Capella-Gutierrez, S., Huerta-Cepas, J., Rekepalli, B., Munch, K., Schierup, M., Lindow, B., Warren, W.C., Ray, D., Green, R.E., Bruford, M.W., Zhan, X.J., Dixon, A., Li, S.B., Li, N., Huang, Y.H., Derryberry, E.P., Bertelsen, M.F., Sheldon, F.H., Brumfield, R.T., Mello, C.V., Lovell, P.V., Wirthlin, M., Schneider, M.P.C., Prosdocimi, F., Samaniego, J.A., Velazquez, A.M.V., Alfaro-Nunez, A., Campos, P.F., Petersen, B., Sicheritz-Ponten, T., Pas, A., Bailey, T., Scofield, P., Bunce, M., Lambert, D.M., Zhou, Q., Perelman, P., Driskell, A.C., Shapiro, B., Xiong, Z.J., Zeng, Y.L., Liu, S.P., Li, Z.Y., Liu, B.H., Wu, K., Xiao, J., Yinqi, X., Zheng, Q.M., Zhang, Y., Yang, H.M., Wang, J., Smeds, L., Rheindt, F.E., Braun, M., Fjeldsa, J., Orlando, L., Barker, F.K., Jonsson, K.A., Johnson, W., Koepfli, K.P., O'Brien, S., Haussler, D., Ryder, O.A., Rahbek, C., Willerslev, E., Graves, G.R., Glenn, T.C., McCormack, J., Burt, D., Ellegren, H., Alstrom, P., Edwards, S.V., Stamatakis, A., Mindell, D.P., Cracraft, J., Braun, E.L., Warnow, T., Jun, W., Gilbert, M.T.P., Zhang, G.J., 2014. Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 346, 1320–1331. Kim, J., 2000. Slicing hyperdimensional oranges: the geometry of phylogenetic estima- tion. Mol. Phylogenet. Evol. 17, 58–75. Kimball, R.T., Wang, N., Heimer-McGinn, V., Ferguson, C., Braun, E.L., 2013. Identifying localized biases in large datasets: A case study using the avian tree of life. Mol. Phylogenet. Evol. 69, 1021–1032. Koch, R.E., Josefson, C.C., Hill, G.E., 2017. Mitochondrial function, ornamentation, and immunocompetence. Biol. Rev. 92, 1459–1474. Ksepka, D.T., Ware, J.L., Lamm, K.S., 2014. Flying rocks and flying clocks: disparity in fossil and molecular dates for birds. Proc. Royal Soc. B-Biol. Sci. 281, 20140677. Kumar, S., 1996. Patterns of nucleotide substitution in mitochondrial protein coding genes of vertebrates. Genetics 143, 537–548. Lanfear, R., Calcott, B., Ho, S.W.Y., Guindon, S., 2012. PartitionFinder: Combined se- lection of partitioning schemes and substitution models for phylogenetic analyses. Mol. Biol. Evol. 29, 1695–1701. Lanfear, R., Calcott, B., Kainer, D., Mayer, C., Stamatakis, A., 2014. Selecting optimal partitioning schemes for phylogenomic datasets. BMC Evol. Biol. 14. Leavitt, J.R., Hiatt, K.D., Whiting, M.F., Song, H.J., 2013. Searching for the optimal data partitioning strategy in mitochondrial phylogenomics: A phylogeny of Acridoidea (Insecta: Orthoptera: Caelifera) as a case study. Mol. Phylogenet. Evol. 67, 494–508. Maddison, W.P., 1997. Gene trees in species trees. Syst. Biol. 46, 523–536. Maddison, W.P., Maddison D.R., 2016. Mesquite: a modular system for evolutionary analysis. Version 3.40 http://mesquiteproject.org. Mahmood, M.T., McLenachan, P.A., Gibb, G.C., Penny, D., 2014. Phylogenetic position of avian nocturnal and diurnal raptors. Genome Biol. Evol. 6, 326–332. McCormack, J.E., Harvey, M.G., Faircloth, B.C., Crawford, N.G., Glenn, T.C., Brumfield, R.T., 2013. A Phylogeny of birds based on over 1,500 loci collected by target en- richment and high-throughput sequencing. PLoS One 8. Meiklejohn, K.A., Danielson, M.J., Faircloth, B.C., Glenn, T.C., Braun, E.L., Kimball, R.T., 2014. Incongruence among different mitochondrial regions: A case study using complete mitogenomes. Mol. Phylogenet. Evol. 78, 314–323. Miller, M.A., Pfeiffer, W., Schwartz, T., 2010. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. Proceedings of the Gateway Computing Environments Workshop (GCE), New Orleans, LA 1–8. Mindell, D.P., Sorenson, M.D., Dimcheff, D.E., 1998. Multiple independent origins of mitochondrial gene order in birds. PNAS 95, 10693–10697. Mindell, D.P., Sorenson, M.D., Dimcheff, D.E., Hasegawa, M., Ast, J.C., Yuri, T., 1999. Interordinal relationships of birds and other reptiles based on whole mitochondrial genomes. Syst. Biol. 48, 138–152. Minh, B.Q., Nguyen, M.A., von Haeseler, A., 2013. Ultrafast approximation for phylo- genetic bootstrap. Mol. Biol. Evol. 30, 1188–1195. Mitchell, K.J., Llamas, B., Soubrier, J., Rawlence, N.J., Worthy, T.H., Wood, J., Lee, M.S.Y., Cooper, A., 2014. Ancient DNA reveals elephant birds and kiwi are sister taxa and clarifies ratite bird evolution. Science 344, 898–900. Mitchell, K.J., Cooper, A., Phillips, M.J., 2015. Comment on “Whole-genome analyses resolve early branches in the tree of life of modern birds”. Science 349, 1460–U1452. Moore, W.S., 1995. Inferring phylogenies from mtDNA variation: Mitochondrial-gene trees versus nuclear-gene trees. Evolution 49, 718–726. Moyle, R.G., 2005. Phylogeny and biogeographical history of Trogoniformes, a pan- tropical bird order. Biol. J. Linn. Soc. 84, 725–738. Nguyen, L.T., Schmidt, H.A., von Haeseler, A., Minh, B.Q., 2015. IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274. Pacheco, M.A., Battistuzzi, F.U., Lentino, M., Aguilar, R.F., Kumar, S., Escalante, A.A., 2011. Evolution of modern birds revealed by mitogenomics: Timing the radiation and origin of major orders. Mol. Biol. Evol. 28, 1927–1942. Pamilo, P., Nei, M., 1988. Relationships between gene trees and species trees. Mol. Biol. Evol. 5, 568–583. Patel, S., Kimball, R.T., Braun, E.L., 2013. Error in phylogenetic estimation for bushes in the Tree of Life. J. Phylogenet. Evol. Biol. 1, 110. Poe, S., Chubb, A.L., 2004. Birds in a bush: Five genes indicate explosive evolution of avian orders. Evolution 58, 404–415. Poe, S., Swofford, D.L., 1999. Taxon sampling revisited. Nature 398, 299–300. Pollock, D.D., Zwickl, D.J., McGuire, J.A., Hillis, D.M., 2002. Increased taxon sampling is advantageous for phylogenetic inference. Syst. Biol. 51, 664–671. Posada, D., Buckley, T.R., 2004. Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Syst. Biol. 53, 793–808. Powell, A.F.L.A., Barker, F.K., Lanyon, S.M., 2013. Empirical evaluation of partitioning schemes for phylogenetic analyses of mitogenomic data: an avian case study. Mol. Phylogenet. Evol. 66, 69–79. Pratt, R.C., Gibb, G.C., Morgan-Richards, M., Phillips, M.J., Hendy, M.D., Penny, D., 2009. Toward resolving deep Neoaves phylogeny: Data, signal enhancement, and priors. Mol. Biol. Evol. 26, 313–326. Prum, R.O., Berv, J.S., Dornburg, A., Field, D.J., Townsend, J.P., Lemmon, E.M., Lemmon, A.R., 2015. A comprehensive phylogeny of birds (Aves) using targeted next-genera- tion DNA sequencing. Nature 526, 569–U247. Reddy, S., Kimball, R.T., Pandey, A., Hosner, P.A., Braun, M.J., Hackett, S.J., Han, K.L., Harshman, J., Huddleston, C.J., Kingston, S., Marks, B., Miglia, K.J., Moore, W.S., Sheldon, F.H., Witt, C.C., Yuri, T., Braun, E.L., 2017. Why do phylogenomic data dets yield conflicting trees? Data type influences the avian tree of life more than taxon sampling. Syst. Biol. 66, 857–879. Richards, E.J., Brown, J.M., Barley, A.J., Chong, R.A., Thomson, R.C., 2018. Variation across mitochondrial gene trees provides evidence for systematic error: How much gene tree variation is biological? Syst. Biol. https://doi.org/10.1093/sysbio/syy013. Rosel, P.E., Block, B.A., 1996. Mitochondrial control region variability and global po- pulation structure in the swordfish, Xiphias gladius. Mar. Biol. 125, 11–22. Rubinoff, D., Holland, B.S., 2005. Between two extremes: mitochondrial DNA is neither the panacea nor the nemesis of phylogenetic and taxonomic inference. Syst. Biol. 54 (6), 952–961. Russell, R.D., Beckenbach, A.T., 2008. Recoding of translation in turtle mitochondrial genomes: programmed frameshift mutations and evidence of a modified genetic code. J. Mol. Evol. 67, 682–695. Sanderson, M.J., Kim, J., 2000. Parametric phylogenetics? Syst. Biol. 49, 817–829. Simon, C., Frati, F., Beckenbach, A., Crespi, B., Liu, H., Flook, P., 1994. Evolution, weighting, and phylogenetic utility of mitochondrial gene-sequences and a compi- lation of conserved polymerase chain-reaction primers. Ann. Entomol. Soc. Am. 87, 651–701. Slack, K.E., Delsuc, F., McLenachan, P.A., Arnason, U., Penny, D., 2007. Resolving the root of the avian mitogenomic tree by breaking up long branches. Mol. Phylogenet. Evol. 42, 1–13. Soltis, D.E., Albert, V.A., Savolainen, V., Hilu, K., Qiu, Y.L., Chase, M.W., Farris, J.S., Stefanovic, S., Rice, D.W., Palmer, J.D., Soltis, P.S., 2004. Genome-scale data, an- giosperm relationships, and 'ending incongruence': a cautionary tale in phylogenetics. Trends Plant Sci. 9, 477–483. Springer, M.S., Gatesy, J., 2016. The gene tree delusion. Mol. Phylogenet. Evol. 94, 1–33. Springer, M.S., Gatesy, J., 2018. Delimiting coalescence genes (C-Genes) in phylogenomic data sets. Genes 9, 123. Stamatakis, A., 2006. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690. R.A. Tamashiro et al. Molecular Phylogenetics and Evolution 130 (2019) 132–142 141 Steel, M., 2005. Should phylogenetic models be trying to 'fit an elephant? Trends Genet. 21, 307–309. Suh, A., 2016. The phylogenomic forest of bird trees contains a hard polytomy at the root of Neoaves. Zoolog. Scr. 45, 50–62. van Tuinen, M., Sibley, C.G., Hedges, S.B., 2000. The early history of modern birds in- ferred from DNA sequences of nuclear and mitochondrial ribosomal genes. Mol. Biol. Evol. 17, 451–457. Wang, N., Braun, E.L., Kimball, R.T., 2012. Testing hypotheses about the sister group of the Passeriformes using an independent 30-locus data set. Mol. Biol. Evol. 29, 737–750. Wang, N., Hosner, P.A., Liang, B., Braun, E.L., Kimball, R.T., 2017. Historical relationships of three enigmatic phasianid genera (Aves: Galliformes) inferred using phylogenomic and mitogenomic data. Mol. Phylogenet. Evol. 109, 217–225. Yang, Z., 1995. A space-time process model for the evolution of DNA-sequences. Genetics 139, 993–1005. Yuri, T., Kimball, R.T., Harshman, J., Bowie, R.C.K., Braun, M.J., Chojnowski, J.L., Han, K.-L., Hackett, S.J., Huddleston, C.J., Moore, W.S., Reddy, S., Sheldon, F.H., Steadman, D.W., Witt, C.C., Braun, E.L., 2013. Parsimony and model-based analyses of indels in avian nuclear genes reveal congruent and incongruent phylogenetic signals. Biology 2, 419–444. Zwickl, D.J., Hillis, D.M., 2002. Increased taxon sampling greatly reduces phylogenetic error. Syst. Biol. 51, 588–598. R.A. Tamashiro et al. Molecular Phylogenetics and Evolution 130 (2019) 132–142 142