Why do phylogenomic data sets yield conflicting trees? Data type influences the avian tree of life more than taxon sampling

dc.contributor.authorReddy, Sushma
dc.contributor.authorKimball, Rebecca T.
dc.contributor.authorPandey, Akanksha
dc.contributor.authorHosner, Peter A.
dc.contributor.authorBraun, Michael J.
dc.contributor.authorHackett, Shannon J.
dc.contributor.authorHan, Kin-Lan
dc.contributor.authorHarshman, John
dc.contributor.authorHuddleston, Christopher J.
dc.contributor.authorKingston, Sarah
dc.contributor.authorMarks, Ben D.
dc.contributor.authorMiglia, Kathleen J.
dc.contributor.authorMoore, William S.
dc.contributor.authorSheldon, Frederick H.
dc.contributor.authorWitt, Christopher C.
dc.contributor.authorYuri, Tamaki
dc.contributor.authorBraun, Edward L.
dc.date.accessioned2017-06-01T09:02:08Z
dc.date.available2017-06-01T09:02:08Z
dc.date.issued2017
dc.description.abstractPhylogenomics, the use of large-scale data matrices in phylogenetic analyses, has been viewed as the ultimate solution to the problem of resolving difficult nodes in the tree of life. However, it has become clear that analyses of these large genomic datasets can also result in conflicting estimates of phylogeny. Here we use the early divergences in Neoaves, the largest clade of extant birds, as a 'model system' to understand the basis for incongruence among phylogenomic trees. We were motivated by the observation that trees from two recent avian phylogenomic studies exhibit conflicts. Those studies used different strategies: 1) collecting many characters ~42 mega base pairs (Mbp) of sequence data] from 48 birds, sometimes including only one taxon for each major clade; and 2) collecting fewer characters (~0.4 Mbp) from 198 birds, selected to subdivide long branches. However, the studies also used different data types: the taxon-poor data matrix comprised 68% non-coding sequences whereas coding exons dominated the taxon-rich data matrix. This difference raises the question of whether the primary reason for incongruence is the number of sites, the number of taxa, or the data type. To test among these alternative hypotheses we assembled a novel, large-scale data matrix comprising 90% non-coding sequences from 235 bird species. Although increased taxon sampling appeared to have a positive impact on phylogenetic analyses the most important variable was data type. Indeed, by analyzing different subsets of the taxa in our data matrix we found that increased taxon sampling actually resulted in increased congruence with the tree from the previous taxon-poor study (which had a majority of non-coding data) instead of the taxon-rich study (which largely used coding data). We suggest that the observed differences in the estimates of topology for these studies reflect data-type effects due to violations of the models used in phylogenetic analyses, some of which may be difficult to detect. If incongruence among trees estimated using phylogenomic methods largely reflects problems with model fit developing more 'biologically-realistic' models is likely to be critical for efforts to reconstruct the tree of life.
dc.format.extent857–879
dc.identifier1063-5157
dc.identifier.citationReddy, Sushma, Kimball, Rebecca T., Pandey, Akanksha, Hosner, Peter A., Braun, Michael J., Hackett, Shannon J., Han, Kin-Lan, Harshman, John, Huddleston, Christopher J., Kingston, Sarah, Marks, Ben D., Miglia, Kathleen J., Moore, William S., Sheldon, Frederick H., Witt, Christopher C., Yuri, Tamaki, and Braun, Edward L. 2017. "<a href="https://repository.si.edu/handle/10088/32489">Why do phylogenomic data sets yield conflicting trees? Data type influences the avian tree of life more than taxon sampling</a>." <em>Systematic Biology</em>, 66, (5) 857–879. <a href="https://doi.org/10.1093/sysbio/syx041">https://doi.org/10.1093/sysbio/syx041</a>.
dc.identifier.issn1063-5157
dc.identifier.urihttps://hdl.handle.net/10088/32489
dc.publisherOxford University Press
dc.relation.ispartofSystematic Biology 66 (5)
dc.titleWhy do phylogenomic data sets yield conflicting trees? Data type influences the avian tree of life more than taxon sampling
dc.typearticle
sro.description.unitNMNH
sro.description.unitNH-Vertebrate Zoology
sro.identifier.doi10.1093/sysbio/syx041
sro.identifier.itemID142763
sro.identifier.refworksID73170
sro.identifier.urlhttps://repository.si.edu/handle/10088/32489
sro.publicationPlaceCambridge, England

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2017 (Reddy et al. ) copy.pdf
Size:
1.08 MB
Format:
Adobe Portable Document Format
Description: