Tuesday, June 28, 2016

Rate variation and gene tree discordance

Several years ago, I published a post about variation in nucleotide substitution rates along lineages: Is rate variation among lineages actually due to reticulation?

In that post I suggested that a reticulate evolutionary history would likely be modelled as apparent rate variation if the data were forced into a tree model. That is, in a tree model, the sudden influx of new genomic material due to the reticulation event could only be modelled as a sudden change in substitution rate. Therefore, a lot of what tree-based phylogeneticists see as rate variation might actually be reticulation.

I have often wondered why this topic of pseudo-rate variation has not been investigated. It turns out that it now has been, to some extent. Fábio K. Mendes and Matthew W. Hahn (2016. Gene tree discordance causes apparent substitution rate variation. Systematic Biology 65: 711-721) have at least confirmed the idea in terms of gene tree discordance.

They note:
Substitution rates are known to be variable among genes, chromosomes, species, and lineages due to multifarious biological processes. Here, we consider another source of substitution rate variation due to a technical bias associated with gene tree discordance. Discordance has been found to be rampant in genome-wide data sets, often due to incomplete lineage sorting (ILS). This apparent substitution rate variation is caused when substitutions that occur on discordant gene trees are analyzed in the context of a single, fixed species tree. Such substitutions have to be resolved by proposing multiple substitutions on the species tree.
All of this is true, and the authors demonstrate this using simulations. They show that the artificially increased level of apparent rate variation becomes more obvious with increasing levels of ILS, and on trees with larger numbers of taxa.

Now, all that has to be done is demonstrate the same thing when gene tree discordance due to reticulation rather than incomplete lineage sorting.