Impact of Gene Molecular Evolution on Phylogenetic Reconstruction: A Case Study in the Rosids (Superorder Rosanae, Angiosperms)
Rate of substitution of genomic regions is among the most debated intrinsic features that impact phylogenetic informativeness. However, this variable is also coupled with rates of nonsynonymous substitutions that underscore the nature and degree of selection on the selected genes. To empirically address these variables, we constructed four completely overlapping data sets of plastid matK, atpB, rbcL, and mitochondrial matR genes and used the rosid lineage (angiosperms) as a working platform. The genes differ in combinations of overall rates of nucleotide and amino acid substitutions. Tree robustness, homoplasy, accuracy in contrast to a reference tree, and phylogenetic informativeness are evaluated. The rapidly evolving/unconstrained matK faired best, whereas remaining genes varied in degrees of contribution to rosid phylogenetics across the lineage's 108 million years evolutionary history. Phylogenetic accuracy was low with the slowly evolving/unconstrained matR despite least amount of homoplasy. Third codon positions contributed the highest amount of parsimony informative sites, resolution and informativeness, but magnitude varied with gene mode of evolution. These findings are in clear contrast with the views that rapidly evolving regions and the 3rd codon position have inevitable negative impact on phylogenetic reconstruction at deep historic level due to accumulation of multiple hits and subsequent elevation in homoplasy and saturation. Relaxed evolutionary constraint in rapidly evolving genes distributes substitutions across codon positions, an evolutionary mode expected to reduce the frequency of multiple hits. These findings should be tested at deeper evolutionary histories.