Ancient and modern colonization of North America by hemlock woolly adelgid, Adelges tsugae (Hemiptera: Adelgidae), an invasive insect from East Asia

Hemlock woolly adelgid, Adelges tsugae, is an invasive pest of hemlock trees (Tsuga) in eastern North America. We used 14 microsatellites and mitochondrial COI sequences to assess its worldwide genetic structure and reconstruct its colonization history. The resulting information about its life cycle, biogeography and host specialization could help predict invasion by insect herbivores. We identified eight endemic lineages of hemlock adelgids in central China, western China, Ulleung Island (South Korea), western North America, and two each in Taiwan and Japan, with the Japanese lineages specializing on different Tsuga species. Adelgid life cycles varied at local and continental scales with different sexual, obligately asexual and facultatively asexual lineages. Adelgids in western North America exhibited very high microsatellite heterozygosity, which suggests ancient asexuality. The earliest lineages diverged in Asia during Pleistocene glacial periods, as estimated using approximate Bayesian computation. Colonization of western North America was estimated to have occurred prior to the last glacial period by adelgids directly ancestral to those in southern Japan, perhaps carried by birds. The modern invasion from southern Japan to eastern North America caused an extreme genetic bottleneck with just two closely related clones detected throughout the introduced range. Both colonization events to North America involved host shifts to unrelated hemlock species. These results suggest that genetic diversity, host specialization and host phylogeny are not predictive of adelgid invasion. Monitoring non‐native sentinel host trees and focusing on invasion pathways might be more effective methods of preventing invasion than making predictions using species traits or evolutionary history.


Introduction
Invasions by non-native herbivorous insects continue to threaten forest ecosystems (Leibhold et al. 1995;Aukema et al. 2010) as a consequence of accelerating arrival rates with international trade (Levine & D'Antonio 2003;Lockwood et al. 2005;Brockerhoff et al. 2014;Blackburn et al. 2015). Fortunately, most introductions of non-native insects do not result in successful establishment (Williamson & Fitter 1996;Aukema et al. 2010), while a few become serious invasive pests. This division between innocuous and invasive introductions has been difficult to predict, so ongoing research seeks to identify species traits that correlate with invasive potential. Some candidate characteristics include high genetic diversity, broad host range and the ability to reproduce asexually (e.g. Kolar & Lodge 2001;Jeschke & Strayer 2006;Liebhold & Tobin 2008;Dlugosch et al. 2015). However, there is little evidence that any one trait can reliably predict invasiveness, especially for herbivorous insects (Williamson 1996;Sakai et al. 2001;National Research Council 2002;Hayes & Barry 2008).
An alternative to the trait-based approach is to examine the evolutionary history of herbivores and their host plants, as a context in which to evaluate the potential for invasion. Many invasive herbivores have resistant host plant species in their native range and susceptible congeneric host species in their introduced range (e.g. Granett et al. 2001;Rebek et al. 2008;Desurmont et al. 2011;Nielsen et al. 2011). This pattern led to the concept of 'defence free space', where invasion is enabled by a lack of effective defences in host plants outside a herbivore's native geographic range (Gandhi & Herms 2010). But not every herbivore finds susceptible host plants after being introduced. For example, 86% of the 455 species of non-native herbivores that feed on trees in the United States have no record of causing damage (Aukema et al. 2010). Therefore, a lack of evolutionary association is not by itself predictive of invasion, but perhaps other patterns in the evolutionary history of herbivores and their host plant could be informative. For example, a species might be more likely to invade a region with host plants that are phylogenetically closely related to those in its native range.
Hemlock woolly adelgid, Adelges tsugae Annand, is a non-native pest that severely impacts the two hemlock (Tsuga) species native to eastern North America: T. canadensis and T. caroliniana (Orwig & Foster 1998;Siderhurst et al. 2010;Brantley et al. 2013). The native geographic range of A. tsugae includes the distributions of the other nine hemlock species found in East Asia and western North America (Havill et al. 2014). Mitochondrial and nuclear DNA sequence data from previous studies indicate that the lineage of A. tsugae introduced to eastern North America originated from a population in southern Japan that specializes on T. sieboldii (Havill et al. 2006, but fine-scale, worldwide population genetic studies have not been completed for this species. Hemlock adelgids have a complex life cycle, with cyclical parthenogenesis and migration between alternate hosts ( Fig. 1; . Populations can be either holocyclic, with host alternation between hemlock and spruce (Picea) where there is a sexual generation, or anholocyclic, with no host alternation and only parthenogenetic reproduction on hemlock. The adelgid holocycle takes 2 years to complete and they can overwinter on hemlock or spruce making host alternation facultative.
In this study, we use data from nuclear and mitochondrial DNA to examine the patterns of genetic variation in A. tsugae throughout its native and introduced ranges to elucidate patterns of life cycle variation, historical biogeography and host association. We discuss our findings in the context of testing the ability to predict invasion by insect herbivores.

Insect sampling and genotyping
Samples of A. tsugae were collected between 2002 and 2015 from host trees growing in their native ranges (Fig. 2, Fig. S1-S4 & Table S1, Supporting information). Sample sites were pooled across years and among localities <10 km apart. For samples on hemlock, we genotyped a single insect per tree, because asexual reproduction on hemlock makes it likely that individuals from the same tree will share the same genotype ( Fig. 1). In contrast, on spruce, where each gall is founded by an individual with sexual parents, we genotyped multiple individuals per tree, each from a different gall.
Adelgids were genotyped with 14 microsatellite loci following previously published protocols (Abdoullaye et al. 2009). Only samples with complete 14-locus genotypes were used for analyses. For most samples, a 651-base pair portion of the mitochondrial cytochrome oxidase subunit I (COI) gene was sequenced using standard protocols (deWaard et al. 2008). Additional details for genotyping and sequencing can be found in Appendix S1, Supporting information.
GENODIVE v 2.0b25 (Meirmans & Van Tiernderen 2004) was used to assign microsatellite genotypes to clonal multilocus lineages (MLLs; Arnaud-Haond et al. 2007), based on a threshold distance of eight steps between individuals, chosen by examining the frequency distribution of pairwise distances between genotypes (Fig. S5, Supporting information). Species with both sexual and asexual reproduction are expected to have a multimodal frequency distribution, with the first peak near zero representing clonal individuals and scoring errors, and subsequent peaks representing somatic mutations, sibling matings or population structure (Meirmans & Van Tiernderen 2004).

Microsatellite and mitochondrial clusters
To identify the number of different genetic clusters in the microsatellite data, discriminant analysis of principal components (DAPC; Jombart et al. 2010) was performed using ADEGENET v 1.3.9 (Jombart 2008) in R v 3.1.1 (R Core Team 2014). The evolutionary relationship among all unique COI haplotypes was reconstructed with MRBAYES v 3.1.2 (Ronquist & Huelsenbeck 2003). Sequences from five other adelgid species were used to root the tree. COI haplotype clustering was also examined by constructing a network based on the statistical parsimony method of Templeton et al. (1992) using the software TCS v 1.21 (Clement et al. 2000) with a 95% confidence limit. Additional details of clustering analyses are provided in Appendix S1, Supporting information.
Clonal diversity, molecular diversity, and reproductive mode Clonal diversity was calculated for sites with 15 or more samples as the ratio of the number of MLLs to the number of individuals, and estimated using Simpson's diversity index calculated with GENODIVE. To detect signatures of sexual reproduction, the standardized index of multilocus linkage disequilibrium ( r d ; Agapow & Burt 2001) was calculated for sampling sites with 15 or more unique MLLs and tested for deviation from zero with 1000 random permutations, using POPPR v 1.1.2 (Kamvar et al. 2014). ARLEQUIN v 3.5.1.3 (Excoffier et al. 2005) was used to calculate the mean number of alleles per locus, expected and observed heterozygosities, and to test for differentiation (F ST ) among sampling sites with 15 or more samples using the infinite allele model and 1000 permutations. SMOGD v 1.2.5 (Crawford 2010) was used to calculate the pairwise harmonic mean of differentiation (D est ) among sampling sites (Jost 2008). This additional method of quantifying genetic differentiation is less sensitive than F ST to underestimating differentiation when allelic diversity is high (Jost 2008). A one-tailed Wilcoxon sign-rank test, implemented in BOTTLENECK v 1.2.02 (Cornuet & Luikart, 1996), was used to test for an excess of observed heterozygotes across all loci.

History of colonization from Asia to western North America
Four scenarios of colonization from Asia to western North America (Fig. 3) were tested using approximate Bayesian computation (ABC; Beaumont et al. 2002), implemented with DIYABC v 2.0 (Cornuet et al. 2014). The software assumes standard sexual reproduction (Cornuet et al. 2010), so a single individual per unique MLL that had both microsatellite and mitochondrial data was included to focus on the sexual generation in the adelgid life cycle. Confidence in model choice (fit to the observed data) and scenario choice (type I and type II error rates) were evaluated using the analyses provided by DIYABC. The divergence times in generations, output by DIYABC, were converted to years by dividing the number of generations by 2.25 based on the adelgid life cycle (Fig. 1). Additional details of the ABC analysis are provided in Appendix S1, Supporting information.

Clonal diversity
Microsatellites were genotyped for 748 individuals, from 133 sampling sites across the A. tsugae native and introduced ranges (Fig. 2, S1-S4, Supporting information). Genotypes were assigned to 442 MLLs (Table 1). All but one sample in eastern North America were members of the same MLL, which was also found in Nakahata (Osaka Prefecture) and Kobe (Hyogo Prefecture), Japan. A second eastern North American MLL, differing from the dominant MLL by ten dinucleotide repeats in one allele at one locus, was found in one sample from Manchester in eastern Massachusetts. Western North America had higher clonal diversity than eastern North America, with 13 MLLs. The most common and widespread MLL in western North America was shared by 43 individuals from 19 of the 25 sampling sites (Table S1, Supporting information).

Microsatellite and mitochondrial clusters
DAPC analysis of microsatellite genotypes separated the samples into k = 4 major clusters (Fig. 4A). These correspond to broad geographic structure and host use. Subsequent DAPC analysis of nested population structure in Cluster 4, which includes samples from several regions, split them into k = 6 subclusters ( Fig. 4B), with MLLs from Taiwan and from western North America forming discrete clusters. Pairwise differentiation (F ST and D est ) between sampling sites showed the same hierarchical pattern with higher values among the major clusters that were identified with DAPC than among sampling sites within these clusters (Table S3, Supporting information).
COI sequences were generated for 446 individuals (423 with microsatellite genotypes), resulting in 105 unique haplotypes (Table S1, Supporting information): 22 haplotypes in continental China, five in Taiwan, one in Ulleung Island (South Korea), 22 in western North America, 54 in Japan and one in both Japan and all eastern North American samples. The eastern North American haplotype was also found in samples collected in southern Japan on T. sieboldii in Kobe (Hyogo Prefecture) and Mt. Koya-san (Wakayama Prefecture) and on P. torano in Odai (Nara Prefecture) (Table S1, Supporting information).
The clustering pattern of COI haplotypes was consistent with those from DAPC analysis of microsatellite genotypes. In the COI phylogeny ( Fig. 5; detailed in Fig. S8, Supporting information), the haplotype from central China (Guizhou) was in a basal position to all other samples in our study, followed by a clade with samples from western China. The Taiwan samples formed two clades that were basal to the Japanese, Ulleung Island, and North American samples. The haplotype network had five major clusters: the sample from central China (Guizhou), two clusters from Taiwan, a cluster from western China and a large cluster with the haplotypes from Japan, Ulleung Island, western North American and eastern North America (Fig. 6). This last cluster included a distinct subcluster of samples collected from T. diversifolia. The haplotypes from T. sieboldii and P. torano fell in two subclusters with a western North American haplotype between them.

Clonal diversity and life cycle
Major lineages and sampling sites varied in the amount of clonal reproduction (Table 1). Samples from eastern North America and western China were the least and the most diverse, respectively. Adelgids sampled at sites in western China, Taiwan and southern Japan on T. sieboldii had genetic signatures of sexual reproduction ( r d ) that were not statistically rejected, and/or collections with matching mitochondrial haplotypes from both hemlock and spruce, which also implies sexual reproduction (Table 2). In contrast, samples from Japan on T. diversifolia and from western North America did not show evidence of host alternation or sexual reproduction. Observed heterozygosity (Ho) was much higher than expected heterozygosity (He) in western North America: 0.942 vs. 0.591, respectively (Wilcoxon sign-rank test, P = 0.003).

Colonization from Asia to western North America
ABC analysis showed that the best-fit scenario of colonization from Asia to western North America involved a split from a direct ancestor of the adelgid lineage on T. sieboldii and P. torano in Japan ( Fig. 3; Scenario 1). A principal component plot for evaluating model choice showed that the summary statistics generated using posterior parameters were a good approximation of the observed data (Fig. S9, Supporting information). The mutation rates for the most likely scenario were estimated to be 6.23 9 10 À8 COI sequence mutations per generation and 9.46 9 10 À6 microsatellite mutations per generation. These values are within the ranges estimated for other insects (Zhang & Hewitt 2003;Papadopoulou et al. 2010). Posterior distributions of divergence times in generations are shown in Fig. S10 and listed in Table S2, Supporting information. Table 3 shows these estimates converted to years.

Discussion
We describe a range of diversity within hemlock adelgids that was unexpected given that they are all considered a single species. Previous studies of hemlock adelgid phylogeny using DNA sequence data suggested five distinct lineages of hemlock adelgids: in western China, Taiwan, western North America and two in Japan that specialize on each Japanese hemlock species, T. diversifolia and T. sieboldii (Havill et al. 2006.

Life cycle
At both the local and continental scales, A. tsugae includes a patchwork of sexual and asexual lineages. Sexual reproduction and host alternation to spruce (Picea) are strictly linked in adelgids, such that migration from hemlock to spruce is required for the sexual generation for all species with known life cycles . Our data confirm that sexual, host alternating hemlock adelgids use P. torano as a primary host in Japan (Inouye 1945;Sato 1999), and P. likiangensis and P. brachytyla in China (Foottit et al. 2009) ( Table 1). The data also suggest that hemlock adelgids migrate to spruce in Taiwan because we find a signature of sexual reproduction in one of the lineages sampled there. Picea morrisonicola is the likely alternate host because we observed dried galls on this species that were similar in morphology to those induced by A. tsugae in western China and Japan (S. Shiyake and M. Sano, personal observation). In addition, Chen et al. (2014) identified galls on P. morrisonicola as belonging to A. tsugae.
Within the lineages that are capable of host alternation, some populations are facultatively asexual, simply because they are not in the vicinity of spruce hosts. For example, adelgids from Yakushima Island in southern Japan, where P. torano does not grow, exhibit low clonal diversity and no signature of sexual reproduction. On the other hand, P. torano is common at the Mt. Mishotai site where there is high adelgid clonal diversity and a   Table S1, Supporting information), this is an example of an extreme genetic bottleneck resulting from invasion. The second MLL differed by mutations in just one microsatellite allele, and it was found in just a single sample collected in Massachusetts, far from the suspected site of original introduction in Virginia (Stoetzel 2002). These mutations therefore likely occurred after establishment. The lack of sexual reproduction and the extreme bottleneck resulting from introduction might be expected to limit the ability to adapt to the new environment (Agashe et al. 2011). This limitation could have been offset by two factors. First, the ability to reproduce asexually may have made hemlock adelgids more likely to establish because they would not be constrained by Allee effects resulting from the difficulty of finding mates at low population densities (Liebhold & Tobin 2008). Tobin et al. (2013) provide support for this by showing that inoculation of eastern hemlock trees with a single A. tsugae individual can result in successful establishment. Second, clones introduced from a sexual lineage might have higher fitness in a novel environment than those from an obligately asexual lineage due to accumulation of deleterious mutations in the latter. This is consistent with the invasion of eastern North America having originated from a cyclically parthenogenetic population in southern Japan, and no evidence of invasion by asexual western North American adelgids, even though they can reproduce on eastern hemlocks planted in a western arboretum (Mausel 2005).
Other adelgid lineages might be obligately asexual because they no longer have the developmental ability to migrate to spruce. These include the lineage on T. diversifolia in Japan, the second lineage in Taiwan and the lineage in western North America. The adelgids on T. diversifolia do not have a genetic signature of sexual reproduction (Table 2), and they do not have a genetic match to samples collected on P. torano (Table 1) or on other Japanese spruce species (Foottit et al. 2009). We are also not aware of any winged migrants reported on T. diversifolia. For the second lineage in Taiwan (Fig. 4B, Subcluster 5), we had too few samples to test for a genetic signature of sexual reproduction, but as they were collected in the same stand of trees as the sexual lineage (Table S1, Supporting information) with no evidence of admixture (Fig. 4), it could represent an isolated asexual lineage. Finally, in western North America, no winged migrants or generations on spruce are known (Annand 1924), and we find no genetic evidence of sexual reproduction, which suggest that this lineage could have been asexual from the time it diverged from its Asian ancestor. We estimate that this occurred tens of thousands of years ago (Table 3). The high observed heterozygosity in this lineage (0.942) compared to others (Table 2) is also consistent with ancient asexuality, which could result in mutations accumulating independently on paired chromosomes in the absence of recombination (Birky 1996 be inferred from the rooted COI phylogeny (Fig. 5). However, the colonization from Asia to western North America is not fully resolved by the COI data, so ABC analysis was completed to test different scenarios for this event, and to estimate lineage divergence times.
The first split in A. tsugae is indicated by the basal position of the sample from central China (Guizhou) in the COI phylogeny. This sample was collected on the eastern end of the Yungui Plateau, an area of rugged topology, which has been implicated as a barrier to gene flow in other insects (e.g. Ye et al. 2014;Zhang et al. 2015). Cun & Wang (2015) showed that Chinese hemlocks (T. chinensis) in southeast China are genetically distinct from those farther west, with a divergence dating perhaps to the Middle Pleistocene [c. 680 thousand years ago (kya)], when glaciation divided western and eastern groups of trees. Hemlock adelgids could have experienced the same event leading to comparable divergence. This date is consistent with our estimates of the remaining divergence times occurring c. 23-323 kya (Table 3). Glacial cycles during this period caused repeated connection and separation among continental China, Taiwan and Japan (Voris 2000). These cycles are typically dated using marine oxygen isotope ratios and assigned to Marine Isotopic Stages (MIS), with odd numbers being warm interglacial periods and even numbers being glacial periods (Lisiecki & Raymo 2005). The divergence between adelgids in Taiwan from those in western China (c. 323 kya; Table 3) could be associated with the glacial cycle defined by MIS 9-10 (300-374 kya), and the divergence between adelgids in Japan from those in Taiwan (c. 294 kya; Table 3) with glacial cycle MIS 8-9 (243-337 kya).
Similarly, the split between the two Japanese hemlock adelgid lineages on T. sieboldii and T. diversifolia (c. 60 kya; Table 3) could be associated with the penultimate glacial period, MIS 4 (57-71 kya), during which Japan experienced extensive glacial advance (Sawagaki & Aoki 2011). Data to reconstruct the vegetation during this glacial period in Japan are scarce, but the situation could have been similar to the last glacial maximum (LGM; c. 18 kya), when the distributions of T. sieboldii and P. torano broadly overlapped, as they do now, while T. diversifolia was found only in isolated refugia in northern Honshu (Tsukada 1983). During this time, the adelgids feeding on T. diversifolia might have lost the ability to migrate to spruce, where the sexual generation occurs, thereby becoming isolated from the adelgids feeding on T. sieboldii and P. torano. The prolonged lack of gene flow between the lineages could have permitted them to each adapt to feeding on only one hemlock host species.
The ABC results suggested that western North America was colonized by a direct ancestor of the lineage that specializes on T. sieboldii in southern Japan (Fig. 3). only MLLs from the larger lineage from Tayuling, as shown with DAPC analysis. *P < 0.05, **P < 0.001. Table 2 Hemlock adelgid molecular diversity indices and likelihood of sexual reproduction for sampling sites with 15 or more unique MLLs. Sexual reproduction is inferred by multilocus linkage disequilibrium values ( r d ) that are not significantly different than zero Table 3 Divergence times of hemlock adelgid lineages from the most likely scenario of colonization to western North America (Fig. 3, Scenario 1). Time is converted from generations to years based on the adelgid life cycle Time point (Fig. 3 This direct divergence suggests that the reticulate pattern in the haplotype network (Fig. 6) is the result of incomplete lineage sorting (where haplotypes have not had enough time to coalesce within a lineage), and not a result of a complex scenario of divergence, admixture and subsequent divergence, as was tested with Scenario 4 (Fig. 3). The divergence time for this colonization event is estimated at c. 23 kya (Table 3) in association with the most recent glacial period, MIS 2 (14-29 kya). Vegetation reconstructions for MIS2 suggest that Tsuga was not continuous across the Bering land bridge when it connected Asia to western North America. The region was probably dominated by mesic shrub tundra, while hemlock had retreated south to the California coast (Roberts & Hamann 2015). There is also no signal for continuity across Beringia in the phylogenies of the host plants, as the divergences between Asian and western North American species are estimated to have occurred well before this time period [c. 42 million years ago (mya) for hemlock (Havill et al. 2008); c. 20-25 mya for spruce (Lockwood et al. 2013)]. Several studies suggest that there could have been small stands of spruce on the Bering land bridge during or near the LGM (Brubaker et al. 2005;Anderson et al. 2006;Zazula et al. 2006), which could have facilitated the adelgid colonization, but there is no similar evidence for hemlocks (Roberts & Hamann 2015). Adelgid colonization would therefore have had to involve crawlers or winged migrants travelling between disconnected Asian and North American forests. The interglacial period MIS 3 (57-29 kya) is also within the 90% confidence limit of the date for this event (Table 3), and it might be a more suitable time for adelgid colonization than the LGM. This period is characterized by warm phases that lasted about 1-2 kya and oscillations in sea level that kept the Bering land bridge intermittently intact (Rabassa & Ponce 2013). During these warm phases, boreal forests in Asia migrated towards the Bering land bridge (Anderson & Lozhkin 2001;Bigham-Grette et al. 2003), and the presence of hemlock pollen in Hokkaido and Sakhalin during MIS 3 suggests that hemlock adelgids might have survived in northeast Asia (Igarashi & Zharov 2011;Leipe et al. 2015). Hemlock forest in western North America may also have been closer to the Bering land bridge than during the LGM with pollen records in interior British Columbia and coastal Washington (Clague et al. 2003;Jim enez-Moreno et al. 2010).
As adelgids would have had to travel long distances across the Bering land bridge to colonize western North America, this might have been mediated by migrating birds passively transporting adelgid eggs or crawlers, rather than active dispersal of winged adelgid adults which are not strong fliers. The furthest an adelgid species (Pineus pinifoliae) is known to travel and colonize new trees is about 30 miles (48 km), when convective winds carried them to high altitudes, and most migrations are much shorter than that (Lowe 1966). On the other hand, eggs and crawlers of A. tsugae are known to be dispersed by birds and can survive up to 14 days after being dislodged from a tree (McClure 1990). Moreover, the existence of recently diverged species of forest birds in western North America and Siberia (e.g. Zink et al. 1995;Alstr€ om et al. 2011) supports a scenario of trans-Beringian colonization with migrating birds around the last glacial period.
An alternative date of 6 mya for adelgid colonization from Asia to western North America was estimated using DNA sequence data ). The climatic conditions at this time were also suitable for conifer forests around Beringia (Wolfe & Leopold 1967;Matthews 1980;Tiffney 1985), and hemlock pollen has been recorded in both northeastern Asia and western North America during this time (Shilo & Minyuk 2006;Andreev et al. 2014;LePage 2003). This older divergence time estimate depended on calibration with only a few fossils of insects in amber hypothesized to be adelgid relatives , making this result questionable. A more recent colonization date, similar to the one that we estimate in this study, seems more likely, given the lack of coalescence of COI sequences for adelgids in western North America and Japan (Fig. 6).
It should be noted, however, that there is also some uncertainty in our ABC results. The type I error rate (0.418; Table S2, Supporting information) was fairly high, and while Scenario 1 had a posterior probability (0.4739) that was 2.5 -39 higher than the others (Fig. 3), marginal support for the other scenarios indicates that they cannot be completely ruled out. This uncertainty does not seem to be a consequence of ABC model choice because the PCA plot to assess the model shows the observed data set surrounded by a tight cluster of points generated from the posterior distribution of parameters (Fig. S9, Supporting information). Rather, the statistical error likely results from a combination of the low sample size of unique MLLs in the western North American lineage, and from two of the five lineages being obligately asexual, which violates the assumption of standard sexual reproduction in the analysis (Cornuet et al. 2010). Future ABC analyses that explicitly account for a cyclically parthenogenetic life cycle and molecular dating methods that use additional fossils for calibration might help to refine our conclusions.

Host specialization
The degree of host specialization varies among different hemlock adelgid lineages. The two Japanese lineages are each confined to just one hemlock species in their native range (Figs. 4-6), while the lineage in western China is a relative generalist, feeding on three hemlock and two spruce host species (Table 1). Moreover, phylogenetic relatedness of hemlock species does not make them more likely to be suitable hosts to the same adelgid lineage. We have identified at least four host shifts between unrelated hemlock species: the modern colonization from T. sieboldii to the eastern North American species (T. canadensis and T. caroliniana), and the ancient colonization from an ancestor of T. sieboldii to the western North American species [T. heterophylla and T. mertensiana (Note, however, that we are not aware of confirmed adelgid specimens from T. mertensiana growing in a natural setting, making the host switch to this species uncertain)]. None of the four North American hemlock species are closely related to T. sieboldii (Havill et al. 2008). On the other hand, the adelgid lineage that specializes on T. sieboldii in Japan cannot survive on T. chinensis (Del Tredici & Kitajima 2004;Montgomery et al. 2009), even though T. sieboldii and T. chinensis are closely related (Havill et al. 2008;Holman 2014). A similar pattern has also been shown in other groups of insects where cospeciation is rare (de Vienne et al. 2013), and plant defences play a larger role in determining host suitability than phylogeny, at least when considering more recently diverged host species (e.g. Jaenike 1990; Desurmont et al. 2011).

Conclusions
Hemlock adelgids have traits that would seem to make them unlikely invaders. Host specialization in Japan did not prevent the modern invasion to new hemlock host species in eastern North America. In addition, this invasion, as well as the ancient colonization of western North America, involved host shifts to hemlock species unrelated to their natal hosts. Therefore, phylogenetic distance between native and non-native hemlock species cannot be used to predict invasion for hemlock adelgids. We might also have expected adaptation to the new environment to have been limited by an extreme genetic bottleneck in the introduced range. This was also not the case, although low genetic diversity could have been offset by asexual reproduction alleviating the difficulty in finding mates in the introduced range, and by increased fitness resulting from sexual reproduction in the source population.
Our data therefore show that the invasive potential of different adelgid lineages may be determined by complex interactions among life history traits, evolutionary history and the introduced environment, making prediction of invasion very difficult. A more reliable way to determine their potential to invade might be to experimentally expose different lineages to novel hosts, or to monitor non-native sentinel hosts growing in their native ranges (Britton et al. 2010;Roques et al. 2015).
As a lack of predictive patterns continues to be found for groups of insect herbivores, it might be more effective to focus on invasion pathways rather than on traits or evolutionary history to prevent invasions. Propagule pressure, or the rate of arrival of non-native species and individuals, is currently the most supported predictor of insect invasions (Lockwood et al. 2005;Brockerhoff et al. 2014;Blackburn et al. 2015). By analysing interception rates at points of entry as an estimate of propagule pressure, it is possible to determine high-risk pathways for introduction (Work et al. 2005;Liebhold et al. 2012). Targeted regulations to limit introduction of non-native insects via known pathways can then be effective, but require frequent re-evaluation based on updated interception data Haack et al. 2014).
For adelgids, switching to unrelated host species within a genus might be common, while switching to host species .in other genera is rare . For each group of herbivores that exhibit some level of host specificity, there could be a threshold divergence time beyond which host phylogeny begins to play a more dominant role in determining host suitability than other factors. Determining whether this could be predictive of invasion for insect herbivores would require more comprehensive analysis than is currently available, but considering the pace at which phylogenies for plants and insects are accumulating (Hinchliff et al. 2015), it may soon be possible.  (Cornuet et al. 2014).

Table S3
Genetic differentiation (F ST ; above diagonal) and harmonic mean of differentiation (Jost 2008) (D est ; below diagonal) between hemlock adelgid sites with ten or more MLLs.        (Fig. 4). A k value of six was chosen to describe the data.