Testing a global standard for quantifying species recovery and assessing conservation impact

Recognizing the imperative to evaluate species recovery and conservation impact, in 2012 the International Union for Conservation of Nature (IUCN) called for development of a “Green List of Species” (now the IUCN Green Status of Species). A draft Green Status framework for assessing species’ progress toward recovery, published in 2018, proposed 2 separate but interlinked components: a standardized method (i.e., measurement against benchmarks of species’ viability, functionality, and preimpact distribution) to determine current species recovery status (herein species recovery score) and application of that method to estimate past and potential future impacts of conservation based on 4 metrics (conservation legacy, conservation dependence, conservation gain, and recovery potential). We tested the framework with 181 species representing diverse taxa, life histories, biomes, and IUCN Red List categories (extinction risk). Based on the observed distribution of species’ recovery scores, we propose the following species recovery categories: fully recovered, slightly depleted, moderately depleted, largely depleted, critically depleted, extinct in the wild, and indeterminate. Fifty‐nine percent of tested species were considered largely or critically depleted. Although there was a negative relationship between extinction risk and species recovery score, variation was considerable. Some species in lower risk categories were assessed as farther from recovery than those at higher risk. This emphasizes that species recovery is conceptually different from extinction risk and reinforces the utility of the IUCN Green Status of Species to more fully understand species conservation status. Although extinction risk did not predict conservation legacy, conservation dependence, or conservation gain, it was positively correlated with recovery potential. Only 1.7% of tested species were categorized as zero across all 4 of these conservation impact metrics, indicating that conservation has, or will, play a role in improving or maintaining species status for the vast majority of these species. Based on our results, we devised an updated assessment framework that introduces the option of using a dynamic baseline to assess future impacts of conservation over the short term to avoid misleading results which were generated in a small number of cases, and redefines short term as 10 years to better align with conservation planning. These changes are reflected in the IUCN Green Status of Species Standard.


Abstract
Recognizing the imperative to evaluate species recovery and conservation impact, in 2012 the International Union for Conservation of Nature (IUCN) called for development of a "Green List of Species" (now the IUCN Green Status of Species). A draft Green Status framework for assessing species' progress toward recovery, published in 2018, proposed 2 separate but interlinked components: a standardized method (i.e., measurement against benchmarks of species' viability, functionality, and preimpact distribution) to determine current species recovery status (herein species recovery score) and application of that method to estimate past and potential future impacts of conservation based on 4 metrics (conservation legacy, conservation dependence, conservation gain, and recovery potential). We tested the framework with 181 species representing diverse taxa, life histories, biomes, and IUCN Red List categories (extinction risk). Based on the observed distribution of species' recovery scores, we propose the following species recovery categories: fully recovered, slightly depleted, moderately depleted, largely depleted, critically depleted, extinct in the wild, and indeterminate. Fifty-nine percent of tested species were considered largely or critically depleted. Although there was a negative relationship between extinction risk and species recovery score, variation was considerable. Some species in lower risk categories were assessed as farther from recovery than those at higher risk. This emphasizes that species recovery is conceptually different from extinction risk and reinforces the utility of the IUCN Green Status of Species to more fully understand species conservation status. Although extinction risk did not predict conservation legacy, conservation dependence, or conservation gain, it was positively correlated with recovery potential. Only 1.7% of tested species were categorized as zero across all 4 of these conservation impact metrics, indicating that conservation has, or will, play a role in improving or maintaining species status for the vast majority of these species. Based on our results, we devised an updated assessment framework that introduces the option of using a dynamic baseline to assess future impacts of conservation over the short term to avoid misleading results which were generated in a small number of cases, and redefines short term as 10 years to better align with conservation planning. These changes are reflected in the IUCN Green Status of Species Standard.

INTRODUCTION
The aims of conservation include protection and restoration of natural systems and the recovery of species and their ecological functions. Until recently, there has been no standardized way of thinking about and measuring species recovery. In 2012, the International Union for Conservation of Nature (IUCN) began working to fill that gap by developing an IUCN Green List of Species, based on objective, transparent, and repeatable criteria for systematically assessing successful species conservation (WCC-2012-Res-041).
The IUCN Green List of Species was envisioned as a complement to the IUCN Red List of Threatened Species (IUCN 2020), which has become the global standard for assessment of species' extinction risk. The supporting information accompanying each IUCN Red List assessment is a valuable source of data about each species' status, trends, habitat, distribution, threats, and conservation, and the list has informed global species conservation efforts for more than 50 years (Rodrigues et al., 2006). The IUCN Green List of Species would add new information about species' recovery level, as well as the impact of conservation actions.
Following international consultations between 2012 and 2018, the Species Conservation Success Task Force proposed a framework for an IUCN Green List of Species (Akçakaya et al., 2018). The name has since changed to IUCN Green Status of Species to prevent the erroneous interpretation that species on a green list are no longer in need of conservation action. The goal of the IUCN Green Status of Species is to provide a standardized way of assessing a species' level of recovery and understanding the past and potential future importance of conservation in improving or maintaining recovery status (Figure 1).

A new way to communicate conservation's impact
A species moving to a lower category of extinction risk on the IUCN Red List due to conservation measures is a useful indicator of conservation impact (Butchart et al., 2006). However, many species may remain in a high threat category for long periods despite successful conservation efforts. For example, the Round Island bottle palm (Hyophorbe lagenicaulis) has been listed as critically endangered since 1998 (Page, FIGURE 1 Simplified example of a provisional Green Status assessment with the Echo Parakeet (Psittacula eques). This assessment, and others conducted for this article, is provisional and should not be cited until its publication on the International Union for Conservation of Nature Red List website. Documentation of uncertainty about species state within a spatial unit in steps 3 and 4 is not shown. Details of the procedures followed and definitions of terms (e.g., viable, functional) are in Akçakaya et al. (2018) and Appendix S1 and Appendix S3 (data entry workbook used by assessors to generate the data analyzed here) of this article 1998). Despite dedicated conservation, it still meets the criteria for critically endangered (V. Tatayah, personal communication). This does not mean that conservation has failed; it is highly likely the species would have gone extinct without conservation (Asmussen-Lange et al., 2011). By standardizing and generalizing the process of using prevented declines to assess conservation impact (e.g., Bolam et al., 2020;Hoffmann et al., 2015), the IUCN Green Status of Species can help improve understanding of what works, recognize the efforts of conservationists, and ensure continued donor and government support.
Furthermore, even when conservation action results in an improvement in IUCN Red List Category, it can be complicated to communicate these actions as a conservation success. For example, the downlisting of the giant panda (Ailuropoda melanoleuca) from endangered to vulnerable in 2016 was at first subject to controversy. Some feared that the category change could promote a simplistic narrative of success for a species that remained highly conservation dependent (Swaisgood et al., 2018). This reluctance to report conservation achievements is a problem because conservation science fights an uphill battle against a "culture of despair" (Swaisgood & Sheppard, 2010).

Conservation impact as communicated through Green Status
The IUCN Green Status of Species introduces a new way of thinking about conservation impact by defining conservation status in terms of progress toward species recovery: a fully recovered species is viable and ecologically functional throughout its indigenous range (Appendix S1; IUCN 2021). Full recovery is not realistic for many species; but rather, it is used as a benchmark. An assessment reflects a species' current standing relative to this benchmark, as well as the past and expected future impact of conservation actions on this standing.
This ambitious definition of recovery should help combat "shifting baseline syndrome" (Pauly 1995;Papworth et al., 2009). Recognizing that humans have significantly altered natural systems over time, there are calls to use historical data, specifically species' distribution and status prior to major human impacts, as a recovery benchmark (Sanderson 2019;Stephenson et al., 2019). Some species with negligible extinction risk exist at levels far below their preimpact baseline .
Inclusion of ecological functionality is another ambitious aspect of the IUCN Green Status of Species' definition of recovery. Although maintaining species' viability and preventing extinctions caused by human activities is the first goal of species conservation, this should not preclude actions to maintain functionality (i.e., the set of interactions that contribute to ecological processes) and thus prevent "ecological extinctions" (Redford, 1992). The call to incorporate ecological function into conservation is not new (Redford & Feinsinger 2001;Soulé et al., 2003). The IUCN Green Status of Species is the first global framework to incorporate functionality in assessment of species recovery (Akçakaya et al., 2018;.

Testing the Green Status of Species
We applied the Akçakaya et al. (2018) framework for a Green Status of Species to a sample of species. Our aims were to apply the assessment method to species across different taxonomic groups, systems, geographies, and IUCN Red List Category of extinction risk; identify changes necessary to make the framework universally applicable; examine new insights Green Status assessments can add to conservation and demonstrate that these assessments represent more than simply a red list in reverse; and propose meaningful categories of recovery status based on test data.

Species selection and assessors
Between 2018 and 2020, we sent invitations to all IUCN Species Survival Commission Specialist Groups and IUCN Red List Authorities (RLA), the groups responsible for IUCN Red List assessments (n = 135 in 2018). We also recruited species experts with no formal affiliation with IUCN by creating a project website with joining instructions. The selection of species was at the discretion of the assessors. Other than our attempt to engage all specialist groups and RLAs, each of which focuses on a unique taxon or geographic region, we did not attempt a systematic or representative sampling of global diversity. Appendix S2 lists the assessors for each species. All assessors who wished to be included are authors of this article. To determine the geographic coverage of testing, species' countries of occurrence as reported in the IUCN Red List (IUCN 2020) were extracted using the package rredlist (Chamberlain, 2020).
We standardized the testing process by providing uniform materials, including a standardized assessment workbook containing all instructions, data entry, and documentation fields (Appendix S3). Participants engaged with a coordinator (M.G.) throughout the process to reduce the potential for misinterpretation of the framework.

Green Status of Species framework
Species were assessed using the Akçakaya et al. (2018) framework. In Figure 1, it is applied to an example species. The basis of an assessment is the estimation of 5 green scores, which represent species condition relative to the fully recovered state, from 0% (extinct or extinct in the wild) to 100% (fully recovered) ( Figure 1 explains the green-score calculation). The green score at the time of assessment, based on observed or inferred information, is called the species recovery score. Green scores were also estimated based on scenarios exploring the past and expected future impact of conservation actions; these scenario-based green scores were used to calculate 4 conservation impact metrics ( Figure 1): conservation legacy (impact of past conservation); conservation dependence (expected impact of halting all conservation in the short term, i.e., the longer of 10 years or 3 generations of the species); conservation gain (expected impact of continuing conservation in the short term); and recovery potential (maximum possible recovery within 100 years). Full definitions of these terms and a summary of the assessment procedure are in Appendix S1.

Relationship between Green Status of Species outputs and IUCN Red List category
To investigate the potential of Green Status of Species assessments to provide novel conservation insights, we evaluated whether species recovery scores were predicted by IUCN Red List category by performing beta regression in the R package betareg (Cribari-Neto & Zeileis, 2010) in R version 4.0.0 (R Core Team 2020). We excluded species considered extinct in the wild (EW) on the IUCN Red List because by definition their species recovery score is 0. Our data set included 0s (pre-exclusion of EW) and 1s, but we did not use zero-one inflated beta regression because it assumes that the 0s and 1s are special cases generated under different processes than other data points (Buis, 2010), which was not true for our data set. To allow for regular beta regression (where the data set cannot contain 0s or 1s), we used the rescaling method recommended by Smithson and Verkuilen (2006): y' = [y(N -1) * 1/2]/ N, where N is the total sample size. No transformation was greater than adding or subtracting 0.003 from the original data point.
Model terms were evaluated using the function joint_tests (package "emmeans [Lenth, 2020]) and the pseudo R 2 obtained using R base package summary. Pairwise comparison of estimated marginal means (a.k.a. least-squares means) was performed using the function cld (package multcomp [Hothorn et al., 2008]); estimated marginal means were compared rather than observed means to account for unbalanced sampling.
Unlike species recovery score, the 4 conservation impact metrics can take negative values. Because this more closely represents a continuous distribution, Welch's analysis of variance (ANOVA) (which does not assume equal variance between groups) was used to investigate the relationship between metric values and IUCN Red List categories at the time of assessment (function oneway.test, package onewaytests [Dag et al., 2018]). Pairwise comparisons were performed using the Games-Howell test (function oneway, package userfriendlyscience [Peters, 2018]). Though species considered EW on the IUCN Red List can obtain nonzero values for conservation gain and recovery potential, they were excluded from this analysis because of small sample size (n = 2).

Species recovery categories
The IUCN Red List of Threatened Species has demonstrated the value of categories in communicating conservation information (Betts et al., 2020). We, therefore, sought to create categories with which species recovery score percentages would be more easily interpreted. We proposed 7 IUCN species recovery categories: fully recovered (species recovery score 100%), slightly depleted (>80%), moderately depleted (>50%), largely depleted (>20%), critically depleted (>0%), extinct in the wild (0%), and indeterminate. Although the species recovery score required for inclusion in 2 of these categories is definitional (extinct in the wild, fully recovered), the thresholds between other categories are somewhat subjective. We examined the distribution of test data against these categories to check that the proposed categories were both meaningful (e.g., values in the category reflect the state suggested in the name) and useful (e.g., there are more than a negligible and less than an overwhelming proportion of species in each category). This mirrors the conceptual basis of IUCN Red List thresholds (Collen et al., 2016).
We included an indeterminate category for species with large uncertainty around the species recovery score. This uncertainty threshold was determined using visual examination of the data and is reported in "Results."

Conservation impact metrics categories
Like the species recovery score, the 4 conservation impact metrics take percentage values (Figure 1). To aid in communication of these metrics, we defined the following categories: high, medium, low, zero, negative, and indeterminate. As with the species recovery score categories, breaks between these categories are delimited by threshold values. If the uncertainty associated with a metric value (maximum -minimum estimate) exceeded 40%, it was placed in the indeterminate category. Additionally, metrics could be assigned to the high category not only by surpassing a fixed threshold value, but also if they were high relative to the current value or if they represented prevention or reversal of extinction in the wild. See Appendix S4 for the rules used to assign metric categories.

Collection of feedback
In addition to the quantitative elements of assessmentsspecies recovery score, conservation impact metrics, and categories--assessors provided qualitative feedback. Unstructured feedback took the form of comments made during oneon-one communication between the assessors and members of the task force. Structured feedback was collected using feedback fields in the workbook; assessors were invited to comment on the different stages of the assessment process.

Test assessors
Of 135 IUCN groups contacted, 52 contributed test assessments (38.5%). Taxonomic focus of assessors (Appendix S2) closely tracked the proportional representation of taxonomic focus within IUCN Specialist Groups and Red List Authorities overall (Appendix S5). Specialist Groups and RLAs were the predominant source of test assessments (78%), although FIGURE 2 Spatial distribution of taxa used to test the International Union for Conservation of Nature (IUCN) Green Status method (n = 181): (a) number of tested terrestrial and freshwater taxa (n = 118 and n = 37, respectively) by country whose ranges include that country (small islands are not visible at this scale) and (b) number of tested marine taxa (n = 26) by exclusive economic zone (EEZ) whose ranges include that EEZ (EEZs for Antarctica [200 NM], South Georgia, and Sandwich Islands are mapped). Taxa that spend part of their time in the ocean and part on land or in freshwater were mapped as marine taxa and are not represented in (a). In (a) and (b), taxa presence in countries or EEZs are based on geographic range reported in IUCN Red List (IUCN 2020). When a taxon's origin code in a country (as specified in its Red List account) was introduced, vagrant, or origin uncertain, or when its seasonality code was passage (RLTWG 2018), we did not map that country for that taxon independent experts made substantial contributions (22%). More than 200 people, working in 38 countries, volunteered as assessors to produce test assessments (Appendix S5) (mean [SD] = 2.2 assessors/species [2.7]) (Appendix S2).

Test species and biases
The framework was tested with 181 taxa (172 species, 7 subspecies, and 2 regional groupings) (Appendix S2). These taxa (hereafter species for simplicity) represent diverse taxonomic groups across plants, animals, and fungi (Appendix S5), geographic regions (Figure 2), and range sizes. Test species' extent of occurrence (IUCN Standards and Petitions Committee 2019) ranged from 0.04 km 2 to 298 million km 2 (Appendix S5). The IUCN has assessed extinction risk for 97% of tested species. Sixty-seven percent of species were in a threatened IUCN Red List category (vulnerable [VU], endangered [EN], or critically endangered [CR]), and all IUCN Red List categories were represented, except data deficient (DD). Terrestrial, freshwater, and marine species were represented (65%, 21%, and 14%, respec-tively). Because these species are not a representative sample of global biodiversity (biased toward terrestrial, threatened species), percentages here serve only to characterize the data set and cannot be extrapolated further. Nonetheless, the diversity of species tested allowed for identification of taxon-and lifehistory-specific challenges (see "Feedback").

Distribution of species recovery scores
The species recovery scores (Appendix S6) covered the range of all possible values (Figure 3a). The species were distributed among the proposed Species Recovery categories as follows (Figure 3a): fully recovered, 5%; slightly depleted, 7%; moderately depleted, 14%; largely depleted, 46%; critically depleted 14%; and extinct in the wild, 2%. The spike in Figure 3a at the x-axis value of 33% results from the properties of the green score calculation (Figure 1). If a species had only 1 spatial unit (SU), the only values the green score could take would be 0% (species absent in SU), 33% (present), 67% (viable), or 100% (functional), and 95% of tested species with 1 SU are listed as FIGURE 3 For 181 tested taxa, (a) distribution of species recovery scores (SRSs) and proposed species recovery category thresholds (EW, extinct in the wild; bins, increments of 5% exclusive of low values and inclusive of high values, except first bin [0%] and last bin [100%]; no shading, species for which the best estimate of SRS was in that bin but uncertainty around the best estimate was large enough for it to be categorized as indeterminate; spike at 33% due to properties of green score calculation ( Figure 1); see text and (b) distribution of uncertainty (max -min) of reported SRSs (dashed vertical line, cutoff for placement of species in the indeterminate category) threatened on the IUCN Red List (if a population in an SU is threatened, the state in the SU usually is present, with some exceptions [IUCN 2021]).
Based on the distribution of uncertainty around species recovery scores (Figure 3b), we applied the indeterminate category when uncertainty was >40%; 12% of species were categorized thus.

Relationship between species recovery score and IUCN Red List category
The IUCN Red List category was a significant predictor of species recovery score (Figure 4). Species at higher extinction risk generally had a lower species recovery score (beta regression, F = 69.7, df = 6, p<0.0001; pseudo R 2 = 0.45). Nonetheless, within a given IUCN Red List category, the range of species recovery scores was wide; standard deviation of species recovery scores within a RL category ranged from 13% (CR) to 22% (VU) (calculated using observed values, not model values). Species recovery scores were not significantly different between some categories (Figure 4). It was not uncommon for a species FIGURE 4 (a) Relationship between species recovery score (SRS) and IUCN Red List extinction risk categories (LC, least concern; NT, near threatened; VU, vulnerable; EN, endangered; CR, critically endangered) excluding species extinct in the wild because their SRS is by definition 0% (box limits, first and third quartiles, respectively; horizontal lines, median; whiskers, smallest and largest values no farther than 1.5 interquartile range); points, values beyond interquartile range; numbers in boxes, sample size) and (b) estimated marginal means of SRS calculated from the beta regression model and used to compare groups with unequal samples (bars, 95% CI around estimated marginal mean; differing letters, significantly different with Tukey-adjusted p<0.05)

FIGURE 5
For 181 tested taxa, (top row) conservation impact metric values relative to the taxon's International Union for Conservation of Nature extinction risk category (box plot elements defined in Figure 4 legend) and (bottom row) distribution of conservation impact metric categories by extinction risk category: (a) conservation legacy, (b) conservation dependence, (c) conservation gain, (d) recovery potential (LC, least concern; NT, near threatened; VU, vulnerable; EN, endangered; CR, critically endangered; EW, extinct in the wild) in a nonthreatened IUCN Red List category (LC or NT) to have the same or lower species recovery score as a species in a threatened category (Figure 4).

Conservation impact metrics
There was no significant difference in the numeric values of conservation legacy among IUCN Red List categories (Welch's ANOVA, F = 1.42, p = 0.236) (Figure 5a). The assessments for more than half of test species showed a positive impact of past conservation, including more than half of the threatened (VU, EN, CR) species tested (Figure 5a). Twenty-eight percent of tested species overall showed high conservation legacy. Of tested species, the high category included 33 currently threatened species for which past conservation actions may have prevented extinction (i.e., best estimate is that extinction was prevented). For 10 species, no uncertainty in this result was reported (i.e., extinction prevented in lower bound, upper bound, and best estimates). The remaining species' conservation legacies were classified as indeterminate (17%) or zero (31%); no species was found to have a negative conservation legacy.
For 17/56 species where conservation legacy was categorized as zero, this classification was because no past conservation action had been taken. For the remaining species in this category, where conservation actions had taken place but there was no evidence that the current Green Score would be different if they had not, various reasons were reported (Table 1).
There was no significant difference in numeric values of conservation dependence among IUCN Red List categories (Welch's ANOVA, F = 0.789, p = 0.537) (Figure 5b). More than half (61%) of tested species had positive conservation dependence (Figure 5b). This was the conservation impact metric for which the largest number of species fell into the high category (67 species, 37%), indicating that continued conservation action is vital to prevent declines in status. For 39 of the 181 tested species, it was estimated that halting conservation actions could result in extinction within 3 generations; assessments of 7 species reported no uncertainty in this result (i.e., extinction prevented in lower bound, upper bound, and best estimates). Species recovery scores of these 39 species varied from 6% (critically depleted) to 67% (moderately depleted).
There was also no significant difference in numeric values of conservation gain between IUCN Red List categories (Welch's ANOVA, F = 1.14, p = 0.345) (Figure 5c). Just under half of tested species (48%) showed positive conservation gain (i.e., indicating opportunities exist to achieve betterthan-current recovery status in the next 10 years or 3 generations if planned conservation actions take place). In contrast to conservation dependence, conservation gain was the metric with the lowest number of species in the high category: 14 species (8%).
Conservation gain was the metric for which the largest number of species fell into the negative category-10 species (5.5%) (Figure 5c). Two species were categorized as having a negative conservation dependence (Figure 5b). The negative category 1 Reported reasons species subject to conservation action have a conservation legacy score of 0 (n = 41 species) *

Reason Species (%)
Action affected only a small part of the global species population (smaller than a spatial unit) 46 Action had a positive effect, but the effect was not enough to change the species' status in its spatial unit or units (i.e., Green Status of Species method not sensitive enough to record relatively limited impact)

39
Action occurred ex situ only 22 Action did not address relevant issue or threat 20 Action did not address most significant problem or threat 17 Lack of evaluation of action or species monitoring 17 Poor management or enforcement of action 15 Action started but not completed 12 Action completed, but duration of action was not long enough to have an impact 12 Action started too recently to show an effect 7 Action started too late to counteract threat 2 *Often, >1 reason was applied to a species, so the percentages reflect the percentage of species for which the factor was reported.
was created to indicate that the species would be worse off if conservation continued (negative conservation gain) or that it would be better off if conservation stopped (negative conservation dependence). However, for the tested species, neither situation was detected. Rather, in these cases, conservation conferred a benefit, but species' status was expected to deteriorate even with conservation (see Discussion). The majority of tested species (70%) had positive recovery potential (Figure 5d), suggesting significant opportunities within the next 100 years for species recovery where the species is extant, for restoration to areas where it has been extirpated, or for expansion into expected additional range. For more than half of species, recovery potential was categorized as medium (40%) or high (20%), indicating that there is substantial space for ambitious recovery planning.
Recovery potential was the only metric for which numeric values were significantly different between IUCN Red List categories (Welch's ANOVA, F = 8.90, p < 0.0001) (Figure 5d). Although recovery potential values between 2 adjacent categories were never significantly different, significant contrasts (p < 0.05) between higher threat and lower threat IUCN Red List categories were observed in many cases, indicating that the higher the extinction risk, the higher the recovery potential tended to be (see Appendix S7 for full list of contrasts).
Relatively few species had zero recovery potential (10%). More than half of these species (10 of 18) were considered fully recovered (i.e., species recovery score = 100%). The other species in the zero recovery potential category included species for which zero recovery potential was reported as the most likely outcome, with uncertainty indicating that some recovery could be possible (5 of 18). Finally, some species had no uncertainty; they had experienced degradation and loss that assessors considered irreversible or assessors judged future degradation within the range unstoppable or immitigable (3 of 18).
Four percent of tested species were estimated to have negative recovery potential, which means that, under the most optimistic scenario within 100 years, the species is expected to have a lower green score than it does now (e.g., Antiguan racer [Alsophis antiguae)]).
Finally, both extinct in the wild species tested (Franklin tree [Franklinia alatamaha] and Aylacostoma chloroticum, a freshwater snail) were considered to have a high recovery potential because individuals exist in ex situ collections and there is a good probability that within 100 years successful reintroductions to the wild could take place.

Feedback
Several areas for improvement of the method emerged multiple times from test assessors and workshop participants (summarized in Appendix S8). One major recommendation was to change the period for conservation gain and conservation dependence to 10 years, rather than 10 years or 3 generations. Another was that calculating conservation gain and conservation dependence relative to the species recovery score created the potential for false negative categorizations.

DISCUSSION
Our results showed that the IUCN Green Status of Species is applicable to a wide range of species and provides important and unique information about the status of biodiversity that complements the information provided by the IUCN Red List. It is not possible to predict Green Status outcomes based on a species' IUCN Red List categories alone (Figures 4 and 5). Nonetheless, there was a significant relationship between the two. That over two-thirds of tested species were in a threatened IUCN Red List categories likely explains why over half of tested species were categorized as largely depleted or critically depleted ( Figure 3). However, 5 of 17 near threatened species and 4 of 33 least concern species were also considered largely depleted. By using species' preimpact distribution as a baseline and incorporating ecological functionality, the IUCN Green Status of Species provides a definition of recovery that can be considered linked to, but distinct from, extinction risk (Mace et al., 2008). To maximize synergy and benefits, IUCN has linked the 2 approaches, requiring an IUCN Red List assessment to exist (or to be conducted simultaneously) for species undergoing an IUCN Green Status of Species assessment (IUCN 2021).
The conservation impact metrics-conservation legacy, conservation dependence, conservation gain, and recovery potential-allowed for a nuanced examination of the effectiveness of past species conservation efforts and the potential for future conservation. These metrics put the species recovery score and IUCN Red List categories in context-the knowledge that a species is largely depleted (IUCN Green Status) and vulnerable (IUCN Red List) reads negatively, but combined with a high conservation legacy that prevented extinction, the story becomes one of success. The IUCN Green Status of Species' use of short-term and long-term milestones creates a vision of potential futures that can be incorporated in conservation planning to inform strategies to minimize losses and maximize potential gains. These metrics could help in the evaluation of effectiveness of conservation actions. For example, if a species' conservation legacy is zero, despite active efforts, it would be useful to determine why (Table 1).
By introducing a formal measure of conservation dependence, the IUCN Green Status of Species may provide a resolution to the controversies that sometimes accompany a species' downlisting to lower IUCN Red List categories. The IUCN Red List guidelines currently allow species that would otherwise be considered least concern to be placed in the near threatened category if they are thought to be "conservation dependent" (IUCN Standards and Petitions Committee 2019). The IUCN Green Status of Species creates a formal mechanism for quantifying conservation dependence and recognizes that species in any IUCN Red List category can be conservation dependent ( Figure 5b).
Conservation gain highlights opportunities for recovery in the short-term and could play an important role in incentivizing future conservation action. Achieving a high value was less common for conservation gain than the other conservation impact metrics (Figure 5c). Although loss and degradation often happen relatively quickly, recovery can be a comparatively slow process (Novacek & Cleland, 2001), which may explain this result. Of the 14 species with high conservation gain, 10 were categorized as such not because conservation gain was intrinsically high, but because it was relatively high compared with the current species recovery score (Appendix S6). For example, the pale-headed brushfinch (Atlapetes pallidiceps) was placed in the high category despite an expected conservation gain of only 17%. This was because the species recovery score was only 8%, and the expected conservation gain of 17% therefore represented a substantial move toward recovery.
The ability of assessments to highlight near-term opportunities through conservation gain counters the necessarily long process of recognizing reduced extinction risk. For long-lived species, ceasing to meet the criteria for a given IUCN Red List category may take decades, which is far too long for policy mak-ers or donors wanting to assess the impact of funding and policies. For this reason, it makes sense to change the definition of short term in the IUCN Green Status of Species assessment from 10 years or 3 generations to simply 10 years (IUCN 2021).
Finally, the recovery potential metric allows conservation planners to envision the maximum recovery that could be achieved if all opportunities for conservation action and innovation over the next 100 years were taken. Two things should be noted. First, it will not be realistic for most species to have a green score of 100% even after recovery potential is fulfilled, and this does not indicate conservation failure. Humans have converted large areas of the world, and climate change threatens the persistence or precludes the return of many species in parts of their indigenous range. Recovery potential merely seeks to estimate how much recovery is possible in the context of the modern world. The observation that species that are more highly threatened tended to have higher recovery potential (Figure 5d) provides encouragement that ambitious conservation actions could greatly improve their status. Second, achieving the recovery potential estimated in the assessment is not necessarily a conservation goal; rather, it can help guide conservation planning by indicating opportunities available for ambitious species recovery action.

Future directions
Although our tests covered a diversity of species, geographies, and biomes, we did not sample in a systematic or representative way (which is why we do not, e.g., report statistics by taxon). Our data set was biased toward threatened species (67%), so our figures of extinctions prevented by past conservation or likely to be prevented by future conservation cannot be generalized. Comprehensive evaluation of counterfactual status for all species within a taxon yielded lower rates of extinctions prevented (Butchart et al., 2006;Hoffmann et al., 2015). Although Green Status of Species assessments could eventually improve understanding of the global impact of conservation, the values reported here are not representative. Understanding the recovery status and trajectories of a systematic sample of the world's major species groups would provide valuable information for conservation planning. With reassessments over time, changes in species recovery scores could be used to track changes in recovery status.
The current status of 12% of tested species presented enough uncertainty that these species were placed in the species recovery category indeterminate (Figure 3). Uncertainty was even higher within the conservation impact metrics ( Figure 5). Although categorization was possible for the majority of tested species, indeterminate values highlight knowledge gaps-in the case of the conservation impact metrics, gaps in understanding of the impacts of conservation actions. This uncertainty can be reduced in the short term by engaging larger groups of experts in the assessment process for a species and employing structured elicitation methods (e.g., Hemming et al., 2018) and, in the long term, by rigorously designing conservation interventions so that their impact can be evaluated (Baylis et al. 2016).
Our testing highlighted several potential areas for improvement of the method (Appendix S8), which have been incorporated in the IUCN Green Status of Species Standard (IUCN 2021). The challenges of identifying indigenous range and ecological functionality of a species have been discussed elsewhere (Akçakaya et al., 2020;Grace et al., 2019). However, one area highlighted for improvement is most relevant to the interpretation of the results presented here. For the tested species, conservation gain and conservation dependence were the difference between the current green score (i.e., species recovery score) and the green scores generated in the futurewith-conservation and future-without-conservation scenarios, respectively (Figure 1). Using the species recovery score to calculate these metrics represents the use of a "static baseline" (Ferraro, 2009), where it is assumed that continued conservation action would result in a future green score greater or equal to the species recovery score and discontinued conservation would result in a future green score less than or equal to the species recovery score. However, this is not necessarily the case, even with continued conservation, because the species' status may deteriorate in the future (if threats to a species multiply or amplify independently [Maron et al., 2015]). This explains the negative values for conservation gain and conservation dependence observed in testing (Figure 5b and c). To avoid giving the false impression that conservation action is predicted to have negative impacts on species' status, in the future assessors will have the option of using a "dynamic baseline" estimated based on a species' predicted trajectory (Ferraro, 2009).
Our results suggest that the IUCN Green Status of Species method is a practical and operational way to assess species recovery in a manner that usefully complements assessment of extinction risk. The IUCN Green Status of Species will continue to undergo development and refinement in the years to come, following a process similar to the IUCN Red List of Threatened Species, which has evolved over the decades as improvements were identified (Hilton-Taylor, 2014). This iterative process of improvements will ensure that the IUCN Green Status of Species develops as a robust and useful measure of species recovery and conservation success.