Estimating the establishment likelihood of pests

Understanding where a harmful invasive species might arrive is critical for biosecurity decision support, especially given the finite resources available for surveillance. A risk map should also weigh the likely harm that would be caused if a pest was to establish itself in different areas. Analysis of potential harm is not part of this workflow, so from here on we will focus on estimating the likelihood of a pest establishing itself (“establishment likelihood”).

Carmac et al. 2020 provide a more comprehensive guide to producing scientifically robust models, and we suggest you review that report to both deepen your understanding of the methodology and the ways data and models might be improved. A more recent report Carmac et al. 2021 provides important updates to this methodology.

What is an establishment likelihood map?

An establishment likelihood map is made of a grid of equal-sized grid cells over the area of interest, often over the whole of Australia. The value in each grid cell is calculated by multiplying up to three values in three grids1in each grid cell the value in the abiotic suitability grid is usually the probability that conditions in that grid cell are not too hot, cold, dry or wet for the pest to survive; 2in each grid cell the value in the biotic suitability grid is an estimate of how likely the biological material or host the pest requires will be found in that grid cell; and 3) in each grid cell the value in the pest arrivals grid corresponds to the proportion of viable pest arrivals likely to occur in each grid cell.

In more detail:

Establishment likelihood = Abiotic suitability X Biotic Suitability X Total pest arrivals

Abiotic suitability = the probability that a species can survive and/or reproduce given the abiotic conditions.

Usually, abiotic suitability is defined by areas known to be too cold, hot, wet, or dry for the pest to survive. If the pest species is likely to do well in all the known climatic conditions available within Australia this step could be skipped.  If the species would not be able to survive in areas that reach freezing temperatures but survive in other climatic conditions, it might be appropriate to simply mask out those areas that freeze (0 probability) and give everything else a probability of 1. Similarly, if there are high levels of certainty that the suite of climatic conditions available to the species in its native range would define the locations used in Australia, an environmental envelope would capture those conditions, and the mean values could be given more weight if binary scoring was not of interest. If you the user want to prioritise capturing the variation in abiotic suitability, then you may want to try to generate your abiotic suitability using a species distribution model using machine learning or statistical methods that produce a wider breadth of probabilities defining the suitable climatic niche.

Most people take a conservative approach to this step as they want to make sure not to rule out locations on a map where the establishment of a pest is possible. Keep in mind how many species of plants or animals are found outside their native climatic niches within botanic gardens or zoos throughout the globe. The key here is to identify those locations where a species will not survive, or is less likely to survive, and that really depends on the assumptions made based on your literature review.

Biotic Suitability the locations of hosts or host material and can include the presence of host, food or habitat

There are a variety of ways to calculate the value in each grid cell representing biotic suitability.  If the pest is a livestock disease, biotic suitability may simply be the number of livestock animals in each grid cell.  If the disease can be transmitted by an insect, the number of livestock animals might be multiplied by the probability that the insect would be found in that grid cell based on a species distribution model.

If a plant pest is likely to use any green vegetation as host material, a grid of vegetation greenness (NDVI) might be a good layer to use. Land use layers are also likely to be useful to identify which grid cells are likely to have suitable host materials.  The Australian Land Use and management classification (ALUMclassification which maps everything from where different crops are found to natural area locations is perhaps the most commonly used land-use layer for all of Australia.

A critical assumption when computing biological suitability relates to the likelihood that a pest will jump to an Australian host or habitat.  Often there is no data on how pests from other continents might interact with Australia’s unique ecology.

Probability of Pest Arrival = the estimated probability of one or more viable pest arrivals in each grid cell.

Most pests will have many ways in which they may enter or spread throughout Australia, i.e., tourists, mail, shipping containers, etc. Each of these different modes of entry is considered a pathway. While the biosecurity system aims to prevent entry of pests and other biosecurity hazards, these controls are not 100% effectivePests that evade the biosecurity controls correspond to “leakage” events (see below and Camac et al. 2021). The probability of a pest arriving at a given location (i.e., a grid cell) in a given time period depends on: (1) the leakage rate (average number of leakage events per time period); (2) the probability that a leakage event will result in the pest’s viable establishment somewhere in susceptible habitat (e.g., in a location that contains host material); and (3) the spatial distribution of the probability of establishment for the relevant pathway (i.e., assuming a viable establishment occurs somewherewhat is the probability that it will occur in a particular cell, relative to other cells).

Leakage and viability rates have been estimated for a range of biosecurity hazards by the federal Department of Agriculture, Fisheries and Forestry, based on expert elicitation (Hemming et al. 2018) and building on previous efforts such as the department’s Risk-Return Resource Allocation (RRRA) model. The latter might be informed by the relative volume of containers arriving in each postcode (for a container pathway) or human population density (e.g., if the pathway is mail or residents returning from international travel), or may be a function of distance from points of entry such as ports (e.g., vessels pathway) or airports (e.g., tourists pathway).

Given the leakage and viability rates, and the spatial arrival weights, the probability of a location encountering one or more arrivals is calculated for each relevant pathway, and the resulting grids are combined by calculating their union to produce a single grid giving the overall probability of establishment.

Page Break

Assumptions & caveats

This workflow does not provide guidance on how to prioritise different pests or diseases, how to optimally deploy surveillance or how to account for climate change. This workflow does result in a map of establishment likelihood, but the accuracy of that map is dependent on the data used, and the assumptions made while constructing the model. It is easy to produce results with the tools available on Biosecurity Commons (see below), however, the utility of those results will depend fully on you the user. Also, it is worth remembering results from these workflows are relative to the data and modelling decisions made, so while they provide a map of relative establishment likelihood, values between species or modelling approaches should not be compared quantitatively. Again, the different methods available here can give you very different resultsThere is no single best method, so take care, evaluate your assumptions, and validate your model estimates whenever possible. A full description of considerations to include when attempting to generate a useful establishment likelihood map is presented in recent reports (Carmac et al. 2020, Carmec et al. 2021).

Generally, if a species' biology and distribution are well understood, and the likely pathways of arrival are well understood, a biosecurity establishment likelihood map will reflect that understanding with perhaps a few new locations identified which had not been considered.  However, often those assessing risks have an incomplete understanding of the biological requirements of any biosecurity pest, or how they might arrive. Therefore, this workflow provides a useful screening tool, which also provides useful maps of where establishment risk is relatively highest.

Remembering that:

Establishment likelihood = Abiotic suitability X Biotic Suitability X Total pest arrivals

This workflow does not provide an estimate of the uncertainty in each variable being multiplied; we know that the overall error in the establishment likelihood estimate is the sum of the errors in each variable being multiplied. Therefore, if you are highly uncertain about one of these estimates, that uncertainty carries through to your final estimate.  It may be best to leave out variables you are highly uncertain about. Research into optimal ways to include uncertainty in results is ongoing and should be encouraged.

Until uncertainty is captured quantitatively it is vital to capture the variety of assumptions underlying an establishment likelihood map and document the data used as well as how those data were derived. These steps are vital to communicating the uncertainty in the estimate, but also in improving future estimates. Measures of uncertainty are commonly reported when generating more complex species distribution models, but those reported levels or uncertainty can only be trusted once the model is tested with independent data and any underlying assumptions are thoroughly tested. The same applies to biological suitability and pest arrivals.

It is worth reiterating that Establishment likelihood values are relative.

If the abiotic suitability is simply set to 1 for areas deemed to have suitable conditions and zero for unsuitable areas, then when this binary abiotic suitability is multiplied by biotic suitability or pest arrival probabilities, the resulting establishment likelihood values will be higher than they would have been if the abiotic suitability had been scaled with a range of values between zero and one. This is just one simple example of why strict comparisons between numeric values of establishment likelihood from different models or species should be avoided.

Additional assessment of the potential financial, ecological, or societal harm of a potential establishment of an invasive species should be done separately.

Finally, there are often relatively high levels of uncertainty in these kinds of approaches, but these approaches provide an estimate of the relative likelihood of pest establishment which can be shared and improved upon.  The tools available here will improve the transparency, reproducibility and consistency of workflows while also speeding up the modelling process. Importantly, by sharing workflows others can improve on existing estimates rather than reinventing them.

Functions described further:

Conform: “Conforms the spatial configuration (CRS, extent, and resolution) of a spatial layer to that of another (template) layer.” It also gives the user the option to “binarize” or normalize the data.

Combine layers:” Combines multiple spatial layers via (optionally weighted) cell-wise multiplication, addition, or union (via complements), and optionally binarizes the output.

Distribute Features:” Distributes feature polygon values across a template or mask raster spatial layer. The total value assigned to each polygon is divided across all corresponding raster cells when a template (single value plus NAs) raster is provided, or across non-zero cells when a mask (zero and non-zero values plus NAs) raster is provided, which is used as a binary mask (regardless of the variety of non-zero values).

Aggregate Categories: “Aggregate the categories within a spatial layer to a courser resolution, that of another (template) layer, based on the presence of each of the user-selected categories in each cell. The resulting aggregated layer cells may be binarized to one or zero, or indicate the proportion of selected categories present in the cell.

Distance weight layer: “Calculates a distance-weighted spatial layer based on the proximity of each cell to a series of points (features), or via a pre-calculated distance layer. Each cell is weighted via a negative exponential function applied to the distance of the nearest point or pre-calculated distances.”  (also one way to generate a bias layer)

Page Break

Literature cited:

Camac, JS., Baumgartner, J., Robinson, A., Elith, J. (2020). Developing pragmatic maps of establishment likelihood for plant pests. Technical Report for CEBRA project 170607.

Camac, J. S., Baumgartner, J. B., Hester, S., Subasinghe, R., & Collins, S. (2021). Using edmaps & Zonation to inform multi-pest early-detection surveillance designs. Technical Report for CEBRA project 20121001.

Elith, J. (2017). Predicting distributions of invasive species. In: Invasive species risk assessment and management (eds. Robinson AP, Walshe T, Burgman MA, Nunn M), pp. 93–129. Cambridge University Press.

Fourcade, Y., Besnard, AG.Secondi, J. (2017). Paintings predict the distribution of species, or the challenge of selecting environmental predictors and evaluation statistics. Global Ecology and Biogeography, 27, 245–256.

Hemming, V., Burgman, M. A., Hanea, A. M., McBride, M. F., & Wintle, B. C. (2018). A practical guide to structured expert elicitation using the IDEA protocol. Methods in Ecology and Evolution, 9(1), 169-180.

Phillips, SJ., Dudik, M., Elith, J., Graham, CH., Lehmann, A.Leathwick, J.Ferrier, S. (2009). Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. Ecological Applications, 19, 181–197.

Syfert, MM., Smith, MJ.Coomes, DA. (2013). The Effects of Sampling Bias and Model Complexity on the Predictive Performance of MaxEnt Species Distribution Models. PLoS ONE, 8, e55158–10.

Warton, DI., & Shepherd, LC. (2010). Poisson point process models solve the “pseudoabsence problem” for presence-only data in ecology. The Annals of Applied Statistics, 4, 1383–1402.