top of page

GBIFBC Toolkit Tutorial: Tracing the Bioclimatic and Genetic Distributions of Pawpaw (Asimina triloba)

Oct 25, 2024

8 min read

0

39

0

Table of Contents

Introduction

Pawpaws (Asimina triloba) and the broader Asimina genus represent a unique group of trees and shrubs native to North America. Belonging to the Annonaceae family—a group best known for tropical fruits like soursop and custard apples—pawpaws stand out as the only members of the family to thrive in temperate climates. While most Annonaceae species are confined to the tropics, A. triloba can be found growing across a broad geographic range, from the southeastern United States to as far north as the northeastern U.S. and southern Canada. This broad range includes both subtropical regions with mild winters and areas with long, harsh winters, making the pawpaw an intriguing subject of study from both a climatic and evolutionary perspective.

PawPaw Leaf Structure

PawPaw Fruits

In this article and tutorial, we will explore how the climate variation within the pawpaw's range is related to its population structure, using GBIF collection data and the GBIF-Bioclim R toolkit. Specifically, we’ll investigate how climatic factors correlate with the genetic differentiation observed across its range, drawing on insights from Graham Edward Wyatt’s 2019 dissertation [1]. Wyatt’s research identified two major genetic groups in A. triloba populations, largely split by the Appalachian Mountains and the Tombigbee river in Alabama. These groups are hypothesized to have originated from separate refugia during the last glacial maximum, one along the East Coast and the other along the Gulf Coast.


Sourcing Data from GBIF

To follow along with this tutorial, first download the GBIFBC.R script from the GitHub page, which contains a template for sourcing GBIF and worldclim data, as well as the custom functions used throughout the tutorial. Make sure to load all the required packages included at the beginning of the script and define the GBIFWC.R functions in your R environment, you can then proceed with the tutorial script which also provides the template for sourcing data and the specific filtering and plotting steps executed below.

The first part of the GBIF-Bioclim script demonstrates how to source collection data from GBIF using the rgbif package.

To download data from GBIF, you'll need to have a GBIF account. You can enter your username and password credentials either through the usethis::edit_r_environ() function or directly into the occ_download() function. To download the data for a specific species, first retrieve its “usageKey” or “taxonKey” using the name_backbone() function, inputting the genus and species (in this case, A. triloba). Store the returned taxonKey in an object to be used in the occ_download() function to begin downloading the data from GBIF. Once you have successfully created the GBIF_spdata object, check its contents in order to get the specific download key for the dataset; resulting in the following output:

As described in this output, we use the occ_download_wait() command with the associated download key in order to check the download status, and once it has finished we use the occ_download_get() and occ_download_import() commands to import the data as an R dataframe object:

Sourcing WorldClim Raster Data

To obtain bioclimatic and elevation data from WorldClim, use the worldclim_global() function from the geodata R package. After retrieving the raster data, use the stack() function to combine the bioclimatic variables and elevation rasters into a raster stack object we are calling 'bioclimd'.


Filtering Dataset and Setting Extent

Now that you've sourced both the WorldClim rasters and the collection data from GBIF, the next step is to filter the data. Since we're focusing on A. triloba’s native range, we want to analyze naturally occurring individuals across its distribution. First, use the provided GBIFilt_country() function to limit the data to the United States and remove any samples derived from iNaturalist (This is done to  avoid misidentified or cultivated observations as well as reduce sample-size as there are more than enough curated herbarium vouchers we can use for this species). Then, filter out any collections with decimal longitudes west of -97 or east of 0, removing samples from the west coast US and Africa that lacked proper country code data.

After filtering, the dataset is reduced from 30,596 raw accessions to a more manageable 4,701 samples for analysis. With this final set of samples, calculate an appropriate extent for the raster stack by using the provided calculate_extent() function, applying a 4-degree buffer around the sampling points, and using the output extent to crop the raster stack object we named 'bioclimd'.

Visualizing and Analyzing Data

Let's visualize our GBIF sampling data and raster extent by creating a SpatialPoints object from the geolocated samples using the CRS derived from the raster stack. We will then plot the GBIF samples on top of the raster for bioclimatic variable 1 (BIO1: Annual Mean Temperature) as a preliminary visualization of our samples:

Our final filtered distribution is looking like what we would expect for the known distribution of PawPaw's, so lets go ahead and use the raster_extract() function to create a new data-frame we'll call 'biovals' to extract the raster values from our 'bioclimd' raster stack for each sample from our filtered set in the 'GBIFilt' object:



To test our hypothesis that western and eastern genetic groups may be experiencing different environmental conditions, let’s split the sample distribution by longitudinal intervals. Use the add_geo_labels() function to create labels for the GBIF samples, assigning intervals based on the ‘decimalLongitude’ column. We'll make a new dataframe object called 'lonlabel', name the label column as ‘long’, and setlongitudinal intervals of 6.5 degrees for our eastern to western groups. Plot these intervals using the plotgeolabels() function, designating ‘long’ as the geolabel variable we designated and plotting it alongside the BIO1 raster (i.e. layer = 'wc2.1_2.5m_bio1'):

Our longitude group intervals generated four distinct groups: Western (-97 to -90.5), Midwestern (-90.5 to -84), Mideastern (-84 to -77.5), and Eastern (-77.5 to -71.2). Notably, the Mideastern group spans both sides of the Appalachian Mountains, thus we should anticipate a binomial-like distribution in this group if climatic differences are significant between eastern and western populations.


Assessing Bioclimatic Variables

As a preliminary analysis of the bioclimatic variables, we will use density plots to visualize the distribution of annual mean temperature (BIO1) across these longitudinal groups. To do this, use the plot_bioclim2() function, setting ‘BIO1’ as the value_column and ‘long’ as the color_column, and choosing a density plot for visualization.

Initial results indicate that the distribution of annual mean temperature is largely overlapping across the different longitudinal groups, with the Mideastern group showing a slight binomial distribution, as predicted. Overall, the groups appear to form a single normal distribution. To refine our analysis, we will focus on bioclimatic variables identified by Dr. Wyatt’s research, which were found to be informative in modeling A. triloba's distribution. Specifically, we will assess temperature variation using BIO8 (Mean Temperature of the Wettest Quarter) and precipitation patterns using BIO13 (Precipitation of the Wettest Month), BIO17 (Precipitation of the Driest Quarter), and BIO19 (Precipitation of the Coldest Quarter). We will use the plot_bioclim2() function again, following the same procedure as before but adjusting the value_column to match the variables of interest by changing the 'wc2.1_2.5m_bio1' string to the appropriate bioclimatic variable (i.e. 'wc2.1_2.5m_bio19 for BIO19, 'wc2.1_2.5m_bio8' for BIO8, etc...).

BIO8 - Temperature During the Wettest Quarter

In examining BIO8 (Mean Temperature During the Wettest Quarter), we observe that most populations of A. triloba cluster toward the higher end of the temperature spectrum, indicating a preference for warmer conditions during the wettest periods of the year. This suggests that A. triloba populations are well adapted to environments where the warmest months coincide with high precipitation, likely providing optimal growing conditions for this deciduous species, which benefits from ample water during active growth phases.

Interestingly, we see greater variation among western populations, with some experiencing colder temperatures during their wettest quarter. This wider temperature range suggests that western populations, potentially influenced by more continental climates, may have adapted to cooler, wetter conditions, indicating some physiological flexibility. These populations may have evolved mechanisms to tolerate both the colder temperatures and the water availability fluctuations that characterize these regions.

BIO13 - Precipitation of the Wettest Month

When analyzing BIO13 (Precipitation of the Wettest Month), we see a more distinct separation between western and eastern populations. Western populations experience significantly wetter conditions during their wettest month compared to their eastern counterparts. This pattern may reflect adaptations to more variable precipitation regimes, with western populations potentially benefiting from more intense but less frequent rainfall events, characteristic of continental climates.

These wetter conditions during the wettest month in western regions could lead to different root system adaptations (e.g. higher water utilization and storage efficiency, more clonal growth habit, or deeper roots), allowing A. triloba to persist in environments where water availability may be less consistent. In contrast, eastern populations may rely on more steady rainfall throughout the year, aligning with the more humid, Atlantic and Gulf Coast-influenced climates.


BIO17 - Precipitation of the Driest Quarter

For BIO17 (Precipitation During the Driest Quarter), we observe an inverse relationship to BIO13. Eastern populations of A. triloba experience higher rainfall during the driest part of the year compared to western populations. This indicates that eastern populations are less exposed to drought stress, likely benefiting from more consistent year-round moisture due to their proximity to oceanic and Gulf moisture sources.

In terms of plant physiology, the ability to persist through a less severe dry season could mean that eastern populations are less reliant on deep-rooted water extraction mechanisms that may be more critical in the west. Reflecting our previous analysis, western populations likely face greater seasonal moisture scarcity, potentially favoring individuals with more spreading clonal growth habit, deeper roots or more water-efficient physiological traits to cope with prolonged dry spells.


BIO19 - Precipitation of the Coldest Quarter

BIO19 (Precipitation During the Coldest Quarter) reveals that western populations tend to receive less precipitation during the coldest months compared to eastern populations. This difference again points to the more stressful (i.e.drier and colder) continental climates of the western regions, which are typically less influenced by the Atlantic or Gulf Coast weather systems. In contrast, the eastern populations may benefit from moisture retained in the colder months linked to mountain rainfall capture and coastal influences.

These climatic differences highlight the role of geographical features, such as the Appalachian Mountains, in shaping the precipitation patterns experienced by A. triloba. Western populations may have adapted to survive in areas with more pronounced winter dry spells, which could involve physiological adaptations such as enhanced cold tolerance or dormancy mechanisms that allow the plants to endure periods of lower moisture availability. Despite this, we still see significant overlap across all longitudinal subgroupings. This longitudinal variation is probably explained by the additional influences of latitude and elevation in modulating precipitation patterns, regardless of longitude.

Conclusions

In this analysis of A. triloba, we observed trends differentiating western and eastern populations, roughly aligning with the major genetic groups identified in Dr. Wyatt’s research. Specifically, western populations tended to experience colder and drier conditions during key periods of the year, while eastern populations were exposed to more consistent moisture and milder temperatures. This suggests that the environmental differences between these regions may contribute to the genetic divergence observed across the species’ range.

Despite these regional distinctions, we also noted substantial overlap between groups. This overlap indicates that factors such as latitude and elevation may further influence population variation within each longitudinal group, possibly explaining some of the common clustering across regions. Such gradients could buffer or enhance genetic differentiation depending on the specific ecological conditions experienced by local populations.

Dr. Wyatt’s research proposed that the East/West genetic split is likely the result of distinct glacial refugia along the East Coast and Gulf Coast followed by recolonization on either side of Appalachians after the retreat of glaciers. Our results suggest that these genetically distinct eastern and western groups may also have secondary adaptations to the different climatic conditions that would reinforce their genetic divergence over time. This would be evident if specific genes or genomic regions are under selective pressure due to the environmental variation we observed. However, confirming this hypothesis would require more detailed analyses, integrating genomic data to determine whether particular genes are differentially adapting to environmental conditions across the species' range.

Additionally, Dr. Wyatt highlighted the significant role of indigenous cultivation and trade in shaping A. triloba's population structure. The movement of different genotypes by indigenous peoples likely contributed to the genetic variation we observe today, complicating the picture of purely environmental and natural genetic differentiation. This historical context emphasizes the importance of considering both natural and anthropogenic factors when interpreting the population structure and ecological adaptations of A. triloba. Reference: 1. Wyatt, G.E., Phylogeography and Population Genetics of Wild and Anthropogenic Populations of a Highly Clonal Tree Species, Asimina triloba (Annonaceae) (Doctoral dissertation, University of Georgia).




Oct 25, 2024

8 min read

0

39

0

Comments

Share Your ThoughtsBe the first to write a comment.
bottom of page