
Modeling Bioclimatic Variation Across Latitude and Elevation in the Saguaro Cactus (Carnegiea gigantea)
0
30
0
Introduction
The Saguaro cactus (Carnegiea gigantea) is an iconic species native to the Sonoran Desert, with its towering arms symbolizing the American Southwest. However, its range extends across varied latitudes and elevations, influencing its survival and adaptation. A phylogenetic study by Sanderson et al. [1], utilizing whole genome sequence data, revealed that Saguaros likely originated in the southeastern portion of their range in Mexico, with populations in the northwestern range resulting from post-glacial migration. Albuquerque et al. [2] applied climate change models and projected a range contraction for the Saguaro over the coming decades, driven by changing conditions in annual precipitation and maximum temperatures during the warmest month.

Saguaros overlooking Tucson, AZ
In this tutorial, we use environmental data to examine how these bioclimatic conditions vary across the Saguaro's range, from lower, warmer elevations in the south to the higher, cooler elevations in the north. We will assess the relationships between climate variables, latitude, and elevation, applying linear models with random effects to detect any trends and potential signals of local adaptation across different environmental gradients. First we will assess relationships between environment and elevation and latitude separately and then use linear models with random effects from the lme4() R package to assess how climatic gradients change across elevations at different latitudinal intervals. Understanding these environmental gradients offers insight into the Saguaro’s adaptability and informs conservation strategies as climate conditions shift. Download the tutorial script via GitHub to follow along.

Map from Sanderson et al. [1] highlighting genetic variation across its range. Based on their research the species is hypothesized to have originated in south-western Mexico with subsequent expansion to the east and north.
Data Collection from GBIF and WorldClim
To start, we source distribution data for C. gigantea from the GBIF database, similar to our previous analysis with Asimina triloba. Replace the scientific name with Carnegiea gigantea to obtain the proper taxon usage key, allowing us to download relevant distribution data. We then source bioclimatic and elevation data from the WorldClim database, which provides raster data layers for various environmental variables. (see last tutorial for detailed steps on sourcing GBIF and WorldClim raster data.

Saguaro National Park
Filtering and Preparing the Dataset
Since we are interested in Saguaro's distribution across Mexico and the United States, we will use the GBIFilter_world() function to filter our samples so that geolocated samples across different countries are kept. Saguaro’s are a charismatic species known by many people for their massive stature and are not likely to be misidentified with other cacti, therefore and we will keep Inaturalist observations to maximize our dataset by setting the 'inat' option to ‘T’. The Saguaro observations include some cultivated specimens in the old world, so we will also filter these out using dplyr by only keeping samples below 0 longitude:

Now lets calculate our sampling extent with a 1 degree buffer to crop our rasters using the GBIFBC.R calculate_extent() function. Next we can visualize our sampling and extent by plotting a raster and making sample points for our Saguaro observations:


Looks like there's one erroneous collection that shows up off the coast of Baja California; lets extract the raster data into a new dataframe we’ll call ‘biovals’ and then use dplyr to filter any samples where the elevation raster value is ‘NA’ to remove this sample (i.e. any sample not falling on a raster grid with a value will result in ‘NA’ ) and remove any samples with negative elevation (lme4 models will return an error if using a variable with negative values):

Grouping Data by Latitude and Elevation
Now that we have curated our sampling we are ready to split up the distribution based on latitude and elevation groups to answer our questions about bioclimatic variables across these gradients. We’ll do this by using the add_geo_labels() to latitude and elevation interval groups to our ‘biovals’ dataframe, calling latitude groups by ‘lat’ and elevation groups by ‘elev’:

We used intervals of 2.5 degrees for latitude and 500m for elevation that we can visualize using the plotgeolabels() function by setting the ‘geolabel =’ option according to the ‘lat’ and ‘elev’ naming conventions we just applied and plotting them on top of the BIO1 worldclim raster layer (i.e. ‘wc2.1_2.5m_bio_1’):


Plotting Distributions and Testing Linear Relationships
It seems like more northern latitudes tend to host populations with higher elevations, let’s test this with a linear model to assess the relationship between latitude and elevation. The Johnston lab has provided a useful function called plotRegression() for plotting linear models with the ggplot2 package that includes summary statistics (https://sejohnston.com/2012/08/09/a-quick-and-easy-function-to-plot-lm-results-in-r/):


There does appear to be a somewhat positive relationship between latitude and elevation, but there is also substantial variation in the data. Let’s look at the two bioclimatic variables (BIO12 - annual precipitation, and BIO5 - Max temperature of warmest month) that were found to have the highest explanatory power for predicting the distribution of Saguaro based on research by Albuquerque et al. [2]. First let’s look at annual precipitation relative to our latitude and elevation groups using the plotbioclim2() and plotbioclim() functions and then test the relationship with a linear model:


There doesn’t seem to be a clear sub-structuring of annual precipitation across latitude groups, which is also confirmed by the linear model:

We have a fairly flat slope suggesting there isn’t a strong correlation between latitude and annual precipitation. Despite a statistically significant p-value, the low R² value suggests that latitude alone explains little of the variation in precipitation. Now let's look at annual precipitation relative to our elevation groups to see if we can detect a more distinct separation of groups, using the plotbioclim() function to plot mean and standard deviation on distribution peaks rather than on a corner legend when using plotbioclim2():

We can see that there is a much more distinct separation of groups along our elevation intervals, with the lowest elevation population showing the lowest annual precipitation values that increase as we go up in elevation, indicating that higher elevation populations receive a higher annual rainfall. Let’s test this relationship with a linear model:

We find a stronger positive slope and higher R² value compared to the latitude-based model, confirming that lower elevations receive less annual precipitation, while higher elevations see progressively more rainfall.
Now let’s explore how latitude and elevation are related to the max temperature of the warmest month (i.e. BIO5), using the same functions as before, but substituting bio12 for bio5:

Examining BIO5 across latitude groups reveals complex patterns. The highest latitude group, with only a few samples, records the lowest temperatures, whereas the second-highest latitude group shows the warmest temperatures. This distribution suggests that maximum temperature trends are not strictly uniform across latitude, possibly due to the influence of local factors like elevation. The highest latitude group (36-36.25) only consisted of a handful of samples, but they are clearly at the lower temperature end of the climatic range relative to the other groups. Let’s test this relationship with a linear model to see the degree of correlation between these variables:

We see a more significant slope than in the previous model for latitude with annual precipitation, showing an increase in temperature at higher latitudes. There's still substantial variation in the data, particularly in the latitudes between 32 and 34 that also show fairly low temperature values. This variation might be better explained by elevation gradients along our latitudinal groups. Let’s look at BIO5 distributions relative to our elevation groups using plotbioclim2() to see if we can detect a clearer relationship:

As in the previous case with precipitation, we see a much clearer separation of distributions along our elevation groups, with the lowest elevation group having the highest temperatures that seems to linearly decrease as the elevation increases, but let’s explicitly test this with a linear model to confirm the relationship:

Plotting BIO5 against elevation groups reveals a more distinct trend, with temperatures decreasing as elevation increases. A strong negative linear relationship suggests that higher elevation populations experience cooler temperatures during the warmest months, though there does appear to be more variation associated with this for samples below 500m. This cooling effect with elevation aligns with typical temperature-elevation relationships seen in many ecosystems and may reflect an adaptive pressure for Saguaro populations to tolerate colder temperatures at higher elevations.
Elevation seems to be the strongest geographic factor explaining temperature and precipitation gradients for Saguaro, but latitude also explained some of the variation. To explore the interplay between elevation and latitude relative to the climatic variation we will use a linear model with random effects as implemented in the lme4 R package. We’ll keep elevation as the main explanatory variable, given its consistent influence across the dataset. However, we introduce latitude groups as random effects to capture the additional variation in temperature and precipitation that cannot be fully explained by elevation alone. To do this we set up our linear model with the lmer() function as we did with the lm() function previously, but we add a random effect by adding a variable with the syntax (1 | random_effect_variable), which in our case will be the latitude groups we designated. We will also plot the model and visualize differences in the random effect groups by making a ggplot2 plot object we are calling 'ip' and setting the aes() color option to 'lat' for the latitude groups we made. Let’s start with precipitation:


We observe generally higher precipitation as samples occupy higher elevations, but the lowest latitudes show a much weaker correlation with a relatively flat . This suggests that elevation doesn't predict precipitation patterns as reliably at lower latitudes, but is also assocaited with narrower elevation gradients for the low latitude group, where all samples fall below 500m. The next three higher latitude groups show a consistent positive correlation between annual precipitation and elevation. The highest latitude group only consists of 5 samples and a much smaller latitude range due our interval cut-offs. This group suggests that samples at the highest latitudes are also at generally higher elevations relative to the other latitudes. Now let’s look at the maximum temperature of warmest month, BIO5 in our model:

In this case we can see that the lower latitudes have a range between ~35-40C for BIO5, whereas higher latitudes span a much broader range both above 40C and below 35C. Additionally the slopes for the two lower latitudes are relatively flat compared to the higher latitudes showing a clear negative correlation with temperatures as elevation increases. This pattern coincides with the idea that lower latitudes with warmer climates closer to the equator will buffer temperatures at higher elevations relative to higher latitudes where continental and seasonal climates expose populations to more extreme temperature fluctuations.
Potential for Local Adaptation in Saguaro
Our results suggest that both temperature and precipitation factors predictive of Saguaro’s distribution are highly correlated with elevation, particularly at the higher latitudes of the species’ range. The two lowest latitude groups still shared substantial environmental overlap with the higher latitudes, except for the higher end of the elevation range which had much higher maximum elevations at the three highest latitude groups. This suggests that any local adaptation would be occurring at the higher elevation populations at latitudes above 31 degrees. Sanderson et al. [1] found that the origin and diversity center of Saguaro occurs near the south east of its range in Mexico followed by post-glacial expansion of the species to the north west into the US. Taken together with our results, higher elevation northern populations may be undergoing adaptive evolution and divergence following their range expansion to deal with wetter and colder climates. Future studies would benefit from comparing these northern populations to the genetic and physiological traits of southern populations, offering insights into how Saguaros may continue adapting to shifting climate conditions.
References:
1. Michael J Sanderson, Alberto Búrquez, Dario Copetti, Michelle M McMahon, Yichao Zeng, Martin F Wojciechowski, Origin and Diversification of the Saguaro Cactus (Carnegiea gigantea): A Within-Species Phylogenomic Analysis, Systematic Biology, Volume 71, Issue 5, September 2022, Pages 1178–1194, https://doi.org/10.1093/sysbio/syac017
2. Albuquerque, F., Benito, B., Rodriguez, M.Á.M. and Gray, C., 2018. Potential changes in the distribution of Carnegiea gigantea under future scenarios. PeerJ, 6, p.e5623.