Statistical Methods Seminar Series

EFI and the ESA Statistical Ecology Section have hosted two rounds of this virtual seminar series that demonstrates a variety of quantitative methods applied within Ecology and Environmental Science in the R programming language. Attendees gain valuable insight into methods that they may or may not be familiar with from experts on a given topic.

We are in the third round of this seminar series! See the schedule and register for the calls below.
If you have suggestions for R packages of interest or potential speakers, please reach out at info@ecoforecast.org. In particular, we are looking for speakers who are women and/or from historically marginalized communities. Speakers are provided a $150 honorarium.

Target audience: Quantitative environmental scientists and ecologists either in-training (graduate students and postdocs) or working professionals in academia, government agencies, or non-governmental organizations. Attendees are expected to be proficient in R.

Webinar structure: Each seminar is 1-1.5 hour in length and is led by a different invited speaker with expertise on a given topic or statistical method. Speakers spend the first part of the webinar presenting a project where they used the method, followed by sharing R code or packages related used for the statistical method. Presenters walk through the code, taking time to describe common pitfalls or stumbling blocks for performing the method and visualizing results. R code is available on this GitHub repository. Recordings from the webinars are available in the EFI YouTube Statistical Methods Webinar Series Playlist.

Dates/Times: We will have monthly webinars typically on the first Monday of each month at noon US Eastern unless otherwise noted.

Click HERE to find recordings and R resources from previous seminars.

Schedule of Upcoming Sessions

The following webinars are held at noon US Eastern time.

  1. March 4, 2024 – Toryn Schafer (Texas A&M University) and Chris Wikle (University of Missouri); Spatio-temporal modeling, Register Here
  2. April 15, 2024 – Tati Micheletti (Technische Universität Dresden); SpaDES for developing and running Spatial Discrete Event Simulation models, Register Here
  3. May 6, 2024 – Daniel Reuman (University of Kansas); wsyn Wavelet approaches to Studies of Synchrony in Ecology, Register Here

March 4, 2024 – Toryn Schafer (Texas A&M University) and Chris Wikle (University of Missouri); Spatio-temporal modeling

Register for the call HERE

Abstract: Spatio-temporal statistical models are increasingly being used across a wide variety of scientific disciplines to describe and predict spatially explicit processes that evolve over time. Such data can be continuous in space and time, events, areal summaries, or trajectories. Scientific interest can be quite varied, ranging from parameter inference, interpolation in space or time, forecasting, to simulation model emulation. Drs. Toryn Schafer and Chris Wikle present here an introduction to spatio-temporal statistics and how one can use R to work with such data.  Topics will include data exploration and visualization, an introduction to descriptive methods based on Gaussian processes, and dynamic spatio-temporal models that attempt to model the explicit evolution of spatial processes through time.  Toryn and Chris will present motivation for spatio-temporal statistical methods along with R examples illustrating these topics.  A good reference for this talk is the book Spatio-Temporal Statistics with R by Wikle, Zammit-Mangion, and Cressie, which is free to download at https://spacetimewithr.org/.   Attendees should expect to gain an appreciation for ways that spatio-temporal statistics can be used to enhance the study of ecological data and enough background to know which R packages could be useful for such analyses.

Toryn Schafer is an Assistant Professor and 2024 ConocoPhillips Data Science Faculty Fellow in the Department of Statistics at Texas A&M University. Dr. Schafer’s research interests span many topics, but are primarily related to spatio-temporal modeling, Bayesian statistics, and alternative learning frameworks such as machine learning, deep learning, and reinforcement learning. She is highly motivated by ecological and environmental applications.

Christopher K. Wikle is a Curators’ Distinguished Professor and Chair of Statistics at the University of Missouri. Dr. Wikle’s research interests are in spatial and spatio-temporal statistics applied to environmental, ecological, geophysical, agricultural and federal survey applications, with particular interest in dynamics. His work has been concerned with formulating computationally efficient deep hierarchical Bayesian models motivated by scientific principles, with more recent work at the interface of deep neural models in machine learning. Dr. Wikle has written two books on analyzing spatio-temporal including “Statistics for Spatio-Temporal Data” (2011) and  “Spatio-Temporal Statistics With R” (2019) which can be accessed at https://spacetimewithr.org/.

Summary of Recordings and R Resources from the Previous 2 Rounds of this Seminar Series:

  1. Applications of the `aniMotum` R package for quality control, behavioural estimation and simulation of animal movement data by James Grecian, February 5, 2024 (recording to be added by Feb 12)
  2. HMSC, Hierarchical Modelling of Species Communities by Nerea Abrego, December 4, 2023
  3. Ecological Forecasting with Dynamic Generalized Additive Models by Nicholas Clark, November 6, 2023
  4. Integrated Species Distribution Models by Neil Gilbert, May 1, 2023
  5. Bayesian Stable Isotope Mixing Models and the MixSIAR R package by Brian Stock, April 3, 2023
  6. State Space Models and the Template Model Builder (TMB) R package by Marie Auger-Méthé, March 6, 2023
  7. Spatial Modeling in Ecology by Marie-Josée Fortin, February 6, 2023
  8. Zero-Inflated GLM and GLMM by Alain Zuur and Elena Ieno, January 9, 2023
  9. Structural Equation Models and the piecewiseSEM R package by Jon Lefcheck, December 5, 2022
  10. Analysis of Bioacoustic Data by Marcelo Araya-Salas, November 7, 2022
  11. Spatial Occupancy Models and the spOccupancy R package by Jeff Doser & Andrew Finley, October 3, 2022
  12. Hidden Markov Models by Vianey Leos Barajas, May 2, 2022
  13. NIMBLE by Lauren Ponisio, April 18, 2022
  14. Multi-Species (Species Interactions) Occupancy Modeling by Chris Rota, April 4, 2022
  15. Integrated Step-Selection Analysis by Brian Smith and Tal Avgar, March 7, 2022
  16. Movement Ecology by Théo Michelot, February 7, 2022
  17. Generalized Joint Attribute Modeling (GJAM) by Tong Qiu, January 24, 2022
  18. Generalized Additive Models (GAMs) by Gavin Simpson, January 3, 2022
  19. Species Archetype Models and Regions of Common Profile Models by Skip Woolley, December 6, 2021
  20. Mixed Models by Ben Bolker, November 1, 2021

Recordings and R Resources from Individual Seminars

Applications of the `aniMotum` R package for quality control, behavioural estimation and simulation of animal movement data by James Grecian. February 5, 2024

Animal telemetry and bio-logging data are essential tools that allow us to understand the movements, behaviour, social interactions and foraging ecology of mobile or cryptic species. However, the data collected by telemetry and bio-logging devices are subject to issues such as irregularly timed intervals and location measurement errors. State-space models (SSMs) are powerful tools for dealing with these issues but can be difficult for non-specialists to implement. In this talk, Dr. James Grecian will introduce `aniMotum`, an R package for fitting SSMs to animal movement data. The aim of `aniMotum` is to provide a user friendly approach for (1) simple and fast quality control of error-prone animal location data; and (2) for inference of changes in movement behaviour along animal movement paths. James will first discuss different types of bio-logging data and their sources of measurement error, before outlining how aniMotum can help address these case studies. Attendees should leave with a basic understanding of how they could use `aniMotum` in their own animal movement analysis workflows.

James Grecian is a postdoctoral researcher at Durham University. Dr. Grecian’s research focuses on understanding the ecology of marine vertebrates and their responses to ecosystem change in the Anthropocene. This work includes expertise in animal biologging, remote sensing, stable isotope analysis and data science.

HMSC, Hierarchical Modelling of Species Communities by Nerea Abrego. December 4, 2023

Hierarchical Modelling of Species Communities (HMSC) is a model-based approach for analyzing community ecological data. The {HMSC} R package uses a Bayesian framework with Gibbs Markov chain Monte Carlo sampling, which is a flexible framework for Joint Species Distribution Modelling (JSDM). This framework can be used to relate species occurrences or abundances to environmental covariates, species traits, and phylogenetic relationships.

Nerea Abrego is an ecologist at the University of Jyväskylä who jointly manages the Predictive Community Ecology Group (PreCom). Dr. Abrego is particularly interested in developing empirical and statistical methods for efficiently acquiring and analyzing community ecology data. Much of her studies have focused on fungi, but she has also worked on communities of other taxonomical groups such as insects, birds, plants, and bacteria. 

Ecological Forecasting with Dynamic Generalized Additive Models by Nicholas Clark. November 6, 2023

  • The recording of the presentation on Dynamic Generalized Additive Models and the mvgam R package is here: https://youtu.be/0zZopLlomsQ
  • Nicholas starts the R tutorial at time 31:35 and the Q&A starts at time 1:11:13.
  • The presentation and R resources are available on GitHub HERE
  • The Multivariate (Dynamic) Generalized Additive Models, mvgam R package information referenced at time 31:20 can be found at: https://nicholasjclark.github.io/mvgam/

Time series analysis and forecasting are standard goals in applied ecology.  But ecological forecasting is difficult because ecology is complex. The abundances of species, for example, fluctuate for many reasons. Food and shelter availability limit survival. Biotic interactions affect colonization and vital rates. Severe weather events and climate variation alter habitat suitability. These sources of variation make it difficult to understand, let alone predict, ecosystem change. Moreover, most available time series software cannot handle features that dominate ecological data, including overdispersion, clustering, missingness, discreteness and nonlinear effects.

In this talk, Dr. Clark introduces Dynamic Generalized Additive Models (DGAMs) as one solution to meet this complexity. He illustrates a number of models that can be tackled with the {mvgam} R package, which builds Stan code to specify probabilistic Bayesian models that include nonlinear smooth functions, random effects and dynamic processes, all with a simple interface that is familiar to most R users.

Nicholas Clark is an Australian Research Council Early Career Fellow in the School of Veterinary Science at the University of Queensland. He is broadly interested in exploring new ways to (1) understand how ecological communities are formed and (2) predict how they will change over time. Dr. Clark’s research focuses on developing computational tools and adapting techniques from statistical forecasting to study how organisms and ecosystems respond to change, with applications across a variety of ecological systems. He is also the developer of {mvgam}, an R package for fitting and interrogating Bayesian Dynamic GAMs.

Integrated Species Distribution Models by Neil Gilbert. May 1, 2023

  • The recording of the presentation on iSDMs and the Q&A is here: https://youtu.be/8pOyTjBFeOI
  • Neil starts the R tutorial at time 33:00 and the Q&A starts at time 58:18.
  • The presentation and R resources are available on GitHub HERE
  • During the Q&A Neil shared resources that are helpful for people getting started with iSDMS. Those resources are:
  • Books 
  • Papers 

Integrated species distribution models (iSDMs) allow ecologists to use multiple types of data to estimate the distribution and abundance of species. The key aspect of iSDMs is information sharing between data sources; these models assume that different types of data—for example, counts and presence-only observations—provide information about the same underlying process of interest (the occurrence or abundance of species). Importantly, iSDMs allow information to be shared among data sources while accounting for biases or differences in the observation processes between data types. iSDMs are an appealing approach because they allow researchers to leverage all available sources of information to ecological problems, and in so doing often expand the scope of inference over space and/or time. In this webinar, we will provide an overview of iSDMs and demonstrate how to fit them with simulated data in the Nimble R package. To follow along with the coding demonstration, please have a working C++ compiler installed on your system, as well as the nimble R package downloaded. For installation and setup instructions, please refer to Chapter 4 of the Nimble user manual: http://r-nimble.org/html_manual/cha-installing-nimble.html

Neil Gilbert is an ecologist with interests spanning from animal behavior to macroecology. Dr. Gilbert is an NSF postdoctoral fellowship at Michigan State University focused on applying integrated models to investigate links between biodiversity and disease risk.  Neil is keenly interested in data integration, or the process of combining seemingly disparate datasets into a cohesive whole within statistical models. 

Bayesian Stable Isotope Mixing Models and the MixSIAR R package by Brian Stock. April 3, 2023

  • The recording of the presentation on Mixing Models and the MixSIAR R package and the Q&A are here: https://youtu.be/LuoEDJBq-wA
  • Brian starts the R tutorial at time 35:48 and the Q&A starts at time 59:34.
  • The presentation and R resources are available on GitHub HERE

Mixing models are statistical tools that use biotracer (e.g. stable isotopes or fatty acids) data to estimate the contributions of sources to a mixture. They are often used by ecologists to estimate animal diet proportions (sources = prey, mixture = consumers), but other applications include habitat use (sources = regions, mixture = animals that can move between regions) and sediment mixing in river systems (sources = upstream land uses, mixture = downstream sediment). MixSIAR is an open-source R package that unifies several years of advances in Bayesian mixing model theory since MixSIR and SIAR into a common framework. In particular, MixSIAR allows scientists to account for: consumer variability via covariate effects (i.e. allow consumers to not all have the same diet), source sampling error (fits source data within model, i.e. admit the sample mean is not the truth), alternative error structures, and informative priors. In this webinar, Dr. Stock will demonstrate the use of MixSIAR and discuss common misconceptions and sticking points.

Brian Stock develops statistical methods to improve our understanding and management of fisheries. Dr. Stock is dedicated to achieving balance between sustainable harvest and conservation of marine populations. He works as a research scientist in the Demersal Fish group at the Norwegian Institute of Marine Research, primarily on the assessment of coastal cod. Brian completed his Ph.D. at Scripps Institution of Oceanography, UC San Diego and a postdoc at NOAA’s Northeast Fisheries Science Center in Woods Hole, MA. 

State Space Models and the Template Model Builder (TMB) R package by Marie Auger-Méthé. March 6, 2023

  • The recording of the presentation on State Space Models and the TMB R package and the Q&A are here: https://youtu.be/V_2Aw_GvzqM
  • Marie starts the R code tutorial in the presentation at time 21:00 and the Q&A starts at time 1:02:35.
  • The presentation and R code are available on GitHub HERE

State–space models (SSMs) are an important modeling framework for analyzing ecological time series and are commonly used to model population dynamics, animal movement, and capture–recapture data. SSMs are popular because they can account for biological variation in the state process separately from observation error, and can be used to model continuous, count, binary, and categorical data. 

SSMs can be fitted with a variety of tools, each having their own strengths and weaknesses. One increasingly popular tool is the R package Template Model Builder (TMB). This package allows users to fit SSMs using Maximum Likelihood Estimation and has the advantage of being computationally efficient and relatively flexible.

Dr. Auger-Méthé starts by introducing SSMs. She then presents when it is advantageous to use TMB to fit a SSM. She demonstrates how to code a SSM in TMB, highlighting some of the peculiarities of using a package at the interface of R and C++. Dr. Auger-Méthé finished by detailing a few methods to assess whether a SSM is adequate for the data at hand. The presentation is based on a recent review led by Dr. Auger-Méthé in Ecological Monographs (open access).

Marie Auger-Méthé is an Associate Professor at the University of British Columbia. Most of her work is interdisciplinary in nature and at the intersection between ecology, statistics, and polar and marine sciences. Her recent focus has been on developing and applying statistical models to understand the movement and space use of marine species.
Dr. Auger-Méthé is broadly interested in developing and applying statistical tools to infer behavioral and population processes from empirical data.

Spatial Modeling in Ecology by Marie-Josée Fortin. February 6, 2023

  • The recording of the presentation on Spatial Modeling in Ecology and the Q&A are here: https://youtu.be/O_6kxHAIovo
  • Marie-Josée starts explaining the R code in the presentation at time 36:07 and moves to R Studio at time 47:13. The Q&A starts at time 57:29
  • The presentation and R code are available on GitHub HERE

When performing parametric tests with ecological data, the assumption of independence of the errors is often violated, increasing the Type I error and biasing the estimation of regression parameters. The lack of independence in the errors can arise because objects (samples, observations, etc.) that are closer together sometimes have a tendency to be more similar than those that are further apart. Dr. Fortin focused only on spatial dependency that generates spatially autocorrelated data. Determining the degree of spatial autocorrelation and spatial scale of ecological variables is paramount to relating the ecological response to the covariates. Yet, spatial autocorrelation can be a nuisance that causes issues while performing statistical inference. Dr. Fortin provides an overview of the different spatial regression types that could be used when data are spatially autocorrelated. As there is not one R package that computes all these spatial regression types, a subset of them will be presented.

Marie-Josée Fortin is a Professor in the Department of Ecology and Evolutionary Biology at the University of Toronto. Dr. Fortin’s research has four subject areas: spatial ecology, spatial and landscape statistics, conservation, and disturbance ecology. She explores the maintenance of biodiversity within ecosystems and appropriate conservation strategies for species affected by land use and climate change. This includes the analyses of how environmental factors and ecological processes affect the movement, persistence, and range dynamics of species at the landscape and geographical range in both forested and aquatic environments. 

Zero-Inflated GLM and GLMM by Alain Zurr and Elena Ieno. January 9, 2023

  • The recording of the presentation on Zero-Inflated GLM and GLMM and the Q&A are here: https://youtu.be/ISN9SE__QOU
  • Alain starts going through R code with the owl example at time 1:05:12 and the Q&A with Alain and Elena starts at time 1:18:17
  • The presentation and R code are available on GitHub HERE

Drs. Zuur and Ieno will start with a brief revision of the Poisson, negative binomial, Bernoulli, Gamma and Tweedie distributions. They will then discuss how these distributions are used in Poisson, negative binomial, Bernoulli, gamma and Tweedie generalised linear models (GLM) and generalised linear mixed-effects models (GLMM).

They will use two case study chapters to show how one can decide whether a GLM(M) or generalised additive mixed effects model (GAMM) can cope with a data set that contains an excessive number of zeros. The first case study is using count data and the second case study continuous data.  In both analyses, they will extend the GLM(M)s and GAM(M)s towards zero-inflated models.

This is a non-mathematical presentation.

Alain F. Zuur is senior statistician and director of Highland Statistics Ltd., a statistical consultancy company based in the UK. Together with his colleague, Elena Ieno, they run about 20–25 statistics courses (covering a wide range of topics) per year.  They wrote three books with Springer, and wrote a ‘Beginner’s Guide to …’ book series as self-publishing authors.

Elena Ieno is a senior marine biologist with over 15 years of experience in teaching statistics to biologists and environmental scientists. Elena has authored and co-authored 10 books on the analysis of ecological data and she has also written two case study chapters on the analysis of time series applied on environmental indices and forensic entomology.

Structural Equation Models and the piecewsie SEM R package Jon Lefcheck. December 5, 2022

Structural equation modeling (SEM) is a rapidly growing statistical technique in ecology and the environmental sciences. SEM unites multiple variables in a single causal network, thereby allowing simultaneous tests of multiple hypotheses. The idea of causality is central to SEM as the technique implicitly assumes that the relationships among variables represent (generally) causal links. Because variables can be both predictors and responses, SEM is also a useful tool for quantifying both direct and indirect (cascading) effects. Piecewise SEM (or confirmatory path analysis) expands upon traditional SEM by introducing a flexible mathematical framework that can incorporate a wide variety of model structures, distributions, and assumptions. These include: interactions and non-normal responses, random effects and hierarchical models, alternate correlation structures (including phylogenetic, spatial, and temporal), and true non-linear functions.

Jon Lefcheck is the Tennenbaum Coordinating Scientist for the Smithsonian MarineGEO Network. Dr. Lefcheck tests and employs ecological theory to inform nature-based solutions to mitigate response to climate change and other anthropogenic impacts across marine and estuarine ecosystems. Currently, he is implementing long-term monitoring and coordinated experiments across the MarineGEO network.

Analysis of Bioacoustic Data by Marcelo Araya-Salas. November 7, 2022

  • The recording of the presentation on Analysis of Bioacoustic Data, and the Q&A are here: https://youtu.be/dYlkDCUTbAs
  • Marcelo starts going through the R code at time 42:22 and the Q&A starts at time 1:11:56
  • Marcelo’s presentation, R code and data are available on GitHub HERE

Bioacoustic data provide a noninvasive method to monitor animals. R can be a powerful tool to analyze animal acoustic signals compared to other sounds analysis software in that it is highly flexible and allows for customizable analyses that can be fit to research questions and the characteristics of vocalizations.  Dr. Araya-Salas is deeply involved in the development of computational tools for bioacoustic analyses including warbleR that provides functions to streamline high-throughput acoustic analysis of animal sounds, Rraven that aim to simplify the use of R for bioacoustic research, baRulho to quantify acoustic signal transmission and degradation, ohun to optimize automatic detection, and dynaSpec for creating dynamic spectrograms.

Marcelo Araya-Salas is professor at the Universidad de Costa Rica. He is interested in the role of social learning/ cultural evolution in shaping animal vocal signals at both ecological and evolutionary time frames. Dr. Araya-Salas studies the communication systems of (mostly neotropical) taxa, making use of ‘single species’ behavioral studies, comparative phylogenetic methods and cutting-edge acoustic analyses.

Spatial Occupancy Models and the spOccupancy R package by Jeff Doser and Andrew Finley. October 3, 2022

  • The recording of the presentation and explanation of the spOccupancy package, and the Q&A are here: https://youtu.be/arYqlfs6lIQ
  • Jeff starts going through the R code at time 21:30 and the Q&A starts at time 57:38
  • Jeff’s presentation pdf including the citations he mentioned and the R code and data are available on GitHub HERE

Occupancy modeling is a common approach to assess species distribution patterns, while explicitly accounting for measurement errors common in detection-nondetection data. Numerous extensions of the basic single species occupancy model exist to model multiple species, spatial autocorrelation, residual species correlations, and to integrate multiple data sources. However, development of specialized and computationally efficient software to incorporate such extensions, especially for large datasets, is scarce or absent. Drs. Doser and Finley will demonstrate how to use the spOccupancy R package to fit single-species, multi-species, and integrated spatially-explicit occupancy models in a Bayesian framework.

Jeffrey Doser is a postdoctoral research associate in the Department of Integrative Biology at Michigan State University in the Zipkin Quantitative Ecology Lab. Dr. Doser received his PhD in Forestry and Ecology, Evolution, and Behavior at Michigan State University. His research interests lie in the development and application of statistical models for methods of monitoring wildlife populations across large spatio-temporal regions by leveraging a variety of data sources, including citizen science data and acoustic recordings.

Andrew Finley is a professor at Michigan State University with a joint appointment in the Departments of Forestry and Geography, and is an adjunct in the Department of Statistics & Probability. A central theme in Dr. Finley’s research is the use of hierarchical models to integrate information from disparate sources to improve inference and prediction. His recent interest is in improving frameworks for modeling exposure to pollutants, climate change, and health outcomes (ecosystem and public).

Hidden Markov Models in Ecology by Vianey Leos Barajas. May 2, 2022

  • The recording of the presentation, Vianey’s walkthrough of the R code, and the Q&A are here: https://youtu.be/TSu6hsT7_6A
  • Vianey starts going through the R code at time 36:13 and the Q&A starts at time 55:45.
  • Vianey’s presentation and R code and data are available on GitHub HERE
  • Additional papers shared during the webinar include:
    • Pohle, J., Langrock, R., van Beest, F.M. et al. Selecting the Number of States in Hidden Markov Models: Pragmatic Solutions Illustrated Using Animal Movement. JABES 22, 270–293 (2017). https://doi.org/10.1007/s13253-017-0283-8
    • Valle, D.; Jameel, Y.; Betancourt, B.; Azeria, E.; Attias, N.; Cullen, J. 2022. Automatic selection of number of clusters using Bayesian clustering and sparsity inducing priors. Ecological Applications, 32:e2524. https://esajournals.onlinelibrary.wiley.com/doi/abs/10.1002/eap.2524
    • Cullen, J. A., Poli, C. L., Fletcher, R. J., & Valle, D. (2022). Identifying latent behavioural states in animal movement with M4, a nonparametric Bayesian method. Methods in Ecology and Evolution, 13, 432– 446. https://doi.org/10.1111/2041-210X.13745

Hidden Markov models (HMMs) are a widely applied modeling framework to data with serial dependence in ecology. An HMM is a time series model involving two layers, an observable state-dependent process and an unobservable state process, where the unobservable state process can be thought to serve as a proxy for biological processes of interest. For instance, in application of HMMs to animal movement, the states can serve as a proxy for animal behavior. It is also straightforward to incorporate environmental variables in the state and/or observation process, account for missing data, and account for individual variation through the use of random effects. Dr. Leos Barajas will demonstrate how to fit HMMs in a Bayesian framework with the R packages ‘rstan‘ and ‘cmdstanr‘, both of which use the programming language Stan, as well as how to interpret the results and common pitfalls of an HMM analysis. 

Vianey Leos Barajas is an Assistant Professor in the Department of Statistical Sciences and the School of the Environment at the University of Toronto and leads the Bayesian Ecological and Environmental Statistics (B.E.E.S.) research group. B.E.E.S. is dedicated to the development of statistical methodology to answer pressing ecological and environmental questions. Dr. Leos Barajas’ work focuses on the analysis of sensor data collected from animals and the environment over time and space but also includes collaborations in health and other areas.

NIMBLE by Lauren Ponisio. April 18, 2022

NIMBLE, short for Numerical Inference for statistical Models using Bayesian and Likelihood Estimation, is a system for building and sharing analysis methods in R for statistical models. The NIMBLE system provides a flexible language for declaring a wide range of hierarchical models, a framework for defining algorithms that operate on this representation of models, and a compiler for generating equivalent C++.

Lauren Ponisio is an assistant professor at the University of Oregon, where she uses modeling, synthesis, and field-based work to study pollinators and understand the mechanisms by which species interactions maintain species diversity. Dr. Ponisio is working with NIMBLE to build common hierarchical models used in ecology, mainly occupancy models, and is the lead author this study looking at NIMBLE’s MCMC performance and customizations for a variety of ecological models. 

Multi-Species (Species Interactions) Occupancy Modeling by Christopher Rota. April 4, 2022

  • The recording of the presentation and Q&A are here: https://youtu.be/tj_OCO77_sc
    • Chris started with the R code at time 26:45. The Q&A starts at time 1:08:21.
  • Chris’ presentation and R code and data are available on GitHub HERE
  • Here is a quick link to the presentation which has links to the papers Chris references
  • During the presentation, Chris recommended the book: Applied Hierarchical Modeling in Ecology by Marc Kéry & Andy Royle

Multi-species occupancy models incorporate both environmental variables and interspecific correlations when estimating factors that influence occupancy, all while accounting for imperfect detection.  Further, multi-species occupancy models can be used to explore whether interspecific correlations vary across environmental gradients.  Given the detail with which multi-species occupancy models are able to investigate interspecific correlations, they are best suited for relatively small species groups. Dr. Rota will demonstrate how to use the ‘unmarked’ R package to fit, interpret, and solve common problems associated with multi-species occupancy models.

Christopher Rota is an Assistant Professor of Wildlife & Fisheries Resources at West Virginia University. Dr. Rota’s research addresses diverse questions in applied vertebrate ecology working with birds, mammals, reptiles, and amphibians. He is interested in understanding factors that shape the spatial distribution of species, and the dynamic interplay between space use and demography. A common link throughout his research is the application and development of modern statistical techniques that capture many of the myriad processes giving rise to ecological data sets.

Integrated Step-Selection Analysis by Brian Smith and Tal Avgar. March 7, 2022

  • The recording of the presentation and Q&A are here: https://youtu.be/jiY9N-TNRjs
    • Brian walked through the R code at time 27:58, followed by Tal going over FAQs at time 1:01:58 and the Q&A starts at time 1:08:13.
  • R code and presentation slides are available on GitHub HERE.
  • Answers to additional questions that were not covered during the live session are available on GitHub HERE.
  • Citations shared during the presentation:
    • Avgar, et al. 2016. Integrated step selection analysis: bridging the gap between resource selection and animal movement. Methods Ecol Evol, 7: 619-630. https://doi.org/10.1111/2041-210X.12528
    • Fieberg, et al. 2021. A ‘How to’ guide for interpreting parameters in habitat-selection analyses. J Anim Ecol. 90: 1027– 1043. https://doi.org/10.1111/1365-2656.13441
    • Fieberg et al. 2017. Used-habitat calibration plots: A new procedure for validating species distribution, resource selection, and step-selection models. Ecography. 41. 10.1111/ecog.03123.
    • Prokopenko, C.M., Boyce, M.S. and Avgar, T. (2017), Characterizing wildlife behavioural responses to roads using integrated step selection analysis. J Appl Ecol, 54: 470-479. https://doi.org/10.1111/1365-2664.12768
    • Avgar et al. 2017. Relative Selection Strength: Quantifying effect size in habitat- and step-selection inference. Ecol Evol. 7: 5322– 5330. https://doi.org/10.1002/ece3.3122
    • Signer et al. 2017. Estimating utilization distributions from fitted step-selection functions. Ecosphere 8( 4):e01771. 10.1002/ecs2.1771
    • Additional references are available in the FAQ section of the pdf in the GitHub repository

A habitat selection function is a model of the relative probability that an available spatial unit will be used by an animal given its habitat value, but how do we appropriately define availability? In an integrated Step-Selection Analysis (iSSA), availability is defined by the animal’s ‘selection-free movement kernel’, which is fitted in conjunction with a conditional habitat-selection function. Parameter estimates are obtained using a conditional-logistic regression by contrasting each ‘used step’ (a straight line connecting two consecutive observed positions of the animal) against a set of ‘available steps’ (randomly sampled from one of several possible theoretical distributions). iSSA thus relaxes the implicit assumption that movement is independent of habitat selection and instead allows simultaneous inference on both processes, resulting in an empirically parametrized mechanistic space-use model.

In this webinar, we will highlight the R package ‘amt' for implementing iSSA, from raw data through simulations from the mechanistic space-use model.

Brian Smith is a PhD student, co-advised by Tal Avgar and Dan MacNulty, studying the space-use ecology of northern Yellowstone elk and the feedbacks between space-use and demography. Brian is particularly interested in how density-dependent habitat selection interacts with predation risk and how animals balance this tradeoff between “many mouths to feed” and “safety in numbers”. His goal is to find insights from individual behavior that scale up to population- and community-level patterns.

Tal Avgar is an Assistant Professor of Movement Ecology in the Department of Wildland Resource and Ecology Center at Utah State University. Dr. Avgar’s research focuses on the ecological and evolutionary causes and consequences of animal movement behaviour. The premise behind Dr. Avgar’s research is that quantitative understanding of the processes underlying animal movement behaviours is essential, not only as means to identifying ecological needs and interactions at the individual level, but as a mechanistic key to emerging population and community patterns.

Movement Ecology Théo Michelot. February 7, 2022

Recent developments in tracking technology have made it possible to collect high volumes of data on animal movement and behaviour, e.g., animal trajectories using GPS tags, or detailed activity profiles with accelerometers. Increasingly sophisticated statistical methods are required to obtain ecological inferences from these complex data (which often include autocorrelation, and can reach millions of observations). This webinar will provide a very brief overview of existing frameworks, and will then focus on one main theme: using location (long-lat) data to learn about animals’ behaviour. In particular, we will discuss how hidden Markov models (HMMs) can be used to draw inferences about the behavioural state process underlying observed movement patterns. The outcomes of an HMM analysis include movement parameters (such as mean step length) for each behavioural state, as well as an estimated state for each time of observation. It is also possible to estimate the effect of covariates (e.g., temperature, bathymetry) on the behavioural dynamics of the animal, which is often of great ecological interest. We will illustrate the application of this method with the R package momentuHMM, and discuss common practical challenges with model fitting. A secondary theme of this webinar will be the filtering and regularisation of animal tracking data. HMMs assume that animal locations are observed at regular time intervals and with no error. When this assumption is not satisfied, a two-stage approach is typically applied, and we will demonstrate this using the R packages foieGras and crawl.

Théo Michelot is a postdoctoral researcher in statistics at the Centre for Research into Ecological and Environmental Modelling (CREEM) at the University of St. Andrews. Dr. Michelot is developing flexible stochastic differential equation models, and using them as continuous-time models of animal movement and behaviour. Additional research interests include hidden Markov models and applications in ecology and statistical software development.

Generalized Joint Attribute Modeling (GJAM) by Tong Qiu. January 24, 2022

  • The recording of the presentation and Q&A are here: https://youtu.be/OYDWLbK335U.
    • Tong walked through the code at time 17:09 and the Q&A starts at time 52:08.
  • R code and Tong’s presentation slides are available on GitHub HERE. See slide 19 for additional resources and references
  • gjam Vignettes
  • Example of model and prediction on multiple species group: https://pbgjam.env.duke.edu/

The Generalized Joint Attribute Model (GJAM) is a probabilistic framework that allows combinations of presence-absence, ordinal, continuous, discrete, composition, zero-inflated, and censored data.  The gjam R package provides inference on sensitivity to input variables, correlations between responses, model selection, prediction of responses, inverse prediction of predictors, and community classification by response to predictors. This model is useful for creating probabilistic forecasts of species distribution and abundance that incorporate a wide range of ecological data and can accommodate massive zeros by relying on censoring.

Tong Qiu is a Postdoc Associate at Duke University.  Dr. Qiu’s research aims to understand how the function and structure of the terrestrial ecosystem respond to global environmental changes at regional to global scales. He uses a data-model synthesis approach that integrates satellite and airborne remote sensing, monitoring networks, and forest inventory with Bayesian hierarchical models. Dr. Qiu uses GJAM to model responses of 1) forest trees and 2) ground beetles to climate habitat interactions.

Generalized Additive Models (GAMs) by Gavin Simpson. January 3, 2022

  • The recording of the presentation and Q&A are here: https://youtu.be/Ukfvd8akfco
    • Gavin’s walk-through of the R code starts at time 33:49 and the Q&A starts at time 1:19:10.
  • R code, slides, resources, and answers to the questions we didn’t get to in the Q&A and the R code shared by Skip are available on GitHub HERE. A quick link to the presentation slides are HERE.

Generalized Additive Models were introduced as an extension to linear and generalized linear models, where the relationships between the response and covariates are not specified up-front by the analyst but are learned from the data themselves. This learning is achieved by viewing the effect of a covariate on the response as a smooth function, rather than following a fixed form (linear, quadratic, etc). The smooth functions are represented in the GAM using penalized splines, in which a penalty against fitting overly-complex functions is employed. GAMs are most useful when the relationships between covariates and response are non linear, and GAMs have found particular use for modelling inter alia spatiotemporal data.

The presentation will briefly explain what a GAM is and how penalized splines work before focusing on the practical aspects of fitting GAMs to data using the mgcv R package, and will be most useful to ecologists who already have some familiarity with linear and generalized linear models.

Gavin Simpson is an Assistant Professor in the Department of Animal Science at Aarhus University. Dr. Simpson’s research uses approaches to modelling large regional to global spatio-temporal data sets using generalized additive models (GAMs) and functional statistical methods to examine broad ecosystem responses to environmental change. He is an active member of the R and Data Science communities and was a lead developer on the vegan package for multivariate data analysis and wrote the permute package for restricted permutation tests that allow multi-species data analyses from complex experimental designs. Dr. Simpson is currently developing a package, gratia, to work with GAMs fitted in R.

Species Archetype Models and Regions of Common Profile Models by Skip Woolley. December 6, 2021

  • The recording of the presentation and Q&A here: https://youtu.be/ukx7ZFX-71A
    • Skip’s walk-through of the R code for SAMs starts at time 21:54 and the R code for RCPs starts at time 58:26.
  • Resources and R code shared by Skip are available on GitHub HERE.

Dr. Woolley will present two types of finite mixture models, that extend GLMs by allowing for multiple components. Specifically, he will present on Species Archetype Models (SAM; Dunstan et al. 2011) and the Region of Common Profile models (RCP; Foster et al. 2013, 2017). Together, these approaches cover inferential situations where understanding joint responses of species are of primary importance (SAMs) or when managing groups of sites are of primary importance (RCPs). Species Archetype Models (SAMs) are a “Mixture-of-regressions”, and describe how a homogeneous group of species varies with the environment. The environmental gradients are represented by covariates in the model. Regions of Common Profile (RCP) models are a type of ‘Mixture-of-Experts Models’ and try to describe how groups of sites vary with the environment. The sites are grouped based on the profile of biological content at the sites, with sites that have relatively similar observed assemblages are grouped together. The RCPs are defined by estimating how these groups vary with environment.

Skip Woolley is a research fellow at the University of Melbourne working on Integrated Environmental Assessment Modelling and he is a visiting
scientist at CSIRO. His research focuses on the development, implementation and interpretation of statistical modelling for integrated environmental risk assessment. Dr. Woolley’s research also focuses on understanding how biodiversity interacts with economic, social and environmental drivers of human activities and pressures, to better protect and reduce the risk of biodiversity loss into the future.

Mixed Models by Ben Bolker. November 1, 2021

 “Mixed models” refers to a broad class of statistical models that extend linear and generalized linear models to handle data where observations are measured within discrete groups such as field sites; years or other temporal blocks; individuals that are observed multiple times; genotypes; species; etc. They can be thought of (equivalently)
as (1) accounting for the correlation among observations from the same group; (2) estimating the variability among groups, or (3) parsimoniously estimating the effects of groups. They are most useful when the experimental or observational design includes a large number of groups with varying numbers of observations per group.

This presentation will be most useful to ecologists who already have some familiarity with linear and generalized linear models.

Ben Bolker is the Director of the School for Computational Science and Engineering and Acting Associate Chair for Mathematics at McMaster University. His interests include spatial, theoretical, mathematical, computational and statistical ecology, evolution and epidemiology, plant community, ecosystem, and epidemic dynamics. He has two books, including Ecological Models and Data in R, and is the co-author of a Very Short Introduction to Infectious Disease with Marta Wayne. Dr. Bolker maintains a popular GLMM FAQ, and keeps miscellaneous mixed models resources here.