Skip to contents

Fits a vector autoregressive spatio-temporal model using a minimal feature-set and a widely used interface.

Usage

tinyVAST(
  formula,
  data,
  sem = NULL,
  dsem = NULL,
  family = gaussian(),
  space_columns = c("x", "y"),
  spatial_graph = NULL,
  time_column = "time",
  times = NULL,
  variable_column = "var",
  variables = NULL,
  distribution_column = "dist",
  delta_options = list(delta_formula = ~1),
  spatial_varying = NULL,
  control = tinyVASTcontrol(),
  ...
)

Arguments

formula

Formula with response on left-hand-side and predictors on right-hand-side, parsed by mgcv and hence allowing s(.) for splines or offset(.) for an offset.

data

Data-frame of predictor, response, and offset variables. Also includes variables that specify space, time, variables, and the distribution for samples, as identified by arguments variable_column, time_column, space_columns, and distribution_column.

sem

Specification for structural equation model structure for constructing a space-variable interaction. sem=NULL disables the space-variable interaction; see make_sem_ram().

dsem

Specification for time-series structural equation model structure including lagged or simultaneous effects for constructing a space-variable interaction. dsem=NULL disables the space-variable interaction; see make_dsem_ram() or make_eof_ram().

family

A function returning a class family, including gaussian(), lognormal(), tweedie(), binomial(), Gamma(), or poisson(). Alternatively, can be a named list of these functions, with names that match levels of data$distribution_column to allow different families by row of data. Delta model families are possible, and see Families for delta-model options,

space_columns

A string or character vector that indicates the column(s) of data indicating the location of each sample. When spatial_graph is an igraph object, space_columns is a string with with levels matching the names of vertices of that object. When spatial_graph is an fmesher or sfnetwork object, space_columns is a character vector indicating columns of data with coordinates for each sample.

spatial_graph

Object that represents spatial relationships, either using fmesher::fm_mesh_2d() to apply the SPDE method, igraph::make_empty_graph() for independent time-series, igraph::make_graph() to apply a simultaneous autoregressive (SAR) process, sfnetwork_mesh() for stream networks, or NULL to specify a single site.

time_column

A character string indicating the column of data listing the time-interval for each sample, from the set of times in argument times.

times

A integer vector listing the set of times in order. If times=NULL, then it is filled in as the vector of integers from the minimum to maximum value of data$time.

variable_column

A character string indicating the column of data listing the variable for each sample, from the set of times in argument variables.

variables

A character vector listing the set of variables. if variables=NULL, then it is filled in as the unique values from data$variable_columns.

distribution_column

A character string indicating the column of data listing the distribution for each sample, from the set of names in argument family. if variables=NULL, then it is filled in as the unique values from data$variables.

delta_options

a named list with slots for delta_formula, delta_sem, and delta_dsem. These follow the same format as formula, sem, and dsem, but specify options for the second linear predictor of a delta model, and are only used (or estimable) when a delta family is used for some samples.

spatial_varying

a formula specifying spatially varying coefficients.

control

Output from tinyVASTcontrol(), used to define user settings.

...

Not used.

Details

tinyVAST includes four basic inputs that specify the model structure:

  • formula specifies covariates and splines in a Generalized Additive Model;

  • dsem specifies interactions among variables and over time, constructing the space-time-variable interaction.

  • sem specifies interactions among variables and over time, constructing the space-variable interaction.

  • spatial_graph specifies spatial correlations

the default dsem=NULL turns off all multivariate and temporal indexing, such that spatial_graph is then ignored, and the model collapses to a standard model using gam. To specify a univariate spatial model, the user must specify both spatial_graph and dsem="", where the latter is then parsed to include a single exogenous variance for the single variable

Model typeHow to specify
Generalized additive modelspecify spatial_graph=NULL and dsem="", and then use formula to specify splines and covariates
Dynamic structural equation model (including vector autoregressive, dynamic factor analysis, ARIMA, and structural equation models)specify spatial_graph=NULL and use dsem to specify interactions among variables and over time
Univariate spatial modelspecify spatial_graph and dsem="", where the latter is then parsed to include a single exogenous variance for the single variable
Multivariate spatial modelspecify spatial_graph and use dsem (without any lagged effects) to specify spatial interactions
Vector autoregressive spatio-temporal modelspecify spatial_graph and use dsem="" to specify interactions among variables and over time, where spatio-temporal variables are constructed via the separable interaction of dsem and spatial_graph

See also

Details section of make_dsem_ram() for a summary of the math involved with constructing the DSEM, and doi:10.1111/2041-210X.14289 for more background on math and inference

doi:10.48550/arXiv.2401.10193 for more details on how GAM, SEM, and DSEM components are combined from a statistical and software-user perspective

summary.tinyVAST() to visualize parameter estimates related to SEM and DSEM model components

Examples

# Simulate a 2D AR1 spatial process with a cyclic confounder w
n_x = n_y = 25
n_w = 10
R_xx = exp(-0.4 * abs(outer(1:n_x, 1:n_x, FUN="-")) )
R_yy = exp(-0.4 * abs(outer(1:n_y, 1:n_y, FUN="-")) )
z = mvtnorm::rmvnorm(1, sigma=kronecker(R_xx,R_yy) )

# Simulate nuissance parameter z from oscillatory (day-night) process
w = sample(1:n_w, replace=TRUE, size=length(z))
Data = data.frame( expand.grid(x=1:n_x, y=1:n_y), w=w, z=as.vector(z) + cos(w/n_w*2*pi))
Data$n = Data$z + rnorm(nrow(Data), sd=1)

# Add columns for multivariate and temporal dimensions
Data$var = "n"

# make mesh
mesh = fmesher::fm_mesh_2d( Data[,c('x','y')], n=100 )

# fit model
out = tinyVAST( data = Data,
                formula = n ~ s(w),
                spatial_graph = mesh,
                sem = "n <-> n, sd_n" )