Skip to contents

Fits a vector autoregressive spatio-temporal model using a minimal feature-set and a widely used interface.

Usage

tinyVAST(
  formula,
  data,
  sem = NULL,
  dsem = NULL,
  family = gaussian(),
  space_columns = c("x", "y"),
  spatial_graph = NULL,
  time_column = "time",
  times = NULL,
  variable_column = "var",
  variables = NULL,
  distribution_column = "dist",
  delta_options = list(delta_formula = ~1),
  spatial_varying = NULL,
  weights = NULL,
  control = tinyVASTcontrol(),
  ...
)

Arguments

formula

Formula with response on left-hand-side and predictors on right-hand-side, parsed by mgcv and hence allowing s(.) for splines or offset(.) for an offset.

data

Data-frame of predictor, response, and offset variables. Also includes variables that specify space, time, variables, and the distribution for samples, as identified by arguments variable_column, time_column, space_columns, and distribution_column.

sem

Specification for structural equation model structure for constructing a space-variable interaction. sem=NULL disables the space-variable interaction; see make_sem_ram().

dsem

Specification for time-series structural equation model structure including lagged or simultaneous effects for constructing a space-variable interaction. dsem=NULL disables the space-variable interaction; see make_dsem_ram() or make_eof_ram().

family

A function returning a class family, including gaussian(), lognormal(), tweedie(), binomial(), Gamma(), or poisson(). Alternatively, can be a named list of these functions, with names that match levels of data$distribution_column to allow different families by row of data. Delta model families are possible, and see Families for delta-model options,

space_columns

A string or character vector that indicates the column(s) of data indicating the location of each sample. When spatial_graph is an igraph object, space_columns is a string with with levels matching the names of vertices of that object. When spatial_graph is an fmesher or sfnetwork object, space_columns is a character vector indicating columns of data with coordinates for each sample.

spatial_graph

Object that represents spatial relationships, either using fmesher::fm_mesh_2d() to apply the SPDE method, igraph::make_empty_graph() for independent time-series, igraph::make_graph() to apply a simultaneous autoregressive (SAR) process, sfnetwork_mesh() for stream networks, or NULL to specify a single site.

time_column

A character string indicating the column of data listing the time-interval for each sample, from the set of times in argument times.

times

A integer vector listing the set of times in order. If times=NULL, then it is filled in as the vector of integers from the minimum to maximum value of data$time.

variable_column

A character string indicating the column of data listing the variable for each sample, from the set of times in argument variables.

variables

A character vector listing the set of variables. if variables=NULL, then it is filled in as the unique values from data$variable_columns.

distribution_column

A character string indicating the column of data listing the distribution for each sample, from the set of names in argument family. if variables=NULL, then it is filled in as the unique values from data$variables.

delta_options

a named list with slots for delta_formula, delta_sem, and delta_dsem. These follow the same format as formula, sem, and dsem, but specify options for the second linear predictor of a delta model, and are only used (or estimable) when a delta family is used for some samples.

spatial_varying

a formula specifying spatially varying coefficients.

weights

A numeric vector representing optional likelihood weights for the data likelihood. Weights do not have to sum to one and are not internally modified. Thee weights argument needs to be a vector and not a name of the variable in the data frame.

control

Output from tinyVASTcontrol(), used to define user settings.

...

Not used.

Details

tinyVAST includes four basic inputs that specify the model structure:

  • formula specifies covariates and splines in a Generalized Additive Model;

  • dsem specifies interactions among variables and over time, constructing the space-time-variable interaction.

  • sem specifies interactions among variables and over time, constructing the space-variable interaction.

  • spatial_graph specifies spatial correlations

the default dsem=NULL turns off all multivariate and temporal indexing, such that spatial_graph is then ignored, and the model collapses to a standard model using gam. To specify a univariate spatial model, the user must specify both spatial_graph and dsem="", where the latter is then parsed to include a single exogenous variance for the single variable

Model typeHow to specify
Generalized additive modelspecify spatial_graph=NULL and dsem="", and then use formula to specify splines and covariates
Dynamic structural equation model (including vector autoregressive, dynamic factor analysis, ARIMA, and structural equation models)specify spatial_graph=NULL and use dsem to specify interactions among variables and over time
Univariate spatial modelspecify spatial_graph and dsem="", where the latter is then parsed to include a single exogenous variance for the single variable
Multivariate spatial modelspecify spatial_graph and use dsem (without any lagged effects) to specify spatial interactions
Vector autoregressive spatio-temporal modelspecify spatial_graph and use dsem="" to specify interactions among variables and over time, where spatio-temporal variables are constructed via the separable interaction of dsem and spatial_graph

See also

Details section of make_dsem_ram() for a summary of the math involved with constructing the DSEM, and doi:10.1111/2041-210X.14289 for more background on math and inference

doi:10.48550/arXiv.2401.10193 for more details on how GAM, SEM, and DSEM components are combined from a statistical and software-user perspective

summary.tinyVAST() to visualize parameter estimates related to SEM and DSEM model components

Examples

# Simulate a 2D AR1 spatial process with a cyclic confounder w
n_x = n_y = 25
n_w = 10
R_xx = exp(-0.4 * abs(outer(1:n_x, 1:n_x, FUN="-")) )
R_yy = exp(-0.4 * abs(outer(1:n_y, 1:n_y, FUN="-")) )
z = mvtnorm::rmvnorm(1, sigma=kronecker(R_xx,R_yy) )

# Simulate nuissance parameter z from oscillatory (day-night) process
w = sample(1:n_w, replace=TRUE, size=length(z))
Data = data.frame( expand.grid(x=1:n_x, y=1:n_y), w=w, z=as.vector(z) + cos(w/n_w*2*pi))
Data$n = Data$z + rnorm(nrow(Data), sd=1)

# Add columns for multivariate and temporal dimensions
Data$var = "n"

# make mesh
mesh = fmesher::fm_mesh_2d( Data[,c('x','y')], n=100 )

# fit model
out = tinyVAST( data = Data,
                formula = n ~ s(w),
                spatial_graph = mesh,
                sem = "n <-> n, sd_n" )