Skip to contents

Fits a vector autoregressive spatio-temporal (VAST) model using a minimal feature-set and a widely used interface.

Usage

tinyVAST(
  formula,
  data,
  time_term = NULL,
  space_term = NULL,
  spacetime_term = NULL,
  family = gaussian(),
  space_columns = c("x", "y"),
  spatial_domain = NULL,
  time_column = "time",
  times = NULL,
  variable_column = "var",
  variables = NULL,
  distribution_column = "dist",
  delta_options = list(formula = ~1),
  spatial_varying = NULL,
  weights = NULL,
  control = tinyVASTcontrol(),
  ...
)

Arguments

formula

Formula with response on left-hand-side and predictors on right-hand-side, parsed by mgcv and hence allowing s(.) for splines or offset(.) for an offset.

data

Data-frame of predictor, response, and offset variables. Also includes variables that specify space, time, variables, and the distribution for samples, as identified by arguments variable_column, time_column, space_columns, and distribution_column.

time_term

Specification for time-series structural equation model structure for constructing a time-variable interaction that defines a time-varying intercept for each variable (i.e., applies uniformly across space). time_term=NULL disables the space-variable interaction; see make_dsem_ram() for notation.

space_term

Specification for structural equation model structure for constructing a space-variable interaction. space_term=NULL disables the space-variable interaction; see make_sem_ram() for notation.

spacetime_term

Specification for time-series structural equation model structure including lagged or simultaneous effects for constructing a time-variable interaction, which is then combined in a separable process with the spatial correlation to form a space-time-variable interaction (i.e., the interaction occurs locally at each site). spacetime_term=NULL disables the space-variable interaction; see make_dsem_ram() or make_eof_ram().

family

A function returning a class family, including gaussian(), lognormal(), tweedie(), binomial(), Gamma(), poisson(), nbinom1(), or nbinom2(). Alternatively, can be a named list of these functions, with names that match levels of data$distribution_column to allow different families by row of data. Delta model families are possible, and see Families for delta-model options,

space_columns

A string or character vector that indicates the column(s) of data indicating the location of each sample. When spatial_domain is an igraph object, space_columns is a string with with levels matching the names of vertices of that object. When spatial_domain is an fmesher or sfnetwork object, space_columns is a character vector indicating columns of data with coordinates for each sample.

spatial_domain

Object that represents spatial relationships, either using fmesher::fm_mesh_2d() to apply the SPDE method, igraph::make_empty_graph() for independent time-series, igraph::make_graph() to apply a simultaneous autoregressive (SAR) process, sfnetwork_mesh() for stream networks, or NULL to specify a single site. If using igraph then the graph must have vertex names V(graph)$name that match levels of data[,'space_columns']

time_column

A character string indicating the column of data listing the time-interval for each sample, from the set of times in argument times.

times

A integer vector listing the set of times in order. If times=NULL, then it is filled in as the vector of integers from the minimum to maximum value of data$time.

variable_column

A character string indicating the column of data listing the variable for each sample, from the set of times in argument variables.

variables

A character vector listing the set of variables. if variables=NULL, then it is filled in as the unique values from data$variable_columns.

distribution_column

A character string indicating the column of data listing the distribution for each sample, from the set of names in argument family. if variables=NULL, then it is filled in as the unique values from data$variables.

delta_options

a named list with slots for formula, space_term, and spacetime_term. These specify options for the second linear predictor of a delta model, and are only used (or estimable) when a delta family is used for some samples.

spatial_varying

a formula specifying spatially varying coefficients.

weights

A numeric vector representing optional likelihood weights for the data likelihood. Weights do not have to sum to one and are not internally modified. Thee weights argument needs to be a vector and not a name of the variable in the data frame.

control

Output from tinyVASTcontrol(), used to define user settings.

...

Not used.

Details

tinyVAST includes four basic inputs that specify the model structure:

  • formula specifies covariates and splines in a Generalized Additive Model;

  • space_term specifies interactions among variables and over time, constructing the space-variable interaction.

  • spacetime_term specifies interactions among variables and over time, constructing the space-time-variable interaction.

  • spatial_domain specifies spatial correlations

the default spacetime_term=NULL and space_term=NULL turns off all multivariate and temporal indexing, such that spatial_domain is then ignored, and the model collapses to a generalized additive model using gam. To specify a univariate spatial model, the user must specify spatial_domain and either space_term="" or spacetime_term="", where the latter two are then parsed to include a single exogenous variance for the single variable

Model typeHow to specify
Generalized additive modelspecify spatial_domain=NULL space_term="" and spacetime_term="", and then use formula to specify splines and covariates
Dynamic structural equation model (including vector autoregressive, dynamic factor analysis, ARIMA, and structural equation models)specify spatial_domain=NULL and use spacetime_term to specify interactions among variables and over time
Univariate spatio-temporal model, or multiple independence spatio-temporal variablesspecify spatial_domain and spacetime_term="", where the latter is then parsed to include a single exogenous variance for the single variable
Multivariate spatial model including interactionsspecify spatial_domain and use space_term to specify spatial interactions
Vector autoregressive spatio-temporal model (i.e., lag-1 interactions among variables)specify spatial_domain and use spacetime_term="" to specify interactions among variables and over time, where spatio-temporal variables are constructed via the separable interaction of spacetime_term and spatial_domain

See also

Details section of make_dsem_ram() for a summary of the math involved with constructing the DSEM, and doi:10.1111/2041-210X.14289 for more background on math and inference

doi:10.48550/arXiv.2401.10193 for more details on how GAM, SEM, and DSEM components are combined from a statistical and software-user perspective

summary.tinyVAST() to visualize parameter estimates related to SEM and DSEM model components

Examples

# Simulate a seperable two-dimensional AR1 spatial process
n_x = n_y = 25
n_w = 10
R_xx = exp(-0.4 * abs(outer(1:n_x, 1:n_x, FUN="-")) )
R_yy = exp(-0.4 * abs(outer(1:n_y, 1:n_y, FUN="-")) )
z = mvtnorm::rmvnorm(1, sigma=kronecker(R_xx,R_yy) )

# Simulate nuissance parameter z from oscillatory (day-night) process
w = sample(1:n_w, replace=TRUE, size=length(z))
Data = data.frame( expand.grid(x=1:n_x, y=1:n_y), w=w, z=as.vector(z) + cos(w/n_w*2*pi))
Data$n = Data$z + rnorm(nrow(Data), sd=1)

# Add columns for multivariate and/or temporal dimensions
Data$var = "n"

# make SPDE mesh for spatial term
mesh = fmesher::fm_mesh_2d( Data[,c('x','y')], n=100 )

# fit model with cyclic confounder as GAM term
out = tinyVAST( data = Data,
                formula = n ~ s(w),
                spatial_domain = mesh,
                space_term = "n <-> n, sd_n" )