Fits a vector autoregressive spatio-temporal model using a minimal feature-set and a widely used interface.
Usage
tinyVAST(
formula,
data,
sem = NULL,
dsem = NULL,
family = gaussian(),
space_columns = c("x", "y"),
spatial_graph = NULL,
time_column = "time",
times = NULL,
variable_column = "var",
variables = NULL,
distribution_column = "dist",
delta_options = list(delta_formula = ~1),
spatial_varying = NULL,
control = tinyVASTcontrol(),
...
)
Arguments
- formula
Formula with response on left-hand-side and predictors on right-hand-side, parsed by
mgcv
and hence allowings(.)
for splines oroffset(.)
for an offset.- data
Data-frame of predictor, response, and offset variables. Also includes variables that specify space, time, variables, and the distribution for samples, as identified by arguments
variable_column
,time_column
,space_columns
, anddistribution_column
.- sem
Specification for structural equation model structure for constructing a space-variable interaction.
sem=NULL
disables the space-variable interaction; seemake_sem_ram()
.- dsem
Specification for time-series structural equation model structure including lagged or simultaneous effects for constructing a space-variable interaction.
dsem=NULL
disables the space-variable interaction; seemake_dsem_ram()
ormake_eof_ram()
.- family
A function returning a class
family
, includinggaussian()
,lognormal()
,tweedie()
,binomial()
,Gamma()
, orpoisson()
. Alternatively, can be a named list of these functions, with names that match levels ofdata$distribution_column
to allow different families by row of data. Delta model families are possible, and seeFamilies
for delta-model options,- space_columns
A string or character vector that indicates the column(s) of
data
indicating the location of each sample. Whenspatial_graph
is anigraph
object,space_columns
is a string with with levels matching the names of vertices of that object. Whenspatial_graph
is anfmesher
orsfnetwork
object, space_columns is a character vector indicating columns ofdata
with coordinates for each sample.- spatial_graph
Object that represents spatial relationships, either using
fmesher::fm_mesh_2d()
to apply the SPDE method,igraph::make_empty_graph()
for independent time-series,igraph::make_graph()
to apply a simultaneous autoregressive (SAR) process,sfnetwork_mesh()
for stream networks, orNULL
to specify a single site.- time_column
A character string indicating the column of
data
listing the time-interval for each sample, from the set of times in argumenttimes
.- times
A integer vector listing the set of times in order. If
times=NULL
, then it is filled in as the vector of integers from the minimum to maximum value ofdata$time
.- variable_column
A character string indicating the column of
data
listing the variable for each sample, from the set of times in argumentvariables
.- variables
A character vector listing the set of variables. if
variables=NULL
, then it is filled in as the unique values fromdata$variable_columns
.- distribution_column
A character string indicating the column of
data
listing the distribution for each sample, from the set of names in argumentfamily
. ifvariables=NULL
, then it is filled in as the unique values fromdata$variables
.- delta_options
a named list with slots for
delta_formula
,delta_sem
, anddelta_dsem
. These follow the same format asformula
,sem
, anddsem
, but specify options for the second linear predictor of a delta model, and are only used (or estimable) when adelta family
is used for some samples.- spatial_varying
a formula specifying spatially varying coefficients.
- control
Output from
tinyVASTcontrol()
, used to define user settings.- ...
Not used.
Details
tinyVAST
includes four basic inputs that specify the model structure:
formula
specifies covariates and splines in a Generalized Additive Model;dsem
specifies interactions among variables and over time, constructing the space-time-variable interaction.sem
specifies interactions among variables and over time, constructing the space-variable interaction.spatial_graph
specifies spatial correlations
the default dsem=NULL
turns off all multivariate and temporal indexing, such
that spatial_graph
is then ignored, and the model collapses
to a standard model using gam
. To specify a univariate spatial model,
the user must specify both spatial_graph
and dsem=""
, where the latter
is then parsed to include a single exogenous variance for the single variable
Model type | How to specify |
Generalized additive model | specify spatial_graph=NULL and dsem="" , and then use formula to specify splines and covariates |
Dynamic structural equation model (including vector autoregressive, dynamic factor analysis, ARIMA, and structural equation models) | specify spatial_graph=NULL and use dsem to specify interactions among variables and over time |
Univariate spatial model | specify spatial_graph and dsem="" , where the latter is then parsed to include a single exogenous variance for the single variable |
Multivariate spatial model | specify spatial_graph and use dsem (without any lagged effects) to specify spatial interactions |
Vector autoregressive spatio-temporal model | specify spatial_graph and use dsem="" to specify interactions among variables and over time, where spatio-temporal variables are constructed via the separable interaction of dsem and spatial_graph |
See also
Details section of make_dsem_ram()
for a summary of the math involved with constructing the DSEM, and doi:10.1111/2041-210X.14289
for more background on math and inference
doi:10.48550/arXiv.2401.10193 for more details on how GAM, SEM, and DSEM components are combined from a statistical and software-user perspective
summary.tinyVAST()
to visualize parameter estimates related to SEM and DSEM model components
Examples
# Simulate a 2D AR1 spatial process with a cyclic confounder w
n_x = n_y = 25
n_w = 10
R_xx = exp(-0.4 * abs(outer(1:n_x, 1:n_x, FUN="-")) )
R_yy = exp(-0.4 * abs(outer(1:n_y, 1:n_y, FUN="-")) )
z = mvtnorm::rmvnorm(1, sigma=kronecker(R_xx,R_yy) )
# Simulate nuissance parameter z from oscillatory (day-night) process
w = sample(1:n_w, replace=TRUE, size=length(z))
Data = data.frame( expand.grid(x=1:n_x, y=1:n_y), w=w, z=as.vector(z) + cos(w/n_w*2*pi))
Data$n = Data$z + rnorm(nrow(Data), sd=1)
# Add columns for multivariate and temporal dimensions
Data$var = "n"
# make mesh
mesh = fmesher::fm_mesh_2d( Data[,c('x','y')], n=100 )
# fit model
out = tinyVAST( data = Data,
formula = n ~ s(w),
spatial_graph = mesh,
sem = "n <-> n, sd_n" )