Skip to contents

Bivariate generalized linear mixed model structure

tinyVAST is a bivariate extension of a generalized linear mixed model (see Tables 1 and 2 for notation), which includes two linear predictors that each include four additive components:

  1. Spatial interactions among variables: The user can specify interactions among variables at a given site (or for spatially correlated latent variables) using arrow notation derived from path analysis, based on the interface from R package sem;

  2. Temporal interaction among variables: The user can specify simultaneous and lagged interactions among variables over time using an expanded arrow-and-lag notation that is derived from R package dsem, where these interactions the annual intercept for a given variable therefore apply uniformly for all locations.

  3. Spatio-temporal interactions among variables: The user can specify simultaneous and lagged interactions among variables over time, where these interactions occur on a site-by-site basis.

  4. Generalized additive model: The user specifies a formula for a generalized additive model (GAM) that is interpreted by R package mgcv. If other model components are missing, tinyVAST estimates parameters that are similar to mgcv, with small differences resulting from different methods for parameter estimation;

These four components are assembled into two linear predictors:

p1,i=𝐗1,i𝛂1+𝐙1,i𝛄1+𝐀i𝛀1,c[i]+𝐃1,c[i],t[i]+𝐀i𝐄1,c[i],t[i]p2,i=𝐗2,i𝛂2+𝐙2,i𝛄2+𝐀i𝛀2,c[i]+𝐃2,c[i],t[i]+𝐀i𝐄2,c[i],t[i] \begin{aligned} p_{\mathrm 1,i} &= \mathbf X_{\mathrm 1,i} \mathbf\alpha_{\mathrm 1} + \mathbf Z_{\mathrm 1,i} \mathbf\gamma_{\mathrm 1} + \mathbf A_{i} \mathbf\Omega_{\mathrm 1,c[i]} + \mathbf D_{\mathrm 1,c[i],t[i]} + \mathbf A_i \mathbf E_{\mathrm 1,c[i],t[i]} \\ p_{\mathrm 2,i} &= \mathbf X_{\mathrm 2,i} \mathbf\alpha_{\mathrm 2} + \mathbf Z_{\mathrm 2,i} \mathbf\gamma_{\mathrm 2} + \mathbf A_{i} \mathbf\Omega_{\mathrm 2,c[i]} + \mathbf D_{\mathrm 2,c[i],t[i]} + \mathbf A_i \mathbf E_{\mathrm 2,c[i],t[i]} \end{aligned}

where

  • p1,ip_{\mathrm 1,i} is the first linear predictor;
  • 𝐗1,i𝛂1+𝐙1,i𝛄1\mathbf X_{\mathrm 1,i} \mathbf\alpha_{\mathrm 1} + \mathbf Z_{\mathrm 1,i} \mathbf\gamma_{\mathrm 1} is the GAM component
  • 𝛂1\mathbf\alpha_{\mathrm 1} is the GAM fixed effects with associated design matrix 𝐗1\mathbf X_{\mathrm 1}, where 𝐗1,i\mathbf X_{\mathrm 1,i} is the row of the design matrix for sample ii;
  • 𝛄1\mathbf\gamma_{\mathrm 1} is the GAM random effects with associated design matrix 𝐙\mathbf Z, where 𝐙1,i\mathbf Z_{\mathrm 1,i} is the row of the design matrix;
  • 𝐀i𝛀1,c[i]\mathbf A_{i} \mathbf\Omega_{\mathrm 1,c[i]} is the projection of the space-veriable interaction 𝛀\mathbf\Omega to sample ii;
  • 𝐃1,c[i],t[i]\mathbf D_{\mathrm 1,c[i],t[i]} is the time-variable interaction 𝐃\mathbf D for sample ii;
  • 𝐀i𝐄1,c[i],t[i]\mathbf A_i \mathbf E_{\mathrm 1,c[i],t[i]} is the projection of the space-variable-time interaction 𝐄\mathbf E to sample ii;
  • p2,ip_{\mathrm 2,i} is the second linear predictor, and terms are defined similarly but using subscript-2;

The linear predictors are then passed through a bivariate inverse-link function to specify the distribution for errors:

yi∌fe[i](ge[i]−1(p1,i,p2,i),Ξe[i]) y_i \sim f_{e[i]}( g_{e[i]}^{-1} (p_{\mathrm 1,i}, p_{\mathrm 2,i}), \theta_{e[i]} )

where

  • fe[i]f_{e[i]} probability density or mass function for sample ii;
  • ge[i]−1(p1,i,p2,i)g_{e[i]}^{-1} (p_{\mathrm 1,i}, p_{\mathrm 2,i}) is the bivariate inverse-link function that transforms linear predictors to the central tendancy parameters of the distribution;
  • Ξe[i]\theta_{e[i]} is the dispersion parameters for distribution e[i]e[i];

In the simple case, the distribution only requires a single linear predictor such that đ©2=𝟎\mathbf p_{\mathrm 2} = \mathbf 0 by construction and drops out of the model. In this case, the model collapses to a generalized linear mixed model. For example we might have a log-link and Poisson distribution, such that it collapses to a log-linked Poisson GLMM, yi∌Poisson(ep1,i)y_i \sim \mathrm{Poisson}(e^{p_{\mathrm 1,i}}).

However, tinyVAST can also handle a delta-model using either logit-log or Poisson-linked link functions:

ge[i]−1(p1,i,p2,i)=(ÎŒ1,i,ÎŒ2,i) g_{e[i]}^{-1} (p_{\mathrm 1,i}, p_{\mathrm 2,i}) = ( \mu_{\mathrm 1,i}, \mu_{\mathrm 2,i} )

where ÎŒ1,i\mu_{\mathrm 1,i} and ÎŒ2,i\mu_{\mathrm 2,i} are the two linear predictors, such that đ”Œ[yi]=ÎŒ1,iÎŒ2,i\mathbb{E}[y_i] = \mu_{\mathrm 1,i} \mu_{\mathrm 2,i};

For the conventional logit-log bivariate link function we obtain:

Ό1,i=ep1,i1+ep1,iΌ2,i=ep2,i \begin{aligned} \mu_{\mathrm 1,i} &= \frac{e^{p_{\mathrm 1,i}}}{1+e^{p_{\mathrm 1,i}}} \\ \mu_{\mathrm 2,i} &= e^{p_{\mathrm 2,i}} \end{aligned}

while for the Poisson-linked link function we obtain:

ÎŒ1,i=1−e−p1,iÎŒ2,i=ep1,iÎŒ1,iep2,i \begin{aligned} \mu_{\mathrm 1,i} &= 1 - e^{-p_{\mathrm 1,i}} \\ \mu_{\mathrm 2,i} &= \frac{e^{p_{\mathrm 1,i}}}{\mu_{\mathrm 1,i}} e^{p_{\mathrm 2,i}} \end{aligned}

In either case, ÎŒ1,i\mu_{\mathrm 1,i} is the Pr(Y>0)\mathrm{Pr}(Y>0) (and is fitted with a Bernoulli distribution), while ÎŒ2,i\mu_{\mathrm 2,i} is the central tendancy for positive values, i.e., đ”Œ(Y|Y>0)=ÎŒ2,i\mathbb{E}(Y | Y>0) = \mu_{\mathrm 2,i}.

Spatial domains

Linear predictors include spatially autocorrelated latent variables. These variables are treated as Gaussian Markov random fields (GMRFs), and evaluating the probability density of GMRFs involves calculating the precision matrix 𝐐domain=đšș−1\mathbf Q_{\mathrm{domain}} = \mathbf\Sigma^{-1} as the inverse of the spatial covariance matrix. tinyVAST involves three options for specifying this spatial precision:

  • Stochastic partial differentiaul equation (SPDE): The analyst can approximate spatial variation over a continuous two-dimensional surface by constructing finite element mesh (FEM), treating the value at vertices as a GMRF, and then using bilinear interpolation (i.e., a piecewise linear approximation) to interpolate from vertices to the spatial domain. In this case, the precision is constructred as:

𝐐domain=τ2(Îș4𝐌0+2Îș2𝐌1+𝐌2) \mathbf Q_{\mathrm{domain}} = \tau^2 ( \kappa^4 \mathbf M_{\mathrm 0} + 2\kappa^2 \mathbf M_{\mathrm 1} + \mathbf M_{\mathrm 2} ) where every row 𝐀i\mathbf A_i of the interpolation matrix 𝐀\mathbf A is nonzero for only the three vertices of the triangle that contains sample ii

  • Simultaneous autoregressive (SAR): Alternatively, the analyst can specify an areal model representing the value within spatial strata:

𝐐domain=τ2(𝐈−Îș𝐀*)2 \mathbf Q_{\mathrm{domain}} = \tau^2 (\mathbf I - \kappa \mathbf A^*)^2 where 𝐀*\mathbf A^* is the adjacency matrix of the graph specified by the analyst, and each row 𝐀i\mathbf A_i of interpolation matrix 𝐀\mathbf A is nonzero only for the single spatial stratum containing sample ii (and noting that adjacency matrix 𝐀*\mathbf A^* is different from interpolation matrix 𝐀\mathbf A, but we use the same term for both due to a collision in standard notation).

  • Stream networks: Finally, the analyst can specify that sites are partially correlated if they are adjacent along a stream network (while ignoring flow direction). This results in an Onstein-Uhlenbeck process along the stream network, or an exponential tail-down model.

Structural equation models

tinyVAST also involves specifying a structural equation model (SEM). This SEM be viewed either:

  1. Weak interpretation: as a flexible way to parameterize the correlation among variables; or
  2. Strong interpretation: as a structural causal model, allowing inference about counterfactual changes to the system.

To specify a SEM, the user uses arrow notation. For example, to specify a linear model this involves:

w1 -> w2, b_12
w1 <-> w1, sd_1
w2 <-> w2, sd_2

This then estimates a single slope parameter (represented with a one-headed arrow), as well as the variance of W1W_1 and W2W_2 (specified with two-headed arrows). In a more complicated case, W1W_1 might cause W2W_2, which in turn causes W3W_3. This is then represented as:

w1 -> w2, b_12
w2 -> w3, b_23
w1 <-> w1, s_1
w2 <-> w2, s_2
w3 <-> w3, s_3

Path coefficients for one-headed arrows then define path matrix 𝐏\mathbf P:

𝐏=(000b12000b230) \mathbf P = \begin{pmatrix} 0 & 0 & 0 \\ b_{12} & 0 & 0 \\ 0 & b_{23} & 0 \end{pmatrix}

and coefficents for two-headed arrows define the Cholesky 𝐆\mathbf G of the exnogenous covariance matrix 𝐆T𝐆\mathbf G^T \mathbf G:

𝐆=(s1000s2000s3) \mathbf G = \begin{pmatrix} s_{1} & 0 & 0 \\ 0 & s_{2} & 0 \\ 0 & 0 & s_{3} \end{pmatrix}

These matrices are define a simultaneous equation model:

$$ \mathbf{ w = P w + \epsilon} \\ \mathbf\epsilon \sim \mathrm{MVN}( \mathbf 0, \mathbf G^T \mathbf G ) $$ where the variance Var(𝐰)=(𝐈−𝐏)−1𝐆2(𝐈−𝐏T)−1\mathrm{Var}(\mathbf w) = (\mathbf{I - P})^{-1} \mathbf G^2 (\mathbf{I - P}^T)^{-1}. This then results in a sparse precision matrix:

𝐐=(𝐈−𝐏T)𝐆−1𝐆−T(𝐈−𝐏) \mathbf Q = (\mathbf{I - P}^T) \mathbf G^{-1} \mathbf G^{-T} (\mathbf{I - P})

Dynamic structural equation models

Similarly, tinyVAST involves specifying dynamic structural equation models (DSEM). To specify a DSEM, the user uses arrow-and-lag notation. For example, to specify a univariate first-order autoregressive process:

w1 -> w1, 1, rho

If there were four time-intervals (T=4T=4) this would then result in the path matrix:

𝐏=(0000ρ0000ρ0000ρ0) \mathbf P = \begin{pmatrix} 0 & 0 & 0 & 0 \\ \rho & 0 & 0 & 0 \\ 0 & \rho & 0 & 0 \\ 0 & 0 & \rho & 0 \end{pmatrix}

and when the DSEM involves multiple times and variables, the sparse precision is formed by summing across the Kronecker product of time-lag and interaction matrices. This DSEM defines a GMRF over a nonseparable interaction of time and variables, represented by a matrix 𝐐time_term\mathbf Q_{\mathrm{time\_term}} with dimension CT×CTCT \times CT. The user can specify a separate arrow-and-lag notation to define the precision matrix for the time-variable interaction 𝐐time_term\mathbf Q_{\mathrm{time\_term}} and for the space-time-variable interaction 𝐐spacetime_term\mathbf Q_{\mathrm{spacetime\_term}}. The time term 𝐐time_term\mathbf Q_{\mathrm{time\_term}} is used to define the time-varying intercept for each variable:

vec(𝐃)∌MVN(𝟎,𝐐time_term) \mathrm{vec}(\mathbf D) \sim \mathrm{MVN}(\mathbf 0, \mathbf Q_{\mathrm{time\_term}}) Meanwhile, the space-time term 𝐐spacetime_term\mathbf Q_{\mathrm{spacetime\_term}} is combined with the spatial precision 𝐐space_term\mathbf Q_{\mathrm{space\_term}} as we explain next.

Spatial interactions for SEM and DSEM

tinyVAST uses the SEM and DSEM notation to construct the joint precision for the space-variable interaction 𝛀\mathbf\Omega with dimension S×CS \times C, and the space-time-variable interaction 𝐄\mathbf E with dimension S×C×TS \times C \times T. To do so, it constructs the separable precision for each process:

vec(𝐄)∌MVN(𝟎,𝐐spacetime_term⊗𝐐domain) \mathrm{vec}(\mathbf E) \sim \mathrm{MVN}(\mathbf 0, \mathbf Q_{\mathrm{spacetime\_term}} \otimes \mathbf Q_{\mathrm{domain}})

where the precision matrix 𝐐spacetime_term⊗𝐐domain\mathbf Q_{\mathrm{spacetime\_term}} \otimes \mathbf Q_{\mathrm{domain}} has dimension STC×STCSTC \times STC to match the length of vec(𝐄)\mathrm{vec}(\mathbf E), and vec(𝛀)∌MVN(𝟎,𝐐space_term⊗𝐐domain) \mathrm{vec}(\mathbf\Omega) \sim \mathrm{MVN}(\mathbf 0, \mathbf Q_{\mathrm{space\_term}} \otimes \mathbf Q_{\mathrm{domain}})

where the precision matrix 𝐐space_term⊗𝐐domain\mathbf Q_{\mathrm{space\_term}} \otimes \mathbf Q_{\mathrm{domain}} has dimension SC×SCSC \times SC, to match the length of vec(𝛀)\mathrm{vec}(\mathbf\Omega).

Generalized additive model

Finally, the analyst can specify a generalized additive model using syntax from package mgcv. For example this might involve:

count ~ year + offset(log_area) + s(depth) + s(species, bs="re")

If year and species are factors and depth and log_area are continuous, then this would specify a fixed effect for each level year, a spline smoother for depth, using log_area as an offset, and estimating a random intercept for each level of species. This formula is parsed internally to assemble fixed effects in design matrix 𝐗\mathbf X and the basis functions for spline smoothers and random effects in design matrix 𝐙\mathbf Z. The coefficients 𝛄\mathbf\gamma associated with smoothers and random effects are then specified to follow a GMRF:

đ›„âˆŒGMRF(𝟎,𝐐gam) \mathbf\gamma \sim \mathrm{GMRF}( \mathbf 0, \mathbf Q_{\mathrm{gam}}) where 𝐐gam\mathbf Q_{\mathrm{gam}} is a blockwise diagonal matrix, assembled from estimated variance parameters and matrices constructed by mgcv.

Table 1: Subscript notation
Symbol Description
ii Index for each sample, ii in (1,2,...,I)(1,2,...,I)
s[i]s[i] spatial coordinate for sample ii, ss in (1,2,...,S)(1,2,...,S)
t[i]t[i] time-interval for sample ii, tt in (1,2,...,T)(1,2,...,T)
c[i]c[i] category for sample ii, cc in (1,2,...,C)(1,2,...,C)
e[i]e[i] error distribution and link function for sample ii
Table 2: Symbol notation, code representation (in model output or in model template code), and descriptions.
Symbol Code Description
yy y_i Observed response data
p1p_1 p_i first linear predictor
p2p_2 p2_i second linear predictor