Computes the exact Euler loglikelihood for any amount of missing data using a Kalman filter.

mou.loglik(X, dt, nvar.obs, Gamma, Lambda, Phi, mu0, Sigma0)

Arguments

X

An nobs x ndims matrix of complete data.

dt

A scalar or length nobs-1 vector of interobservations times.

nvar.obs

A scalar or length nobs vector of integers between 0 and ndims denoting the number of observed SDE variables in each row of data. Defaults to ndims. See sde.init() for details.

Gamma

A ndims x ndims of linear-drift parameters. See Details.

Lambda

A length-ndims vector of constant-drift parameters. See Details.

Phi

A ndims x ndims positive definite variance matrix. See Details.

mu0, Sigma0

Mean and variance of marginal multivariate normal distribution of X[1,]. Defaults to iid standard normals for each component.

Value

Scalar value of the loglikelihood. See Details.

Details

The \(p\)-dimensional multivariate Ornstein-Uhlenbeck (mOU) process \(Y_t = (Y_{1t}, \ldots, Y_{dt})\) satisfies the SDE $$ dY_t = (\Gamma Y_t + \Lambda)dt + \Phi^{1/2} dB_t, $$ where \(B_t = (B_{1t}, \ldots, B_{pt})\) is \(p\)-dimensional Brownian motion. Its Euler discretization is of the form $$ Y_{n+1} = Y_n + (\Gamma Y_n + \Lambda) \Delta_n + \Phi^{1/2} \Delta B_n, $$ where \(Y_n = Y(t_n)\), \(\Delta_n = t_{n+1} - t_n\) and $$ \Delta B_n = B(t_{n+1}) - B(t_n) \stackrel{\textnormal{ind}}{\sim} \mathcal N(0, \Delta_n). $$ Thus, \(Y_0, \ldots, Y_N\) is multivariate normal Markov chain for which the marginal distribution of any subset of timepoints and/or components can be efficiently calculated using the Kalman filter. This can be used to check the MCMC output of sde.post() as in the example.

Examples

# \donttest{ # bivariate OU model bmod <- sde.examples("biou") # simulate some data # true parameter values Gamma0 <- .1 * crossprod(matrix(rnorm(4),2,2)) Lambda0 <- rnorm(2) Phi0 <- crossprod(matrix(rnorm(4),2,2)) Psi0 <- chol(Phi0) # precompiled model uses the Cholesky scale theta0 <- c(Gamma0, Lambda0, Psi0[c(1,3,4)]) names(theta0) <- bmod$param.names # initial value Y0 <- rnorm(2) names(Y0) <- bmod$data.names # simulation dT <- runif(1, max = .1) # time step nObs <- 10 bsim <- sde.sim(bmod, x0 = Y0, theta = theta0, dt = dT, dt.sim = dT, nobs = nObs)
#> Number of normal draws required: 10
#> Running simulation...
#> Execution time: 0 seconds.
#> Bad Draws = 0
YObs <- bsim$data # inference via MCMC binit <- sde.init(bmod, x = YObs, dt = dT, theta = theta0, nvar.obs = 1) # second component is unobserved # only Lambda1 is unknown fixed.params <- rep(TRUE, bmod$nparams) names(fixed.params) <- bmod$param.names fixed.params["Lambda1"] <- FALSE # prior on (Lambda1, Y_0) hyper <- list(mu = c(0,0), Sigma = diag(2)) names(hyper$mu) <- bmod$data.names dimnames(hyper$Sigma) <- rep(list(bmod$data.names), 2) # posterior sampling nsamples <- 1e5 burn <- 1e3 bpost <- sde.post(bmod, binit, hyper = hyper, fixed.params = fixed.params, nsamples = nsamples, burn = burn)
#> Output size:
#> params = 9e+05
#> data = 40000
#> Running posterior sampler...
#> Execution time: 0.3 seconds.
#> Bridge accept: min = 74.8%, avg = 87.6%
#> Lambda1 accept: 44.2%
#> Y20 accept: 100%
L1.mcmc <- bpost$params[,"Lambda1"] # analytic posterior L1.seq <- seq(min(L1.mcmc), max(L1.mcmc), len = 500) L1.loglik <- sapply(L1.seq, function(l1) { lambda <- Lambda0 lambda[1] <- l1 mou.loglik(X = YObs, dt = dT, nvar.obs = 1, Gamma = Gamma0, Lambda = lambda, Phi = Phi0, mu0 = hyper$mu, Sigma0 = hyper$Sigma) }) # normalize density L1.Kalman <- exp(L1.loglik - max(L1.loglik)) L1.Kalman <- L1.Kalman/sum(L1.Kalman)/(L1.seq[2]-L1.seq[1]) # compare hist(L1.mcmc, breaks = 100, freq = FALSE, main = expression(p(Lambda[1]*" | "*bold(Y)[1])), xlab = expression(Lambda[1]))
lines(L1.seq, L1.Kalman, col = "red")
legend("topright", legend = c("Analytic", "MCMC"), pch = c(NA, 22), lty = c(1, NA), col = c("red", "black"))
# }