Software

\(\trainee\): Student trainee. \(\cotrainee\): Student co-trainee. \(\coraut\): Package maintainer. \(\eqaut\): Equally-contributing authors.

“pfjax: Particle filtering in JAX.”

Lysy\(\coraut\) M, Subramani\(\trainee\) P, Ramkissoon\(\trainee\) J, Wu\(\trainee\) M, Ko\(\trainee\) M, Choptra\(\trainee\) K (2022). Programming language(s): \(\textsf{Python}\). Version: 0.0.2. GitHub, website.

A collection of tools for estimating the parameters of state-space models using particle filtering methods, with JAX as the backend for JIT-compiling models and automatic differentiation.
“subdiff: Subdiffusive Modeling in Passive Particle-Tracking Microrheology.”

Lysy\(\coraut\) M, Ling\(\trainee\) Y (2021). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 0.0.1.9002. GitHub, website.

Tools for implementing various models for particle subdiffusion in biological fluids. In addition to the well-known semiparametric least-squares estimator based on the mean square displacement (MSD), the package provides functions for simulation, inference, and goodness-of-fit for two fully-parametric subdiffusion models: fractional Brownian motion and the generalized Langevin equation (GLE)… model with Rouse memory kernel. A generic model class allows users to easily implement custom subdiffusion models, with a ready-made framework for drift, high-frequency error correction, and efficient maximum likelihood estimation.
“aq2020: Statistical Analysis of Air Quality Data.”

Lysy\(\coraut\) M, Al-Abadleh HA, Neil L, Patel P, Mohammed W, Khalaf Y (2021). Programming language(s): \(\textsf{R}\). Version: 0.0.0.9000. GitHub, website.

Provides the statistical tools created for the analysis in Al-Abadleh et al (2021) doi:10.1016/j.jhazmat.2021.125445, including step-by-step instructions to replicate the p-value calculations and figures.
“SpatialGEV: Efficient parameter estimation for spatial extreme value models.”

Chen\(\cotrainee\) M, Lysy M, Ramezan R (2021). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 1.0.0. CRAN, GitHub.

Fit latent variable models with the GEV distribution as the data likelihood and the GEV parameters following latent Gaussian processes. The models in this package are built using the template model builder ‘TMB’ in R, which has the fast ability to integrate out the latent variables using Laplace approximation. This… package allows the users to choose in the fit function which GEV parameter(s) is considered as a spatially varying random effect following a Gaussian process, so the users can fit spatial GEV models with different complexities to their dataset without having to write the models in ‘TMB’ by themselves. This package also offers methods to sample from both fixed and random effects posteriors as well as the posterior predictive distributions at different spatial locations. Methods for fitting this class of models are described in Chen, Ramezan, and Lysy (2021) <arXiv:2110.07051>.
“realPSD: Robust and Efficient Calibration of Parametric Power Spectral Density Models.”

Zhu\(\trainee\) F, Lysy\(\coraut\) M (2021). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 0.0.0.9000. GitHub.

Provides tools for parametric inference for power spectral densitie models fit to high-throughput data. Three estimators are implemented: maximum likelihood estimate (MLE) using the Whittle likelihood, nonlinear least-squares (NLS), and log-periodogram regression (LP). For arbitrary user-supplied models, the corresponding objective functions for each estimator are provided with gradients… and hessians using automatic differentiation via the R/C++ package ‘TMB’. Also provided are are routines for the automated removal of periodic noise produced by instrument electronics, which typically contaminates recordings of high-throughput data.
“flexEL: Flexible Empirical Likelihood Methods for Regression.”

Huang\(\trainee\) S, Lysy\(\coraut\) M (2021). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 0.0.0.1. GitHub.

Various tools for implementing and calibrating empirical likelihood models. In particular, provides the loglikelihood and gradient functions for arbitrary moment constraint matrices. The inner optimization problem is efficiently computed in C++ using the ‘Eigen’ linear algebra library. Also provides functions for implementing right-censored regression models, where the… inner optimization is conducted via expectation-maximation. Users may interface with the library through R or directly through C++, as the underlying C++ code is exposes as a standalone header-only library.
“SuperGauss: Superfast Likelihood Inference for Stationary Gaussian Time Series.”

Ling\(\trainee\) Y, Lysy\(\coraut\) M (2020). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 2.0.3. GitHub, CRAN.

Likelihood evaluations for stationary Gaussian time series are typically obtained via the Durbin-Levinson algorithm, which scales as O(n^2) in the number of time series observations. This package provides a “superfast” O(n log^2 n) algorithm written in C++, crossing over with Durbin-Levinson around n = 300. Efficient implementations of… the score and Hessian functions are also provided, leading to superfast versions of inference algorithms such as Newton-Raphson and Hamiltonian Monte Carlo. The C++ code provides a Toeplitz matrix class packaged as a header-only library, to simplify low-level usage in other packages and outside of R.
“rodeo: Probabilistic numerical solution of ordinary differential equations.”

Wu\(\trainee\) M, Lysy\(\coraut\) M (2020). Programming language(s): \(\textsf{Python}\)/\(\textsf{C++}\). Version: 0.4. GitHub, website.

A collection of various probabilistic numeric algorithms to solve ordinary differential equation initial value problems.
“kalmantv: Fast and lightweight Kalman filtering and smoothing.”

Wu\(\trainee\) M, Lysy\(\coraut\) M (2020). Programming language(s): \(\textsf{Python}\)/\(\textsf{C++}\). Version: 0.2.4.9000. GitHub, PyPi, website.

Provides a simple and fast Python interface to the time-varying Kalman filtering and smoothing algorithms.
“svcommon: Fast Inference for Common-Factor Stochastic Volatility Models.”

Lysy M (2020). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 0.1.0. GitHub, website.

Provides various tools for estimating the parameters of the common-factor multivariate stochastic volatility model of Fang et al (2020) doi:10.1002/cjs.11536 and extensions. In particular, the complete-data likelihood implementation scales linearly in the number of assets, and latent volatilities are efficiently marginalized using the Laplace approximation in the R package… ‘TMB’ with very high accuracy. Combined with a carefully initialized block coordinate descent algorithm, maximum likelihood estimation can be conducted two orders of magnitude faster than with alternative parameter inference algorithms.
“mniw: The Matrix-Normal Inverse-Wishart Distribution.”

Lysy\(\coraut\) M, Yates\(\trainee\) B (2019). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 1.0. CRAN, website.

Density evaluation and random number generation for the Matrix-Normal Inverse-Wishart (MNIW) distribution, as well as the the Matrix-Normal, Matrix-T, Wishart, and Inverse-Wishart distributions. Core calculations are implemented in a portable (header-only) C++ library, with matrix manipulations using the ‘Eigen’ library for linear algebra. Also provided is a Gibbs… sampler for Bayesian inference on a random-effects model with multivariate normal observations.
“LMN: Inference for Linear Models with Nuisance Parameters.”

Lysy\(\coraut\) M, Yates\(\trainee\) B (2019). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 1.1.2. CRAN, GitHub.

Efficient Frequentist profiling and Bayesian marginalization of parameters for which the conditional likelihood is that of a multivariate linear regression model. Arbitrary inter-observation error correlations are supported, with optimized calculations provided for independent-heteroskedastic and stationary dependence structures.
“TMBtools: Tools for Developing R Packages Interfacing with ‘TMB’.”

Lysy M (2019). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 0.9.2. GitHub.

Provides helper functions for creating packages which contain ‘TMB’ source code.
“losmix: Inference for Gaussian Location-Scale Mixed-Effects Models.”

Lysy M (2019). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 0.0.1. GitHub.

Tools for fitting Gaussian linear mixed-effects models where both the mean and variance terms are random. Marginal likelihoods are compiled in C++ with automatic differentiation using the ‘TMB’ library, such that efficient gradient-based optimization algorithms can be readily employed. The package provides C++ entry points for users to… extend the basic model with minimal effort.
“LocalCop: Local Likelihood Inference for Conditional Copula Models.”

Lysy\(\coraut\) M, Acar EF (2019). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 0.0.0.9000. GitHub, website.

Implements a local likelihood estimator for the dependence parameter in bivariate conditional copula models. Copula family and local likelihood bandwidth parameters are selected by leave-one-out cross-validation. The models are implemented in ‘TMB’, meaning that the local score function is efficiently calculated via automated differentiation (AD), such that quasi-Newton… algorithms may be used for parameter estimation.
“hlm: Inference for a Heteroscedastic Linear Model.”

Lysy\(\coraut\) M, You\(\trainee\) T, Wang\(\trainee\) Y (2019). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 0.0.0.9000. GitHub, website.

Provides efficient algorithms to compute the MLE of a heteroscedastic regression model, with linear mean and log-linear variance covariates. An Expectation-Maximization algorithm is provided for right-censored responses, which are expected in the context of survival analysis for the corresponding heteroscedastic accelerated failure-time model of Wang et al (2019).
“rdoxygen: Create Doxygen Documentation for Source Code.”

Schmid C, Lysy\(\coraut\) M (2019). Programming language(s): \(\textsf{R}\). Version: 2.0.0. GitHub.

Create ‘doxygen’ documentation for source code in R packages, and optionally make it accessible as an R vignette. Includes a ‘RStudio’ Addin to easily trigger the doxygenize process.
“GaussCop: Utilities for the Gaussian Copula Distribution.”

Lysy\(\coraut\) M, Ramkissoon\(\trainee\) J (2019). Programming language(s): \(\textsf{R}\). Version: 0.9. GitHub, website.

Provides functions for density evaluation and both joint and conditional random sampling from a Gaussian Copula distribution with arbitrary margins. This is achieved by storing univariate marginal distributions as extended density (‘xDensity’) objects which provide ‘d/p/q/r’ methods, and which can be estimated from data using a variety of parametric, nonparametric,… and semiparametric approaches.
“rstantools: Tools for developing packages interfacing with .”

Gabry J, Goodrich B, Lysy M, Siegert S (2018). Programming language(s): \(\textsf{R}\)/\(\textsf{Stan}\). Version: 2.2.0. CRAN, GitHub.

Provides various tools for developers of R packages interfacing with ‘Stan’ https://mc-stan.org, including functions to set up the required package structure, S3 generics and default methods to unify function naming across ‘Stan’-based R packages, and vignettes with recommendations for developers.
“optimCheck: Graphical and Numerical Checks for Mode-Finding Routines.”

Lysy M (2018). Programming language(s): \(\textsf{R}\). Version: 1.0. CRAN, GitHub, website.

Tools for checking that the output of an optimization algorithm is indeed at a local mode of the objective function. This is accomplished graphically by calculating all one-dimensional “projection plots” of the objective function, i.e., varying each input variable one at a time with all other elements of the… potential solution being fixed. The numerical values in these plots can be readily extracted for the purpose of automated and systematic unit-testing of optimization routines.
“PK1: Mixed Effect SDE Inference for the One-Compartment Pharmacodynamic Model.”

Lysy\(\coraut\) M, Rai\(\trainee\) K (2018). Programming language(s): \(\textsf{R}\)/\(\textsf{Stan}\). Version: 0.0.0.9011. GitHub.

Simulation and inference for the PK1 model featuring fixed/mixed effects, SDE/ODE implementations, and with/without measurement error.
“MADPop: MHC Allele-Based Differencing Between Populations.”

Lysy\(\coraut\) M, Kim\(\trainee\) PWJ, Robinson T (2018). Programming language(s): \(\textsf{R}\)/\(\textsf{Stan}\). Version: 1.1.2. CRAN, GitHub.

Tools for the analysis of population differences using the Major Histocompatibility Complex (MHC) genotypes of samples having a variable number of alleles (1-4) recorded for each individual. A hierarchical Dirichlet-Multinomial model on the genotype counts is used… to pool small samples from multiple populations for pairwise tests of equality. Bayesian inference is implemented via the ‘rstan’ package. Bootstrapped and posterior p-values are provided for chi-squared and likelihood ratio tests of equal genotype probabilities.
“msde: Bayesian Inference for Multivariate Stochastic Differential Equations.”

Lysy\(\coraut\) M, Zhu\(\trainee\) F, Tong\(\trainee\) J, Kitt\(\trainee\) T, Delaney N (2017). Programming language(s): \(\textsf{R}\)/\(\textsf{C++}\). Version: 1.0.5. GitHub, website.

Implements an MCMC sampler for the posterior distribution of arbitrary time-homogeneous multivariate stochastic differential equation (SDE) models with possibly latent components. The package provides a simple entry point to integrate user-defined models directly with the sampler’s C++ code, and parallelizes large portions of the calculations when compiled with ‘OpenMP’.
“nicheROVER: Niche Region and Niche Overlap Metrics for Multidimensional Ecological Niches.”

Lysy\(\coraut\) M, Statsko\(\cotrainee\) AD, Swanson HK (2014). Programming language(s): \(\textsf{R}\). Version: 1.1.1. CRAN, GitHub.

Implementation of a probabilistic method to calculate ‘nicheROVER’ (niche _r_egion and niche _over_lap) metrics using multidimensional niche indicator data (e.g., stable isotopes, environmental variables, etc.). The niche region is defined as the joint probability density function of the multidimensional niche indicators at a user-defined probability alpha (e.g., 95%). Uncertainty… is accounted for in a Bayesian framework, and the method can be extended to three or more indicator dimensions. It provides directional estimates of niche overlap, accounts for species-specific distributions in multivariate niche space, and produces unique and consistent bivariate projections of the multivariate niche region. The article by Swanson et al. (2015) doi:10.1890/14-0235.1 provides a detailed description of the methodology. See the package vignette for a worked example using fish stable isotope data.