Wishart Distribution

The Wishart distribution on a random positive-definite matrix 𝐗q×q{\boldsymbol{X}}_{q\times q} is is denoted 𝐗Wish(𝚿,ν){\boldsymbol{X}}\sim \operatorname{Wish}({\boldsymbol{\Psi}}, \nu), and defined as 𝐗=(𝐋𝐙)(𝐋𝐙){\boldsymbol{X}}= ({\boldsymbol{L}}{\boldsymbol{Z}})({\boldsymbol{L}}{\boldsymbol{Z}})', where:

  • 𝚿q×q=𝐋𝐋{\boldsymbol{\Psi}}_{q\times q} = {\boldsymbol{L}}{\boldsymbol{L}}' is the positive-definite matrix scale parameter,

  • ν>q\nu > q is the shape parameter,

  • 𝐙q×q{\boldsymbol{Z}}_{q\times q} is a random lower-triangular matrix with elements

    Zij{iidNormal(0,1)i<jindχ(νi+1)2i=j=0i>j. Z_{ij} \begin{cases} \overset{\;\textrm{iid}\;}{\sim}\operatorname{Normal}(0,1) & i < j \\ \overset{\:\textrm{ind}\:}{\sim}\chi^2_{(\nu-i+1)} & i = j \\ = 0 & i > j. \end{cases}

The log-density of the Wishart distribution is

logp(𝐗𝚿,ν)=12[tr(𝚿1𝐗)+(q+1ν)log|𝐗|+νlog|𝚿|+νqlog(2)+2logΓq(ν2)], \log p({\boldsymbol{X}}\mid {\boldsymbol{\Psi}}, \nu) = -\textstyle{\frac{1}{2}} \left[\mathrm{tr}({\boldsymbol{\Psi}}^{-1} {\boldsymbol{X}}) + (q+1-\nu)\log |{\boldsymbol{X}}| + \nu \log |{\boldsymbol{\Psi}}| + \nu q \log(2) + 2 \log \Gamma_q(\textstyle{\frac{\nu }{2}})\right],

where Γn(x)\Gamma_n(x) is the multivariate Gamma function defined as

Γn(x)=πn(n1)/4j=1nΓ(x+12(1j)). \Gamma_n(x) = \pi^{n(n-1)/4} \prod_{j=1}^n \Gamma\big(x + \textstyle{\frac{1}{2}} (1-j)\big).

Inverse-Wishart Distribution

The Inverse-Wishart distribution 𝐗InvWish(𝚿,ν){\boldsymbol{X}}\sim \operatorname{InvWish}({\boldsymbol{\Psi}}, \nu) is defined as 𝐗1Wish(𝚿1,ν){\boldsymbol{X}}^{-1} \sim \operatorname{Wish}({\boldsymbol{\Psi}}^{-1}, \nu). Its log-density is given by

logp(𝐗𝚿,ν)=12[tr(𝚿𝐗1)+(ν+q+1)log|𝐗|νlog|𝚿|+νqlog(2)+2logΓq(ν2)]. \log p({\boldsymbol{X}}\mid {\boldsymbol{\Psi}}, \nu) = -\textstyle{\frac{1}{2}} \left[\mathrm{tr}({\boldsymbol{\Psi}}{\boldsymbol{X}}^{-1}) + (\nu+q+1) \log |{\boldsymbol{X}}| - \nu \log |{\boldsymbol{\Psi}}| + \nu q \log(2) + 2 \log \Gamma_q(\textstyle{\frac{\nu }{2}})\right].

Properties

If 𝐗q×qWish(𝚿,ν){\boldsymbol{X}}_{q\times q} \sim \operatorname{Wish}({\boldsymbol{\Psi}},\nu), the for a nonzero vector 𝐚q{\boldsymbol{a}}\in \mathbb R^q we have

𝐚𝐗𝐚𝐚𝚿𝐚χ(ν)2. \frac{{\boldsymbol{a}}'{\boldsymbol{X}}{\boldsymbol{a}}}{{\boldsymbol{a}}'{\boldsymbol{\Psi}}{\boldsymbol{a}}} \sim \chi^2_{(\nu)}.

Matrix-Normal Distribution

The Matrix-Normal distribution on a random matrix 𝐗p×q{\boldsymbol{X}}_{p \times q} is denoted 𝐗MatNorm(𝚲,𝚺R,𝚺C){\boldsymbol{X}}\sim \operatorname{MatNorm}({\boldsymbol{\Lambda}}, {\boldsymbol{\Sigma}}_R, {\boldsymbol{\Sigma}}_C), and defined as 𝐗=𝐋𝐙𝐔+𝚲{\boldsymbol{X}}= {\boldsymbol{L}}{\boldsymbol{Z}}{\boldsymbol{U}}+ {\boldsymbol{\Lambda}}, where:

  • 𝚲p×q{\boldsymbol{\Lambda}}_{p \times q} is the mean matrix parameter,
  • 𝚺Rp×p=𝐋𝐋{{\boldsymbol{\Sigma}}_R}_{p \times p} = {\boldsymbol{L}}{\boldsymbol{L}}' is the row-variance matrix parameter,
  • 𝚺Cq×q=𝐔𝐔{{\boldsymbol{\Sigma}}_C}_{q \times q} = {\boldsymbol{U}}'{\boldsymbol{U}} is the column-variance matrix parameter,
  • 𝐙q×q{\boldsymbol{Z}}_{q\times q} is a random matrix with ZijiidNormal(0,1)Z_{ij} \overset{\;\textrm{iid}\;}{\sim}\operatorname{Normal}(0,1).

The log-density of the Matrix-Normal distribution is

logp(𝐗𝚲,𝚺R,𝚺C)=12[tr(𝚺C1(𝐗𝚲)𝚺R1(𝐗𝚲))+νqlog(2π)+νlog|𝚺C|+qlog|𝚺R|]. \log p({\boldsymbol{X}}\mid {\boldsymbol{\Lambda}}, {\boldsymbol{\Sigma}}_R, {\boldsymbol{\Sigma}}_C) = -\textstyle{\frac{1}{2}} \left[\mathrm{tr}\big({\boldsymbol{\Sigma}}_C^{-1}({\boldsymbol{X}}-{\boldsymbol{\Lambda}})'{\boldsymbol{\Sigma}}_R^{-1}({\boldsymbol{X}}-{\boldsymbol{\Lambda}})\big) + \nu q \log(2\pi) + \nu \log |{\boldsymbol{\Sigma}}_C| + q \log |{\boldsymbol{\Sigma}}_R|\right].

Properties

If 𝐗p×qMatNorm(𝚲,𝚺R,𝚺C){\boldsymbol{X}}_{p \times q} \sim \operatorname{MatNorm}({\boldsymbol{\Lambda}}, {\boldsymbol{\Sigma}}_R, {\boldsymbol{\Sigma}}_C), then for nonzero vectors 𝐚p{\boldsymbol{a}}\in \mathbb R^p and 𝐛q{\boldsymbol{b}}\in \mathbb R^q we have

𝐚𝐗𝐛Normal(𝐚𝚲𝐛,𝐚𝚺R𝐚𝐛𝚺C𝐛). {\boldsymbol{a}}' {\boldsymbol{X}}{\boldsymbol{b}}\sim \operatorname{Normal}({\boldsymbol{a}}' {\boldsymbol{\Lambda}}{\boldsymbol{b}}, {\boldsymbol{a}}'{\boldsymbol{\Sigma}}_R{\boldsymbol{a}}\cdot {\boldsymbol{b}}'{\boldsymbol{\Sigma}}_C{\boldsymbol{b}}).

Matrix-Normal Inverse-Wishart Distribution

The Matrix-Normal Inverse-Wishart Distribution on a random matrix 𝐗p×q{\boldsymbol{X}}_{p \times q} and random positive-definite matrix 𝐕q×q{\boldsymbol{V}}_{q\times q} is denoted (𝐗,𝐕)MNIW(𝚲,𝚺,𝚿,ν)({\boldsymbol{X}},{\boldsymbol{V}}) \sim \operatorname{MNIW}({\boldsymbol{\Lambda}}, {\boldsymbol{\Sigma}}, {\boldsymbol{\Psi}}, \nu), and defined as

𝐗𝐕MatNorm(𝚲,𝚺,𝐕)𝐕InvWish(𝚿,ν). \begin{aligned} {\boldsymbol{X}}\mid {\boldsymbol{V}}& \sim \operatorname{MatNorm}({\boldsymbol{\Lambda}}, {\boldsymbol{\Sigma}}, {\boldsymbol{V}}) \\ {\boldsymbol{V}}& \sim \operatorname{InvWish}({\boldsymbol{\Psi}}, \nu). \end{aligned}

Properties

The MNIX distribution is conjugate prior for the multivariable response regression model

𝐘n×qMatNorm(𝐗n×p𝛃p×q,𝐕,𝚺). {\boldsymbol{Y}}_{n \times q} \sim \operatorname{MatNorm}({\boldsymbol{X}}_{n\times p} {\boldsymbol{\beta}}_{p \times q}, {\boldsymbol{V}}, {\boldsymbol{\Sigma}}).

That is, if (𝛃,𝚺)MNIW(𝚲,𝛀1,𝚿,ν)({\boldsymbol{\beta}}, {\boldsymbol{\Sigma}}) \sim \operatorname{MNIW}({\boldsymbol{\Lambda}}, {\boldsymbol{\Omega}}^{-1}, {\boldsymbol{\Psi}}, \nu), then

𝛃,𝚺𝐘MNIW(𝚲̂,𝛀̂1,𝚿̂,ν̂), {\boldsymbol{\beta}}, {\boldsymbol{\Sigma}}\mid {\boldsymbol{Y}}\sim \operatorname{MNIW}(\hat {\boldsymbol{\Lambda}}, \hat {\boldsymbol{\Omega}}^{-1}, \hat {\boldsymbol{\Psi}}, \hat \nu),

where

𝛀̂=𝐗𝐕1𝐗+𝛀𝚿̂=𝚿+𝐘𝐕1𝐘+𝚲𝛀𝚲𝚲̂𝛀̂𝚲̂𝚲̂=𝛀̂1(𝐗𝐕1𝐘+𝛀𝚲)ν̂=ν+n. \begin{aligned} \hat {\boldsymbol{\Omega}}& = {\boldsymbol{X}}'{\boldsymbol{V}}^{-1}{\boldsymbol{X}}+ {\boldsymbol{\Omega}} & \hat {\boldsymbol{\Psi}}& = {\boldsymbol{\Psi}}+ {\boldsymbol{Y}}'{\boldsymbol{V}}^{-1}{\boldsymbol{Y}}+ {\boldsymbol{\Lambda}}'{\boldsymbol{\Omega}}{\boldsymbol{\Lambda}}- \hat {\boldsymbol{\Lambda}}'\hat {\boldsymbol{\Omega}}\hat {\boldsymbol{\Lambda}} \\ \hat {\boldsymbol{\Lambda}}& = \hat {\boldsymbol{\Omega}}^{-1}({\boldsymbol{X}}'{\boldsymbol{V}}^{-1}{\boldsymbol{Y}}+ {\boldsymbol{\Omega}}{\boldsymbol{\Lambda}}) & \hat \nu & = \nu + n. \end{aligned}

Matrix-t Distribution

The Matrix-tt distribution on a random matrix 𝐗p×q{\boldsymbol{X}}_{p \times q} is denoted 𝐗MatT(𝚲,𝚺R,𝚺C,ν){\boldsymbol{X}}\sim \operatorname{MatT}({\boldsymbol{\Lambda}}, {\boldsymbol{\Sigma}}_R, {\boldsymbol{\Sigma}}_C, \nu), and defined as the marginal distribution of 𝐗{\boldsymbol{X}} for (𝐗,𝐕)MNIW(𝚲,𝚺R,𝚺C,ν)({\boldsymbol{X}}, {\boldsymbol{V}}) \sim \operatorname{MNIW}({\boldsymbol{\Lambda}}, {\boldsymbol{\Sigma}}_R, {\boldsymbol{\Sigma}}_C, \nu). Its log-density is given by

logp(𝐗𝚲,𝚺R,𝚺C,ν)=12[(ν+p+q1)log|I+𝚺R1(𝐗𝚲)𝚺C1(𝐗𝚲)|=12[+qlog|𝚺R|+plog|𝚺C|=12[+pqlog(π)logΓq(ν+p+q12)+logΓq(ν+q12)]. \begin{aligned} \log p({\boldsymbol{X}}\mid {\boldsymbol{\Lambda}}, {\boldsymbol{\Sigma}}_R, {\boldsymbol{\Sigma}}_C, \nu) & = -\textstyle{\frac{1}{2}} \Big[(\nu+p+q-1)\log | I + {\boldsymbol{\Sigma}}_R^{-1}({\boldsymbol{X}}-{\boldsymbol{\Lambda}}){\boldsymbol{\Sigma}}_C^{-1}({\boldsymbol{X}}-{\boldsymbol{\Lambda}})'| \\ & \phantom{= -\textstyle{\frac{1}{2}} \Big[} + q \log |{\boldsymbol{\Sigma}}_R| + p \log |{\boldsymbol{\Sigma}}_C| \\ & \phantom{= -\textstyle{\frac{1}{2}} \Big[} + pq \log(\pi) - \log \Gamma_q(\textstyle{\frac{\nu+p+q-1}{2}}) + \log \Gamma_q(\textstyle{\frac{\nu+q-1}{2}})\Big]. \end{aligned}

Properties

If 𝐗p×qMatT(𝚲,𝚺R,𝚺C,ν){\boldsymbol{X}}_{p\times q} \sim \operatorname{MatT}({\boldsymbol{\Lambda}}, {\boldsymbol{\Sigma}}_R, {\boldsymbol{\Sigma}}_C, \nu), then for nonzero vectors 𝐚p{\boldsymbol{a}}\in \mathbb R^p and 𝐛q{\boldsymbol{b}}\in \mathbb R^q we have

𝐚𝐗𝐛μσt(νq+1), \frac{{\boldsymbol{a}}'{\boldsymbol{X}}{\boldsymbol{b}}- \mu}{\sigma} \sim t_{(\nu -q + 1)},

where μ=𝐚𝚲𝐛,σ2=𝐚𝚺R𝐚𝐛𝚺C𝐛νq+1. \mu = {\boldsymbol{a}}'{\boldsymbol{\Lambda}}{\boldsymbol{b}}, \qquad \sigma^2 = \frac{{\boldsymbol{a}}'{\boldsymbol{\Sigma}}_R{\boldsymbol{a}}\cdot {\boldsymbol{b}}'{\boldsymbol{\Sigma}}_C{\boldsymbol{b}}}{\nu - q + 1}.

Random-Effects Normal Distribution

Consider the multivariate normal distribution on qq-dimensional vectors 𝐱{\boldsymbol{x}} and 𝛍{\boldsymbol{\mu}} given by

𝐱𝛍Normal(𝛍,𝐕)𝛍Normal(𝛌,𝚺). \begin{aligned} {\boldsymbol{x}}\mid {\boldsymbol{\mu}}& \sim \operatorname{Normal}({\boldsymbol{\mu}}, {\boldsymbol{V}}) \\ {\boldsymbol{\mu}}& \sim \operatorname{Normal}({\boldsymbol{\lambda}}, {\boldsymbol{\Sigma}}). \end{aligned}

The random-effects normal distribution is defined as the posterior distribution 𝛍p(𝛍𝐱){\boldsymbol{\mu}}\sim p({\boldsymbol{\mu}}\mid {\boldsymbol{x}}), which is given by

𝛍𝐱Normal(𝐆(𝐱𝛌)+𝛌,𝐆𝐕),𝐆=𝚺(𝐕+𝚺)1. {\boldsymbol{\mu}}\mid {\boldsymbol{x}}\sim \operatorname{Normal}\big({\boldsymbol{G}}({\boldsymbol{x}}-{\boldsymbol{\lambda}}) + {\boldsymbol{\lambda}}, {\boldsymbol{G}}{\boldsymbol{V}}\big), \qquad {\boldsymbol{G}}= {\boldsymbol{\Sigma}}({\boldsymbol{V}}+ {\boldsymbol{\Sigma}})^{-1}.

The notation for this distribution is 𝛍RxNorm(𝐱,𝐕,𝛌,𝚺){\boldsymbol{\mu}}\sim \operatorname{RxNorm}({\boldsymbol{x}}, {\boldsymbol{V}}, {\boldsymbol{\lambda}}, {\boldsymbol{\Sigma}}).

Hierarchical Normal-Normal Model

The hierarchical Normal-Normal model is defined as

𝐲i𝛍i,𝛃,𝚺indNormal(𝛍i,𝐕i)𝛍i𝛃,𝚺iidNormal(𝐱i𝛃,𝚺)(𝛃,𝚺)MNIW(𝚲,𝛀1,𝚿,ν), \begin{aligned} {\boldsymbol{y}}_i \mid {\boldsymbol{\mu}}_i, {\boldsymbol{\beta}}, {\boldsymbol{\Sigma}}& \overset{\:\textrm{ind}\:}{\sim}\operatorname{Normal}({\boldsymbol{\mu}}_i, {\boldsymbol{V}}_i) \\ {\boldsymbol{\mu}}_i \mid {\boldsymbol{\beta}}, {\boldsymbol{\Sigma}}& \overset{\;\textrm{iid}\;}{\sim}\operatorname{Normal}({\boldsymbol{x}}_i'{\boldsymbol{\beta}}, {\boldsymbol{\Sigma}}) \\ ({\boldsymbol{\beta}}, {\boldsymbol{\Sigma}}) & \sim \operatorname{MNIW}({\boldsymbol{\Lambda}}, {\boldsymbol{\Omega}}^{-1}, {\boldsymbol{\Psi}}, \nu), \end{aligned}

where:

  • 𝐲iq×1{{\boldsymbol{y}}_i}_{q\times 1} is the response vector for subject ii,
  • 𝛍iq×1{{\boldsymbol{\mu}}_i}_{q\times 1} is the random effect for subject ii,
  • 𝐕iq×q{{\boldsymbol{V}}_i}_{q\times q} is the error variance for subject ii,
  • 𝐱ip×1{{\boldsymbol{x}}_i}_{p\times 1} is the covariate vector for subject ii,
  • 𝛃p×q{\boldsymbol{\beta}}_{p \times q} is the random-effects coefficient matrix,
  • 𝚺q×q{\boldsymbol{\Sigma}}_{q \times q} is the random-effects error variance.

Let 𝐘n×q=(𝐲1,,𝐲n){\boldsymbol{Y}}_{n\times q} = ({\boldsymbol{y}}_{1},\ldots,{\boldsymbol{y}}_{n}), 𝐗n×p=(𝐱1,,𝐱n){\boldsymbol{X}}_{n\times p} = ({\boldsymbol{x}}_{1},\ldots,{\boldsymbol{x}}_{n}), and 𝚯n×q=(𝛍1,,𝛍n){\boldsymbol{\Theta}}_{n \times q} = ({\boldsymbol{\mu}}_{1},\ldots,{\boldsymbol{\mu}}_{n}). If interest lies in the posterior distribution p(𝚯,𝛃,𝚺𝐘,𝐗)p({\boldsymbol{\Theta}}, {\boldsymbol{\beta}}, {\boldsymbol{\Sigma}}\mid {\boldsymbol{Y}}, {\boldsymbol{X}}), then a Gibbs sampler can be used to cycle through the following conditional distributions:

𝛍i𝛃,𝚺,𝐘,𝐗indRxNorm(𝐲i,𝐕i,𝐱i𝛃,𝚺)𝛃,𝚺𝚯,𝐘,𝐗MNIW(𝚲̂,𝛀̂1,𝚿̂,ν̂), \begin{aligned} {\boldsymbol{\mu}}_i \mid {\boldsymbol{\beta}}, {\boldsymbol{\Sigma}}, {\boldsymbol{Y}}, {\boldsymbol{X}}& \overset{\:\textrm{ind}\:}{\sim}\operatorname{RxNorm}({\boldsymbol{y}}_i, {\boldsymbol{V}}_i, {\boldsymbol{x}}_i'{\boldsymbol{\beta}}, {\boldsymbol{\Sigma}}) \\ {\boldsymbol{\beta}}, {\boldsymbol{\Sigma}}\mid {\boldsymbol{\Theta}}, {\boldsymbol{Y}}, {\boldsymbol{X}}& \sim \operatorname{MNIW}(\hat {\boldsymbol{\Lambda}}, \hat {\boldsymbol{\Omega}}^{-1}, \hat {\boldsymbol{\Psi}}, \hat \nu), \end{aligned}

where 𝚲̂\hat {\boldsymbol{\Lambda}}, 𝛀̂\hat {\boldsymbol{\Omega}}, 𝚿̂\hat {\boldsymbol{\Psi}}, and ν̂\hat \nu are obtained from the MNIW conjugate posterior formula with 𝐘𝚯{\boldsymbol{Y}}\gets {\boldsymbol{\Theta}}.