# Solved – Canonical form representation of a Linear Gaussian CPD

This question was asked in physics stack exchange but didn't get an answer and it was suggested this would be a better place. Two years later I am wondering the same thing. Here is the question with slightly different wording:

How can a linear Gaussian conditional probability distribution be represented in canonical form?

For example, let \$mathbf{X}\$ and \$mathbf{Y}\$ be two sets of continuous variables, with \$|mathbf{X}| = n\$ and \$|mathbf{Y}| = m\$. Let

\$p(mathbf{Y} | mathbf{X}) = mathcal{N}(mathbf{Y} | mathbf{a} + Bmathbf{X}; C)\$

where \$mathbf{a}\$ is a vector of dimension \$m\$, \$B\$ is an \$m\$ by \$n\$ matrix, and \$C\$ is an \$m\$ by \$m\$ matrix.
How does one represent that in canonical form?

This is boggling me particularly since a linear Gaussian is not necessarily a Gaussian probability distribution.
The canonical representation of a Gaussian has
\$K = Sigma^{-1}\$ and \$mathbf{h} = Sigma^{-1} boldsymbol{mu}\$.
How can one have a \$K\$ and \$mathbf{h}\$ for a something that is not a Gaussian?

Contents

I have an answer found with help from two technical reports (I can only post one link will post the other one in comments) [1], 2. The report from [1] only showed a univariate Gaussian. Here is my attempt at the multivariate case.

The basic idea is to use Bayes law: \$p(Y|X) = frac{p(Y,X)}{p(X)}\$

We know from 2 that the joint of the linear Gaussian is:

\$p(X,Y) = mathcal{N} left( begin{pmatrix} boldsymbol{mu_X} \ B boldsymbol{mu_X} + mathbf{a} end{pmatrix} , Sigma_{X,Y} right)\$

with the process noise described by \$Sigma_{w}\$ we have

\$ Sigma_{X,Y} = begin{pmatrix} B^T Sigma_{w}^{-1} B + Sigma_{X}^{-1} & -B^T Sigma_{w}^{-1} \ -Sigma_{w}^{-1} B & Sigma_{w}^{-1} end{pmatrix}^{-1} = begin{pmatrix} Sigma_{X} & Sigma_{X} B^{T} \ B Sigma_{X} & Sigma_{w} + B Sigma_{X} B^T end{pmatrix}\$

Now to get \$p(Y|X)\$ we devide it py \$p(X)\$ which in canonical form is

\$K_X = Sigma_{X}^{-1}\$ and \$mathbf{h} = K_X mu_X\$

Dividing it out gives us:

\$ K_{X|Y} = begin{pmatrix} B^T Sigma_{w}^{-1} B & -B^T Sigma_{w}^{-1} \ -Sigma_{w}^{-1} B & Sigma_{w}^{-1} end{pmatrix}^{-1} \$, \$mathbf{h}_{X|Y} = begin{pmatrix} 0 \ vdots \ 0 end{pmatrix}, g_{X|Y} = – log((2 pi)^{n/2} |Sigma_{w}|^{1/2}) \$

With \$n\$ the dimension of the Gaussian.

Note that I went with zero mean process noise and also assumed \$mathbf{a}\$ to be zero.

The result in canonical form is probably not a valid Gaussian as \$K_{X|Y}\$ is probably not invertible. Multiplying it with the \$p(X)\$ then however gives you a valid Gaussian as one would expect.

Rate this post