Solved – Raising a variance-covariance matrix to a negative half power

I want to implement the following formula where $C$ is variance-covariance matrix of variables x, y, and z:

$$C = begin{bmatrix}cov(x,x)&cov(x,y)&cov(x,z)\cov(y,x)&cov(y,y)&cov(y,z)\cov(z,x)&cov(z,y)&cov(z,z)end{bmatrix}$$

$$S = (C)^{-1/2}$$

I understand the diagonal and inverse operations, but I am unclear on the meaning of raising a variance covariance matrix to a negative half power. Thus, my questions

  1. What does it mean to raise a covariance matrix to a negative half power?
  2. What general ideas of linear algebra does this assume?
  3. Is there any nice formula to raise a covariance matrix to a negative half power?

What the operation $C^{-frac{1}{2}}$ refers at is the decorrelation of the underlying sample to uncorrelated components; $C^{-frac{1}{2}}$ is used as whitening matrix. This is natural operation when looking to analyse each column/source of the original data matrix $A$ (having a covariance matrix $C$), through an uncorrelated matrix $Z$. The most common way of implementing such whitening is through the Cholesky decomposition (where we use $C = LL^T$, see this thread for an example with "colouring" a sample) but here we use slightly less uncommon Mahalanobis whitening (where we use $C= C^{0.5} C^{0.5}$). The whole operation in R would go a bit like this:

set.seed(323) N <- 10000; p <- 3; # Define the real C ( C <- base::matrix( data =c(4,2,1,2,3,2,1,2,3), ncol = 3, byrow= TRUE) )  # Generate the uncorrelated data (ground truth) Z <- base::matrix( ncol = 3, rnorm(N*p) )  # Estimate the colouring matrix C^0.5 CSqrt <- expm::sqrtm(C) # "Colour" the data / usually we use Cholesky (LL^T) but using C^0.5 valid too A <- t( CSqrt %*% t(Z) )  # Get the sample estimated C  ( CEst <- round( digits = 2, cov( A )) ) # Estimate the whitening matrix C^-0.5 CEstInv <-  expm::sqrtm(solve(CEst)) # Whiten the data ZEst <-  t(CEstInv %*% t(A) ) # Check that indeed we have whitened the data  ( round( digits = 1, cov(cbind(ZEst, Z) ) ) ) 

So to succinctly answer the question raised:

  1. It means that we can decorrelate the sample $A$ that is associated with that covariance matrix $C$ in such way that we get uncorrelated components. This is commonly referred as whitening.
  2. The general Linear Algebra idea it assumes is that a (covariance) matrix can be used as a projection operator (to generate a correlated sample by "colouring") but so does the inverse of it (to decorrelate/"whiten" a sample).
  3. Yes, the easiest way to raise a valid covariance matrix to any power (the negative square root is just a special case) by using the eigen-decomposition of it; $C = V Lambda V^T$, $V$ being an orthonormal matrix holding the eigenvectors of $C$ and $Lambda$ being a diagonal matrix holding the eigenvalues. Then we can readily change the diagonal matrix $Lambda$ as we wish and get the relevant result.

A small code snippet showcasing point 3.

# Get the eigendecomposition of the covariance matrix myEigDec <- eigen(cov(A)) # Use the eigendecomposition to get the inverse square root myEigDec$vectors %*% diag( 1/ sqrt( myEigDec$values) ) %*% t(myEigDec$vectors) # Use the eigendecomposition to get the "negative half power" (same as above) myEigDec$vectors %*% diag( ( myEigDec$values)^(-0.5) ) %*% t(myEigDec$vectors) # And to confirm by the R library expm solve(expm::sqrtm(cov(A))) 

Similar Posts:

Rate this post

Leave a Comment