Solved – Uncorrelated/orthogonal random vectors

For random variables $X,Yinmathbb{R}$ we say that they are orthogonal if $E(XY)=0$ and uncorrelated if $E((X-E(X))(Y-E(Y))=0$. In what follows I assume all random variables to be centered so orthogonal and uncorrelated are the same thing.

Consider the case $X,Yinmathbb{R}^i$ for some $i>1$. Does orthogonal/uncorrelated mean that the expected value of their inner product is zero, that is:
begin{align}tag{1}
E(X^TY)=0? qquadtext{[Note that }X^TYinmathbb{R}text{ so } 0inmathbb{R}text{]}
end{align}
Or does it mean that the covariance matrix is zero, that is:
begin{align}tag{2}
E(XY^T)=0?qquadtext{[Note that }XY^Tinmathbb{R}^{i^2}text{ so } 0inmathbb{R}^{i^2} text{]}
end{align}
Definition (2) says that every coordinate of $X$ is uncorrelated to every coordinate of $Y$, and (2) implies (1), but (1) does not imply (2).

I lean towards (2) as a more natural definition. Namely, I have seen people talk about $Xinmathbb{R}^i$ and $Yinmathbb{R}^j$ as uncorrelated/orthogonal, even when $ineq j$, in which case definition (1) breaks down.

Correlation and orthogonality, although closely related concepts, are not the same things. This question is confusing because both answers are correct, depending on which version of the question is understood!

"Orthogonal" is always understood in mathematics to be a relation relative to an inner product. In particular, an inner product associates a scalar to ordered pairs of vectors. In the question the vectors are random variables like $X$ and $Y$ having values in $mathbb{R}^i.$ They qualify as "vectors" in the abstract sense that we can (a) sum any two of them and (b) multiply any one of them by any scalar in a way that satisfies the usual axioms of vector spaces. Thus, only (1) can possibly be considered as a definition of "orthogonal," because it alone of (1) and (2) concerns a possible inner product. It's straightforward to show the map $(X,Y)to E[X^prime Y]$ indeed is an inner product (on the space of square-integrable equivalence classes of random variables).

Notice that this definition requires $X$ and $Y$ to have values in a common vector space $mathbb{R}^i.$

On the other hand, "uncorrelated" would typically be taken to mean that each component of $X$ is uncorrelated with each component of $Y.$ Equivalently, the covariance of each component of $X$ with each component of $Y$ is zero. This what (2) states. One justification for this interpretation is that "uncorrelated" ought to refer to a correlation matrix. As always, a correlation matrix is obtained from a covariance matrix. The covariance matrix for the concatenated vector $(X,Y)$ contains four blocks: one is the variance-covariance matrix of $X;$ another is the variance-covariance for $Y;$ and the other two (which are transposes of each other) give the cross-covariances $E[XY^prime] = E[YX^prime]^prime.$ Thus, "uncorrelated" is a statement about the structure of the covariance matrix of the vector-valued random variable $(X,Y).$

Notice that this latter sense does not require $X$ and $Y$ to have the same dimensions. For instance, $X$ could have values in $mathbb{R}^3$ and $Y$ have values in $mathbb{R}^2.$ The cross-covariance matrices are $3times 2$ and $2times 3$ matrices.

Finally, to drive the point home, let's exhibit an example of orthogonal correlated random vectors. Let $Z$ be a (scalar) random variable with unit variance $operatorname{Var}(Z)=1.$ Define $X=(Z,Z)^prime$ and $Y=(Z,-Z)^prime.$ Then

$$E[X^prime Y] = Eleft[pmatrix{Z&-Z}pmatrix{Z\Z}right] = E[Z^2-Z^2] = 0$$

demonstrates orthogonality, yet

$$E[X Y^prime] = Eleft[pmatrix{Z\Z}pmatrix{Z&-Z}right] = pmatrix{E[Z^2] & E[Z(-Z)]\E[Z^2]& E[Z(-Z)]} = pmatrix{1 & -1\1 & -1}$$

demonstrates correlation. (Indeed, because all the components of $X$ and $Y$ have unit variances, this is the cross-correlation matrix of $X$ and $Y.$)


Because I do not want to leave anyone with the mistaken impression that "orthogonal" means $X^prime Y=0$ almost surely (as is the case in the preceding example), introduce a second variable $U,$ independent of $Z,$ for which $Pr(U=1)=Pr(U=-3)=1/2.$ Set $X=(Z,Z)^prime$ and $Y=(Z,UZ)^prime.$ Now $X^prime Y = Z^2 + UZ^2 = (1+U)Z^2.$ Half the time this equals $2Z^2$ and the other half of the time it equals $-2Z^2,$ so on average the value is zero: $X$ and $Y$ are orthogonal. Yet, provided $Z$ is not almost surely zero, there is a positive chance that $X^prime Y$ is nonzero.

Similar Posts:

Rate this post

Leave a Comment