I am reading A Tutorial on Principal Component Analysis by Shlens, 2014, and it mentions these two notions: "second-order dependencies" and "higher order dependencies". I could not find any clear explanation of them. What do they mean?
The goal of the analysis is to decorrelate the data, or
said in other terms, the goal is to remove second-order dependencies in the data. In the data sets of Figure 6, higher order
dependencies exist between the variables. Therefore, removing second-order dependencies is insufficient at revealing all
structure in the data.
Best Answer
PCA is based on variances and covariances, $mathrm E[x_i x_j]$ (assuming mean-free variables). These are measures of second-order dependencies because the data enter in the form of terms of order 2. After PCA, the principal components have 0 covariance between them, so second-order dependencies have been removed. However, it is still possible that higher-order dependencies exist, e.g. that $mathrm E[x_i x_j x_k] neq 0$ for some $i$, $j$, and $k$. By removing second-order dependencies by applying a linear transform, PCA in a way "reveals" second-order dependencies in the form of that transform, but it does not "reveal" higher-order dependencies.
Similar Posts:
- Solved – What are “second-order dependencies” and “higher order dependencies” in the data
- Solved – What are “second-order dependencies” and “higher order dependencies” in the data
- Solved – When using higher order terms in multivariable regression what is the effect on the P-value
- Solved – When using higher order terms in multivariable regression what is the effect on the P-value
- Solved – delta method with higher order terms to improve variance estimation accuracy