Does linear discriminant analysis always project the points to a line? Most of the graphical illustrations of LDA that I see online use an example of 2 dimensional points which are projected onto a straight line y=mx+c. If the points were each a 10-dimensional vector, does LDA still project them to a line?
Or would it project them to a hyperplane with 9 dimensions or less.
ANother question about projections: If I have a vector Y=[a,b,c,d]. The projection of this vector onto a given line is the product of the direction vector V of the line and the vector Y. This is equivalent to a dot product given by transpose(V).Y, and gives just one number (a scalar).
This seems to be the way how LDA works. So, if I may ask, does LDA map a full n-dimensional vector onto a scalar (a singe number)?
Apologies in advance for my newbie question.
Best Answer
LDA seeks to reduce dimensionality while preserving as much of the class discriminatory information as possible. Assume we have a set of $d$-dimensional observations $X$, belonging to $C$ different classes. The goal of LDA is to find an linear transformation (projection) matrix $L$ that converts the set of labelled observations $X$ into another coordinate system $Y$ such that the class separability is maximized. The dataset is transformed into the new subspace as:
begin{equation} Y = XL end{equation}
The columns of the matrix $L$ are a subset of the $C-1$ largest (non-orthogonal) eigenvectors of the squared matrix $J$, obtained as:
begin{equation} J = S_{W}^{-1} S_B end{equation}
where $S_W$ and $S_B$ are the scatter matrices within-class and respectively between-classes.
When it comes to dimension reduction in LDA, if some eigenvalues have a significantly bigger magnitude than others then we might be interested in keeping only those dimensions, since they contain more information about our data distribution. This becomes particularly interesting as $S_B$ is the sum of $C$ matrices of rank $leq 1$, and the mean vectors are constrained by $frac{1}{C}sum_{i=1}^C mu_i = mu$ cite{c.radhakrishnarao1948}. Therefore, $S_B$ will be of rank $C-1$ or less, meaning that there are only $C-1$ eigenvalues that will be non-zero (more info here). For this reason, even if the dimensionality $k$ of the sub-space $Y$ can be arbitrarily chosen, it does not make any sense to keep more than $C-1$ dimensions, as they will not carry any useful information. In fact, in ac{lda} the smallest $d – (C-1)$ dimensions have magnitude zero, and therefore the subspace $Y$ should have exactly $k = C-1$ dimensions.
Similar Posts:
- Solved – How to eigenfaces (PCA eigenvectors on face image data) be displayed as images
- Solved – Why decision boundary is of (D-1) dimensions
- Solved – Difference between big data and high dimensional data
- Solved – Dimensionally weighted distance between two points in n-dimensional space
- Solved – Dimensionally weighted distance between two points in n-dimensional space