Does linear discriminant analysis always project the points to a line? Most of the graphical illustrations of LDA that I see online use an example of 2 dimensional points which are projected onto a straight line y=mx+c. If the points were each a 10-dimensional vector, does LDA still project them to a line?

Or would it project them to a hyperplane with 9 dimensions or less.

ANother question about projections: If I have a vector Y=[a,b,c,d]. The projection of this vector onto a given line is the product of the direction vector V of the line and the vector Y. This is equivalent to a dot product given by transpose(V).Y, and gives just one number (a scalar).

This seems to be the way how LDA works. So, if I may ask, does LDA map a full n-dimensional vector onto a scalar (a singe number)?

Apologies in advance for my newbie question.

**Contents**hide

#### Best Answer

LDA seeks to reduce dimensionality while preserving as much of the class discriminatory information as possible. Assume we have a set of $d$-dimensional observations $X$, belonging to $C$ different classes. The goal of LDA is to find an linear transformation (projection) matrix $L$ that converts the set of labelled observations $X$ into another coordinate system $Y$ such that the class separability is maximized. The dataset is transformed into the new subspace as:

begin{equation} Y = XL end{equation}

The columns of the matrix $L$ are a subset of the $C-1$ largest (non-orthogonal) eigenvectors of the squared matrix $J$, obtained as:

begin{equation} J = S_{W}^{-1} S_B end{equation}

where $S_W$ and $S_B$ are the scatter matrices within-class and respectively between-classes.

When it comes to dimension reduction in LDA, if some eigenvalues have a significantly bigger magnitude than others then we might be interested in keeping only those dimensions, since they contain more information about our data distribution. This becomes particularly interesting as $S_B$ is the sum of $C$ matrices of rank $leq 1$, and the mean vectors are constrained by $frac{1}{C}sum_{i=1}^C mu_i = mu$ cite{c.radhakrishnarao1948}. Therefore, $S_B$ will be of rank $C-1$ or less, meaning that there are only $C-1$ eigenvalues that will be non-zero (more info here). For this reason, even if the dimensionality $k$ of the sub-space $Y$ can be arbitrarily chosen, it does not make any sense to keep more than $C-1$ dimensions, as they will not carry any useful information. In fact, in ac{lda} the smallest $d – (C-1)$ dimensions have magnitude zero, and therefore the subspace $Y$ should have exactly $k = C-1$ dimensions.

### Similar Posts:

- Solved – How to eigenfaces (PCA eigenvectors on face image data) be displayed as images
- Solved – Why decision boundary is of (D-1) dimensions
- Solved – Difference between big data and high dimensional data
- Solved – Dimensionally weighted distance between two points in n-dimensional space
- Solved – Dimensionally weighted distance between two points in n-dimensional space