I am dealing with 2 class LDA classification problem.
During a test phase (after training), I'm trying to project a feature vector to lower dimensional space.
How do we get the projected test feature vector?
Is it
- Y = (X-mean) * W
- Y = X * W
Which one of above is true? (X is feature vector, W is weight vector obtained during training, Y is the resultant projected vector).
Best Answer
(You probably found out by now, but in case someone else needs this:)
Centering the data is independent of the projection (LDA projects into a $n_text{classes} – 1$ dimensional space, and it doesn't matter at all wheter this is one or more dimensions).
Generally speaking, translation (i.e. using a different center) doesn't change the predictions of an LDA model as they depend on the distance between the classes in LD space. This means that implementations of LDA are free to choose whatever centering they prefer. So if and exactly what center you need to subtract will depend on the implementation of LDA you use.
As an example, MASS::lda
in R uses the mean of the class means weighted by the prior probabilities:
means <- colSums(prior * object$means) scaling <- object$scaling x <- scale(x, center = means, scale = FALSE) %*% scaling
To test this:
> LDA <- lda (Species ~ ., data = iris, prior=c (.8, .1, .1)) > plot (predict (LDA)$x, asp = 1) > LDscores <- scale (iris [, -5], center = colSums (LDA$prior * LDA$means), scale = FALSE) %*% LDA$scaling > points (LDscores, pch = 20, col = 2)
> summary (LDscores - predict (LDA)$x) LD1 LD2 Min. :0 Min. :0 1st Qu.:0 1st Qu.:0 Median :0 Median :0 Mean :0 Mean :0 3rd Qu.:0 3rd Qu.:0 Max. :0 Max. :0
Similar Posts:
- Solved – Pre-processing (center, scale, impute) among training sets (different forms) and the test set – what is a good approach
- Solved – Mahalanobis distance in a LDA classifier
- Solved – Machine Learning : Classification algorithm for very high dimensional data which is uniquely definable in a very small sub-space
- Solved – How to calculate predicted values using an lm.ridge object
- Solved – Clustering as dimensionality reduction