I adjust the partial least squares regression for one categorical factor (2 levels – `be`

or `nottobe`

) with with the `pls`

package in R. I try to use `round()`

function in the predict values for take the decision if the result are the first or second level in my factor. Does this approach sound correct?

`require(pls) #Artificial data T<-as.factor(sort(rep(c("be", "nottobe"), 100))) y1 <- c(rnorm(100,1,0.1),rnorm(100,1,0.1)) y2 <- c(rnorm(100,10,0.3),rnorm(100,10,0.6)) y3 <- c(rnorm(100,10,2.3),rnorm(100,11,2.6)) y4 <- c(rnorm(100,5,0.5),rnorm(100,7,0.5)) y5 <- c(rnorm(100,0,0.1),rnorm(100,0,0.1)) #Create the data frame avaliacao <- as.numeric(T) espectro <- cbind(y1,y2,y3,y4,y5) dados <- data.frame(avaliacao = I(as.matrix(avaliacao)), bands = I(as.matrix(espectro))) #PLS regression taumato <- plsr(avaliacao ~ bands, ncomp = 5, validation = "LOO", data=dados) summary(taumato) #Components analysis plot(taumato, plottype = "scores", comps = 1:5) #Cross validation taumato.cv <- crossval(taumato, segments = 10) plot(MSEP(taumato.cv), legendpos = "topright") summary(taumato.cv, what = "validation") plot(taumato, xlab ="medição", ylab="predição", ncomp = 3, asp = 1, main=" ", line = TRUE) #Predition for 3 components T<-as.factor(sort(rep(c("be", "nottobe"), 50))) y1 <- c(rnorm(100,1,0.1),rnorm(100,1,0.1)) y2 <- c(rnorm(100,10,0.3),rnorm(100,10,0.6)) y3 <- c(rnorm(100,10,2.3),rnorm(100,11,2.6)) y4 <- c(rnorm(100,5,0.5),rnorm(100,7,0.5)) y5 <- c(rnorm(100,0,0.1),rnorm(100,0,0.1)) espectro2 <- cbind(y1,y2,y3,y4,y5) new.dados <- data.frame(bands = I(as.matrix(espectro2))) round(predict(taumato, ncomp = 3, newdata = new.dados))## `

#### Best Answer

PLS with a "hardening"-threshold to convert the output into hard class decisions is known as PLS-DA, and yes that is frequently done.

If you go for PLS-DA, you typically want to adjust the threshold for unequal numbers of training cases in the classes.

However, there are more advanced and possibly also more appropriate possibilities: you can use the PLS as regularization for "proper" classification models such as LDA (PLS-LDA) or logistic regression (PLS-LR; this is a type II nonlinear PLS model according to Rosipal's description).

