Solved – Why are discriminant analysis results in R (lda) and SPSS different?: Constant term

I tried discriminant analysis with lda() in R and in SPSS, but the scalings were different, why?

N, how to get (Constant) with R like SPSS result?


head(data)   ï..smoke age selfcon anxiety absence subtestb 1        1  36      42      17       3       30 2        1  45      45      21       0       29 3        1  43      36      13       8       23 4        2  25      25      23      14       20 5        2  36      32      25       9       16 6        2  25      19      27       5       20 

lda() result

lda(x,cl)$scaling                 LD1 age      -0.0237009 selfcon  -0.0800297 anxiety   0.0999290 absence   0.0115092 subtestb -0.1341198 

SPSS result:

age            .024 selfcon        .080 anxiety score −.100 absence       −.012 subtestb       .134 (Constant)   −4.543 

Except for the constant, the numbers in SPSS are just the rounded results of the numbers in R. There is no constant in R because by default, R function 'lda' from the MASS package, centers the data.

Because of questions in the comments I added:

If you look at the numbers in R and those in SPSS then (1) they have opposite signs but that doesn't make a difference it just means that the class that R takes as positive is chosen as the negative by SPSS (it is just a matter of coding the binary outcome) (2) in SPSS they are rounded to 3 digits and (3) in SPSS You have a constant while in R you don't. That is because R centers the data.

If one wants to obtain the constant in R, then you can apply the LDA-formulas as in this pdf (section 4.3). Formula (4.9) of this reference shows the constant on the first line (no $x$ there) and the coefficients on the second line. On the next page, with bullets, you see how you can estimate the parameters from your data.

Similar Posts:

Rate this post

Leave a Comment