I tried discriminant analysis with lda() in R and in SPSS, but the scalings were different, why?
N, how to get (Constant) with R like SPSS result?
data:
head(data) ï..smoke age selfcon anxiety absence subtestb 1 1 36 42 17 3 30 2 1 45 45 21 0 29 3 1 43 36 13 8 23 4 2 25 25 23 14 20 5 2 36 32 25 9 16 6 2 25 19 27 5 20
lda() result
lda(x,cl)$scaling LD1 age -0.0237009 selfcon -0.0800297 anxiety 0.0999290 absence 0.0115092 subtestb -0.1341198
SPSS result:
age .024 selfcon .080 anxiety score −.100 absence −.012 subtestb .134 (Constant) −4.543
Best Answer
Except for the constant, the numbers in SPSS are just the rounded results of the numbers in R. There is no constant in R because by default, R function 'lda' from the MASS package, centers the data.
Because of questions in the comments I added:
If you look at the numbers in R and those in SPSS then (1) they have opposite signs but that doesn't make a difference it just means that the class that R takes as positive is chosen as the negative by SPSS (it is just a matter of coding the binary outcome) (2) in SPSS they are rounded to 3 digits and (3) in SPSS You have a constant while in R you don't. That is because R centers the data.
If one wants to obtain the constant in R, then you can apply the LDA-formulas as in this pdf (section 4.3). Formula (4.9) of this reference shows the constant on the first line (no $x$ there) and the coefficients on the second line. On the next page, with bullets, you see how you can estimate the parameters from your data.
Similar Posts:
- Solved – use binary covariate in anova
- Solved – How to estimate correlations between ordinal and binary data in SPSS
- Solved – Does no correlation imply no causality
- Solved – Assigning class to the cases after K means cluster analysis (SPSS)
- Solved – Repeated-Measures ANOVA: how to locate the significant difference(s) by R?