I would like to run a factor analysis in lavaan with a factor having only one manifest indicator. My problem is that I don't know the correct syntax for the model specification.
What I already do know is that (1) I can set the item-factor loading on the single-indicator-factor to 1 or (2) I have to specify an estimate for the item's error variance which is what I want.
The error variance depends on the item's empirical variance and reliability. I compute the reliability with the Spearman-Brown formula. The original factor has 4 items and the single item version is an "enlargement" to 1/4th. In my example the reliabiliy goes down from .87 to .63 which – I think – understimates the item's reliability.
The item's variance is 1.20 and the error variance is (1 – alpha) * var = (1 – .63) * 1.20 = .45. Now here is the question. Having three other normal factors A, B, C and D1 as the single indicator for D is this the correct model specification:
model <- ' Factor.A =~ A1 + A2 + A3 + A4 Factor.B =~ B1 + B2 + B3 + B3 Factor.C =~ C1 + C2 Factor.D =~ D1 D1 ~~ 0.45*D1' Data <- .... fit <- sem(model, data=Data, estimator="MLM")
My particular problem is that every source says that I have to specify the error variance. But lavaan's ~~ operator is to specify a variance. Then the correct coefficient would be D1's estimated alpha times it's variance = .63 * 1.20 = .76.
Best Answer
I hope that you have found an answer at this point, but for the viewers who still want to know how to do this in the lavaan
package, here's how.
To start I simulated some data:
> head(dat) A1 A2 A3 A4 B1 B2 B3 C1 C2 D1 1 1.6785322 0.9257293 -0.39660571 -0.5171069 1.47728589 1.3256104 -1.2620390 -0.6492827 -1.10679078 0.8026393 2 1.6168768 1.9164575 1.09444280 -0.3433172 -2.55549628 2.0257767 0.3753301 -2.2027485 -1.74793846 -0.7827619 3 -0.4532672 -1.8770901 -0.01629435 -1.3525647 0.05900466 -1.3453644 -1.3048589 2.1052869 -0.07766467 0.6288775 4 0.6806613 1.2028459 -0.51391579 -1.1764455 1.08308724 -1.7084728 -0.4183617 1.4533609 1.80628226 1.5631844 5 0.2953281 -2.1000532 0.03250903 -1.8928100 0.49891131 0.1838630 -1.1338902 -0.4802558 -0.33459527 -0.5051095 6 2.3563684 1.2439698 -0.85265611 2.1545112 -2.01701660 -0.8861477 -2.3937187 -1.5670614 -0.56750672 -1.8588870
Next I built a model with a single indicator for Factor.D
as described above. In this case the trick is specifying the variance of the observed variable to be equal to 0. This is the same thing as saying that the latent variable will account for all of the variance in the observed variable (i.e., a latent variable with a single indicator).
model<-' Factor.A =~ A1 + A2 + A3 + A4 Factor.B =~ B1 + B2 + B3 + B3 Factor.C =~ C1 + C2 Factor.D =~ D1 D1~~0*D1 '
Next, I run the model and obtain my output. To stick with the example above I will use the robust MLM
estimator, but it isn't required in this situation.
> fit<-sem(model, data=dat, estimator = 'MLM') > > summary(fit, fit.measures=T, standardized =T) lavaan (0.5-22) converged normally after 79 iterations Number of observations 500 Estimator ML Robust Minimum Function Test Statistic 23.764 24.159 Degrees of freedom 30 30 P-value (Chi-square) 0.783 0.765 Scaling correction factor 0.984 for the Satorra-Bentler correction Model test baseline model: Minimum Function Test Statistic 990.791 971.021 Degrees of freedom 45 45 P-value 0.000 0.000 User model versus baseline model: Comparative Fit Index (CFI) 1.000 1.000 Tucker-Lewis Index (TLI) 1.010 1.009 Robust Comparative Fit Index (CFI) 1.000 Robust Tucker-Lewis Index (TLI) 1.009 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -8035.768 -8035.768 Loglikelihood unrestricted model (H1) -8023.886 -8023.886 Number of free parameters 35 35 Akaike (AIC) 16141.535 16141.535 Bayesian (BIC) 16289.047 16289.047 Sample-size adjusted Bayesian (BIC) 16177.954 16177.954 Root Mean Square Error of Approximation: RMSEA 0.000 0.000 90 Percent Confidence Interval 0.000 0.023 0.000 0.024 P-value RMSEA <= 0.05 1.000 1.000 Robust RMSEA 0.000 90 Percent Confidence Interval 0.000 0.024 Standardized Root Mean Square Residual: SRMR 0.021 0.021 Parameter Estimates: Information Expected Standard Errors Robust.sem Latent Variables: Estimate Std.Err z-value P(>|z|) Std.lv Std.all Factor.A =~ A1 1.000 0.892 0.664 A2 1.003 0.088 11.378 0.000 0.895 0.672 A3 1.071 0.093 11.500 0.000 0.955 0.718 A4 1.080 0.091 11.899 0.000 0.963 0.680 Factor.B =~ B1 1.000 0.961 0.685 B2 1.081 0.087 12.492 0.000 1.038 0.732 B3 1.098 0.095 11.597 0.000 1.055 0.744 Factor.C =~ C1 1.000 0.564 0.409 C2 2.224 4.074 0.546 0.585 1.254 0.925 Factor.D =~ D1 1.000 0.973 1.000 Covariances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all Factor.A ~~ Factor.B 0.059 0.051 1.157 0.247 0.069 0.069 Factor.C 0.001 0.028 0.049 0.961 0.003 0.003 Factor.D -0.015 0.045 -0.332 0.740 -0.017 -0.017 Factor.B ~~ Factor.C 0.033 0.065 0.518 0.605 0.062 0.062 Factor.D 0.072 0.047 1.520 0.128 0.077 0.077 Factor.C ~~ Factor.D 0.003 0.028 0.108 0.914 0.006 0.006 Intercepts: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .A1 -0.017 0.060 -0.275 0.783 -0.017 -0.012 .A2 -0.016 0.060 -0.270 0.787 -0.016 -0.012 .A3 -0.016 0.060 -0.264 0.791 -0.016 -0.012 .A4 -0.017 0.063 -0.268 0.789 -0.017 -0.012 .B1 0.063 0.063 1.006 0.314 0.063 0.045 .B2 -0.003 0.063 -0.041 0.967 -0.003 -0.002 .B3 -0.044 0.063 -0.690 0.490 -0.044 -0.031 .C1 -0.054 0.062 -0.874 0.382 -0.054 -0.039 .C2 -0.012 0.061 -0.198 0.843 -0.012 -0.009 .D1 -0.006 0.044 -0.135 0.893 -0.006 -0.006 Factor.A 0.000 0.000 0.000 Factor.B 0.000 0.000 0.000 Factor.C 0.000 0.000 0.000 Factor.D 0.000 0.000 0.000 Variances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .D1 0.000 0.000 0.000 .A1 1.010 0.079 12.772 0.000 1.010 0.559 .A2 0.970 0.081 12.032 0.000 0.970 0.548 .A3 0.859 0.079 10.857 0.000 0.859 0.485 .A4 1.077 0.097 11.153 0.000 1.077 0.537 .B1 1.046 0.091 11.476 0.000 1.046 0.531 .B2 0.934 0.096 9.753 0.000 0.934 0.464 .B3 0.895 0.103 8.678 0.000 0.895 0.446 .C1 1.582 0.593 2.665 0.008 1.582 0.833 .C2 0.266 2.878 0.092 0.926 0.266 0.145 Factor.A 0.796 0.099 8.017 0.000 1.000 1.000 Factor.B 0.923 0.118 7.848 0.000 1.000 1.000 Factor.C 0.318 0.584 0.544 0.586 1.000 1.000 Factor.D 0.947 0.059 15.955 0.000 1.000 1.000
Note that the variance for Factor.D is the exact same variance as the variance for D1 (using the population formula):
> var(dat$D1)*499/500 [1] 0.9468973
And there you have it.
Similar Posts:
- Solved – How to specify a lavaan sem model with a single-indicator factor
- Solved – How to specify a lavaan sem model with a single-indicator factor
- Solved – How does oblimin rotation method affect confirmatory factor analysis in lavaan
- Solved – Degrees of Freedom in CFA using lavaan
- Solved – Output interpretation of lavaan in R concerning fit indices of robust estimator