# Solved – How to specify a lavaan sem model with a single-indicator factor

I would like to run a factor analysis in lavaan with a factor having only one manifest indicator. My problem is that I don't know the correct syntax for the model specification.

What I already do know is that (1) I can set the item-factor loading on the single-indicator-factor to 1 or (2) I have to specify an estimate for the item's error variance which is what I want.

The error variance depends on the item's empirical variance and reliability. I compute the reliability with the Spearman-Brown formula. The original factor has 4 items and the single item version is an "enlargement" to 1/4th. In my example the reliabiliy goes down from .87 to .63 which – I think – understimates the item's reliability.

The item's variance is 1.20 and the error variance is (1 – alpha) * var = (1 – .63) * 1.20 = .45. Now here is the question. Having three other normal factors A, B, C and D1 as the single indicator for D is this the correct model specification:

``model <- '    Factor.A =~ A1 + A2 + A3 + A4    Factor.B =~ B1 + B2 + B3 + B3    Factor.C =~ C1 + C2    Factor.D =~ D1    D1 ~~ 0.45*D1' Data <- .... fit <- sem(model, data=Data, estimator="MLM") ``

My particular problem is that every source says that I have to specify the error variance. But lavaan's ~~ operator is to specify a variance. Then the correct coefficient would be D1's estimated alpha times it's variance = .63 * 1.20 = .76.

Contents

I hope that you have found an answer at this point, but for the viewers who still want to know how to do this in the `lavaan` package, here's how.

To start I simulated some data:

``> head(dat)           A1         A2          A3         A4          B1         B2         B3         C1          C2         D1 1  1.6785322  0.9257293 -0.39660571 -0.5171069  1.47728589  1.3256104 -1.2620390 -0.6492827 -1.10679078  0.8026393 2  1.6168768  1.9164575  1.09444280 -0.3433172 -2.55549628  2.0257767  0.3753301 -2.2027485 -1.74793846 -0.7827619 3 -0.4532672 -1.8770901 -0.01629435 -1.3525647  0.05900466 -1.3453644 -1.3048589  2.1052869 -0.07766467  0.6288775 4  0.6806613  1.2028459 -0.51391579 -1.1764455  1.08308724 -1.7084728 -0.4183617  1.4533609  1.80628226  1.5631844 5  0.2953281 -2.1000532  0.03250903 -1.8928100  0.49891131  0.1838630 -1.1338902 -0.4802558 -0.33459527 -0.5051095 6  2.3563684  1.2439698 -0.85265611  2.1545112 -2.01701660 -0.8861477 -2.3937187 -1.5670614 -0.56750672 -1.8588870 ``

Next I built a model with a single indicator for `Factor.D` as described above. In this case the trick is specifying the variance of the observed variable to be equal to 0. This is the same thing as saying that the latent variable will account for all of the variance in the observed variable (i.e., a latent variable with a single indicator).

``model<-' Factor.A =~ A1 + A2 + A3 + A4 Factor.B =~ B1 + B2 + B3 + B3 Factor.C =~ C1 + C2 Factor.D =~ D1 D1~~0*D1 ' ``

Next, I run the model and obtain my output. To stick with the example above I will use the robust `MLM` estimator, but it isn't required in this situation.

``> fit<-sem(model, data=dat, estimator = 'MLM') >  > summary(fit, fit.measures=T, standardized =T) lavaan (0.5-22) converged normally after  79 iterations    Number of observations                           500    Estimator                                         ML      Robust   Minimum Function Test Statistic               23.764      24.159   Degrees of freedom                                30          30   P-value (Chi-square)                           0.783       0.765   Scaling correction factor                                  0.984     for the Satorra-Bentler correction  Model test baseline model:    Minimum Function Test Statistic              990.791     971.021   Degrees of freedom                                45          45   P-value                                        0.000       0.000  User model versus baseline model:    Comparative Fit Index (CFI)                    1.000       1.000   Tucker-Lewis Index (TLI)                       1.010       1.009    Robust Comparative Fit Index (CFI)                         1.000   Robust Tucker-Lewis Index (TLI)                            1.009  Loglikelihood and Information Criteria:    Loglikelihood user model (H0)              -8035.768   -8035.768   Loglikelihood unrestricted model (H1)      -8023.886   -8023.886    Number of free parameters                         35          35   Akaike (AIC)                               16141.535   16141.535   Bayesian (BIC)                             16289.047   16289.047   Sample-size adjusted Bayesian (BIC)        16177.954   16177.954  Root Mean Square Error of Approximation:    RMSEA                                          0.000       0.000   90 Percent Confidence Interval          0.000  0.023       0.000  0.024   P-value RMSEA <= 0.05                          1.000       1.000    Robust RMSEA                                               0.000   90 Percent Confidence Interval                             0.000  0.024  Standardized Root Mean Square Residual:    SRMR                                           0.021       0.021  Parameter Estimates:    Information                                 Expected   Standard Errors                           Robust.sem  Latent Variables:                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all   Factor.A =~                                                                A1                1.000                               0.892    0.664     A2                1.003    0.088   11.378    0.000    0.895    0.672     A3                1.071    0.093   11.500    0.000    0.955    0.718     A4                1.080    0.091   11.899    0.000    0.963    0.680   Factor.B =~                                                                B1                1.000                               0.961    0.685     B2                1.081    0.087   12.492    0.000    1.038    0.732     B3                1.098    0.095   11.597    0.000    1.055    0.744   Factor.C =~                                                                C1                1.000                               0.564    0.409     C2                2.224    4.074    0.546    0.585    1.254    0.925   Factor.D =~                                                                D1                1.000                               0.973    1.000  Covariances:                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all   Factor.A ~~                                                                Factor.B          0.059    0.051    1.157    0.247    0.069    0.069     Factor.C          0.001    0.028    0.049    0.961    0.003    0.003     Factor.D         -0.015    0.045   -0.332    0.740   -0.017   -0.017   Factor.B ~~                                                                Factor.C          0.033    0.065    0.518    0.605    0.062    0.062     Factor.D          0.072    0.047    1.520    0.128    0.077    0.077   Factor.C ~~                                                                Factor.D          0.003    0.028    0.108    0.914    0.006    0.006  Intercepts:                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all    .A1               -0.017    0.060   -0.275    0.783   -0.017   -0.012    .A2               -0.016    0.060   -0.270    0.787   -0.016   -0.012    .A3               -0.016    0.060   -0.264    0.791   -0.016   -0.012    .A4               -0.017    0.063   -0.268    0.789   -0.017   -0.012    .B1                0.063    0.063    1.006    0.314    0.063    0.045    .B2               -0.003    0.063   -0.041    0.967   -0.003   -0.002    .B3               -0.044    0.063   -0.690    0.490   -0.044   -0.031    .C1               -0.054    0.062   -0.874    0.382   -0.054   -0.039    .C2               -0.012    0.061   -0.198    0.843   -0.012   -0.009    .D1               -0.006    0.044   -0.135    0.893   -0.006   -0.006     Factor.A          0.000                               0.000    0.000     Factor.B          0.000                               0.000    0.000     Factor.C          0.000                               0.000    0.000     Factor.D          0.000                               0.000    0.000  Variances:                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all    .D1                0.000                               0.000    0.000    .A1                1.010    0.079   12.772    0.000    1.010    0.559    .A2                0.970    0.081   12.032    0.000    0.970    0.548    .A3                0.859    0.079   10.857    0.000    0.859    0.485    .A4                1.077    0.097   11.153    0.000    1.077    0.537    .B1                1.046    0.091   11.476    0.000    1.046    0.531    .B2                0.934    0.096    9.753    0.000    0.934    0.464    .B3                0.895    0.103    8.678    0.000    0.895    0.446    .C1                1.582    0.593    2.665    0.008    1.582    0.833    .C2                0.266    2.878    0.092    0.926    0.266    0.145     Factor.A          0.796    0.099    8.017    0.000    1.000    1.000     Factor.B          0.923    0.118    7.848    0.000    1.000    1.000     Factor.C          0.318    0.584    0.544    0.586    1.000    1.000     Factor.D          0.947    0.059   15.955    0.000    1.000    1.000 ``

Note that the variance for Factor.D is the exact same variance as the variance for D1 (using the population formula):

``> var(dat\$D1)*499/500 [1] 0.9468973 ``

And there you have it.

Rate this post