Solved – Testing Measurement Invariance with Robust Estimators Yields Bizarre (Improved) Model Fit Indexes

I'm in the process of fitting measurement invariance models using the semTools package for R. For the basic measurement model–not distinguishing between groups–I used a robust ML estimator (MLR), and everything seemed fine. But I noticed that when I used this robust estimator (or an alternative robust estimator like MLM) when testing measurement invariance, my models yielded fit indexes that seemed highly implausible, if not impossible. Specifically, the CFI and the RMSEA both improve after constraining loadings and intercepts to equality, rather than worsening (as they ought to). When I re-run these models with a traditional ML estimator, however, this problem disappears (but then I do not get the benefits of correcting my test statistic for non-normality in my indicators).

Any guidance or ideas about what may be going on here would be seriously appreciated.

Update 1

Okay, as requested by Jeremy Miles, here's an update with some model code and output, in order to better diagnose the issue:

Update 2

Posted some factor loadings and latent covariances from both groups, to help diagnose why the scaling factor is changing so much between models.

Model Syntax

library(semTools) library (lavaan)  RNS.invar<-" manage =~ RNS_3 + RNS_8 + RNS_6 + RNS_19 + RNS_13 + RNS_4 + RNS_12 + RNS_15 + RNS_1 + RNS_22 + RNS_18 + RNS_5 + RNS_24 agree =~ RNS_8 + RNS_19 + RNS_13 + RNS_12 + RNS_15 + RNS_1 + RNS_18 + RNS_17 + RNS_11 + RNS_14 + RNS_7 + RNS_9 + RNS_16 + RNS_5 + RNS_2 explicit =~ RNS_3 + RNS_20 + RNS_4 + RNS_1 + RNS_22 + RNS_18 + RNS_17 + RNS_9 + RNS_16 + RNS_10 + RNS_5 + RNS_2 punish =~ RNS_12 + RNS_22 + RNS_9 + RNS_5 + RNS_21 + RNS_20 + RNS_23 + RNS_24 "  invariance=measurementInvariance(RNS.invar, data=dat, group="Male", estimator="MLR") 

Output: invariance$fit.configural

Model Fit Indexes

Estimator                                         ML      Robust Minimum Function Test Statistic              983.743     890.242 Degrees of freedom                               444         444 P-value (Chi-square)                           0.000       0.000 Scaling correction factor                                  1.105 for the Yuan-Bentler correction Comparative Fit Index (CFI)                    0.858       0.846 RMSEA                                          0.100       0.091 

Factor Loadings

Group 1 [1]:  Latent Variables:                Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all manage =~                                                              RNS_3             1.000                               1.428    0.865 RNS_8             0.813    0.088    9.201    0.000    1.161    0.673 RNS_6             0.988    0.088   11.189    0.000    1.411    0.811 RNS_19            0.889    0.116    7.680    0.000    1.270    0.694  Group 2 [2]:  Latent Variables:                Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all manage =~                                                              RNS_3             1.000                               1.549    0.907 RNS_8             0.798    0.091    8.741    0.000    1.237    0.823 RNS_6             0.982    0.102    9.617    0.000    1.521    0.861 RNS_19            0.728    0.121    6.002    0.000    1.127    0.695 

Latent Covariances

Group 1 [1]: Covariances:                Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all manage ~~                                                              agree             0.190    0.098    1.948    0.051    0.382    0.382 explicit          0.004    0.020    0.183    0.855    0.174    0.174 punish            0.056    0.051    1.095    0.274    0.324    0.324 agree ~~                                                               explicit         -0.002    0.012   -0.177    0.859   -0.400   -0.400 punish            0.001    0.005    0.171    0.865    0.021    0.021 explicit ~~                                                            punish           -0.000    0.001   -0.152    0.879   -0.085   -0.085  Group 2 [2]: Covariances:                Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all manage ~~                                                              agree             0.063    0.066    0.963    0.336    0.137    0.137 explicit          0.039    0.167    0.232    0.817    0.877    0.877 punish            0.015    0.257    0.059    0.953    0.415    0.415 agree ~~                                                               explicit          0.000    0.003    0.184    0.854    0.057    0.057 punish           -0.001    0.023   -0.059    0.953   -0.195   -0.195 explicit ~~                                                            punish            0.000    0.006    0.057    0.955    0.519    0.519 

Output: invariance$fit.loadings

Model Fit Indexes

Estimator                                         ML      Robust Minimum Function Test Statistic             1021.659     861.367 Degrees of freedom                               488         488 P-value (Chi-square)                           0.000       0.000 Scaling correction factor                                  1.186 for the Yuan-Bentler correction Comparative Fit Index (CFI)                    0.859       0.871 RMSEA                                          0.095       0.079 

Factor Loadings

Group 1 [1]: Latent Variables:                Estimate  Std.Err  Z-value    P(>|z|)   Std.lv  Std.all manage =~                                                                RNS_3             1.000                                 1.462    0.873 RNS_8   (.p2.)    0.791    0.054     14.673    0.000    1.156    0.678 RNS_6   (.p3.)    0.981    0.053     18.469    0.000    1.435    0.817 RNS_19  (.p4.)    0.777    0.079      9.807    0.000    1.136    0.644  Group 2 [2]: Latent Variables:                Estimate  Std.Err  Z-value    P(>|z|)   Std.lv  Std.all manage =~                                                                RNS_3             1.000                                 1.536    0.913 RNS_8   (.p2.)    0.791    0.054     14.673    0.000    1.215    0.800 RNS_6   (.p3.)    0.981    0.053     18.469    0.000    1.508    0.863 RNS_19  (.p4.)    0.777    0.079      9.807    0.000    1.194    0.698 

Latent Covariances

Group 1 [1]: Covariances:                Estimate  Std.Err  Z-value    P(>|z|)   Std.lv  Std.all manage ~~                                                                agree             0.177    0.049      3.592    0.000    0.392    0.392 explicit          0.001    0.001      1.683    0.092    0.164    0.164 punish            0.005    0.002      2.877    0.004    0.304    0.304 agree ~~                                                                 explicit         -0.001    0.000     -3.217    0.001   -0.389   -0.389 punish            0.000    0.000      0.000    1.000    0.000    0.000 explicit ~~                                                              punish           -0.000    0.000     -0.479    0.632   -0.061   -0.061  Group 2 [2]: Covariances:                Estimate  Std.Err  Z-value    P(>|z|)   Std.lv  Std.all manage ~~                                                                agree             0.113    0.063      1.798    0.072    0.221    0.221 explicit          0.004    0.001      2.745    0.006    0.335    0.335 punish            0.010    0.002      3.948    0.000    0.455    0.455 agree ~~                                                                 explicit         -0.001    0.000     -4.246    0.000   -0.527   -0.527 punish           -0.000    0.001     -0.476    0.634   -0.060   -0.060 explicit ~~                                                              punish            0.000    0.000      1.130    0.259    0.149    0.149 

Output: invariance$fit.intercepts

Model Fit Indexes

Estimator                                         ML      Robust Minimum Function Test Statistic             1049.546     754.119 Degrees of freedom                               508         508 P-value (Chi-square)                           0.000       0.000 Scaling correction factor                                  1.392 for the Yuan-Bentler correction Comparative Fit Index (CFI)                    0.857       0.915 RMSEA                                          0.094       0.063 

Factor Loadings

Group 1 [1]: Latent Variables:                Estimate  Std.Err  Z-value       P(>|z|)   Std.lv  Std.all manage =~                                                                   RNS_3             1.000                                    1.460    0.872 RNS_8   (.p2.)    0.802    0.047        17.113    0.000    1.171    0.678 RNS_6   (.p3.)    0.986    0.044        22.531    0.000    1.440    0.817 RNS_19  (.p4.)    0.786    0.054        14.597    0.000    1.148    0.648  Group 2 [2]: Latent Variables:                Estimate  Std.Err  Z-value       P(>|z|)   Std.lv  Std.all manage =~                                                                   RNS_3             1.000                                    1.532    0.913 RNS_8   (.p2.)    0.802    0.047        17.157    0.000    1.229    0.800 RNS_6   (.p3.)    0.986    0.044        22.278    0.000    1.512    0.863 RNS_19  (.p4.)    0.786    0.054        14.504    0.000    1.205    0.700 

Latent Covariances

Group 1 [1]: Covariances:                Estimate  Std.Err  Z-value       P(>|z|)   Std.lv  Std.all manage ~~                                                                   agree             0.175    0.044         3.964    0.000    0.391    0.391 explicit          0.001    0.001         1.711    0.087    0.167    0.167 punish            0.004    0.001         3.116    0.002    0.306    0.306 agree ~~                                                                    explicit         -0.001    0.000        -5.071    0.000   -0.391   -0.391 punish            0.000    0.000         0.003    0.997    0.000    0.000 explicit ~~                                                                 punish           -0.000    0.000        -0.338    0.735   -0.059   -0.059  Group 2 [2]: Covariances:                Estimate  Std.Err  Z-value       P(>|z|)   Std.lv  Std.all manage ~~                                                                   agree             0.111    0.054         2.045    0.041    0.220    0.220 explicit          0.003    0.001         3.713    0.000    0.340    0.340 punish            0.007    0.001         5.103    0.000    0.454    0.454 agree ~~                                                                    explicit         -0.001    0.000        -6.576    0.000   -0.530   -0.530 punish           -0.000    0.000        -0.520    0.603   -0.062   -0.062 explicit ~~                                                                 punish            0.000    0.000         1.139    0.255    0.153    0.153 

Summary

As I would expect, the $chi^2$ statistic increases with each added level of invariance, when estimated using the boring ol' ML estimator. But the robust estimates of the $chi^2$ statistic are decreasing with each added level of invariance constraints, when estimated using the MLR estimator…

The source of your problem is the 'robust' estimation of standard errors using the robust Satorra-Bentler Chi-square statistic. When testing for measurement invariance, we compare less constrained (configural invariance) to more constrained (metric or scalar invariance) models. The comparison that is usually applied is a Chi-square difference test, which compares the Chi-square of a less constrained to a more constrained model, testing the null hypothesis that both models have the same fit.

In addition some authors argue that one may also look at the change in RMSEA or CFI, but there are no strong advices on which change in these statistics is desired. Therefore my advice is to first of all look at change in model Chi-square and the associated p-value for above mentioned null hypothesis. I will therefore first answer your question in terms of Chi-square change and then address the change in CFI and RMSEA


Testing the change in model Chi-square

MLR uses a scaled version of Chi-square to find robust standard errors following a paper by Satorra and Bentler in Psychometrica. The problem you are facing now is that, as you say, the (scaled) Chi-squares decrease across more constrained version of the model. In fact, the simple scaled Chi-square differences between your models is negative and thus undefined.

This behavior can be expected because the difference in scaled Chi-squares is not Chi-square distributed. A Chi-square difference test using scaled Chi-squares needs to be adapted before the Chi-square difference can be interpreted in the usual way. Specifically, the adjustment goes as follows. First we calculate a scaling correction factor:

$$s= (d_0c_0-d_1c_1)/(d_0-c_1)$$

where $d_0$ is the degrees of freedom of the nested (constrained) model and $d_1$ in the unconstrained model. Furthermore $c_1$ and $c_0$ are the scaling correction factors reported by lavaan or other SEM packages like Mplus. Subsequently, we calculate a corrected Chi-sqaure difference

$$ Delta_{chi} = (T_0c_0 – T_1c_1)/ s $$

where $T_0$ and $T_1$ are the scaled (robust) model Chi-squares. This adjusted Chi-square is then tested on a central Chi-square distribution with degrees of freedom equal to the difference in degrees of freedom of the two models.

To provide an example for your data for testing configural against metric invariance in R, we use a short script:

d0 = 488 # Enter data as in your output d1 = 444 c0 = 1.186 c1 = 1.105 T0 = 861.367 T1=890.242  (cd  = (d0 * c0 - d1*c1)/(d0 - d1)) # scaling correction factor [1] 2.003364  (TRd = (T0*c0 - T1*c1)/cd)          # Adjusted difference in model Chi-squares [1] 18.90014  > (df  = d0-d1)                     # Difference in degrees of freedom [1] 44  > 1 - pchisq(TRd,df)                # p-value [1] 0.9996636 

We can see that the scaled Chi-square difference is 18.9 (and now it has a positive sign!), which when tested with $alpha=.05$ type-1 error probability is not significant. Hence there is evidence for metric invariance in your data.

There is a lot of documentation on this problem on the Mplus website. See here for a discussion of difference testing with scaled Chi-square. The correction I suggest is the simple adjustment variant which in some cases may still yield negative Chi-square. There is a more recent and more sophisticated approach called the strictly positive Chi-square difference. It is described on the Mplus website I linked.


Decrease in fit indices (RMSEA and CFI):

It was remarked that my answer did not yet sufficiently address the RMSEA and CFI increase that was observed over increasingly constrained versions of a baseline model. To understand this we first of all need to refer to the definitions of the two statistics:

$$RMSEA = frac{(chi ^2-df)^{frac{1}{2}}}{df(n-1)}$$

and

$$CFI = frac{ (chi_0^2 – df_0) – (chi_1^2 – df_1) }{ (chi_0^2 – df_0)}$$

where $0$ and $1$ indicate the null model and the tested model respectively. It can be seen that both fit measures depend on $chi^2$ and the $df$ of the model. The scaled $chi^2$ is designed in a way to be more 'robust' to many practical problems, in particular the violation of multi-variate noramlity in continuous factor analysis. If we assume scaled $chi^2$ is a more valid version than unscaled $chi^2$, we may conclude that also ´scaled rmsea´ and ´scaled cfi´ are more precise versions. In lavaan you therefore need to check that you looked at the correct scaled rmsea and scaled cfi.

Assuming that you did this already, it can be seen from the definitions of the two indices that a decrease in RMSEA and CFI across more constraint versions of the model is actually possible, in fact it is desirable!

To see this, we first of all assume that the chi-square of the constrained and unconstrained models does not change. This means that the more strict model is true. However the number of parameters in the model decreases, thus $df$'s increase. Now let $a$ denote the unconstrained (e.g. configural) and $b$ the constrained (e.g. metric) model. So we know that $df_a<df_b$ while assuming $chi^2_a =chi^2_b = chi^2$ (i.e. no decrease in fit / more constrained model is true). Now we wonder if it is possible whether

$$RMSEA_a > RMSEA_b$$

as well as

$$CFI_a > CFI_b$$

It is particularly easy to see this for $CFI$, because there we have

$$CFI_a > CFI_b Leftrightarrow (chi_a^2 – df_a) – (chi_b^2 – df_b) > 0 \ Leftrightarrow (chi^2 – df_a) – (chi^2 – df_b) > 0 \ Leftrightarrow df_b > df_a $$

which is always true if $chi^2_a =chi^2_b = chi^2$. Hence $CFI$ of the more constrained model can be smaller than that of the unconstrained model and necessarily is when fit of the two models is exctly equal. For RMSEA the situation is a little bit more complicated because the inequality involves squared terms of $chi^2$, $df_a$ and $df_b$. This suggests that the solution under the assumption $chi^2_a = chi^2_b$ depends on their particular values, but under certain combinations the inequality will hold as well.

Hence in conclusion, what you observe is possible. In particular we are more likely to find it in situations when the model $chi^2$ only marginally changes while the amoung of additionally constrained parameters is large. This is exectly the result we get when a more constrained model is the true model and the less constrained model was specified too 'flexible' (over-parametrized). Thus decrease in the two fit measures is even better news than a (small) increase!

Similar Posts:

Rate this post

Leave a Comment