I am trying to use the GEE package in R to fit a GEE model to some clinical trial data. The model fits fine using independent, or exchangeable correlation structures. I'm trying to use an AR-1 structure as follows:
eff.gee.ar <- gee(cluster_severity ~ logtime + cluster + cluster*logtime, corstr = "AR-M", Mv = 1, id = ID, data=ral2)
but this is causing the following error message:
Error in gee(cluster_severity ~ logtime + cluster + cluster * logtime, : VC_GEE_covlag: arg has > MAX_COVLAG rows
The data are sensitive so I cannot share them, but here is the head() view, to give you an idea of how they are arranged. They are sorted so that the cluster (ID) is always contiguous (as instructed in the GEE package). There are ~70k rows to the ral2 data frame.
> head(ral2, n=20) ID WEEK cluster cluster_severity logtime 1 1 0 cluster1 2.0000000 0.000000 2 1 0 cluster2 0.7500000 0.000000 3 1 0 cluster3 1.5000000 0.000000 4 1 0 cluster4 2.3333333 0.000000 5 1 2 cluster1 1.4000000 1.098612 6 1 2 cluster2 0.0000000 1.098612 7 1 2 cluster3 1.0000000 1.098612 8 1 2 cluster4 2.3333333 1.098612 9 1 4 cluster1 0.2000000 1.609438 10 1 4 cluster2 0.0000000 1.609438 11 1 4 cluster3 0.7500000 1.609438 12 1 4 cluster4 3.0000000 1.609438 13 1 6 cluster1 0.4000000 1.945910 14 1 6 cluster2 0.0000000 1.945910 15 1 6 cluster3 0.5000000 1.945910 16 1 6 cluster4 2.0000000 1.945910 17 2 0 cluster1 1.8000000 0.000000 18 2 0 cluster2 0.2500000 0.000000 19 2 0 cluster3 0.7500000 0.000000 20 2 0 cluster4 0.6666667 0.000000
Any advice or illumination regarding this error message would be greatly appreciated. Google only returns 6 results for the error message :(.
My only intuition so far is that the AR-1 structure cannot be fit to individuals for whom there is only one data point (e.g. subjectID == 2 in the above illustration), although for some reason I expected that a GEE would be fine with this missingness.
On a related note, is it true that GEE's are robust to mis-specification of the correlation structure anyway? This seems to be word on the street, but I haven't found citations for/against this view.
Thanks in advance to any GEE-ers.
Best Answer
Apparently GEEs are a lonely topic in crossvalidated these days 🙁
I think the AR-M structure didn't work because there was high correlation between some of the random effects in the model. I fixed it by trying harder to model the data in a mixed-effects regression framework with an appropriate random effects setup.
I still haven't seen any definitive answer on mis-specification of the correlation structure in GEEs, but in my experience it doesn't seem to affect fixed effects estimates much (usually the same to 3.s.f). Perhaps it is more of an issue in smaller datasets, this one is reasonably large.