Solved – Performing subgroup analysis using the metafor package

When I performed a subgroup analysis on a catergorical moderator named "moda" (with two levels:m and n) in my data,

dat=read.csv("D:\...\bothlevels.csv",header=T,sep=",")#this is a data composed of single proportions transf.ies=escalc(measure="PFT",xi=cases,ni=total,data=dat,add=0)#note that I used the double arcsine transformation transf.pes.m=rma(yi,vi,data=transf.ies,subset=(moda=="m"),method="DL") transf.pes.n=rma(yi,vi,data=transf.ies,subset=(moda=="n"),method="DL") pes.m=predict(transf.pes.m,transf=transf.ipft.hm,targ=list(ni=dat$total),digits=4);pes.m pes.n=predict(transf.pes.n,transf=transf.ipft.hm,targ=list(ni=dat$total),digits=4);pes.n 

the results showed that:

pes.m:  pred ci.lb  ci.ub  cr.lb  cr.ub 0.7641 0.6760 0.8422 0.2769 1.0000 pes.n: pred  ci.lb  ci.ub  cr.lb  cr.ub 0.5442 0.4727 0.6149 0.1752 0.8872 

But, when I separated my data into two csv files according to the levels of the moderator and performed meta-analyses respectively, the estimated average effect sizes and the corresponding CIs became slightly different than before.

dat=read.csv("D:\...\levelm.csv",header=T,sep=",") transf.ies=escalc(measure="PFT",xi=cases,ni=total,data=dat,add=0) transf.pes=rma(yi,vi,data=transf.ies,method="DL") pes.m=predict(transf.pes,transf=transf.ipft.hm,targ=list(ni=dat$total));pes.m  pes.m: pred  ci.lb  ci.ub  cr.lb  cr.ub 0.7647 0.6764 0.8430 0.2764 1.0000  dat=read.csv("D:\...\leveln.csv",header=T,sep=",") transf.ies=escalc(measure="PFT",xi=cases,ni=total,data=dat,add=0) transf.pes=rma(yi,vi,data=transf.ies,method="DL") pes.n=predict(transf.pes,transf=transf.ipft.hm,targ=list(ni=dat$total));pes.n  pes.n: pred  ci.lb  ci.ub  cr.lb  cr.ub 0.5441 0.4727 0.6146 0.1759 0.8864 

I wondered how this happened. The issue occurred with or without transformation of the original data. Note that this data contains no proportions of 0 or 1, so I don't think the small discrepancy was due to the adjustment of such proportions.

Below are the csv files of my data:

bothlevels.csv
levelm.csv
leveln.csv

The issue here is that if you perform one analysis you get a single estimate of $tau^2$. If you split the dataset into subsets you get separate estimates for $tau^2$ in each subset. This leads to different estimates throughout. Note that is all supposing you are fitting random effects models which is in fact the case here.

There is an extensive explanation in the page linked to by Wolfgang which also shows how to get estimates from the combined dataset which match those from the separate subsets by allowing for $tau^2$ to vary between subsets.

Similar Posts:

Rate this post

Leave a Comment