Solved – How to apply sampling weights in R

I'm working with a large, national survey that was collected using complex survey methods. In order to make my results representative I need to account for sample weights and other survey design features (e.g., sampling strata). I'm new to this methodology, so apologies if the answers here are obvious.

Some of my models involve only a subset of the data (e.g., only female participants).

I have one questions:

Do I need to adjust the sample weights to reflect the fact that I am only analyzing a subsample (e.g., females)? My understanding is that not adjusting the weights can bias results (the standard errors in particular).

Yes you do need to use the weights. You do not adjust the weights, rather by using the weights, you adjust for the complex design of the survey to obtain efficient and unbiased estimates of the parameters of interest. If you ignore the weights, the analysis will most often be biased, or it may be inefficient. Getting the wrong standard errors doesn't mean the estimate is biased, but it can be conservative (or anti-conservative). For instance, if I wanted to know the incidence of a rare disease, I may conduct an SRS, then collect an additional sample from a high risk subpopulation. If I only analyze my SRS, it will be unbiased but very inefficient. By inverse probability weighting by the probability of randomly sampling a high risk individual, I can get a much tighter confidence interval for the prevalence of disease.

In R, the survey package has methods for calculating mean differences and GLM estimates from complex designs.

Similar Posts:

Rate this post

Leave a Comment