I'm working with a large, national survey that was collected using complex survey methods. In order to make my results representative I need to account for sample weights and other survey design features (e.g., sampling strata). I'm new to this methodology, so apologies if the answers here are obvious.

Some of my models involve only a subset of the data (e.g., only female participants).

I have one questions:

Do I need to adjust the sample weights to reflect the fact that I am only analyzing a subsample (e.g., females)? My understanding is that not adjusting the weights can bias results (the standard errors in particular).

**Contents**hide

#### Best Answer

Yes you do need to *use* the weights. You do not adjust the weights, rather by using the weights, you adjust for the complex design of the survey to obtain efficient and unbiased estimates of the parameters of interest. If you ignore the weights, the analysis will most often be biased, or it may be inefficient. Getting the wrong standard errors doesn't mean the estimate is biased, but it can be conservative (or anti-conservative). For instance, if I wanted to know the incidence of a rare disease, I may conduct an SRS, then collect an additional sample from a high risk subpopulation. If I only analyze my SRS, it will be unbiased but very inefficient. By inverse probability weighting by the probability of randomly sampling a high risk individual, I can get a much tighter confidence interval for the prevalence of disease.

In R, the `survey`

package has methods for calculating mean differences and GLM estimates from complex designs.

### Similar Posts:

- Solved – Using post-stratification weights in R survey package
- Solved – Difference of Two Proportions Hypothesis Test with Weighted Sample Data
- Solved – Pearson Correlation for Clustered data
- Solved – R: Do I have to use sample-weights for calculations inside a bootstrap function that allready uses sample weights
- Solved – Why it is important to make survey design object (svydesign function in R with id, strata, weights, fpc) from raw data and after clean data in object