We are interested in fitting a multiple logistic regression model using data obtained from a national survey of children with special health care needs. The data has an accompanying weight variable intended to standardize children to the national population in which we intend to make inference. This weight variable does not sum to 1 nor are the weights integral (they may take values such as 23.2). This model is being fit using SAS v9.2. In consulting the documentation for the logistic procedure, I notice in the syntax description the following statement:
Caution: PROC LOGISTIC does not compute the proper variance estimators
if you are analyzing survey data and specifying the sampling weights
through the WEIGHT statement. The SURVEYLOGISTIC procedure is designed
to perform the necessary, and correct, computations.
I don't understand why this should be an issue. If model based standard errors are being computed, then the weighted maximum likelihood estimator should give standard errors which are correct for the population of interest. Is this correct? What likelihood function is SAS's logistic regression solver optimizing if the above statement is correct?
Best Answer
Most stats software that is not built specifically for survey weights-to-population interprets weights as either a frequency variable or as a correction for heteroscedasticity and fits likelihood functions accordingly. Estimates of parameters will be correct but standard errors will be wildly wrong (usually way too small ie claiming excess precision).
Similar Posts:
- Solved – Normalizing weighted regression data
- Solved – Weights in IPSW (inverse propensity score weighting) too high
- Solved – Post stratification weights in survey package in R
- Solved – Comparing weighted and non-weighted anthropometric and categorical data
- Solved – Weights in quantile regression for complex survey in R