I am looking at state-wide data (entire population) of a school's grade as a function of the school's poverty index. The data appears to me to be an unconditional heteroskedastic distribution. I am drawing a line of best fit (LBF) through the data using linear regression (I changed this from 2-degree poly). What I am attempting to do is look at how each school is doing compared to how it was ‘predicted’ to do. In past years, a different test was used and the result was that the data was clearly linear, and moderately heteroskedastic. I then used SD as a measure so that I could look across multiple years.
The question I have is how can I find SD across a LBF through heteroskedastic data? I just clustered the data based on Chernick's reply. You can see what it now looks like. Very interesting.
So y is school grade and x is poverty index. If you have repeated x values or several x values close together you could compute the average squared distance from the observed y to the fitted y is each group. These could serve as variance estimates for the residuals that you can plot vs x (picking a central x for each group) to see how the variance is changing with x.
- Solved – Relationship between Beta regression of Y on X and linear regression of X on Y. (where Y is a proportion)
- Solved – Multivariate regression with Stata; Joint Hypothesis Testing
- Solved – How to interpret non-significant interaction term in linear regression
- Solved – Estimation of school effects using xtmixed (in Stata 12)
- Solved – Independent and dependent variables in Repeated Measures ANOVA