Solved – Heteroskedasticity and standard deviation

enter image description hereI am looking at state-wide data (entire population) of a school's grade as a function of the school's poverty index. The data appears to me to be an unconditional heteroskedastic distribution. I am drawing a line of best fit (LBF) through the data using linear regression (I changed this from 2-degree poly). What I am attempting to do is look at how each school is doing compared to how it was ‘predicted’ to do. In past years, a different test was used and the result was that the data was clearly linear, and moderately heteroskedastic. I then used SD as a measure so that I could look across multiple years.

The question I have is how can I find SD across a LBF through heteroskedastic data? I just clustered the data based on Chernick's reply. You can see what it now looks like. Very interesting.

So y is school grade and x is poverty index. If you have repeated x values or several x values close together you could compute the average squared distance from the observed y to the fitted y is each group. These could serve as variance estimates for the residuals that you can plot vs x (picking a central x for each group) to see how the variance is changing with x.

Similar Posts:

Rate this post

Leave a Comment