For a Bayesian logistic regression problem, I have created a posterior predictive distribution. I sample from the predictive distribution and receive thousands of samples of (0,1) for each observation I have. Visualizing the goodness-of-fit is less than interesting, for example:
This plot shows the 10 000 samples + the observed datum point (way in the left one can make out a red line: yea that's the observation). The problem is is that this plot is hardly informative, and I'll have 23 of them, one for each data point.
Is there a better way to visualize the 23 data points plus there posterior samples.
Another attempt based on the paper here
I have a feeling your not quite giving up all the goods to your situation, but given what we have in front of us lets consider the utility of a simple dot-plot to display the information.
The only real thing to not here (that aren't perhaps default behaviors) are:
- I utilized redundant encodings, shape and color, to discriminate between the observed values of no defects and defects. With such simple information, placing a dot on the graph is not necessary. Also you have a problem when the point is near the middle values, it takes more look-up to see if the observed value is either zero or one.
- I sorted the graphic according to observed proportion.
Sorting is the real kicker for dot-plots like these. Sorting by values of proportion here helps easily uncover high residual observations. Having a system where you can easily sort by values either contained in the plot or in external characteristics of the cases is the best way to get the bang for your buck.
This advice extends to continuous observations as well. You could color/shape the points according to whether the residual is negative or positive, and then size the point according to the absolute (or squared) residual. This is IMO not necessary here though because of the simplicity of the observed values.
- Solved – How to visualize Bayesian goodness of fit for logistic regression
- Solved – How to generate the posterior predictive distribution for hierarchal model in PYMC3
- Solved – Bayesian inference: numerically sampling from the posterior predictive
- Solved – Practical implementation of posterior predictive distribution
- Solved – How to make correct predictions of probabilities and their uncertainty in Bayesian logistic regression