Solved – the proper way to compute the r-squared between a binned distribution of observed values and a continuous probability density function

I have circular data — observations each of which falls between -180 degrees and +180 degrees — divided into a 15-bin histogram.

I'd like to see how well a continuous PDF — specifically a mixture of a von Mises distribution and a uniform distribution, with particular parameters — fits the observed histogram.

And to determine this fit, I'd like to use the r-squared statistic. I'd like to leave aside the question of whether I should be using r-squared or something else. My choice of r-squared is based on Zhang & Luck, Nature, 2008 and Zhang & Luck, Psychological Science, 2009, work I'm trying to replicate. (These papers did exactly what I'm describing I want to do — compute the r-squared between a 15-bin histogram of circular data and the mixture model.) But if you'd like to suggest a better method, and can describe it clearly, I'd be happy to try it out.

My question is, how should I compute the r-squared? Should I bin the continuous function, and then compare the PDF bin heights to the observed bin heights? Should I take the mean of the PDF over the range spanned by each bin of the observed data? Should I compare the bin centers to the corresponding points in the continuous function?

You can plot empirical data vs your distribution with fitted parameters. Please see the answer of this link and plot histogram of your original data and smooth histogram of your fitted data as data1 is the original data and data2 is your fitted data using MLE estimates.

Similar Posts:

Rate this post

Leave a Comment