I have aggregated(total) youtube videos views. I have take log of that views. And calculated autoregressive koefs that can be used for the video views predictibility tests. Let say I have aggregated daily views array for each video. Koef for each video is calculated as:
koef = aggregatedViews / aggregatedViews
This list of koefs for all videos forms target distribution.
Initially, I thought that it will be half-normal but looks like this is not normal. Is this Pareto Type I distribution?
How to determine the type of probability distribution for a dataset?
You can use the
fitdistrplus package in R. First, you can plot a Cullen AC and Frey graph using the
descdist function in order to find possible candidates of distributions . Then you can fit the best candidates of distributions to your data using
fitdist. Now you can test the hypothesis that your data comes from these distributions by performing a Kolmogorov-Smirnov test or an Anderson-Darling test. Finally, you can select a fitted distribution using graphical methods or comparing measures of quality like AIC values.
Here you can find a nice example of this procedure: