# Solved – Drought index calculation

I am calculating a drought index and I have a set of values. The next step in the procedure is to standardize the values to compare across regions and time scales. The paper that establishes this particular drought index fits their dataset to the log-logistic distribution and uses the probability distribution function to standardize the values. My dataset does not follow the log-logistic distribution, or any distribution for which I can test. Is there a procedure to standardize values that do not follow a distribution?

What I am trying to do is calculate the drought index values to use in an attempt to correlate drought index with irrigation water demand. Contents

I guess the author is referring to the Standardized Precipitation Index (SPI; McKee et al., 1993) or one derived from it (e.g., SRI (Shukla & Wood, 2008), for runoff; SSI (Hao & AghaKouchak, 2013), for soil moisture; SPEI (Vicente-Serrano et al., 2010), for precipitation and evapotranspiation; etc.). Although each of these indices was originally proposed with a specific probability distribution function (PDF), since their publication other PDFs have been used to compute them. For example, Guttman (1999) found that the Pearson type III was the best model to compute the SPI for a large data set in U. S.; meanwhile, Lana et al. (2001) found that the Poisson-gamma distribution was the best for the data analyzed in Catalonia, in Spain.

Furthermore, there are non-parametric approaches used to compute this type of drought index. For example, Farahmand & AghaKouchak (2015) proposed the use of the empirical probability estimated with the general formula for plotting positions (Hesel et al., 2020):

$$p = frac{i – alpha}{n – alpha – beta + 1}$$

where $$i$$ denotes the rank of non-zero values in the data set, from the smallest; $$n$$ is the sample size; and $$alpha = beta = 0.44$$.

Also, Kumar et al. (2016) used Kernel Density Estimation (KDE) with a Gaussian kernel to estimate the probability distribution of the data set.

I guess the key idea here is that you're not constrained to use a specific PDF. Rather, use the one that better fits your data, or, if each site in your analysis fits better different PDFs, use a non-parametric approach.

Rate this post