Pearson correlation computes linear association between variables and Spearman computes monotonic relations that could be non-linear. I computed Pearson and Spearman correlation between different features. Both of them gave similar values. What does this indicate. How can a linear method give similar values to a non-linear method.
Best Answer
Say you have two sets of values, $X$ and $Y$. The Spearman correlation coefficient is obtained by rank transforming $X$ and $Y$, then calculating the Pearson correlation coefficient. If each value in $X$ and $Y$ is a linear function of its rank, the Pearson and Spearman correlation coefficients will be identical.
Pearson and Spearman correlation coefficients will be similar when there's an underlying linear relationship between $X$ and $Y$. But, the reverse isn't necessarily true. Here's an example where $X$ and $Y$ are constructed to be independent (i.e. no relationship), but have identical Pearson and Spearman correlation coefficients. I generated $X$ by randomly permuting a list of integers from 1 to 20, multiplying these values by 0.2, and adding 0.3. I did the same to generate $Y$, but multiplied by 0.5 and added 0.1. $X$ and $Y$ are based on separate random permutations, so they're independent as you can see in the left scatter plot. Each value in $X$ and $Y$ is a linear function of its rank, as you can see in the right plot. The Pearson and Spearman correlation coefficients are both 0.1699. Of course, the fact that the correlation coefficients are positive is just by chance. $X$ and $Y$ are independent so, if you performed this procedure many times, the average correlation would be 0. But, on every iteration, the Pearson and Spearman correlation coefficients would be identical.
Similar Posts:
- Solved – Correlation between features Pearson vs Spearman
- Solved – Correlation between features Pearson vs Spearman
- Solved – the explanation for having a Pearson’s correlation coefficient significantly larger than the Spearman’s rank correlation coefficient
- Solved – What could cause big differences in correlation coefficient between Pearson’s and Spearman’s correlation for a given dataset
- Solved – Is it a reasonable rule of thumb to examine Pearson and Spearman correlations, and use the latter if they are very different