What is the explanation for obtaining a *Pearson's correlation coefficient* value that is significantly larger (a factor of ~2) than the *Spearman's rank correlation coefficient* value (on the same data)?

Doesn't this goes against the idea that *Spearman's rank correlation coefficient*, being the *Pearson's correlation coefficient* of the ranked data, can be seen as a generalization of *Pearson's* evaluation for monotonic dependences instead of linear ones? How can the correlation coefficient value for the monotonic dependence be smaller than that for a linear dependence only?

I was surprised to see that this was possible in a dataset with $N$~100 elements. I should add that the p-value associated to the *Pearson's correlation coefficient* is of 0.0 while that of *Spearman's rank* is of ~0.10.

**Possible explanation:**

This behaviour might be driven by the extreme values of the dataset. I compare the values of Pearson's c.c. ($rho$) and Spearman's rank c.c. ($rho_r$) after removal of these. I present the 2-sided p-values.

Full dataset: $rho$ = 0.381 (p-value: 0.000), $rho_r$ = 0.151 (p-value: 0.131)

One outlier removed: $rho$ = 0.336 (p-value: 0.001), $rho_r$ = 0.125 (p-value: 0.213)

Three outliers removed: $rho$ = 0.167 (p-value: 0.100), $rho_r$ = 0.076 (p-value: 0.459)

The remaining distribution (plotted) does not seem affected by the presence of outliers and yet is still exhibits the same behaviour. The full data is available here; note the outliers correspond to the first three rows.

**Contents**hide

#### Best Answer

This is a simple dataset, where the points come alternating from two linear functions:

The pearson correlation detects, there is a general upwards motion in the combined data (red an black together) and is r=.453 The spearman correlation just sees the ranks, which are distributed like this:

There is a high and a low rank alternating, so no clear trend for spearman. Spearman r = .079 This pearson is 5.7 times as high and you can easily increase that value by extending the row. You can even easily get a negative Spearman for a positive Pearson by just leaving out the last value. So there is nothing in the way of a compbination of a large Pearson and a small Spearman r and the above picture is even a bit similar to your's.

You can easily see how I constructed the data by looking at them:

1, -.01, 2, -.02, 3, -.03, 4, -.04, 5, -.05, 6, -.06, 7, -.07, 8, -.08, 9, -.09, 10

Hope that helps, Bernhard

### Similar Posts:

- Solved – Correlation between features Pearson vs Spearman
- Solved – Correlation between features Pearson vs Spearman
- Solved – Correlation between features Pearson vs Spearman
- Solved – What could cause big differences in correlation coefficient between Pearson’s and Spearman’s correlation for a given dataset
- Solved – Measuring correlation of trained neural networks