# Solved – is it possible to get Z-score ranging from -17 to +20

I have a data set in which I am trying to find outliers. I am using python libraries to get the Z-score value using below code :

``df['z_score']=stats.zscore(df[column_Name]) new_df=df.loc[df['z_score'].abs()>3] ``

Now the problem is that I get a good percent of my sample data which is having Z-Score > 3 or <-3. And due to which I cant drop it.

So, I checked the Z-Scores for all these columns and rows. The value of Z-Score is ranging from -17 to +20. Is it normal to get so high values of Z-Scores. And what does it shows about my data?

And in this case, how should I proceed, clearly I cant have Z-Score compared with 3. So, how do we do this in real world.

I am new to data science, I googled but did not find much help regarding this. So any leads will be appreciated.
Also, I am not able to understand this range of -5 to 10 which gets displayed at the bottom of box plot. If I look at that, it looks like the data beyond this value of -5 to 10 is my outlier.

Contents

What it means is that your data set is more prone to extreme observations than a normal distribution with the same variance. For a norma distribution, you have about a $$0.06%$$ chance of getting an observation with a z-score of magnitude greater than $$3$$, and it’s extraordinarily unusual to observe z-scores with magnitudes like $$17$$ and $$20$$.
This is related to a quantity called kurtosis, which quantifies the propensity of a distribution to have extreme values. Every normal distribution has a kurtosis of $$3$$. If you stick your data into R and call kurtosis in the moments package, I would expect you to get quite a bit higher value than 3. The Python implementation, since you’re into Python, is scipy.stats.kurtosis, though I think scipy subtracts 3 to give you the so-called excess kurtosis.