Shapiro-Wilk Test for Normality in R

[This article was first published on R – data technik, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

I think the Shapiro-Wilk test is a great way to see if a variable is normally distributed. This is an important assumption in creating any sort of model and also evaluating models.

Let’s look at how to do this in R!


And here is the output:

Shapiro-Wilk normality test
data: data$CreditScore
W = 0.96945, p-value = 0.2198

So how do we read this? It looks like the p-value is too high. But it is not. The data is normal if the p-value is above 0.05. So we now know our variable is normally distributed.

Let’s make a histogram to take a look using base R graphics:

hist(data$CreditScore, main="Credit Score", xlab="Credit Score", border="light blue", col="blue", las=1, breaks=5)

It does look normal from our distribution here:

Great! Now we can make assumptions and perform more tests on our credit scores.

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook


Leave a Comment