In the previous lecture, we saw that by creating a Q-Q plot for residuals, we can visually inspect whether residuals follow a normal distribution or not. Now, let's look at a more concrete method to test the normality of residuals.
We'll use the same models as in the previous lecture i.e. "model" between "b" and "a" and "model2" between "c" and "a". To find coefficient of correlation in R we will use following command -
cor.test()
But before that let's first find residuals for both the models.
residual <- resid(model)
residual2 <- resid(model2)
Now, let's find standard deviation of both the residuals in R using sd()
sd <- sd(residual)
sd2 <- sd(residual2)
Now, for our correlation test we will require normal variables. Mean of it will be equal to mean of residual i.e. 0 and its standard deviation will be equal to that of residual. First, let's create normal variables corresponding to both our models.
list1 <- rnorm(100,0,sd)
list2 <- rnorm(100,0,sd2)
Now, we will find correlation between residual and list 1 as well as between residual2 and list2.
cor.test(residual, list1)
We get output : 0.1225952 as well as p-value of 0.22.
Since the p-value is greater than 0.05, with 95% confidence we can conclude that residual is not correlated with list1 i.e. residuals of first model do not follow a normal distribution.
cor.test(residual2, list2)
We get output : 0.1495362 and p-value of 0.1376.
Since the p-value is greater than 0.05, with 95% confidence we can conclude that residual2 is not correlated with list2 i.e. residuals of the second model also do not follow a normal distribution.