In: Statistics and Probability
Linear regression is a statistical tool commonly used to find a relationship that exists between a variable and one explanatory variable. What are the factors that affect a linear regression model? How can you accomplish linear regression in R? Please provide an example to illustrate your assertions.
First you need to check assumptions of linear regression model.
There are four assumptions associated with a linear regression model:
This can be done by using four in one plot in R.
Also it is important to check presence of outliers.
Example using R:
step 1. Import data in R
step 2. Fit regression model using lm() function.
step 3. Plot that fitted model to get four in one plot.
Example:
Consider an example of predicting weight of person using its height as predictor variable.
> height = c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
> weight = c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
> fit=lm(weight~height)
> summary(fit)
Call:
lm(formula = weight ~ height)
Residuals:
Min 1Q Median 3Q Max
-6.3002 -1.6629 0.0412 1.8944 3.9775
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -38.45509 8.04901 -4.778 0.00139 **
height 0.67461 0.05191 12.997 1.16e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.253 on 8 degrees of freedom
Multiple R-squared: 0.9548, Adjusted R-squared: 0.9491
F-statistic: 168.9 on 1 and 8 DF, p-value: 1.164e-06
> par(mfrow=c(2,2))# it divides graph window in 4
sections
> plot(fit)