In: Computer Science
This question involves the use of simple linear regression on the fat dataset that can be found in the faraway library.
data set.
Use the lm() function to perform a simple linear regression with brozek (percent body fat using the reference method) on abdom (abdomen circumference in cm) as the predictor. Print the results of the summary(function) and submit along with your answers to the following questions.
library(faraway) #Importing faraway library
library(dplyr) #Importing dplyr library
glimpse(fat) #a short summary with data values of dataset fat
#Using lm() function to define the linear fit model where abdom is the predictor variable and brozek is the #dependent variable from dataset fat.
lml <- lm(brozek~abdom, data = fat)
pred <- predict(lml, newfat, interval = "confidence")
pred
#Scatter plot
plot(brozek~abdom, data = fat)
#Using abline() function to extracts coefficients of fitted model and adds a regression line to the plot
abline(lml)
names(lml)
#Using summary function to output the results of linear regression model
summary(lml)
lml
Explanation:
1. How strong is the relationship between the predictor and the response?
A. Since the p-value is less than 0.05, we can reject the null hypothesis β=0. Hence there is a significant relationship between the predictor and dependent variables in the linear regression model.
2. What is the predicted percent body fat associated with a abdomen circumference of 100 cm?
3. What are the associated 95% confidence and prediction intervals?
A. The 95% confidence interval of the abdomen circumference 100 cm is between 23.611 % fat and 23.97347 % fat.
3. Plot the scatterplot of response against the predictor. Use the abline() function to display the least squares regression line.
A.
4. Are there any outliers?
A. Yes, there are 2 outliers in this data. A simple linear regression model, an outlier is an observation of data point with large residual. The value in dependent variable here is unusual when compared to its predictor variable value.