In: Statistics and Probability
USE R STUDIO.
Consider the pressure data frame. There are two columns:
temperature and pressure:
• Construct a scatterplot with pressure on the vertical axis and temperature on the horizontal axis.
• The graph of the following function passes through the plotted points reasonably well: y = (0.168 + 0.007 ∗ x) ^(20/3). Recall that the differences between the pressure values predicted by the curve (i.e. y) and the observed pressure values (i.e. the pressure values obtained from the data frame) are called residuals. Construct a normal QQ-plot of these residuals and decide whether they are normally distributed or whether they follow a skewed distribution. Write it as a comment in your R script file.
• Now, apply the power transformation pressure3/20 to the pressure data values. Plot these transformed values against temperature. Is there a linear trend? Write it as a comment in your R script file.
• Now build a simple linear regression model between temperature and the transformed pressure pressure3/20. Extract residuals from the model. Obtain a normal QQ-plot. Are the residuals normally distributed? Write it as a comment in your R script file.
• For comparison, redo the QQ-plot of the residuals predicted by the curve and the QQ-plot of the residuals predicted by the simple linear regression model on the transformed data to display in a 1 × 2 layout on the graphics page using mfrow() function.
code
library(dplyr)
plot(pressure$temperature,pressure$pressure)
x <- pressure$temperature
y <- (.168+.007*x)^(20/3)
curve((.168+.007*x)^(20/3),0,400,add = TRUE)
resid <- pressure$pressure- y
qqnorm(resid)
s <- pressure %>% mutate(pressure_tranformed =
(pressure)^(3/20))
plot(s$temperature,s$pressure_tranformed)
abline(0.168,.007)
model <- lm(s$pressure_tranformed ~ s$temperature)
model
plot(model)
qqplot of residual
after power transformation