Question

In: Statistics and Probability

USE R STUDIO. Consider the pressure data frame. There are two columns: temperature and pressure: •...

USE R STUDIO.


Consider the pressure data frame. There are two columns: temperature and pressure:

• Construct a scatterplot with pressure on the vertical axis and temperature on the horizontal axis.

• The graph of the following function passes through the plotted points reasonably well: y = (0.168 + 0.007 ∗ x) ^(20/3). Recall that the differences between the pressure values predicted by the curve (i.e. y) and the observed pressure values (i.e. the pressure values obtained from the data frame) are called residuals. Construct a normal QQ-plot of these residuals and decide whether they are normally distributed or whether they follow a skewed distribution. Write it as a comment in your R script file.

• Now, apply the power transformation pressure3/20 to the pressure data values. Plot these transformed values against temperature. Is there a linear trend? Write it as a comment in your R script file.

• Now build a simple linear regression model between temperature and the transformed pressure pressure3/20. Extract residuals from the model. Obtain a normal QQ-plot. Are the residuals normally distributed? Write it as a comment in your R script file.

• For comparison, redo the QQ-plot of the residuals predicted by the curve and the QQ-plot of the residuals predicted by the simple linear regression model on the transformed data to display in a 1 × 2 layout on the graphics page using mfrow() function.

Solutions

Expert Solution

code

library(dplyr)


plot(pressure$temperature,pressure$pressure)
x <- pressure$temperature
y <- (.168+.007*x)^(20/3)
curve((.168+.007*x)^(20/3),0,400,add = TRUE)

resid <- pressure$pressure- y

qqnorm(resid)

s <- pressure %>% mutate(pressure_tranformed = (pressure)^(3/20))
plot(s$temperature,s$pressure_tranformed)
abline(0.168,.007)

model <- lm(s$pressure_tranformed ~ s$temperature)

model
plot(model)

qqplot of residual

after power transformation


Related Solutions

Using R studio 1. Read the iris data set into a data frame. 2. Print the...
Using R studio 1. Read the iris data set into a data frame. 2. Print the first few lines of the iris dataset. 3. Output all the entries with Sepal Length > 5. 4. Plot a box plot of Petal Length with a color of your choice. 5. Plot a histogram of Sepal Width. 6. Plot a scatter plot showing the relationship between Petal Length and Petal Width. 7. Find the mean of Sepal Length by species. Hint: You could...
Use R studio to do this problem. This problem uses the wblake data set in the...
Use R studio to do this problem. This problem uses the wblake data set in the alr4 package. This data set includes samples of small mouth bass collected in West Bearskin Lake, Minnesota, in 1991. Interest is in predicting length with age. Finish this problem without using Im() (a) Compute the regression of length on age, and report the estimates, their standard errors, the value of the coefficient of determination, and the estimate of variance. Write a sentence or two...
9.) Write a MATLAB script that will read the Pressure and Temperature columns from the provided...
9.) Write a MATLAB script that will read the Pressure and Temperature columns from the provided Excel file. Use a loop to calculate the linear best fit line for the data. You may only use the built-in functions sum, size, and length. Plot both the individual data points (red) and the fit line (blue) on the same graph. Include a title and axes labels.
THIS QUESTION REQUIRES THE USE OF R STUDIO. ANY ANSWERS GIVEN THAT ARE NOT IN R...
THIS QUESTION REQUIRES THE USE OF R STUDIO. ANY ANSWERS GIVEN THAT ARE NOT IN R STUDIO CODE WILL NOT SUFFICE. SOLVING WITHOUT THE USE OF R STUDIO IS NOT ACCEPTABLE. The previous question was: Annual salaries for a large company are approximately normally distributed with a mean of 49000 dollars and a standard deviation of 2000 dollars. One manager claims that all of his direct reports are paid "above the 75th percentile" for the company. What is the minimum...
( In R / R studio ) im not sure how to share my data set,...
( In R / R studio ) im not sure how to share my data set, but below is the title of my data set and the 12 columns of my data set. Please answer as best you can wheather its pseudo code, partial answers, or just a suggestion on how i can in to answer the question. thanks #---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- The dataset incovid_sd_20201001.RDatacontains several variables related to infections of covid-19 for eachzip code in San Diego County as of October...
Answer using R Studio Here we consider the amount of data needed to perform hypothesis testing....
Answer using R Studio Here we consider the amount of data needed to perform hypothesis testing. Suppose we are testing a coin using observations of tosses. We wish to test H0: p = 0.5 against an alternative of HA : p = 0.6 (in this question use one-sided tests only). How many tosses are needed to guarantee a size Æ∑ 0.05 and Ø∑ 0.2? Now generalize to consider HA : p = 0.5+delta. Choose sensible values for delta and quantify...
1. Basic use of R/R Studio. Solve the following problem in R and print out the...
1. Basic use of R/R Studio. Solve the following problem in R and print out the commands and outputs. (a) Create a vector of the positive odd integers less than 100; Remove the values greater than 60 and less than 80; Find the variance of the remaining set of values (b) What’s the difference in output between the commands 2*1:5 and (2*1):5? Why is there a difference? (c) If you wanted to enter the odd numbers from 1 to 19...
Use R Studio to solve this problem. A simple electronic device consists of two components which...
Use R Studio to solve this problem. A simple electronic device consists of two components which have failure times which may be modeled as independent exponential random variables. The first component has a mean time to failure of 3 months, and the second has a mean time to failure of 6 months. (a) If the electronic device will fail when either of the components fails, use 1000 random samples of the simulated electronic device to estimate the mean and variance...
Using R Studio Use the two iid samples. (You can copy and paste the code into...
Using R Studio Use the two iid samples. (You can copy and paste the code into R). They both come from the same normal distribution. X = c(-0.06, 1.930, 0.608 -0.133,0.657, -1.284, 0.166, 0.963, 0.719, -0.896) Y = c(0.396, 0.687, 0.809, 0.939, -0.381, -0.042, -1.529, -0.543, 0.758, -2.574, -0.160, -0.713, 0.311, -0.515, -2.332, -0.844, -0.942, 0.053, 0.066, 0.942, -0.861, -0.186, -0.947, -0.110, 0.634, 2.357, 0.201, -0.428, -1.661, 0.395) (a) Report 95% confidence interval for the mean of X. Should we...
please use R studio to answer the following questions 1. An eductional theorist collects behavioural data...
please use R studio to answer the following questions 1. An eductional theorist collects behavioural data from two groups of children in an early childhood center. She measures how much time the children are active (e.g. running or swinging on the monkey bars) in minutes. The first group of children are encouraged to run about and as such are expected to be active; the second group is encouraged to sit still and paint, and are expected to be less active....
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT