Question

In: Statistics and Probability

An endocrinologist was interested in exploring the relationship between the level of a steroid (Y) and...

An endocrinologist was interested in exploring the relationship between the level of a steroid (Y) and age (X) in healthy subjects whose ages ranged from 8 to 25 years. She collected a sample of 27 healthy subject in this age range. The data is located in the file problem01.txt, where the first column represents X = age and the second column represents Y = steroid level. For all R programming, print input and output codes and values.

(a) Read the file problem01.txt into R using the read.table() function. You’ll need to set the working directory to the file location. Make a scatterplot of steroid (Y) versus age (X). Include the plot.

(b) Use R to fit a simple linear regression. Write down the fitted equation and multiple R2 from the summary() output. Also comment on the p-value for the ?1

coefficient

Yi = ?0 + ?1Xi + ?i

(c) Make a scatterplot of the fitted values versus the standardized residuals for the model in part (b). Are there any violation of assumptions? Include a copy of your plot.

(d) Create a quadratic regression in R. Write down the fitted equation, multiple R2, and the p-value for ?1 from the summary()output. Compare to part (b).

Yi = ?0 + ?1Xi + ?2Xi2 + ?i

problem01.txt

"age" "steroid"

15 14.1

10 8.5

13 10.8

16 18.4

10 4.7

18 23.3

16 16.4

10 9.4

16 17.7

23 35.8

19 25.4

18 24.9

24 42.1

19 26.5

24 40

12 10.7

13 11.6

10 3.6

23 37.9

17 16.8

19 24

23 37.7

20 29.6

14 13.7

19 23.1

11 8.3

17 19.6

9 7.8

11 7.1

13 13.3

18 20.8

25 44.4

9 9.7

12 12.5

22 34.9

8 4.3

9 5.9

8 6

22 36.2

15 11.7

10 5.3

15 15.6

9 6.6

14 15.7

13 10.5

17 20.7

23 36.8

23 37.2

8 5

16 19.6

16 18.9

15 16.1

10 7.7

14 11.9

12 9

8 4.4

8 2.7

8 5.2

16 19.3

20 27.5

20 27.8

13 12.9

12 12.8

13 9.3

15 16.1

19 25

13 10.5

13 9

18 22.3

22 33.6

9 4.9

19 28.4

15 14

21 30.6

19 24.8

R Outline Sample

########################

####### Part (a) #######

########################

# First save the file 'problem01.txt' on your computer.

# Next, set the working directory to the file location by doing the following:

# 1) Click on 'Session' on the top menu

# 2) Select 'Set Working Directory' > 'Choose Directory'

# 3) Select the folder where 'problem01.txt' is saved

# Read in data using the read.table() function.

dat <-

attach(dat)

# Create a scatterplot of age (X) vs steroid (Y)

# Write code here

########################

####### Part (b) #######

########################

# Fit a simple linear regression, then display the summary

fit <- # Enter code for simple linear regression

summary(fit)

########################

####### Part (c) #######

########################

# Plot the fitted values versus the standardized residuals for the fitted

# equation in part (b). Use the functions: sigma(), resid(), and predict()

y.hat <-

e.std <-

plot(y.hat, e.std, main = "Standardized Residuals vs. Fitted Value")

########################

####### Part (d) #######

########################

Solutions

Expert Solution

rm(list=ls())

# Reading Data

df <- read.table(file.choose(),header=T) #You will put your directory here

head(df)

#Scatter Plot

plot(df$Age,df$Steriod)

# Fitting linear model

model.linear <- lm(Steriod~.,data=df)

#Summary of the model with p-values, Rsquare and p-values

summary(model.linear)

# Adding Age^2 in our data

df$Age2 <- df$Age^2

# Now performing Quadratic Regression

model.quadratic <- lm(Steriod~.,data=df)

# Summary of quadratic model

summary(model.quadratic)

Part(a)

Part(b)

Variable Age is highly correlated with the variable Steriod. Also, the R^2 is very high. Quite a good fit for the data

part(c)

Residual vs fitted

There is a non-linear pattern in data. It means we should try a quadratic model

# Standardized Residuals vs fitted

The plot is useful for heteroscedasticity. But the plot doesn't verify it.

Part(d)

model.quadratic

The quadratic model has high R-squared. Also when we include the squared term in the model. The unsquared do not contribute to our model. Hence there is a quadratic relation between the variables.


Related Solutions

Q6). We are interested in exploring the relationship between the weight of a vehicle and its...
Q6). We are interested in exploring the relationship between the weight of a vehicle and its fuel efficiency (gasoline mileage). The data in the table show the weights, in pounds, and fuel efficiency, measured in miles per gallon, for a sample of 12 vehicles. Weight Fuel Efficiency 2710 24 2550 24 2680 29 2720 38 3000 25 3410 22 3640 21 3700 27 3880 21 3900 19 4060 21 4710 16 Part (a) Graph a scatterplot of the data. Part...
We are interested in exploring the relationship between the weight of a vehicle and its fuel...
We are interested in exploring the relationship between the weight of a vehicle and its fuel efficiency (gasoline mileage). The data in the table show the weights, in pounds, and fuel efficiency, measured in miles per gallon, for a sample of 12 vehicles. Weight Fuel Efficiency 2710 24 2570 27 2620 29 2750 38 3000 23 3410 24 3640 21 3700 27 3880 22 3900 19 4060 18 4710 15 e.) What percent of the variation in fuel efficiency is...
We are interested in exploring the relationship between the weight of a vehicle and its fuel...
We are interested in exploring the relationship between the weight of a vehicle and its fuel efficiency (gasoline mileage). The data in the table show the weights, in pounds, and fuel efficiency, measured in miles per gallon, for a sample of 12 vehicles. Weight Fuel Efficiency 2715 26 2520 24 2630 29 2790 38 3000 23 3410 25 3640 21 3700 27 3880 21 3900 19 4060 21 4710 15 Part (b) r = -0.71 (correlation coefficient). Yes, it is...
We are interested in exploring the relationship between the weight of a vehicle and its fuel...
We are interested in exploring the relationship between the weight of a vehicle and its fuel efficiency (gasoline mileage). The data in the table show the weights, in pounds, and fuel efficiency, measured in miles per gallon, for a sample of 12 vehicles. Weight Fuel Efficiency 2670 25 2570 24 2630 29 2760 38 3000 25 3410 24 3640 21 3700 26 3880 21 3900 18 4060 18 4710 17 Find the correlation coefficient. Find the equation of the best...
Data We are interested in exploring the relationship between the weight of a vehicle and its...
Data We are interested in exploring the relationship between the weight of a vehicle and its fuel efficiency (gasoline mileage). The data in the table show the weights, in pounds, and fuel efficiency, measured in miles per gallon, for a sample of vehicles. Note the units here are pounds and miles per gallon. Weight (pounds) Fuel Efficiency (miles per gallon) 2695 25 2510 27 2680 29 2730 38 3000 25 3410 23 3640 21 3700 27 3880 21 3900 19...
We are interested in exploring the relationship between the weight of a vehicle and its fuel...
We are interested in exploring the relationship between the weight of a vehicle and its fuel efficiency (gasoline mileage). The data in the table show the weights, in pounds, and fuel efficiency, measured in miles per gallon, for a sample of 12 vehicles. g.)For the vehicle that weighs 3000 pounds, find the residual (y − ŷ). (Round your answer to two decimal places.) _________ I.)Remove the outlier from the sample data. Find the new correlation coefficient and coefficient of determination....
A researcher is interested to learn if there is a relationship between the level of interaction...
A researcher is interested to learn if there is a relationship between the level of interaction a women in her 20s has with her mother and her life satisfaction ranking. Below is a list of women who fit into each of four level of interaction. Conduct a One-Way ANOVA on the data to determine if a relationship exists. State whether or not a relationship exists and why or why not. Explain in as much detail as possible. No Interaction Low...
"You are interested in determining whether there is a relationship between the grade level of students...
"You are interested in determining whether there is a relationship between the grade level of students at Big Rock School District and their primary color preference for a new school football uniform. A sample of students (grades 8, 9, 12) is asked which color (black, blue, gold) they would prefer for a new school football uniform. Assuming the .05 level of significance, what would you conclude?" 8th grade mean 13.0, 9th grade mean: 12.6, 12th grade mean: 8.0. There is...
The Tvet college is interested in the relationship between anxiety level and the need to succeed...
The Tvet college is interested in the relationship between anxiety level and the need to succeed in school. A random sample of 400 students took a test that measured anxiety level and need to succeed in school. Need to succeed in school vs Anxiety level Need to succeed High Med–High Medium Med–Low Low in school Anxiety Anxiety Anxiety Anxiety Anxiety High Need 35 42 53 15 10 Medium Need 18 48 63 33 31 Low Need 4 5 11 15...
A researcher is investigating the relationship between economic development (x) and level of religiosity (y) in...
A researcher is investigating the relationship between economic development (x) and level of religiosity (y) in ten countries. (The researcher has interval-level measurements for both variables.)The researcher theorizes that citizens of countries at the lower end of the development scale will profess higher levels of religiosity than will citizens of countries at the higher end of the development scale. As development increases, religiosity decreases. Draw and Label four sets of axes, like this one below: (A.) Is the researcher hypothesizing...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT