Question

In: Statistics and Probability

The following code simulates data on resting heart rate vs. age, then simulates running multiple regression...

The following code simulates data on resting heart rate vs. age, then simulates running multiple regression analysis repeatedly and saving the coefficients (i.e. slopes) from the model. This model assumes a slightly quadratic relationship between heart rate and age. Copy and run this code, then answer the questions below.

n <- 50

coefs_age <- rep(0,1e5) coefs_age2 <- rep(0,1e5) for (i in 1:1e5) {

Age=round(runif(n,min=18,max=90))

Age2 <- Age^2

HR <- 84-Age*0.5+Age2*0.035+rnorm(n,sd=15) model <- lm(HR~Age+Age2)

coefs_age[i] <- summary(model)$coefficients[2,1] coefs_age2[i] <- summary(model)$coefficients[3,1]

} mean(coefs_age) mean(coefs_age2)

  1. What are the mean values of coefs_age and coefs_age2? Do these mean values seem "right" to you, based on the code in the simulation? Explain why or why not.
  2. Make a scatter plot of a single instance of the data simulation that is run 100,000 times in this for loop. Give this histogram the title "Heart rate vs. Age". Give the x axis the name "Age", and give the y axis the name "Heart rate". Make the data points a little bit smaller than their default.
  3. Add code that saves the p-values for the two slope coefficients in this model, and re-run the simulation. For each slope coefficient, report the estimated power.
  4. Modify the simulation so that "model" uses only Age as a predictor variable, not Age2. This means that the simulated regression will be incorrectly assuming a linear rather than quadratic relationship. Explain how this changes the results, both in terms of the values that the slope for Age takes on, and the values that the p-value for the slope takes on.

using the Rstudio

Solutions

Expert Solution

RUN THE CODE IN R STUDIO:

set.seed(1001) n <- 50 MC = 1e5 coefs_age <- rep(0,MC) coefs_age2 <- rep(0,MC) pvalue_age <- rep(0,MC) pvalue_age2 <- rep(0,MC) model1_coefs_age <- model1_pvalue_age <- rep(0,MC) for (i in 1:MC) { Age=round(runif(n,min=18,max=90)) Age2 <- Age^2 HR <- 84-Age*0.5+Age2*0.035+rnorm(n,sd=15) model <- lm(HR~Age+Age2) coefs_age[i] <- summary(model)$coefficients[2,1] coefs_age2[i] <- summary(model)$coefficients[3,1] # Addition to save p-values of co-effecient: pvalue_age[i] <- summary(model)$coefficients[2,4] pvalue_age2[i] <- summary(model)$coefficients[3,4] # Model with only Age as predictor: model1 <- lm(HR~Age) model1_coefs_age[i] <- summary(model)$coefficients[2,1] model1_pvalue_age[i] <- summary(model)$coefficients[2,4] } mean(coefs_age) mean(coefs_age2) cat("mean of coefs_age and coef_age2 is very close to co-effecients of the variables as per the regression line, which is as expected") #Scatter plot of last simulation plot(Age,HR,main = "Heart rate vs. Age",xlab = "Age" ,ylab = "Heart rate",cex = 0.5) # Eastimtaed power of each slope: level_of_significance = 0.01 cat("\nEstimated power of co-eff of age is ",mean(pvalue_age < level_of_significance)) cat("\nEstimated power of co-eff of age2 is ",mean(pvalue_age2 < level_of_significance)) # Eastimtaed power of slope of Age for alternate model: level_of_significance = 0.01 mean(model1_coefs_age) cat("\nCo-effecient of Age changes to ",mean(model1_coefs_age)) cat("\nEstimated power of co-eff of age is ",mean(model1_pvalue_age < level_of_significance)) cat("\nOverall regression is not as good as the model with Age^2") 

OUTPUT:


Related Solutions

The age x and resting heart rate y were measured for ten men, with the results...
The age x and resting heart rate y were measured for ten men, with the results shown in the table. x 20 23 30 37 35 45 51 55 60 63 y 72 71 73 74 74 73 72 79 75 77 Test, at 10% level of significance, whether age is useful for predicting resting heart rate. H0: Ha: t-Test Statistic (round to three decimal places) = Critical t-score (t αt α or t α / 2t α / 2,...
A multiple regression model is to be constructed to predict the heart rate in beats per...
A multiple regression model is to be constructed to predict the heart rate in beats per minute (bpm) of a person based upon their age, weight and height. Data has been collected on 30 randomly selected individuals: hide data Heart Rate (bpm) Age (yrs) Weight (lb) Height (in) 78 23 245 70 91 44 223 68 79 42 178 67 60 33 200 58 57 25 99 68 59 35 123 64 78 30 204 62 98 56 200 63...
Assume you conducted a logistic regression analysis. To analyze the relationship resting heart rate and having...
Assume you conducted a logistic regression analysis. To analyze the relationship resting heart rate and having a heart attack (heart attack is a dummy variable, 0 = no heart attack, 1 = heart attack). Resting heart rate is the predictor variable and heart attack is the outcome. You found that the beta coefficient is 1.2 and the R2 value is 0.76. How would you interpret the beta coefficient? How would you interpret the R2 value? Assume that the p-value for...
I'm investigating the impacts of gender and age on individuals resting heart rate after excersise.Could someone...
I'm investigating the impacts of gender and age on individuals resting heart rate after excersise.Could someone pls explain in depth whether age impacts resting heart rate on individuals after excersise and how that differs in regards to different genders.This comparison also involves differences between young and those who are older.
Review the Client Profile below. Client Profile: Jamie Summers Age: 53 Gender: Female Resting Heart Rate:...
Review the Client Profile below. Client Profile: Jamie Summers Age: 53 Gender: Female Resting Heart Rate: 90 bpm Height: 5'5" Weight: 165 lb Body Fat Percentage: 35% Background and Goals: Jamie is a working mother of three teenagers. She has not been consistently active for many years. She was recently diagnosed with high blood pressure, which is likely caused by her high-stress corporate job and physical inactivity. She also has an affinity for processed and sugary foods. Jamie was recently...
1. As a result of running a simple regression on a data set, the following estimated...
1. As a result of running a simple regression on a data set, the following estimated regression equation was obtained:       = 9.7 + 13.4x Furthermore, it is known that SST = 622, and SSE = 150. 2. You are given the following information about y and x: y x Dependent Variable Independent Variable 11 6 15 5 10 2 14 2 Linear regression using least squares method yielded the following equation:   = 12.06 + 0.12x What is the predicted value...
1) The average resting heart rate for a healthy adult horse in the population is 46...
1) The average resting heart rate for a healthy adult horse in the population is 46 beats per minute, with a standard deviation of 8 beats per minute. A horse whose resting heart rate is in the upper 5% of the distribution may be at risk for a serious illness. If your house has a resting heart rate at 58 beats, would you consider your horse to be at risk? To Answer this CALCULATE the Z score for that heart...
24. Assume that the resting heart rate in humans is normally distributed with a population mean...
24. Assume that the resting heart rate in humans is normally distributed with a population mean of 72 bpm and a standard deviation of 8 bpm. a. What proportion of the population has resting heart rates between 70 and 82 bpm? b. What proportion of the population has resting heart rates between 50 and 75 bpm? c. What proportion of the population has resting heart rates between 60 and 70 bpm? d. What proportion of the population has resting heart...
1.        A researcher wants to determine the effects that smoking has on resting heart rate. She...
1.        A researcher wants to determine the effects that smoking has on resting heart rate. She randomly selects 7 individuals from 3 categories: nonsmokers, light smokers (fewer than 10 cigarettes per day), and heavy smokers (10 or more cigarettes per day) and obtains the following heart rate data (beats per minute): nonsmokers light smokers heavy smokers 70 67 79 58 75 80 51 65 77 56 78 77 53 62 86 53 70 68 65 73 83 Use your calculator...
A researcher wants to determine the impact that smoking has on resting heart rate. She randomly...
A researcher wants to determine the impact that smoking has on resting heart rate. She randomly selects 77 people from 33 groups and obtains the heart rate data​ (beats per​ minute) in the table. Complete parts ​(a) through ​(c). Nonsmokers 5050 5050 4848 6262 6767 5252 4747 Open in StatCrunch + Copy to Clipboard + Open in Excel + Light Smokers 7272 6060 6565 6969 6363 7373 5959 Heavy Smokers 7171 8383 6363 7878 7575 7676 7676 ​(a) Test the...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT