In: Statistics and Probability
The following code simulates data on resting heart rate vs. age, then simulates running multiple regression analysis repeatedly and saving the coefficients (i.e. slopes) from the model. This model assumes a slightly quadratic relationship between heart rate and age. Copy and run this code, then answer the questions below.
n <- 50
coefs_age <- rep(0,1e5) coefs_age2 <- rep(0,1e5) for (i in 1:1e5) {
Age=round(runif(n,min=18,max=90))
Age2 <- Age^2
HR <- 84-Age*0.5+Age2*0.035+rnorm(n,sd=15) model <- lm(HR~Age+Age2)
coefs_age[i] <- summary(model)$coefficients[2,1] coefs_age2[i] <- summary(model)$coefficients[3,1]
} mean(coefs_age) mean(coefs_age2)
using the Rstudio
RUN THE CODE IN R STUDIO:
set.seed(1001) n <- 50 MC = 1e5 coefs_age <- rep(0,MC) coefs_age2 <- rep(0,MC) pvalue_age <- rep(0,MC) pvalue_age2 <- rep(0,MC) model1_coefs_age <- model1_pvalue_age <- rep(0,MC) for (i in 1:MC) { Age=round(runif(n,min=18,max=90)) Age2 <- Age^2 HR <- 84-Age*0.5+Age2*0.035+rnorm(n,sd=15) model <- lm(HR~Age+Age2) coefs_age[i] <- summary(model)$coefficients[2,1] coefs_age2[i] <- summary(model)$coefficients[3,1] # Addition to save p-values of co-effecient: pvalue_age[i] <- summary(model)$coefficients[2,4] pvalue_age2[i] <- summary(model)$coefficients[3,4] # Model with only Age as predictor: model1 <- lm(HR~Age) model1_coefs_age[i] <- summary(model)$coefficients[2,1] model1_pvalue_age[i] <- summary(model)$coefficients[2,4] } mean(coefs_age) mean(coefs_age2) cat("mean of coefs_age and coef_age2 is very close to co-effecients of the variables as per the regression line, which is as expected") #Scatter plot of last simulation plot(Age,HR,main = "Heart rate vs. Age",xlab = "Age" ,ylab = "Heart rate",cex = 0.5) # Eastimtaed power of each slope: level_of_significance = 0.01 cat("\nEstimated power of co-eff of age is ",mean(pvalue_age < level_of_significance)) cat("\nEstimated power of co-eff of age2 is ",mean(pvalue_age2 < level_of_significance)) # Eastimtaed power of slope of Age for alternate model: level_of_significance = 0.01 mean(model1_coefs_age) cat("\nCo-effecient of Age changes to ",mean(model1_coefs_age)) cat("\nEstimated power of co-eff of age is ",mean(model1_pvalue_age < level_of_significance)) cat("\nOverall regression is not as good as the model with Age^2")
OUTPUT: