In: Statistics and Probability
6.) The ages and annual salaries of 10 electrical engineers in the city of Ogdenville are listed below.
Age | Annual Salary in $1,000's |
24 | 64 |
26 | 67 |
31 | 76 |
32 | 79 |
39 | 84 |
40 | 83 |
44 | 87 |
48 | 91 |
51 | 96 |
55 | 102 |
A.) Compute the least-squares regression line for this data set, where Age will be your explanatory variable. Make sure to write down the code you entered into RStudio to help you construct your least-squares regression line. What does the slope represent in your least-squares regression line?
B.) Your friend, Lenny Leonard, is 43 years old electrical engineer and he just moved into the Ogdenville area. He wants to know what his salary would roughly be in Ogdenville. Use the least-square regression line you constructed in Part A.) to determine his salary. Make sure to round your answer to the nearest thousand dollars.
C.)Your other friend, Hans Moleman, is 76 years old and is still working as an electrical engineer and also just moved into the Ogdenville area. Hans would like to know what his salary would roughly be in Ogdenville. Would it be appropriate to use the least-squares regression line you constructed in Part A.) to compute his salary? If yes, then please find it and round your answer to the nearest thousand dollars. If not, please explain why
Solution:
Given that, the ages and annual salaries of 10 electrical engineers in the city of Ogdenville and we have to compute the least square regression line for this data set.
We have fitting the regression line in R studio
# Defining the data
> Age = c(24,26,31,32,39,40,44,48,51,55) > Salary = c(64,67,76,79,84,83,87,91,96,102) > df = data.frame(Age,Salary) > df Age Salary 1 24 64 2 26 67 3 31 76 4 32 79 5 39 84 6 40 83 7 44 87 8 48 91 9 51 96 10 55 102 a) # Fitting the Least square regression line > mod = lm(Salary ~ Age, data = df) > summary(mod) Call: lm(formula = Salary ~ Age, data = df) Residuals: Min 1Q Median 3Q Max -2.1988 -1.4567 -0.6372 1.2391 3.8939 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 39.47692 2.67797 14.74 4.41e-07 *** Age 1.11341 0.06649 16.75 1.64e-07 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.117 on 8 degrees of freedom Multiple R-squared: 0.9723, Adjusted R-squared: 0.9688 F-statistic: 280.5 on 1 and 8 DF, p-value: 1.636e-07
From the regression output the least square regression line is given by:
Salary = 39.4769 + 1.1134*Age
Here, the slope is 1.1134, this means that if age is increases by 1 unit than on average salary is increases by 1.1134 unit.(in $1,000's ).
b) Given that, my friend, Lenny Leonard, is 43 years old electrical engineer and he just moved into the Ogdenville area. He wants to know what his salary would roughly be in Ogdenville.
So age = 43
> df1 = data.frame(Age = 43)
> predict(mod,newdata = df1)
1 87.35365
Hence, the salary of my friend would roughly be $87000 in Ogdenville.
C) Given that, my other friend, Hans Moleman, is 76 years old and is still working as an electrical engineer and also just moved into the Ogdenville area. Hans would like to know what his salary would roughly be in Ogdenville.
The regression line computed in part A is
Salary = 39.4769 + 1.1134*Age
Since, the R square value is 0.9723 which is close to 1, this means the given model is appropriate to compute his salary
> df2 = data.frame(Age =76 ) > predict(mod,newdata = df2) 1 124.0963
Hence, the salary of my other friend would roughly be $124000 in Ogdenville.