Question

In: Statistics and Probability

(20 pts) Use the “Distance.sav” (SPSS) data set (located below) to perform a linear regression analysis....

  1. (20 pts) Use the “Distance.sav” (SPSS) data set (located below) to perform a linear regression analysis. This dataset shows how far on average a person in Illinois drives each year. Write your findings using the format presented in the class slides. (2 pts) How much of the variation in the dependent variable is explained by the variation in the independent variable? What statistic did you use? (2 pts) Is the linear model significantly different than zero? Why or why not? What statistic did you use? (4 pts) Do the model assumptions hold? Why or why not? Provide a thorough response. (Hint: Create both a Histogram and a Normal probability plot using the standardized residuals. Also create a scatterplot of Regression Standardized Residuals and the Regression Standardized Predicted Value of the dependent variable. Attach a copy of each below.)
Year Distance
1960 1472.08
1961 1564.80
1962 1603.03
1963 1670.65
1964 1840.97
1965 1936.46
1966 2031.93
1967 2093.46
1968 2163.59
1969 2205.16
1970 2281.37
1971 2398.31
1972 2503.06
1973 2623.12
1974 2575.82
1975 2604.13
1976 2740.65
1977 2791.32
1978 2886.16
1979 2870.89
1980 3049.89
1981 3107.49
1982 3202.19
1983 3240.61
1984 3400.64
1985 3461.57
1986 3617.96
1987 3887.96
1988 4148.67
1989 4476.36
1990 4506.32
1991 4499.51
1992 4487.92
1993 4470.72
1994 4559.77
1995 4636.48
1996 4745.51
1997 4831.20
1998 4897.49
1999 4978.39
2000 4958.52
2001 5024.30
2002 5131.16
2003 5152.03

Solutions

Expert Solution

98.3% of total variation in the dependent variable is explained by the variation in the independent variable. Here we use R2 statistic.

From t test corresponding Year, we see that p-value=0.000<0.05, so the linear model is significantly different than zero. Here we use t statistic and observed value of this statistic=49.15.

From normal probability plot, we see that assumption of normality holds. Since the points are randomly distributed on both sides of horizontal line in residual plot, assumption of linearity and equal variance hold. However from fitted vs. standard residual plot, one value of standardized residual greater than 3, so outlier may present.


Related Solutions

(18 pts) Use the “Car Filter.sav” (SPSS) dataset (located below) to perform a Two-Way ANOVA. This...
(18 pts) Use the “Car Filter.sav” (SPSS) dataset (located below) to perform a Two-Way ANOVA. This dataset contains noise level readings taken from inside different sized cars equipped with different sized filters. Write your findings using the format presented in the class slides. (2 pts) What is the mean and standard deviation of each filter’s noise level? (2 pts) Based on the means and standard deviations calculated above for each filter, do you believe the mean noise levels are statistically...
Use SPSS to follow the steps below and conduct a simple linear regression of the following...
Use SPSS to follow the steps below and conduct a simple linear regression of the following data: Calories (Xi) Sodium (Yi) 186 495 181 477 176 425 149 322 184 482 190 587 158 370 139 322 175 479 148 375 State your hypotheses (e.g. HA: “calories will significantly predict sodium”) Create a scatterplot of the data. State if the scatterplot appears to contain a linear relationship. Conduct the analysis in SPSS. Include all of the important outputs (e.g. ANOVA...
Question No. 01: Linear Regression Analysis in SPSS Statistics a. Assume a case study to use...
Question No. 01: Linear Regression Analysis in SPSS Statistics a. Assume a case study to use simple linear regression for analysis and precisely interpret the results of your study. Also, use Y=aX + b to predict the results. b. Suppose another case study to use multiple linear regression, Interpret the results tactfully. Also, use Z=aX+bY+c to predict the results. (Use screenshots as required).
Use Excel to prepare a Linear Regression Analysis. Use data samples below for populations and determine...
Use Excel to prepare a Linear Regression Analysis. Use data samples below for populations and determine if the selected independent variable is affecting the dependent variable. Use an alpha of 5% for ANOVA and Correlation Coefficient. Explain the results. Data samples Group A 104,103,101,99,97,101,101 Group B 101,100,95,99,101,103,97 Group C 100,96,99,95,99,102,106 Group D 97,99,99,101,105,100,99
Directions: Use SPSS to compute the Regression Line. Problem: Using the following set of data and...
Directions: Use SPSS to compute the Regression Line. Problem: Using the following set of data and Excel, compute the regression line. The data set represents the number of hours of training to predict how severe injuries will be if someone is injured playing football. Briefly summarize your findings. Training Injuries Training Injuries 12 8 11 5 3 7 16 7 22 2 14 8 12 5 15 3 11 4 16 7 31 1 22 3 27 5 24 8...
Refer to the TV Revenue data set. Perform a complete multiple regression analysis that might be...
Refer to the TV Revenue data set. Perform a complete multiple regression analysis that might be used to predict net revenue using all provided explanatory variables (there are 4 explanatory variables). Complete all steps for the multiple regression as outlined in class and modify the original model if necessary. Use an alpha = .10 for all hypotheses tests. Make sure you show each required step for any hypothesis test. Provide all required Minitab output with your written responses. Obs NetRevenue...
Perform a linear regression on this data set Assessed Value Heating Area Age 184400 2000 3.42...
Perform a linear regression on this data set Assessed Value Heating Area Age 184400 2000 3.42 177400 1710 11.50 175700 1450 8.33 185900 1760 0.00 179100 1930 7.42 170400 1200 32.00 175800 1550 16.00 185900 1930 2.00 178500 1590 1.75 179200 1500 2.75 186700 1900 0.00 179300 1390 0.00 174500 1540 12.58 183800 1890 2.75 176800 1590 7.17 Be sure to plot the data and plot and include various graphs that would help determine if the data is normally distributed...
13. Linear regression analysis was performed for a standard addition data set based on this experiment....
13. Linear regression analysis was performed for a standard addition data set based on this experiment. The analysis yielded the following results: m = 0.006110, b = 0.008170. What is the unknown's riboflavin concentration (ppm) in the standard addition solutions? a) 2.0 b) 13.37 c) 1.337 d) 1.67 15. The riboflavin concentration in the unknown powder solution is 58.0 ppm. The total volume of the powder solution is 250.00 mL. What is the concentration in the tablet (mg ribo/tablet) if...
The standard project is to use multiple regression analysis to analyze a data set. The data...
The standard project is to use multiple regression analysis to analyze a data set. The data set is a study of student persistent enrolling in the next semester based on Gender, Age, GPA, a 22 questionnaire on self-efficacy, and student enrollment status. The educational researcher wants to study the relationship between student enrollment status as it relates to gender, age, GPA, and the total response to a 22 questionnaire survey. a. The estimated multiple regression analysis equation. b. Does the...
Question #11 – Regression Analysis Use the data provided to: Perform the “Tests to Check the...
Question #11 – Regression Analysis Use the data provided to: Perform the “Tests to Check the Validity of a Regression” Show both the calculated and critical values Estimate Y when X = 4 (round to 2 decimal places) Use a level of significance of 5% (α = .05). Clearly show the null and alternate hypothesis. Graphs are not required. X Y 3 14 7 26 6 23 4 17 7 28 5 20 8 29 2 11
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT