Question

In: Statistics and Probability

Use the paired dataset given in the below table to conduct a simple regression analysis as...

  1. Use the paired dataset given in the below table to conduct a simple regression analysis as follows:

  1. (10 pts) Using these 4 paired data points, fill up the table calculating all of the terms such as (x2, x*y) used in the formulas of this question. Fill up the other column titles on your own and make the calculations. If needed add coloumns. (the table should be written in your answer paper)

xi

yi

xi2

xi*yi

1

2

32

2

1

27

3

5

93

4

8

101

Total

Avr.

  1. (15 pts.) Using the values in the table, find the intercept and slope of the estimated regression line.
  2. (15 pts.) Check the homogenous variance assumption of the error of your estimated regression model using a graphic.
  3. (15 pts.) Conduct hypothesis testing to conclude whether or not there is a significant linear relationship between variable x and variable y.

  1. (15 pts.) Sample three scores from a standard normal distribution, square each score, and sum the squares. What is the probability that the sum of these two squares will be 8 or higher? Explain how you find the result.

(Hint: use a special type of distribution we learned that represents the described case to find the probability)

Solutions

Expert Solution

a.

xi yi xi2 xi*yi
1 2 32 4 64
2 1 27 1 27
3 5 93 25 465
4 8 101 64 808
Total 16 253 94 1364
Avr. 4 63.25 23.5 341

For linear regression, y = a+bx

a = intercept

and b = slope

then, , and

then, = 1408/120 = 11.73

and a = 63.25 - 11.73*4 = 16.33

therefore, intercept = 16.33 and slope = 11.73

Equation: y = 16.33 + 11.73*x

b. To test this we will plot residuals with predicted y

xi yi xi2 xi*yi yi hat ei = yi - yihat
1 2 32 4 64 39.79 -7.79
2 1 27 1 27 28.06 -1.06
3 5 93 25 465 74.98 18.02
4 8 101 64 808 110.17 -9.17
Total 16 253 94 1364 253 0
Avr. 4 63.25 23.5 341 63.25 0

Then, plot:

The pattern is random or we can say residuals do not form any specific pattern.

Hence, variances are constant and not variable , we have homoscedasticity and no heteroscedasticity.

c.

Running regression in R:

Regression Statistics
Multiple R 0.947475
R Square 0.897709 explains 89.77 % variability
Adjusted R Square 0.846563
Standard Error 15.33976
Observations 4
ANOVA
df SS MS F Significance F
Regression 1 4130.133 4130.133 17.55201 0.052525 not less than 0.05, not significant at 5%
Residual 2 470.6167 235.3083
Total 3 4600.75
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 16.31667 13.57663 1.20182 0.352433 -42.0989 74.73219
X Variable 1 11.73333 2.800645 4.189511 0.052525 -0.31687 23.78354

The model is not significant at 5% , however it can be significant at alpha values greater than output p-value 0.0525

a. For standard normal distribution, z ~ N(0,1) i.e, mean = 0 and variance = 1

Then, let 3 scores be , X1 = 0, X2 = 1, X3 = -1 as Z ranges from -3 to 3 for 99.73% times

then distribution of sum of these, i.e,W = X12+X22+X32 ~ chi square with 3 degrees of freedom.

then, P(W greater than or equal to 8) = 0.05 (using calculator)

Please rate my answer and comment for doubt, it took alot of effort


Related Solutions

From the table below, use a simple linear regression analysis to establish the relationship that may...
From the table below, use a simple linear regression analysis to establish the relationship that may exist between a) number of confirmed cases and deaths; b) number of confirmed cases and number of tests performed; c) number of confirmed cases and number of recoveries; and d) number of deaths and number of recoveries. Briefly discuss these relations. Date Total confirmed Death Recoveries Test 1-Apr 195 5 3 12046 2-Apr 204 5 3 12046 3-Apr 205 5 3 12046 4-Apr 214...
Use SPSS to follow the steps below and conduct a simple linear regression of the following...
Use SPSS to follow the steps below and conduct a simple linear regression of the following data: Calories (Xi) Sodium (Yi) 186 495 181 477 176 425 149 322 184 482 190 587 158 370 139 322 175 479 148 375 State your hypotheses (e.g. HA: “calories will significantly predict sodium”) Create a scatterplot of the data. State if the scatterplot appears to contain a linear relationship. Conduct the analysis in SPSS. Include all of the important outputs (e.g. ANOVA...
This question involves the use of simple linear regression on the fat dataset that can be...
This question involves the use of simple linear regression on the fat dataset that can be found in the faraway library. data set. Use the lm() function to perform a simple linear regression with brozek (percent body fat using the reference method) on abdom (abdomen circumference in cm) as the predictor. Print the results of the summary(function) and submit along with your answers to the following questions. Is there a relationship between the predictor and the response? How strong is...
Use Simple Linear Regression (excel) on the table below. What is the MAD? Period Month Demand...
Use Simple Linear Regression (excel) on the table below. What is the MAD? Period Month Demand 37 January 7077 38 February 7050 39 March 5430 40 April 5475 41 May 5504 42 June 6246 43 July 6361 44 August 6358 45 September 6379 46 October 6430 47 November 6720 48 December 7107
What is a simple regression analysis?
What is a simple regression analysis?
What is regression analysis? When would you use it? What is the difference between simple regression...
What is regression analysis? When would you use it? What is the difference between simple regression and multiple regression?
Table 2 below shows regression results from the study of Schularick and Taylor (2012).1 The dataset...
Table 2 below shows regression results from the study of Schularick and Taylor (2012).1 The dataset comprises annual panel data for 12 countries between 1870 and 2008. The study asks a simple question: does a country's recent history of credit growth help predict a financial crisis? The dependent variable is the probability of a financial crisis event pit in country i in year t. Ordinary Least Squares Country fixed effects Explanatory variables dependent variable in year t: pit credit growth...
Below is the ANOVA table for a linear regression analysis: Source SS df MS Lin Regression...
Below is the ANOVA table for a linear regression analysis: Source SS df MS Lin Regression 75 1 75 Residual 1500 60 25 Total 1575 61 (a) How many people were tested? N = ________ (b) What are the tabled (α = .05) and calculated F-values? Ftabled = ___________ Fcalc = ___________ (c) What are the two possible values for the product moment correlation (r)? r = _______ or _______ (d) Without calculating the interval, will the 95% confidence interval...
In a simple linear regression analysis, will the estimate of the regression line be the same...
In a simple linear regression analysis, will the estimate of the regression line be the same if you exchange X and Y? Why or why not?
Answer the following questions using the data given in the table. Use the T-Test Paired Two...
Answer the following questions using the data given in the table. Use the T-Test Paired Two Sample for Means to arrive at the solutions to the questions below. Use an alpha value of 0.01 Home Appraiser 1 Appraiser 2 A $235,000 $228,000 B $210,000 $205,000 C $231,000 $219,000 D $242,000 $240,000 E $205,000 $198,000 F $230,000 $223,000 G $231,000 $227,000 H $210,000 $215,000 I $225,000 $222,000 J $249,000 $245,000 K $199,000 $201,000 1 What is the value for the test...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT