Question

In: Math

Rejection Region After reviewing data from a sample, an inference can be made about the population....

Rejection Region

After reviewing data from a sample, an inference can be made about the population. For example,

Find a data set on the internet. Some suggested search terms: Free Data Sets, Medical Data Sets, Education Data Sets.

  1. Introduce your Data Set and Cite the Source.
  2. What trends do you notice in your data set?
  3. Based on the trends and the history of your data set, make a claim. What kind of test (left, right, two-tailed) would you have to complete?
  4. Explain the steps needed to complete the Hypothesis Test. What is needed?

After reviewing data from a sample, an inference can be made about the population. For example,

Find a data set on the internet. Some suggested search terms: Free Data Sets, Medical Data Sets, Education Data Sets.

  1. Introduce your Data Set and Cite the Source.
  2. What trends do you notice in your data set?
  3. Based on the trends and the history of your data set, make a claim. What kind of test (left, right, two-tailed) would you have to complete?
  4. Explain the steps needed to complete the Hypothesis Test. What is needed?

Solutions

Expert Solution

Here I have used a sample dataset of Mortgage Interest Rate and Home Prices obtained from

https://journalistsresource.org/wp-content/uploads/2014/11/Sample-data-sets-for-linear-regression1.xlsx

Now I shall start doing analysis on it. For analysis purpose, I am using R - Studio. Let's start.

# Linear Regression Analysis:

# Step 1 : Importing the data
setwd("C:/Users/raqui/Desktop")
getwd
data = read.csv("data.csv",header = T)
data
data = data.frame(data)

# Step 2 : Exploratory data analysis

# interest_rate is the explanatory variable and median_home_price is the dependent variable

summary(data$interest_rate)
summary(data$Median_home_price)

boxplot(data$interest_rate)
boxplot(data$Median_home_price)

# Step 3 : Examining the trend : whether it is linear or quadratic or cubic or something else

x = data$interest_rate
y = data$Median_home_price

# We should standardize as the units are not same for x and y

y_std = (y - mean(y))/sd(y)
x_std = (x - mean(x))/sd(x)

plot(x_std,y_std)

# Inference: We can see a downward trend of home price with the interest rate

# Step 4: Actual regression fitting

l1 = lm(y_std~x_std)
summary(l1)

Output:

> summary(l1)

Call:
lm(formula = y_std ~ x_std)

Residuals:
Min 1Q Median 3Q Max
-0.9648 -0.7410 -0.1867 0.5735 1.4147

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.163e-17 2.030e-01 0.000 1.0000
x_std -6.202e-01 2.097e-01 -2.958 0.0104 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.812 on 14 degrees of freedom
Multiple R-squared: 0.3846,   Adjusted R-squared: 0.3406
F-statistic: 8.749 on 1 and 14 DF, p-value: 0.01038


plot(l1)

# Inference: The data is not very much appropriate for the linear model so we will move to the higher degree

# Step 5: Regression for a higher degree

l2 = lm(y_std~poly(x_std,2))
summary(l2)

Output:

> summary(l2)

Call:
lm(formula = y_std ~ poly(x_std, 2))

Residuals:
Min 1Q Median 3Q Max
-0.92704 -0.26805 -0.04894 0.13192 1.07002

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.716e-16 1.392e-01 0.000 1.00000
poly(x_std, 2)1 -2.402e+00 5.567e-01 -4.315 0.00084 ***
poly(x_std, 2)2 2.281e+00 5.567e-01 4.097 0.00126 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5567 on 13 degrees of freedom
Multiple R-squared: 0.7314,   Adjusted R-squared: 0.6901
F-statistic: 17.7 on 2 and 13 DF, p-value: 0.0001945


plot(l2)

plot(x_std,y_std)
lines(x_std,predict(l2),col = "red")

# Now we can see that there is a significant improvement over the linear regression model

# now we will move to the hypothesis.
for the regression equation, the hypothesis which is generally constructed is
all the betas are insignificant, with the alternative hypothesis that the betas are significant
Hence here we have a both sided test and the p-value obtained from the model output signifies that whether
the hypothesis is rejected or accepted.
Here in our case, we can see that the intercept does not have significance, hence we can safely eliminate the intercept from the model as p-value associated with that is much greater than the significance levels(which is 0.05)

Now for the explanatory variable, the 1st degree and 2nd-degree beta coefficients are significant as the p-values associated with them are much smaller than the confidence level which is 0.05. Hence they will be considered in the regression equation.

Thus, in this way, we can analyse data and set a hypothesis to test by some statistical analysis.

Hope this answer has helped you.
Thanks !!


Related Solutions

After reviewing data from a sample, an inference can be made about the population. Find a...
After reviewing data from a sample, an inference can be made about the population. Find a data set on the internet. Some suggested search terms: Free Data Sets, Medical Data Sets, Education Data Sets. A. Introduce your Data Set and Cite the Source. B. What trends do you notice in your data set? C. Based on the trends and the history of your data set, make a claim. What kind of test (left, right, two tailed) would you have to...
After reviewing data from a sample, an inference can be made about the population. For example,...
After reviewing data from a sample, an inference can be made about the population. For example, Find a data set on the internet. Some suggested search terms: Free Data Sets, Medical Data Sets, Education Data Sets. Introduce your Data Set and Cite the Source. What trends do you notice in your data set? Based on the trends and the history of your data set, make a claim. What kind of test (left, right, two tailed) would you have to complete?...
After reviewing data from a sample, an inference can be made about the population. For example,...
After reviewing data from a sample, an inference can be made about the population. For example, Find a data set on the internet. Some suggested search terms: Free Data Sets, Medical Data Sets, Education Data Sets. Based on the trends and the history of your data set, make a claim. What kind of test (left, right, two tailed) would you have to complete? Show work
What is statistical inference? A) a way to infer conclusions about the wider population from sample...
What is statistical inference? A) a way to infer conclusions about the wider population from sample data B) using facts about a sample to estimate the truth about the whole population C) a way to describe the uncertainty and variability inherent in all statistical information D) all of the above
Whats the difference between describing a sample and making an inference about a population parameter?
Whats the difference between describing a sample and making an inference about a population parameter?
Question 3: a. In making an inference about a population, it is usually desirable to make...
Question 3: a. In making an inference about a population, it is usually desirable to make a/an __________ estimate. sample standard average interval b. If housing starts are always stronger in the spring and summer than during the fall and winter. This is a result of what type of data pattern? Cyclical Irregular Seasonal Trend c. For the forecasting process, where would the model selection step fall in the process? After specifying the objectives and before determining what to forecast....
A sample is selected from a population with µ= 50. After a treatment is administered to...
A sample is selected from a population with µ= 50. After a treatment is administered to the individuals in the sample, the mean is found to be M= 55 and the variance is s2= 64. For a sample of n = 16 scores, conduct a single sample t-test to evaluate the significance of the treatment effect and calculate Cohen’s d to measure the size of the treatment effect. Use a two-tailed test with α = .05. Show the sampling distribution....
a sample is selected from a population with u=50. after a treatment is administered to the...
a sample is selected from a population with u=50. after a treatment is administered to the individuals in the sample, the mean is found to be M=55 and the variance is s2 . equal 64. If the sample has n=4 scores , then conduct a hypothesis test to evaluate the significance of the treatment effect and calculate Cohen's d to measure the size of the treatment effect. Use a two-tailed test with alpha = .05 If the sample has n=16...
A sample of students is selected from a population with µ = 50. After a treatment...
A sample of students is selected from a population with µ = 50. After a treatment is administered to the individuals in the sample, the mean is found to be M = 55 and the variance is s2 = 64. If the sample has n = 16 scores, then conduct a hypothesis test to evaluate the significance of the treatment effect. Use a two-tailed test with α = .05. What is the est. standard error or est. s.e. value? What...
A sample is selected from a population with µ= 50. After a treatment is administered to...
A sample is selected from a population with µ= 50. After a treatment is administered to the individuals in the sample, the mean is found to be M= 55 and the variance is s2= 64.           a. For a sample of n = 4 scores, conduct a single sample t-test to evaluate the  significance of the treatment effect and calculate Cohen’s d to measure the size of the treatment effect. Use a two-tailed test with α = .05.Show the sampling distribution.(2pts)...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT