Question

In: Statistics and Probability

1. In the iris data, build a linear regression model to predict Sepal.Length based on both...

1. In the iris data, build a linear regression model to predict Sepal.Length based on both Petal.Length and Species.

a. Calculate the regression equation, including the interaction.

b. From this equation, you should be able to find 3 regression lines (one for each Species). Interpret each of the 3 slopes of the lines in the context of the problem. Remember that both numerical variables are measured in centimeters.

c. Plot the 3 regression lines in a scatterplot of Sepal.Length vs. Petal.Length. Use a different color for each Species.

d. Predict the Sepal.Length for a Versicolor iris with a petal that is 3.4 cm long.

e. Conduct a partial ?-test to see if the Petal.Length:Species interaction terms are significant. State the hypothesis, ?-value, and conclusion of the test.

f. Now calculate the regression equation without the interaction term. Interpret the slope in the context of the problem.

g. Using the no-interaction model, I’m 90% sure that the sepal length of a Virginica iris with a 5.5-cm-long petal will be between __________ cm and __________ cm.

h. Identify and interpret the ? 2 from the no-interaction model.

Solutions

Expert Solution

ANSWER::

Note : allowed to answer only 4 sub questions in one post.

a to d answered and r code given

e to h   

> library(interactions)
> library(ggplot2)
> fitiris <- lm(Sepal.Length ~ Petal.Length * Species, data = iris)
> summary(fitiris)

Call:
lm(formula = Sepal.Length ~ Petal.Length * Species, data = iris)

Residuals:
Min 1Q Median 3Q Max
-0.73479 -0.22785 -0.03132 0.24375 0.93608

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.2132 0.4074 10.341 < 2e-16 ***
Petal.Length 0.5423 0.2768 1.959 0.05200 .
Speciesversicolor -1.8056 0.5984 -3.017 0.00302 **
Speciesvirginica -3.1535 0.6341 -4.973 1.85e-06 ***
Petal.Length:Speciesversicolor 0.2860 0.2951 0.969 0.33405
Petal.Length:Speciesvirginica 0.4534 0.2901 1.563 0.12029
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3365 on 144 degrees of freedom
Multiple R-squared: 0.8405,   Adjusted R-squared: 0.8349
F-statistic: 151.7 on 5 and 144 DF, p-value: < 2.2e-16

> plot(iris$Sepal.Length,iris$Petal.Length, pch=21, bg=c("red","green3","blue")[unclass(iris$Species)], main="Edgar Anderson's Iris Data", xlab="Sepal.Length", ylab="Petal.Length")
> interact_plot(fitiris, pred = Petal.Length, modx = Species)
> 5.22622
[1] 5.22622

NOTE:: I HOPE THIS ANSWER IS HELPFULL TO YOU......**PLEASE SUPPORT ME WITH YOUR RATING......

**PLEASE GIVE ME "LIKE".....ITS VERY IMPORTANT  FOR,ME......PLEASE SUPPORT ME .......THANK YOU


Related Solutions

            Develop a simple linear regression model to predict the price of a house based upon...
            Develop a simple linear regression model to predict the price of a house based upon the living area (square feet) using a 95% level of confidence.             Write the reqression equation             Discuss the statistical significance of the model as a whole using the appropriate regression statistic at a 95% level of confidence.              Discuss the statistical significance of the coefficient for the independent variable using the appropriate regression statistic at a 95% level of confidence.             Interpret the...
Develop a simple linear regression model to predict a person’s income (INCOME) based on their age...
Develop a simple linear regression model to predict a person’s income (INCOME) based on their age (AGE) using a 95% level of confidence. a. Write the regression equation. Discuss the statistical significance of the model as whole using the appropriate regression statistic at a 95% level of confidence. Discuss the statistical significance of the coefficient for the independent variable using the appropriate regression statistic at a 95% level of confidence. Interpret the coefficient for the independent variable. What percentage of...
I have conducted a linear regression model to predict student scores on an exam based on...
I have conducted a linear regression model to predict student scores on an exam based on the number of hours they studied. I get a coefficient (slope) of +2.5 for the variable of hours studied. The pvalue for this coefficient is 0.45 and the 95% confidence interval is [-2.5, +7]. Which of the following conclusions CANNOT be drawn from these results? At an alpha of 0.05, we can say that the effect of hours studied on exam score is significant...
. Develop a simple linear regression model to predict a person’s income (INCOME) based upon their...
. Develop a simple linear regression model to predict a person’s income (INCOME) based upon their years of education (EDUC) using a 95% level of confidence. a. Write the reqression equation. b. Discuss the statistical significance of the model as a whole using the appropriate regression statistic at a 95% level of confidence. c. Discuss the statistical significance of the coefficient for the independent variable using the appropriate regression statistic at a 95% level of confidence. d. Interpret the coefficient...
Develop a simple linear regression model to predict the Cost of Living Index based upon Restaurant...
Develop a simple linear regression model to predict the Cost of Living Index based upon Restaurant Price Index using a 95% level of confidence. Write the reqression equation. Discuss the statistical significance of the model as a whole using the appropriate regression statistic at a 95% level of confidence. Discuss the statistical significance of the coefficient for the independent variable using the appropriate regression statistic at a 95% level of confidence. Interpret the coefficient for the independent variable. What percentage...
Using the Instructor ranking data conduct a simple linear regression to predict a student’s scores based...
Using the Instructor ranking data conduct a simple linear regression to predict a student’s scores based on the number of hours the student studies and answer the following questions: What is the value of the intercept of this regression equation? What is the value of the slope of this regression equation and what is its interpretation within the context of this problem? Write the equation for this logistic regression model R2 =    What does this number mean? F=     Explain how F (F...
1.Develop a multiple linear regression model to predict the price of a house using the square...
1.Develop a multiple linear regression model to predict the price of a house using the square feet of living area, number of bedrooms, and number of bathrooms as the predictor variables     Write the reqression equation.      Discuss the statistical significance of the model as a whole using the appropriate regression statistic at a 95% level of confidence. Discuss the statistical significance of the coefficient for each independent variable using the appropriate regression statistics at a 95% level of confidence....
1. Obtain a linear regression equation for the data to predict the mean temperature values for...
1. Obtain a linear regression equation for the data to predict the mean temperature values for any given CO2 level. How good is the linear fit for this data? Explain using residual plot and R-square value. To draw residual plot, compute the estimated temperatures for every value of the CO2 level using the regression equation. Then compute the difference between observed (y) and estimated temperature values (called residual; ). Plot the residuals versus CO2 level (called a residual plot). 320.09...
A linear regression model is generated to predict the daily increase in covid 19 cases in...
A linear regression model is generated to predict the daily increase in covid 19 cases in Mumbai . y=2 X1 + 10 X2 + b(100) where Y= number of new cases daily, X1= no of incoming passenger flights in Mumbai, and X2= no of passenger train arriving in Mumbai; b= constant or predicted daily increase due to community transmission ,even if no passenger flight or trains are allowed into the city . Assume a1 and a2 are regression coefficeint for...
Suppose we wish to build a multiple regression model to predict the cost of rent (dollars)...
Suppose we wish to build a multiple regression model to predict the cost of rent (dollars) in a city based on population (thousands of people), and income (thousands of dollars). Use the alpha level of 0.05. City Monthly Rent ($) 2018 Population (Thousands) 2010 Median Income (Thousands of Dollars) Denver, CO 998 586.158 45.438 Birmingham, AL 711 212.237 301.704 San Diego, CA 1414 1307.402 61.962 Gainesville, FL 741 124.354 28.653 Winston-Salem, NC 750 239.617 41.979 Memphis, TN 819 646.889 36.535...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT