Question

In: Statistics and Probability

Analyze and interpret the effect of explanatory variables on the milk intake (dl.milk) in the kfm...

Analyze and interpret the effect of explanatory variables on the milk intake (dl.milk) in the kfm data set (ISwR) using a multiple regression model.(R programming)

1) Run regression for dl.milk on all other variables. Do you find any significance that milk intake can be explained by other variables?

2) find regression models in which fewer explanation variables should be used. i.e., select a subset of variables so that a better fit can be achieved.

Solutions

Expert Solution

Using R Code:

load the library ISwR
load the dataset kfm
There are 50 observations and 7 variables
variables are
"no" "dl.milk" "sex" "weight" "ml.suppl" "mat.weight" "mat.height"
code the variable sex boys to 0 and girls to 1
dl.milk is dependent variable


R Code:


library(ISwR)
print(kfm)
dim(kfm)
names(kfm)
require(dplyr)
kfm1 <- kfm %>%
mutate(sex = ifelse(sex == "girl",0,1))
head(kfm1)
kfm1 <- kfm1[,-1]
rgmod1 <- lm(dl.milk~.,data=kfm1)
summary(rgmod1)
coefficients(rgmod1)


output :


> summary(rgmod1)
Call:
lm(formula = dl.milk ~ ., data = kfm1)
Residuals:
Min 1Q Median 3Q Max
-1.74201 -0.81173 -0.00926 0.78326 2.52646
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -12.181372 4.322605 -2.818 0.007212 **
sex 0.499532 0.312672 1.598 0.117284
weight 1.349124 0.322450 4.184 0.000135 ***
ml.suppl -0.002233 0.001241 -1.799 0.078829 .  
mat.weight 0.006212 0.023708 0.262 0.794535
mat.height 0.072278 0.030169 2.396 0.020906 *  
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.075 on 44 degrees of freedom
Multiple R-squared: 0.5459, Adjusted R-squared: 0.4943
F-statistic: 10.58 on 5 and 44 DF, p-value: 1.03e-06
intrepretation
From summary we can see for weight and mat.height are significant variables as p<0.05
other variables sex ,ml.suppl ,mat.weight are not significant variables as p>0.05
F(5,44)=10.58
p=0.0000
p<0.05
Model is significant.
We can use model for predicting dl.milk.
Regression model
dl.milk=-12.181371613 +0.4995321988*sex+1.349124010*weight-0.002232952 * ml.suppl + 0.006211857* mat.weight+0.072278226 * mat.height


Solution 2:
Now exclude the insignificant variables and run model with significant variables:


Rcode:

rgmod2 <- lm(dl.milk~weight+mat.height,data=kfm1)
summary(rgmod2)
output:
Min 1Q Median 3Q Max
-2.19598 -0.82149 0.01822 0.75582 2.83375
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -11.92014 4.07325 -2.926 0.00527 **
weight 1.42862 0.31338 4.559 3.67e-05 ***
mat.height 0.07063 0.02636 2.680 0.01013 *  
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.109 on 47 degrees of freedom
Multiple R-squared: 0.4835, Adjusted R-squared: 0.4615
F-statistic: 22 on 2 and 47 DF, p-value: 1.811e-0
Intrepretation :
r sq=0.4835
48.35% variation in dl.milk is explained by model.
Rest 51.65% is unexplained variation.
F(2,47)=22
p=0.0000
p<0.05 model is significant.
Final regression model is
(Intercept) weight mat.height
-11.92014253 1.42862096 0.07062876
dl.milk=-11.92014253 +1.42862096*weight+ 0.07062876 *mat.height


Related Solutions

In R, analyze and interpret the effect of explanatory variables on the milk intake (dl.milk) in...
In R, analyze and interpret the effect of explanatory variables on the milk intake (dl.milk) in the kfm data set (ISwR) using a multiple regression model Test by using ALPHA = 0.05. 1) Run regression for( dl.milk )on all other variables. Do you find any significance that milk intake can be explained by other variables? 2) find regression models in which fewer explanation variables should be used. i.e., select a subset of variables so that a better fit can be...
An interaction term is the multiplication of two or more explanatory variables true or false
An interaction term is the multiplication of two or more explanatory variables true or false
b) In the following examples, identify the response variable and the explanatory variables illustrating their levels...
b) In the following examples, identify the response variable and the explanatory variables illustrating their levels of measuremsnt i) Marital status (married, single, divorced, widowed), Quality of life (excellent, good, fair, poor). ii) presence of a disease( Presnet, Absent) , gender ( male, Female),
You estimate a model with 9 explanatory variables and an intercept from a data set with...
You estimate a model with 9 explanatory variables and an intercept from a data set with 100 observations. To test hypotheses on this model you should use a t-distribution with how many degrees of freedom? Select one: a. 1 b. 10 c. 9 d. 90
Why is it important for you, a non-accountant, to be able to interpret and analyze the...
Why is it important for you, a non-accountant, to be able to interpret and analyze the financial statements that have been prepared by a company's accountants? As useful as financial statements are, they have their limitations. Of course, we should be aware of such limitations whenever we are analyzing a company's financial statements. What do you see as the shortcomings/limitations of corporate financial statements?
Let's set up a regression problem by generating x and y variables. x is the explanatory...
Let's set up a regression problem by generating x and y variables. x is the explanatory variable and y is the response variable. ```{r} set.seed(8) x <- rnorm(100, 5, 10) y <- x*4.3 + 5 + rnorm(100, 0, 40) ``` What is the theoretical slope of the population? What is the theoretical intercept of the population? What is the slope that is inferred from creating a regression model for the randomly generated sample? What is the intercept that is inferred...
Question 13. For each pair of variables, specify which variable is the explanatory variable and which...
Question 13. For each pair of variables, specify which variable is the explanatory variable and which is the response variable. Also identify whether the variable is categorical or numerical. The time student spends for the study and her/his final score The car engine size and car’s maximum speed
1. Identify the explanatory and response variables in your study. 2. Explain why the study is...
1. Identify the explanatory and response variables in your study. 2. Explain why the study is an observational study or an experiment. 3. Can we conclude that there is a causal relationship between the explanatory and response variables? This is the Study: https://www.sciencedaily.com/releases/2018/10/181009210738.htm Planned intermittent fasting may help to reverse type 2 diabetes, suggest doctors writing in the journal BMJ Case Reports after three patients in their care, who did this, were able to cut out the need for insulin...
Multicollienarity (check all that apply): Question options: arises when some of the explanatory variables are highly...
Multicollienarity (check all that apply): Question options: arises when some of the explanatory variables are highly correlated either with each other or with a combination of some of the rest of the explanatory variables. can be best detected by checking the values of the so-called variance inflation factors (VIFs). doesn't need to be corrected for if the only goal of the analysis is to use the regression equation for predictions of the response variable, because the regression standard error is...
Consider a binary response variable y and two explanatory variables x1 and x2. The following table...
Consider a binary response variable y and two explanatory variables x1 and x2. The following table contains the parameter estimates of the linear probability model (LPM) and the logit model, with the associated p-values shown in parentheses. Variable LPM Logit Constant −0.40 −2.20 (0.03 ) (0.01 ) x1 0.32 0.98 (0.04 ) (0.06 ) x2 −0.04 −0.20 (0.01 ) (0.01 ) a. At the 5% significance level, comment on the significance of the variables for both models. Variable LPM Logit...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT