In: Statistics and Probability
Using MS Excel and the random number generator function, generate values for 30 observations for the following columns with average daily:
Body weight with random values between 100 and 250lbs
Calories intake with random values between 1000 and 3000
calories
Workout duration with random values between 0 and 60 minutes
Sleep duration with random values between 2 and 12 hours
Work duration with random values between 0 and 12 hours
Assuming that the values are averages over 1 year, conduct the following:
1. Descriptive statistics for each category
2. Correlation analysis between weight and calorie intake 3.
Analysis of variance
4. Regression analysis
Formulate a hypothesis of your choice using weight, calorie intake, workout duration, sleep duration, and/or work duration. Which statistical test would you select to validate the hypothesis?
What are your observations?
Let
y denotes, Body weight ; x1 denotes Calories Intake ; x2 denotes Work Duration ; x3 denotes Sleep Duration ; x4 denotes Work Duration
Random values generated as per the given conditions are
S.No. | y | x1 | x2 | x3 | x4 |
1 | 180 | 1938 | 7 | 2 | 5 |
2 | 149 | 1783 | 58 | 9 | 11 |
3 | 166 | 2974 | 37 | 11 | 0 |
4 | 183 | 1540 | 58 | 11 | 7 |
5 | 173 | 2247 | 14 | 2 | 6 |
6 | 135 | 2398 | 21 | 3 | 3 |
7 | 124 | 2274 | 57 | 10 | 9 |
8 | 172 | 2545 | 43 | 3 | 2 |
9 | 108 | 2194 | 59 | 2 | 9 |
10 | 193 | 2678 | 25 | 4 | 8 |
11 | 191 | 1867 | 14 | 4 | 10 |
12 | 237 | 1412 | 4 | 4 | 7 |
13 | 190 | 1711 | 4 | 12 | 8 |
14 | 192 | 1018 | 30 | 7 | 8 |
15 | 184 | 1090 | 60 | 5 | 3 |
16 | 234 | 2080 | 31 | 8 | 8 |
17 | 192 | 1758 | 40 | 10 | 12 |
18 | 232 | 2107 | 48 | 4 | 4 |
19 | 179 | 1701 | 8 | 10 | 4 |
20 | 212 | 2818 | 44 | 9 | 3 |
21 | 207 | 2758 | 10 | 8 | 7 |
22 | 213 | 2105 | 28 | 3 | 1 |
23 | 179 | 2278 | 37 | 5 | 6 |
24 | 184 | 2359 | 58 | 7 | 4 |
25 | 171 | 2901 | 21 | 8 | 0 |
26 | 113 | 1734 | 22 | 5 | 1 |
27 | 211 | 2242 | 51 | 10 | 6 |
28 | 163 | 1253 | 55 | 7 | 0 |
29 | 246 | 2069 | 9 | 5 | 0 |
30 | 199 | 1014 | 48 | 12 | 4 |
1. Descriptive Statistics for each category are
2. Correlation between Body Weight and Calories Intake is
Let us regress Body Weight (y) on Calories Intake (x1), Workout Duration (x2), Sleep Duration (x3), and Work Duration (x4) i.e.,
The linear regression equation is
... (1)
Where
Our hypothesis is to test is at least one of the regressors are statistically significantly i.e.,
Null hypothesis
versus
Alternative hypothesis
for at least one j = 1,2,3,4
(3)
Analysis of variance table is given below
From p-value (Significance F) = 0.5486 > 0.05, it is inferred that we do not reject Null hypothesis and infer that none of the independent variables considered in the equation (1) are statistically significant in explaining the dependent variable
(4)
Regression Analysis table is given below:
From the 'Summary Output' it is inferred that
a. Considering p-values of individual parameters, all p-values > 0.05 leading us to conclude that the parameter estimates are not statistically significant from '0'.
b. Considering R-Square (=0.1109), inferring that the regressor variables could only explain 11.09% of the total variation in the dependent variable, which indicates that these regressor variables are not sufficient to explain the dependent variable