In: Statistics and Probability
1. What role does tenure play in analyzing customer profitability for online and offline customers? (Do a regression analysis with 9profit as the dependent variable and 9tenure and 9online as independent variables. Discuss the results of the regression. Include - R-square and its interpretation, results of hypothesis testing on whole model's significance and the significance of individual variables.)
2. We want to compare profit from 1999 to profit from 2000 for the same customers. Profit for the same customers in year 2000 given by the variable 0profit. As you can see in the data set, some of the data for year 2000 is missing. How would you compare the profit from 1999 to 2000 if some data is missing? (Here you don't have to do the actual comparison. You don't have to do a hypothesis test. You should discuss how you would handle missing data)
please use the attached link for the data given:
https://drive.google.com/file/d/1jSekuFbrHLLJT6KXDz8TiMzcpfG6aVTD/view
1. The statistic summery are given below:
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.192155908 | |||||
R Square | 0.036923893 | |||||
Adjusted R Square | 0.036862998 | |||||
Standard Error | 267.7633336 | |||||
Observations | 31634 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 2 | 86948482.61 | 43474241.3 | 606.359 | 0 | |
Residual | 31631 | 2267854222 | 71697.2028 | |||
Total | 31633 | 2354802704 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 46.36364191 | 2.451080502 | 18.9155933 | 2.33E-79 | 41.55942857 | 51.16785526 |
X Variable 1 | 16.55286472 | 4.612836494 | 3.588435173 | 0.000333 | 7.511525361 | 25.59420409 |
X Variable 2 | 6.211189274 | 0.178479097 | 34.80065385 | 2E-260 | 5.861363286 | 6.561015262 |
2. To handle those missing values, we can go for two different procesdures are there. one is deletion and imputation. In deletion portion there is three types - row deletion, column deletion and pairwise deletion. Similarly imputation can be handled thorugh general problem and time series problem.