In: Statistics and Probability
*** Please solve in R ***
You now pull another sample of 20 policyholders. See “Prob 11-2 claims data.csv” for the data. For this data set, assume a negative binomial distribution is your first choice as a data generating process that could have produced the counts of claims. Perform a negative binomial regression to analyze the data; report your findings.
Claims | Years |
1 | 1.042191 |
7 | 2.497906 |
2 | 3.676603 |
4 | 3.798047 |
1 | 4.870992 |
9 | 7.689958 |
1 | 8.02383 |
6 | 8.305029 |
2 | 9.052364 |
9 | 10.44021 |
5 | 11.44318 |
5 | 12.12458 |
2 | 13.32714 |
4 | 14.65696 |
11 | 14.78745 |
5 | 16.20412 |
0 | 16.27578 |
13 | 16.48458 |
21 | 16.94984 |
22 | 18.3575 |
Answer --> To Perform the negative binomial regression in R, Below is the stepwise process and also the finding
#### data import /Input data ######
> Claims <- c(1,7,2,4,1,9,1,6,2,9,5,5,2,4,11,5,0,13,21,22)
> Years <- c(1.042191,2.497906,3.676603,3.798047,4.870992,7.689958,8.02383,8.305029,9.052364,10.44021,11.44318,12.12458,13.32714,14.65696,14.78745,16.20412,16.27578,16.48458,16.94984,18.3575)
> Policy_Data <- data.frame(Claims,Years) # Make the data table
> dim(Policy_Data) # check dimention of data( 20 obs with 2 variables)
[1] 20 2
##### Import library ‘MASS’ required to Perform the negative binomial regression ########
> library (MASS)
##### Syntax of To Perform the negative binomial regression ############
> negBinomModel <- glm.nb(Claims ~ Years, data = Policy_Data) # negative Binomial model
> negBinomModel
Call: glm.nb(formula = Claims ~ Years, data = Policy_Data, init.theta = 2.359874145,
link = log)
Coefficients:
(Intercept) Years
0.74543 0.09494
Degrees of Freedom: 19 Total (i.e. Null); 18 Residual
Null Deviance: 30.44
Residual Deviance: 21.81 AIC: 115.8
After performing the negative binomial regression below is the our finding / R output.
> summary (negBinomModel) # Model summary
Call:
glm.nb(formula = Claims ~ Years, data = Policy_Data, init.theta = 2.359874145,
link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.7874 -0.8940 -0.3172 0.4420 1.3660
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.74543 0.42534 1.753 0.07968 .
Years 0.09494 0.03441 2.759 0.00579 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(2.3599) family taken to be 1)
Null deviance: 30.443 on 19 degrees of freedom
Residual deviance: 21.806 on 18 degrees of freedom
AIC: 115.85
Number of Fisher Scoring iterations: 1
Theta: 2.36
Std. Err.: 1.10
2 x log-likelihood: -109.849