In: Statistics and Probability
Find the ‘best fit’ equation of net income for last year on last year’s sales. Test the significance of the overall model at 1% level of significance.
Company | Market value | Sales | Profits | Assets | Recent share price | P-E Ratio | Yield |
1 | 42926 | 9663 | 2446.6 | 11086 | 38 | 18 | 2.67 |
2 | 31557 | 37799 | 975.0 | 38870 | 47 | 33 | 3.76 |
3 | 19143 | 7230 | 1093.5 | 9590 | 59 | 18 | 2.85 |
4 | 9915 | 4908 | 737.6 | 19429 | 46 | 16 | 6.66 |
5 | 9094 | 989 | 267.7 | 1203 | 30 | 37 | 0.00 |
6 | 7206 | 13428 | 952.4 | 111896 | 70 | 8 | 1.72 |
7 | 7164 | 5814 | 319.4 | 5662 | 68 | 22 | 2.96 |
8 | 6340 | 3962 | 478.9 | 12578 | 29 | 15 | 4.87 |
9 | 4996 | 3525 | 183.1 | 3987 | 33 | 27 | 1.92 |
10 | 4211 | 3702 | 56.8 | 4070 | 33 | 74 | 3.30 |
11 | 4041 | 4102 | 282.3 | 50863 | 54 | 17 | 2.23 |
12 | 3789 | 1619 | 79.2 | 1490 | 87 | 48 | 0.32 |
13 | 3744 | 8311 | 194.0 | 5458 | 60 | 19 | 3.16 |
14 | 3618 | 3832 | 128.0 | 2769 | 35 | 20 | 0.00 |
15 | 3200 | 3434 | 190.0 | 7483 | 29 | 21 | 5.54 |
16 | 3167 | 2330 | 146.1 | 2458 | 58 | 22 | 2.43 |
17 | 2759 | 3472 | 138.6 | 3175 | 205 | 19 | 0.49 |
18 | 2636 | 1172 | 172.7 | 6455 | 27 | 15 | 3.31 |
19 | 2567 | 3858 | 91.4 | 3188 | 19 | 36 | 3.20 |
20 | 2416 | 6895 | 115.6 | 1812 | 21 | 22 | 0.00 |
21 | 2300 | 1553 | 202.3 | 4802 | 27 | 13 | 6.23 |
22 | 2206 | 1739 | 139.6 | 3005 | 33 | 16 | 0.61 |
23 | 2012 | 3376 | 65.2 | 2994 | 34 | 31 | 1.18 |
24 | 2010 | 1773 | 133.8 | 6859 | 18 | 24 | 0.00 |
25 | 1994 | 3389 | 28.0 | 3266 | 43 | 66 | 2.64 |
26 | 1707 | 644 | 29.4 | 845 | 41 | 58 | 0.98 |
27 | 1612 | 5550 | 120.7 | 3162 | 34 | 13 | 4.71 |
28 | 1404 | 505 | 107.1 | 2273 | 27 | 14 | 5.77 |
29 | 1318 | 2152 | 99.0 | 2008 | 28 | 14 | 2.12 |
30 | 1285 | 1220 | 64.7 | 920 | 13 | 20 | 1.85 |
31 | 1281 | 2867 | 112.6 | 15925 | 28 | 11 | 3.67 |
32 | 1261 | 577 | 60.5 | 628 | 34 | 21 | 0.00 |
33 | 1253 | 840 | 84.9 | 13626 | 38 | 21 | 0.74 |
34 | 1216 | 1386 | 102.6 | 16844 | 38 | 13 | 3.73 |
35 | 1066 | 2219 | 39.1 | 1662 | 21 | 27 | 1.76 |
36 | 1060 | 2650 | 53.7 | 1479 | 35 | 20 | 1.47 |
37 | 1034 | 219 | 10.6 | 250 | 30 | 57 | 0.00 |
38 | 1021 | 819 | 34.3 | 1566 | 37 | 28 | 0.00 |
39 | 1011 | 3352 | 54.4 | 1319 | 21 | 20 | 0.00 |
40 | 956 | 528 | 42.5 | 438 | 27 | 22 | 2.48 |
41 | 832 | 966 | 69.7 | 1844 | 38 | 9 | 2.86 |
42 | 824 | 461 | 55.4 | 502 | 24 | 15 | 0.00 |
43 | 805 | 883 | 16.1 | 495 | 35 | 47 | 0.00 |
44 | 788 | 600 | 39.7 | 584 | 31 | 22 | 0.65 |
45 | 692 | 389 | 26.6 | 497 | 30 | 25 | 0.00 |
46 | 633 | 708 | 35.8 | 1020 | 24 | 20 | 5.45 |
47 | 616 | 526 | 40.9 | 475 | 27 | 16 | 2.12 |
48 | 602 | 351 | 50.7 | 3916 | 48 | 12 | 3.60 |
49 | 585 | 453 | 27.1 | 331 | 44 | 23 | 0.63 |
50 | 581 | 705 | 39.4 | 472 | 20 | 15 | 3.16 |
Find the ‘best fit’ equation of net income for last year on last year’s sales. Test the significance of the overall model at 1% level of significance.
> # Reading the data in console > df = read.csv(file.choose(),header = T) > colnames(df) [1] "Company" "Market.value" "Sales" "Profits" [5] "Assets" "Recent.share.price" "P.E.Ratio" "Yield" > dim(df) [1] 50 8 > str(df) 'data.frame': 50 obs. of 8 variables: $ Company : int 1 2 3 4 5 6 7 8 9 10 ... $ Market.value : int 42926 31557 19143 9915 9094 7206 7164 6340 4996 4211 ... $ Sales : int 9663 37799 7230 4908 989 13428 5814 3962 3525 3702 ... $ Profits : num 2447 975 1094 738 268 ... $ Assets : int 11086 38870 9590 19429 1203 111896 5662 12578 3987 4070 ... $ Recent.share.price: int 38 47 59 46 30 70 68 29 33 33 ... $ P.E.Ratio : int 18 33 18 16 37 8 22 15 27 74 ... $ Yield : num 2.67 3.76 2.85 6.66 0 1.72 2.96 4.87 1.92 3.3 ... > summary(df) Company Market.value Sales Profits Assets Min. : 1.00 Min. : 581 Min. : 219.0 Min. : 10.60 Min. : 250 1st Qu.:13.25 1st Qu.: 1024 1st Qu.: 735.8 1st Qu.: 44.55 1st Qu.: 1066 Median :25.50 Median : 1850 Median : 1962.5 Median : 95.20 Median : 2882 Mean :25.50 Mean : 4209 Mean : 3468.9 Mean : 220.71 Mean : 7951 3rd Qu.:37.75 3rd Qu.: 3712 3rd Qu.: 3799.5 3rd Qu.: 180.50 3rd Qu.: 6257 Max. :50.00 Max. :42926 Max. :37799.0 Max. :2446.60 Max. :111896 Recent.share.price P.E.Ratio Yield Min. : 13.00 Min. : 8.00 Min. :0.000 1st Qu.: 27.00 1st Qu.:15.25 1st Qu.:0.520 Median : 33.50 Median :20.00 Median :2.120 Mean : 39.52 Mean :24.40 Mean :2.196 3rd Qu.: 42.50 3rd Qu.:26.50 3rd Qu.:3.275 Max. :205.00 Max. :74.00 Max. :6.660 > plot(df) > cor(df[,-1]) Market.value Sales Profits Assets Recent.share.price Market.value 1.00000000 0.6922622439 0.9341491 0.2761694 0.1110567665 Sales 0.69226224 1.0000000000 0.5452554 0.5097578 0.1532372418 Profits 0.93414905 0.5452554054 1.0000000 0.4174388 0.1232055395 Assets 0.27616939 0.5097577692 0.4174388 1.0000000 0.1870074786 Recent.share.price 0.11105677 0.1532372418 0.1232055 0.1870075 1.0000000000 P.E.Ratio -0.01095553 0.0004982515 -0.1736945 -0.2136470 -0.0004741459 Yield 0.17005089 0.1828134131 0.2154797 0.1271457 -0.1053225181 P.E.Ratio Yield Market.value -0.0109555282 0.1700509 Sales 0.0004982515 0.1828134 Profits -0.1736944984 0.2154797 Assets -0.2136469763 0.1271457 Recent.share.price -0.0004741459 -0.1053225 P.E.Ratio 1.0000000000 -0.2648121 Yield -0.2648120721 1.0000000 > # Here the net income is profit. > #Fitting the equation of net income for last year on last year’s sales > mod = lm(Profits~Sales, data = df) > summary(mod) Call: lm(formula = Profits ~ Sales, data = df) Residuals: Min 1Q Median 3Q Max -590.41 -98.62 -68.27 -22.69 1983.27 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 84.829673 57.131079 1.485 0.144 Sales 0.039170 0.008692 4.506 4.23e-05 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 343.1 on 48 degrees of freedom Multiple R-squared: 0.2973, Adjusted R-squared: 0.2827 F-statistic: 20.31 on 1 and 48 DF, p-value: 4.229e-05 # here p value is 4.229e-05 <0.01, so it is significant at 1% level > #Fitting the equation of net income for last year on last year complete data > df1 = df[,-1] # since company number is unique for each data point therefore it has no importance > mod1 = lm(Profits~ .,df1) > summary(mod1) Call: lm(formula = Profits ~ ., data = df1) Residuals: Min 1Q Median 3Q Max -211.494 -25.299 -2.766 32.283 142.073 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 56.3927458 32.4206991 1.739 0.0891 . Market.value 0.0587314 0.0019425 30.234 < 2e-16 *** Sales -0.0262996 0.0029623 -8.878 2.81e-11 *** Assets 0.0062518 0.0007296 8.568 7.55e-11 *** Recent.share.price 0.1368850 0.3895267 0.351 0.7270 P.E.Ratio -2.6194392 0.7825693 -3.347 0.0017 ** Yield 7.8043555 6.0061320 1.299 0.2007 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 73.92 on 43 degrees of freedom Multiple R-squared: 0.9708, Adjusted R-squared: 0.9667 F-statistic: 238.2 on 6 and 43 DF, p-value: < 2.2e-16 # Here we see that Market.value, sales and assets have significant effect at 1% level. |
|
|