In: Statistics and Probability
5. What is the skewness and kurtosis of each data set? 6. Generate a histogram plot of each of the data sets. 7. Based on the variability of the data, what do you think the next step would be to analyze the data? Age Income 29 9315 25 6590 28 9668 27 8412 25 1654 24 2431 25 6977 19 8966 27 9327 18 3871 25 9934 19 2236 19 3035 29 2518 19 3616 19 9219 28 1090 18 5368 26 2832 29 1899
5. The summary statistics for each data-set is as follows;
Age | |
Mean | 23.9 |
Standard Error | 0.9316086822 |
Median | 25 |
Mode | 19 |
Standard Deviation | 4.166280684 |
Sample Variance | 17.35789474 |
Kurtosis | -1.5946043 |
Skewness | -0.3145955467 |
Range | 11 |
Minimum | 18 |
Maximum | 29 |
Sum | 478 |
Count | 20 |
Income | |
Mean | 5447.9 |
Standard Error | 724.4938411 |
Median | 4619.5 |
Mode | 9934 |
Standard Deviation | 3240.034956 |
Sample Variance | 10497826.52 |
Kurtosis | -1.769589839 |
Skewness | 0.1720874245 |
Range | 8844 |
Minimum | 1090 |
Maximum | 9934 |
Sum | 108958 |
Count | 20 |
From the above tables we observe the value of skewness and kurtosis.
The same has been calculated using the Data analysis option in MS Excel.
6. Histogram for Age:
Bin | Frequency | Cumulative % |
18 | 2 | 10.00% |
20 | 5 | 35.00% |
22 | 0 | 35.00% |
24 | 1 | 40.00% |
26 | 5 | 65.00% |
28 | 4 | 85.00% |
30 | 3 | 100.00% |
More | 0 | 100.00% |
Histogram for Income:
Bin | Frequency | Cumulative % |
2000 | 3 | 15.00% |
4000 | 7 | 50.00% |
6000 | 1 | 55.00% |
8000 | 2 | 65.00% |
10000 | 7 | 100.00% |
More | 0 | 100.00% |
7. Based on the data our next step would be to calculate the correlation between age and income.
Age | Income | |
Age | 1 | |
Income | 0.06171967151 | 1 |
We observe a very low correlation between age and income. The value is 0.06.