Question

In: Statistics and Probability

use methods of descriptive statistics to summarize the data and comment on your findings - Income...

use methods of descriptive statistics to summarize the data and comment on your findings -

Income
($1000s)
Household
Size
Amount
Charged ($)
54 3 4,016
30 2 3,159
32 4 5,100
50 5 4,742
31 2 1,864
55 2 4,070
37 1 2,731
40 2 3,348
66 4 4,764
51 3 4,110
25 3 4,208
48 4 4,219
27 1 2,477
33 2 2,514
65 3 4,214
63 4 4,965
42 6 4,412
21 2 2,448
44 1 2,995
37 5 4,171
62 6 5,678
21 3 3,623
55 7 5,301
42 2 3,020
41 7 4,828
54 6 5,573
30 1 2,583
48 2 3,866
34 5 3,586
67 4 5,037
50 2 3,605
67 5 5,345
55 6 5,370
52 2 3,890
62 3 4,705
64 2 4,157
22 3 3,579
29 4 3,890
39 2 2,972
35 1 3,121
39 4 4,183
54 3 3,730
23 6 4,127
27 2 2,921
26 7 4,603
61 2 4,273
30 2 3,067
22 4 3,074
46 5 4,820
66 4 5,149

Solutions

Expert Solution


> Income=scan()
1: 54   30   32   50   31   55   37   40   66   51   25   48   27   33   65   63   42   21   44   37   62   21   55   42   41   54   30   48   34   67   50   67   55   52   62   64   22   29   39   35   39   54   23   27   26   61   30   22   46   66
51:
Read 50 items
> Household=scan()
1: 3   2   4   5   2   2   1   2   4   3   3   4   1   2   3   4   6   2   1   5   6   3   7   2   7   6   1   2   5   4   2   5   6   2   3   2   3   4   2   1   4   3   6   2   7   2   2   4   5   4
51:
Read 50 items
> Amount=scan()
1: 4016   3159   5100   4742   1864   4070   2731   3348   4764   4110   4208   4219   2477   2514   4214   4965   4412   2448   2995   4171   5678   3623   5301   3020   4828   5573   2583   3866   3586   5037   3605   5345   5370   3890   4705   4157   3579   3890   2972   3121   4183   3730   4127   2921   4603   4273   3067   3074   4820   5149
51:
Read 50 items
> d=cbind(Income,Household,Amount)
> colMeans(d)
Income Household Amount
43.48 3.42 3964.06
> c(median(d[,1]),median(d[,2]),median(d[,3]))#medians
[1] 42 3 4090
> hist(d[,1])


> hist(d[,2])


> ##household is positively skewed i.e most of people have small houses. Number of people having large house size are less in number
> hist(d[,3])


> ##Amount is slightly negatively skewed
> cor(d)
Income Household Amount
Income 1.0000000 0.1725335 0.6309742
Household 0.1725335 1.0000000 0.7528432
Amount 0.6309742 0.7528432 1.0000000
> #household size and amount are highly correlated. Income and amount are more correlated as compare to that income and household size
> sqrt(round(c(var(d[,1]),var(d[,2]),var(d[,3])),5))#standard deviation
[1] 14.550742 1.738988 933.494082
> sqrt(round(c(var(d[,1]),var(d[,2]),var(d[,3])),5))#standard deviation
[1] 14.550742 1.738988 933.494082
> sqrt(round(c(var(d[,1]),var(d[,2]),var(d[,3])),5))/colMeans(d)#coefficient of variation
Income Household Amount
0.3346537 0.5084761 0.2354894
> #regression on amount charged using income and household as predictors
> m=lm(Amount~Income+Household)
> m

Call:
lm(formula = Amount ~ Income + Household)

Coefficients:
(Intercept) Income Household
1304.90 33.13 356.30

> summary(m)

Call:
lm(formula = Amount ~ Income + Household)

Residuals:
Min 1Q Median 3Q Max
-1180.62 -155.31 7.05 194.56 1309.66

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1304.905 197.655 6.602 3.29e-08 ***
Income 33.133 3.968 8.350 7.68e-11 ***
Household 356.296 33.201 10.732 3.12e-14 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 398.1 on 47 degrees of freedom
Multiple R-squared: 0.8256,   Adjusted R-squared: 0.8181
F-statistic: 111.2 on 2 and 47 DF, p-value: < 2.2e-16

>
>
> #model is Amount = 1304.905 + 33.133*Income + 356.296*Household
> #for unit increase in household size amount charged increases by 356.296 and for unit increase in Income amount increases by 33.133


Related Solutions

Use methods of descriptive statistics to summarize the data. Comment on the findings. Please include formulas used and steps to complete it.
Use methods of descriptive statistics to summarize the data. Comment on the findings. Please include formulas used and steps to complete it. Data: Golfer Earnings ($1000s) Scoring Avg. Greens in Reg. Putting Avg. Ai Miyazato 57017 72.00 0.702 30.04 Alena Sharp 27127 72.80 0.689 30.65 Alison Lee 136411 70.72 0.716 29.17 Alison Walshe 66038 72.45 0.653 29.55 Amelia Lewis 16524 73.33 0.636 29.72 Amy Anderson 20459 73.40 0.708 31.60 Amy Yang 470755 70.47 0.752 30.03 Angela Stanford 93913 71.46 0.718...
In this exercise, we will look at descriptive statistics and how to explore and summarize data...
In this exercise, we will look at descriptive statistics and how to explore and summarize data sets. For this, we use the Heart Disease dataset from the UCI data repository. This dataset consists of 4 small datasets of people with heart disease admitted to 4 hospitals. For now, we only work with the file. this data consists of 271 instances with 7 attributes. The attributes are described as below: Age: age in years sex: 1 = male; 0 = female...
Use appropriate descriptive statistics to summarize each of the two variables for the 40 Gulf View...
Use appropriate descriptive statistics to summarize each of the two variables for the 40 Gulf View condominiums, and each of the two variables for the 18 No Gulf View condominiums. What are the means and standard deviations of the four variables? . Compare your summary results. Discuss any specific statistical results that would help a real estate agent understand the condominium market. In particular, what are the percent discounts between the average list and sale price for Gulf View and...
7.38 Teaching descriptive statistics. A study compared five different methods for teaching descriptive statistics. The five...
7.38 Teaching descriptive statistics. A study compared five different methods for teaching descriptive statistics. The five methods were traditional lecture and discussion, programmed textbook instruction, programmed text with lectures, computer instruction, and computer instruction with lectures. 45 students were randomly assigned, 9 to each method. After completing the course, students took a 1-hour exam. (a) What are the hypotheses for evaluating if the average test scores are different for the different teaching methods? (b) What are the degrees of freedom...
Third, the researcher wishes to use numerical descriptive measures to summarize the data on each of...
Third, the researcher wishes to use numerical descriptive measures to summarize the data on each of the two variables: hours worked per week and income earned per year. b. Compute the correlation coefficient using the relevant Excel function to measure the direction and strength of the linear relationship between the two variables. Display and interpret the correlation value.      Data for HOURSWORKED63 Excel spreadsheet is below: Yearly Income ('000's) Hours Per Week 43.8 18 44.5 13 44.8 18 46.0 25.5...
Third, the researcher wishes to use numerical descriptive measures to summarize the data on each of...
Third, the researcher wishes to use numerical descriptive measures to summarize the data on each of the two variables: hours worked per week and income earned per year. (a) Prepare and display a numerical summary report for each of the two variables including summary measures such as mean, median, range, variance, standard deviation, smallest and largest values and the three quartiles. Notes: Use QUARTILE.EXC command to generate the three quartiles. Data for HOURSWORKED63 Excel spreadsheet is below: Yearly Income ('000's)...
4. the researcher wishes to use numerical descriptive measures to summarize the data on each of...
4. the researcher wishes to use numerical descriptive measures to summarize the data on each of the two variables: hours worked per week and income earned per year. Prepare and display a numerical summary report for each of the two variables including summary measures such as mean, median, range, variance, standard deviation, smallest and largest values and the three quartiles.                               Notes: Use QUARTILE.EXC command to generate the three quartiles. Compute the correlation coefficient using the relevant Excel function...
Second, the researcher wishes to use graphical descriptive methods to present summaries of the data on...
Second, the researcher wishes to use graphical descriptive methods to present summaries of the data on each of the two variables: hours worked per week and income earned per year, as stored in HOURSWORKED63 worksheet. (a) The number of observations (n) is 63 individuals. The researcher suggests using 7 class intervals to construct a histogram for each variable. Explain how the researcher would have decided on the number of class intervals (K) as 7. Data of HOURSWORKED63 Excel spraedsheet is...
Case 1 Instruction (Accounting Application) Use the MS Excel tabular graphical methods of descriptive statistics to...
Case 1 Instruction (Accounting Application) Use the MS Excel tabular graphical methods of descriptive statistics to summarize the sample data in the data set named PelicanStores in Case 1 folder. The managerial report should contain summaries such as: 1. A frequency and relative frequency distributions for the methods of payment (different cards). (20%) 2. Mean, median, first quartile, third quartile, and sample standard deviation for net sales from regular customers. (20%) 3. Mean, median, first quartile, third quartile, and sample...
Case 1 Instruction (Accounting Application) Use the MS Excel tabular graphical methods of descriptive statistics to...
Case 1 Instruction (Accounting Application) Use the MS Excel tabular graphical methods of descriptive statistics to summarize the sample data in the data set named PelicanStores in Case 1 folder. The managerial report should contain summaries such as: 1. A frequency and relative frequency distributions for the methods of payment (different cards). (20%) 2. Mean, median, first quartile, third quartile, and sample standard deviation for net sales from regular customers. (20%) 3. Mean, median, first quartile, third quartile, and sample...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT