In: Statistics and Probability
Description of Variables/Data Dictionary:
The following table is a data dictionary that describes the variables and their locations in this dataset (Note: Dataset is on second page of this document):
Variable Name |
Location in Dataset |
Variable Description |
Coding |
UniqueID# |
First Column |
Unique number used to identify each survey responder |
Each responder has a unique number from 1-30 |
SE-MaritalStatus |
Second Column |
Marital Status of Head of Household |
Not Married/Married |
SE-Income |
Third Column |
Total Annual Household Income |
Amount in US Dollars |
SE-AgeHeadHousehold |
Fourth Column |
Age of the Head of Household |
Age in Years |
SE-FamilySize |
Fifth Column |
Total Number of People in Family (Both Adults and Children) |
Number of People in Family |
USD-Food |
Sixth Column |
Total Amount of Annual Expenditures on Food |
Amount in US Dollars |
USD-Meat |
Seventh Column |
Total Amount of Annual Expenditure on Meat |
Amount in US Dollars |
USD-Bakery |
Eighth Column |
Total Amount of Annual Expenditure on Bakery |
Amount in US Dollars |
USD-Fruits |
Ninth Column |
Total Amount of Annual Expenditure on Fruit |
Amount in US Dollars |
How to read the data set: Each row contains information from one household. For instance, the first row of the dataset starting on the next page shows us that: the head of household is not married and is 39 years old, has an annual household income of $96,727, a family size of 2, annual food expenditures of $7,051, and spends $904 on meat, $345 on bakery items, and $759 on fruit.
UniqueID# |
SE-MaritalStatus |
SE-Income |
SE-AgeHeadHousehold |
SE-FamilySize |
USD-Food |
USD-Meat |
USD-Bakery |
USD-Fruits |
1 |
Not Married |
96727 |
39 |
2 |
7051 |
904 |
345 |
759 |
2 |
Not Married |
95366 |
48 |
2 |
7130 |
904 |
344 |
760 |
3 |
Not Married |
95432 |
51 |
1 |
7089 |
900 |
350 |
765 |
4 |
Not Married |
96886 |
44 |
2 |
6982 |
917 |
359 |
752 |
5 |
Not Married |
97469 |
35 |
4 |
6900 |
915 |
335 |
773 |
6 |
Not Married |
95744 |
52 |
4 |
7040 |
906 |
353 |
753 |
7 |
Not Married |
98717 |
40 |
3 |
7036 |
889 |
348 |
768 |
8 |
Not Married |
94929 |
59 |
2 |
6948 |
899 |
345 |
771 |
9 |
Not Married |
97912 |
49 |
1 |
6937 |
913 |
353 |
770 |
10 |
Not Married |
96244 |
56 |
4 |
7073 |
918 |
338 |
773 |
11 |
Not Married |
96621 |
54 |
2 |
7000 |
911 |
344 |
768 |
12 |
Not Married |
97681 |
53 |
4 |
7097 |
921 |
341 |
767 |
13 |
Not Married |
96697 |
49 |
2 |
6971 |
898 |
357 |
779 |
14 |
Not Married |
96522 |
43 |
4 |
6991 |
922 |
349 |
758 |
15 |
Not Married |
96664 |
53 |
3 |
7051 |
906 |
346 |
772 |
16 |
Married |
95208 |
52 |
4 |
8970 |
1116 |
452 |
979 |
17 |
Married |
106622 |
49 |
4 |
10865 |
1554 |
534 |
1240 |
18 |
Married |
95801 |
54 |
3 |
9395 |
1211 |
449 |
1018 |
19 |
Married |
97611 |
44 |
5 |
9037 |
1147 |
449 |
994 |
20 |
Married |
97835 |
30 |
5 |
8671 |
1062 |
390 |
1005 |
21 |
Married |
107235 |
38 |
6 |
10856 |
1322 |
549 |
1156 |
22 |
Married |
101890 |
48 |
2 |
11089 |
1481 |
541 |
1157 |
23 |
Married |
107511 |
56 |
3 |
10682 |
1428 |
564 |
1169 |
24 |
Married |
95385 |
50 |
4 |
9101 |
1179 |
450 |
1001 |
25 |
Married |
106627 |
56 |
3 |
10363 |
1561 |
585 |
1178 |
26 |
Married |
107795 |
51 |
3 |
11278 |
1408 |
544 |
1231 |
27 |
Married |
107338 |
67 |
2 |
11710 |
1533 |
541 |
1324 |
28 |
Married |
105601 |
19 |
4 |
10330 |
1377 |
568 |
1098 |
29 |
Married |
96362 |
37 |
2 |
8789 |
983 |
355 |
1146 |
30 |
Married |
99610 |
36 |
2 |
9513 |
721 |
367 |
1025 |
Variable |
n |
Measure(s) of Central Tendency |
Measure(s) of Dispersion |
Variable: Income |
Median= |
SD = |
Graph and/or Table: Histogram of Income
(Place Histogram here)
Description of Findings.
Variable |
n |
Measure(s) of Central Tendency |
Measure(s) of Dispersion |
Variable: FamilySize |
Mean= |
SD= |
Graph and/or Table.
(Place Graph or Table Here)
Description of Findings.
ANSWER:
A data dictionary that decribes the variables and their locations in this dataset.
We have to consider three variables which are marital status, food and fruit.
Marital status is categorical variable having married and un married are two categories.
Food and fruit are quantitative variables.
1. Confidence Interval Analysis: For one expenditure variable, select and run the appropriate method for estimating a parameter, based on a statistic (i.e., confidence interval method) and complete the following table
Here we use Confidence interval methof for variable food.
95% confidence interval for population mean (mu) is,
Xbar - E < mu < Xbar + E
where Xbar is sample mean and
E is margin of error.
E = Zc * (sd / sqrt(n))
For 95% confidence Zc = 1.96
Now we have to find sample mean and standard deviation of the data.
Xbar = 8464.83
sd = 1794.65
n = 30
E = 1.96 * (8464.83 / sqrt(30)) = 3029.10
lower limit = Xbar - E = 8464.83 - 3029.10 = 5435.73
upper limit = XBar + E = 8464.83 + 3029.10 = 11493.94
95% confidence interval for population mean is (5435.73, 11493.94)
SO population mean is lies between these two limits.
2. Hypothesis Testing: Using the second expenditure variable (with socioeconomic variable as the grouping variable for making two groups), select and run the appropriate method for making decisions about two parameters relative to observed statistics (i.e., two sample hypothesis testing method) and complete the following table
Now we use second variable as fruit and grouping variable as marital status.
We can test here,
H0 : mu1 = mu2 Vs H1 : mu1 not= mu2
where mu1 and mu2 are two population means for malee and females.
Assume alpha = level of significance = 0.05
We can use here two sample t-test assuming equal variances.
We can do this test in MINITAB.
ENTER data into MINITAB sheet --> Stat --> Basic statistics --> 2-Sample t --> Both samples are in one column --> Samples : fruit --> Sample ids : marital status --> Options --> Confidence levle : 95.0 --> Hypothesized difference : 0.0 --> Alternative hypothesis : not equal --> Assume equal variances --> ok --> ok
————— 06-08-2018 11:03:00 ————————————————————
Welcome to Minitab, press F1 for help.
Two-Sample T-Test and CI: Fruit, marital status
Two-sample T for Fruit
marital
status N Mean StDev SE Mean
0 15 1115 107 28
1 15 1226 1782 460
Difference = ? (0) - ? (1)
Estimate for difference: -111
95% CI for difference: (-1055, 833)
T-Test of difference = 0 (vs ?): T-Value = -0.24 P-Value = 0.811 DF
= 28
Both use Pooled StDev = 1262.2626
Test statistic = -0.24
P-value = 0.811
P-value > alpha
Accept H0 at 5% level of significance.
COnclusion : Two population means do not differ.