In: Statistics and Probability
Assignment #1: Descriptive Statistics Data Analysis Plan
Identifying Information
Student (Full Name):
Class:
Instructor:
Date:
Scenario: Please write a few lines describing your scenario and the four variables (in addition to income) you have selected.
Use Table 1 to report the variables selected for this assignment. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 1. Variables Selected for the Analysis
Variable Name in the Data Set |
Description (See the data dictionary for describing the variables.) |
Type of Variable (Qualitative or Quantitative) |
Variable 1: “Income” |
Annual household income in USD. |
Quantitative |
Variable 2: |
||
Variable 3: |
||
Variable 4: |
||
Variable 5: |
Reason(s) for Selecting the Variables and Expected Outcome(s):
Variable 1: “Income” -
Variable 2: “ “ -
Variable 3: “ “ -
Variable 4: “ “ -
Variable 5: “ “ -
Data Set Description:
Proposed Data Analysis:
Measures of Central Tendency and Dispersion
Complete Table 2. Numerical Summaries of the Selected Variables and briefly explain why you choose those measurements. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 2. Numerical Summaries of the Selected Variables
Variable Name |
Measures of Central Tendency and Dispersion |
Rationale for Why Appropriate |
Variable 1: “Income” |
Number of Observations Median Sample Standard Deviation |
I am using median for two reasons: If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. The variable is quantitative. I am using sample standard deviation for three reasons: The data is a sample from a larger data set. It is the most commonly used measure of dispersion. The variable is quantitative. |
Variable 2: |
||
Variable 3: |
||
Variable 4: |
||
Variable 5: |
Graphs and/or Tables
Complete Table 3. Type of Graphs and/or Table for Selected Variables and briefly explain why you choose those graphs and/or tables. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 3. Type of Graphs and/or Tables for Selected Variables
Variable Name |
Graph and/or Table |
Rationale for why Appropriate? |
Variable 1: “Income” |
Graph: I will use the histogram to show the normal distribution of data. |
Histogram is one of the best plot to show the normal distribution of quantitative level data . |
Variable 2: |
||
Variable 3: |
||
Variable 4: |
||
Variable 5: |
Missing data
STAT200 Introduction to Statistics
The dataset for Written Assignments
Description of Dataset:
The data is a random sample from the US Department of Labor’s 2016 Consumer Expenditure Surveys (CE) and provides information about the composition of households and their annual expenditures (https://www.bls.gov/cex/). It contains information from 30 households, where a survey responder provided the requested information; it is all self-reported information. This dataset contains four socioeconomic variables (whose names start with SE) and four expenditure variables (whose names start with USD).
Description of Variables/Data Dictionary:
The following table is a data dictionary that describes the variables and their locations in this dataset (Note: Dataset is on second page of this document):
Variable Name |
Location in Dataset |
Variable Description |
Coding |
UniqueID# |
First Column |
Unique number used to identify each survey responder |
Each responder has a unique number from 1-30 |
SE-MaritalStatus |
Second Column |
Marital Status of Head of Household |
Not Married/Married |
SE-Income |
Third Column |
Annual Household Income |
Amount in US Dollars |
SE-AgeHeadHousehold |
Fourth Column |
Age of the Head of Household |
Age in Years |
SE-FamilySize |
Fifth Column |
Total Number of People in Family (Both Adults and Children) |
Number of People in Family |
USD-Annual Expenditures |
Sixth Column |
Total Amount of Annual Expenditures |
Amount in US Dollars |
USD-Housing |
Seventh Column |
Total Amount of Annual Expenditure on Housing |
Amount in US Dollars |
USD-Electricity |
Eighth Column |
Total Amount of Annual Expenditure on Electricity |
Amount in US Dollars |
USD-Water |
Ninth Column |
Total Amount of Annual Expenditure on Water |
Amount in US Dollars |
How to read the data set: Each row contains information from one household. For instance, the first row of the dataset starting on the next page shows us that: the head of household is not married and is 53 years old, has an annual household income of $97,681, a family size of 4, annual expenditures of
$56,124, and spends $18,676 on housing, $1,468 on electricity, and $551 on water.
UniqueID# |
SE-MaritalStatus |
SE-Income |
SE-AgeHeadHousehold |
SE-FamilySize |
USD-AnnualExpenditures |
USD-Housing |
USD-Electricity |
USD-Water |
1 |
Not Married |
97681 |
53 |
4 |
56124 |
18676 |
1468 |
551 |
2 |
Not Married |
96727 |
39 |
2 |
56440 |
18376 |
1441 |
542 |
3 |
Not Married |
95432 |
51 |
1 |
55120 |
18391 |
1458 |
548 |
4 |
Not Married |
96928 |
43 |
3 |
55932 |
18701 |
1479 |
520 |
5 |
Not Married |
94929 |
59 |
2 |
55247 |
18483 |
1451 |
546 |
6 |
Not Married |
95744 |
52 |
4 |
55963 |
18435 |
1465 |
555 |
7 |
Not Married |
95366 |
48 |
2 |
57082 |
18576 |
1478 |
538 |
8 |
Not Married |
96697 |
49 |
2 |
56453 |
18520 |
1469 |
545 |
9 |
Not Married |
96572 |
59 |
2 |
56515 |
18648 |
1480 |
552 |
10 |
Not Married |
96653 |
51 |
4 |
56488 |
18838 |
1470 |
535 |
11 |
Not Married |
96664 |
53 |
3 |
55558 |
18502 |
1478 |
553 |
12 |
Not Married |
96621 |
54 |
2 |
55746 |
18149 |
1455 |
540 |
13 |
Not Married |
96886 |
44 |
2 |
55321 |
18312 |
1450 |
523 |
14 |
Not Married |
96244 |
56 |
4 |
56051 |
18484 |
1457 |
539 |
15 |
Not Married |
94867 |
60 |
1 |
55512 |
18633 |
1485 |
523 |
16 |
Married |
98351 |
34 |
3 |
76558 |
26513 |
1342 |
547 |
17 |
Married |
109312 |
37 |
6 |
80801 |
25392 |
1514 |
743 |
18 |
Married |
111478 |
29 |
5 |
82699 |
24949 |
1503 |
814 |
19 |
Married |
107511 |
56 |
3 |
83347 |
22915 |
1723 |
773 |
20 |
Married |
95835 |
54 |
3 |
73092 |
23252 |
1300 |
705 |
21 |
Married |
110553 |
23 |
4 |
81419 |
26991 |
1421 |
719 |
22 |
Married |
95706 |
52 |
4 |
71597 |
22376 |
1315 |
694 |
23 |
Married |
110651 |
58 |
4 |
83766 |
22899 |
1682 |
754 |
24 |
Married |
98491 |
22 |
3 |
75996 |
26283 |
1326 |
620 |
25 |
Married |
99610 |
36 |
2 |
73550 |
27164 |
1330 |
627 |
26 |
Married |
97663 |
51 |
3 |
72971 |
23150 |
1320 |
689 |
27 |
Married |
115766 |
41 |
4 |
83448 |
25679 |
1511 |
767 |
28 |
Married |
107235 |
38 |
6 |
83471 |
26074 |
1486 |
769 |
29 |
Married |
106627 |
56 |
3 |
82676 |
22414 |
1688 |
709 |
30 |
Married |
109523 |
37 |
5 |
84002 |
26771 |
1457 |
768 |
I selected MaritalStatus, FamilySize, AnnualExpenditures and Food.
The data of 30 households is provided with details about the family size, income and expenditures. The scenario is of a Married single parent with a family size of 2. The age of head of the household is 35 years. The aim here is to determine the household budget plan. The variables selected are marital status of the head of the household, family size, annual expenditures and food.
Reason(s) for Selecting the Variables and Expected Outcome(s):
Data Set Description:
The data has been analyzed in two parts.
The first part deals with the statistics of the single households. The mean income for them is USD 96458. Average age of heads of such households is 50 years, with a family size of 2 to 3. The annual expenditures are approximately USD 55918, with expenses on food USD 7018.
For the households with married heads, mean income level is USD 103264 with annual expenses USD 77807, and exclusively food expenses are USD 10210.5. The family size is between 3 to 4 members and the age of the head of the household is 45 years.
Table 1. Variables Selected for the Analysis
Variable Name in data set |
Description |
Type of Variable (Qualitative or Quantitative) |
Variable 1: “Income” |
Annual household income in USD. |
Quantitative |
Variable 2: “MaritalStatus” |
Marital Status of Head of Household |
Qualitative |
Variable 3: “FamilySize” |
Total Number of People in Family (Both Adults and Children) |
Quantitative |
Variable 4: “AnnualExpenditures” |
Total Amount of Annual Expenditures |
Quantitative |
Variable 5: “Food” |
Total Amount of Annual Expenditure on Food |
Quantitative |