In: Statistics and Probability
Scenario: A 28 year old with a bachelors degree. They have no children and is a new home owner in the state of Maryland. As head of the household must determine a household budgeting plan.
Use Table 1 to report the variables selected for this assignment. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
| Variable Name In DataSet | Description | Type of variable (Qualitative or quantitative | 
| Income | annual househole income in USD | Quantitative | 
| Martial Status | ||
| Age | ||
| Family Size | ||
| Housing | 
Reason(s) for selecting the variable and expected Outcome(s)
1) Income
2) Martial Status
3) Age
4) Family Size
5) Housing
Data Set Description:
Proposed Data Analysis:
Measures of Central Tendency and Dispersion
Complete Table 2. Numerical Summaries of the Selected Variables and briefly explain why you choose those measurements. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 2. Numerical Summaries of the Selected Variables
| Variable Name | 
 Measures of Central Tendency and Dispersion  | 
 Rationale for Why Appropriate  | 
| 
 Variable 1: “Income”  | 
 Number of Observations ? Median ? Sample Standard Deviation  | 
 I am using median for two reasons: 1. If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. 2. The variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. It is the most commonly used measure of dispersion. 3. The variable is quantitative.  | 
| Marital Status | ||
| Age | ||
| Family size | ||
| Housing | 
Graphs and/or Tables
Complete Table 3. Type of Graphs and/or Table for Selected Variables and briefly explain why you choose those graphs and/or tables. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 3. Type of Graphs and/or Tables for Selected Variables
| 
 Variable Name  | 
 Graph and/or Table  | 
 Rationale for why Appropriate?  | 
| 
 Variable 1: “Income”  | 
 Graph: I will use the histogram to show the normal distribution of data.  | 
 Histogram is one of the best plot to show the normal distribution of quantitative level data .  | 
| martial status | ||
| age | ||
| family size | ||
| housing | 
The data is a random sample from the US Department of Labor’s 2016 Consumer Expenditure Surveys (CE) and provides information about the composition of households and their annual expenditures (https://www.bls.gov/cex/). It contains information from 30 households, where a survey responder provided the requested information; it is all self-reported information. This dataset contains four socioeconomic variables (whose names start with SE) and four expenditure variables (whose names start with USD).
| 
 Variable name in dataset  | 
 Description  | 
 Type of variable  | 
| 
 Income  | 
 annual household income in USD  | 
 Quantitative  | 
| 
 Marital Status  | 
 married  | 
 Qualitative  | 
| 
 Age  | 
 28 years  | 
 Quantitative  | 
| 
 Family Size  | 
 2 person  | 
 Quantitative  | 
| 
 Housing  | 
 Owned house  | 
 Qualitative  | 
Table 2. Numerical Summaries of the Selected Variables
| 
 Variable name in dataset  | 
 Measures of Central Tendency and Dispersion  | 
 Rationale for Why Appropriate  | 
| 
 Income  | 
 Number of Observations ? Median ? Sample Standard Deviation  | 
 I am using median for two reasons: 1. If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. 2. The variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. It is the most commonly used measure of dispersion. 3. The variable is quantitative.  | 
| 
 Marital Status  | 
 ? Mode  | 
 I am using mode as: 1. The variable is qualitative.  | 
| 
 Age  | 
 Number of Observations ? Mean ? Sample Standard Deviation  | 
 I am using mean for two reasons: 1. crude measure of central tendency. 2. The variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. It is the most commonly used measure of dispersion. 3. The variable is quantitative.  | 
| 
 Family Size  | 
 Number of Observations ? Mean ? Sample Standard Deviation  | 
 I am using mean for two reasons: 1. crude measure of central tendency. 2. The variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. It is the most commonly used measure of dispersion. 3. The variable is quantitative.  | 
| 
 Housing  | 
 mode  | 
 I am using mode as: 1. The variable is qualitative.  | 
Table 3. Type of Graphs and/or Tables for Selected Variables
| 
 Variable name in dataset  | 
 Graph and/or Table  | 
 Rationale for Why Appropriate  | 
| 
 Income  | 
 Graph: I will use the histogram to show the normal distribution of data.  | 
 Histogram is one of the best plot to show the normal distribution of quantitative level data.  | 
| 
 Marital Status  | 
 Pie chart or Bar chart  | 
 Effectively display the relative frequencies of a small number of groups of qualitative variable  | 
| 
 Age  | 
 histogram  | 
 Histogram is one of the best to show whether data follows the normal distribution  | 
| 
 Family Size  | 
 histogram  | 
 Histogram is one of the best to show whether data follows the normal distribution  | 
| 
 Housing  | 
 Pie chart or Bar chart  | 
 Effectively display the relative frequencies of a small number of groups of qualitative variable  |