Question

In: Math

The consumer food database contains five variables: Annual Food Spending per Household, Annual Household Income, Non-Mortgage...

The consumer food database contains five variables: Annual Food Spending per Household, Annual Household Income, Non-Mortgage Household Debt, Geographic Region of the U.S. of the Household, and Household Location. There are 200 entries for each variable in this database representing 200 different households from various regions and locations in the United States. Annual Food Spending per Household, Annual Household Income, and Non-Mortgage Household Debt are all given in dollars. The variable Region tells in which one of four regions the household resides. In this variable, the Northeast is coded as 1, the Midwest is coded 2, the South is coded as 3, and the West is coded as 4. The variable Location is coded as 1 if the household is in a metropolitan area and 2 if the household is outside a metro area. The data in this database were randomly derived and developed based on actual national norms.The consumer food database contains five variables: Annual Food Spending per Household, Annual Household Income, Non-Mortgage Household Debt, Geographic Region of the U.S. of the Household, and Household Location. There are 200 entries for each variable in this database representing 200 different households from various regions and locations in the United States. Annual Food Spending per Household, Annual Household Income, and Non-Mortgage Household Debt are all given in dollars. The variable Region tells in which one of four regions the household resides. In this variable, the Northeast is coded as 1, the Midwest is coded 2, the South is coded as 3, and the West is coded as 4. The variable Location is coded as 1 if the household is in a metropolitan area and 2 if the household is outside a metro area. The data in this database were randomly derived and developed based on actual national norms.

Provide a 1,600-word detailed, statistical report including the following:

  • Explain the context of the case
  • Provide a research foundation for the topic
  • Present graphs
  • Explain outliers
  • Prepare calculations
  • Conduct hypotheses tests
  • Discuss inferences you have made from the results

This assignment is broken down into four parts:

  • Part 1 - Preliminary Analysis
  • Part 2 - Examination of Descriptive Statistics
  • Part 3 - Examination of Inferential Statistics
  • Part 4 - Conclusion/Recommendations

Part 1 - Preliminary Analysis (3-4 paragraphs)

Generally, as a statistics consultant, you will be given a problem and data. At times, you may have to gather additional data. For this assignment, assume all the data is already gathered for you.

State the objective:

  • What are the questions you are trying to address?

Describe the population in the study clearly and in sufficient detail:

  • What is the sample?

Discuss the types of data and variables:

  • Are the data quantitative or qualitative?
  • What are levels of measurement for the data?

Part 2 - Descriptive Statistics (3-4 paragraphs)

Examine the given data.

Present the descriptive statistics (mean, median, mode, range, standard deviation, variance, CV, and five-number summary).

Identify any outliers in the data.

Present any graphs or charts you think are appropriate for the data.

Note: Ideally, we want to assess the conditions of normality too. However, for the purpose of this exercise, assume data is drawn from normal populations.

Part 3 - Inferential Statistics (2-3 paragraphs)

Use the Part 3: Inferential Statistics document.

  • Create (formulate) hypotheses
  • Run formal hypothesis tests
  • Make decisions. Your decisions should be stated in non-technical terms.

Hint: A final conclusion saying "reject the null hypothesis" by itself without explanation is basically worthless to those who hired you. Similarly, stating the conclusion is false or rejected is not sufficient.

Part 4 - Conclusion and Recommendations (1-2 paragraphs)

Include the following:

  • What are your conclusions?
  • What do you infer from the statistical analysis?
  • State the interpretations in non-technical terms. What information might lead to a different conclusion?
  • Are there any variables missing?
  • What additional information would be valuable to help draw a more certain conclusion?

Solutions

Expert Solution

Preliminary Analysis

The purpose of this case study is to statistically explain the data provided by the University of Phoenix in regards to consumer food spending throughout 4 regions of the United States with emphasis on the Midwest categorized as region 2 within the data set. The objectives of the case study will be tested using 5 variables containing a 200-sample data set. The focus of the case study will be centered around three objectives: 1.) Test to determine if the average annual food spending for a household in the Midwest region of the U.S. is more than $8,000 using a 1% level of significance, 2.) Test to determine if there is a significant difference between households in a metro area and households outside metro areas in annual food spending using

α = 0.01, and 3.) Perform three different one-way ANOVA's—one for each of the three dependent variables (Annual Food Spending, Annual Household Income, Non-Mortgage Household Debt) using Region as an independent variable with four classification levels (four regions of the U.S.). Find all significant differences by region.

The parameters around the case study will be used to solve the question, “Is the average annual food spending for a household located in the Midwest region of the United States greater than $8000.00”? The population of the case study is comprised of independent variables which are qualitative data such as North East, Mid-West, South, West regions. The breakdown of the qualitative data is coded in U.S. regions such as 1- Northeast, 2 - Midwest, 3 – South, and 4 - West. The location variable of the data set is identified as number 1 only if the household is in a metropolitan area and number 2 only if the household is outside the metropolitan area.

The data set is also made up of quantitative data that will be used as dependent variables in the case study classified as Annual Household Spending per Household, Annual Household Income, and Non-Mortgage Household Debts which will be measured in US currency. The case study data set contains sample data within Annual Food Spending, Annual Household Income per Household, and Non-Mortgage Household Debt characterized by regions and locations. The independent variable in the data set is the qualitative data called regions divided into four parts of the United States. The calculations have shown a variation amongst the regions and in this case study, the level of measurement will be utilized as a ratio variable. The level of measurement as a ratio will be utilized to solve the question based on a monetary variable.

Descriptive Statistics

The use of descriptive statistics is very important tools that help describe certain features within data sets. The use of data sets provides the user with complete summaries about sample means and also sample measures. The overall function of descriptive statistics is to describe what data is and what data shows. Descriptive statistics help us to simplify large amounts of data in a sensible way ("Descriptive Statistics", 2017). The data presented in this section of the case study will provide a descriptive analysis of the data set derived from using the data analysis function using the analysis tool pack within Microsoft Excel 2016. Within the data presented the identification of outliers were present. The first noticeable outlier was found within the data set for Annual Food Spending with a value of 17740 which was out of range from the upper bound of 16974. The second noticeable outlier was discovered in the data set of Annual Household Income with a value of 96132 which was out of the range of the upper bound. The data set Non-Mortgage Household Debt had no identified outliers within the data set. The data tables compiled below presents the descriptive statistics mean, median, mode, range, standard deviation, variance, CV, and five-number summary.

A.) Descriptive Analysis for Consumer Food data: Annual Food Spending

B.) Descriptive Analysis for Consumer Food data: Annual Household Income

C.) Descriptive Analysis for Consumer Food data: Non-Mortgage Household Debt

Inferential Analysis

In this part of the case study, there will be several tests ran using inferential analysis where predictions from the data will be made taken from the samples provided in the case study. The first test will provide if the average annual food spending for a household in the Midwest region of the U.S. is more than $8,000 using the Midwest region data and a 1% level of significance to test this hypothesis. The second test will be conducted testing to determine if there is a significant difference between households in a metro area and households outside metro areas in annual food spending by letting α = 0. The third test will analyze the quantitative factors of annual food spending, annual household income, and non-mortgage household debt by regions to determine if there are any significant findings.

Test 1

To test whether the average Annual Food Spending per Households in the Midwest region of U.S. is more than $8,000, the data were sorted according to region 2 which is the Midwest Region using the descriptive statistics for the annual household food spending data. The One Sample Z Test was executed to test the null hypothesis the average household spending in Midwest region is equal to $8,000, against the alternative hypothesis that this average was greater than $8,000. The test rejected the null hypothesis and there is also a statistical difference from the calculation means.

Test Hypothesis:

H0: µ = 8000

H1: µ > 8000 = H1: 8660 > 8000

Test Statistics:

z-Test: Two Sample for Means

Annual Food Spending

Test

Mean

8659.688889

8000

Known Variance

5449631

5449631

Observations

45

45

Hypothesized Mean Difference

0

z

1.34043846

P(Z<=z) one-tail

0.09005142

z Critical one-tail

2.326347874

P(Z<=z) two-tail

0.180102839

z Critical two-tail

2.575829304

Test 2

The second test was performed to determine if there is a significant difference between households in a metro area and households outside metro areas in annual food spending with α = 0.01. The data were organized according to locations named metro and outside the metro within the annual food spending data set that was obtained using the descriptive analysis excel function. The test performed was a Two Sample Z Test used to test the null hypothesis. The test discovered that there is a significant difference between households in metro and outside the metro. The test rejected the null hypothesis against the alternative hypothesis being there was a significant difference in households between households in metro and outside the metro.

Test Hypothesis:

H0: µ metro = µ outside metro

H1: µ metro ≠ µ outside metro

Test Statistics:

t-Test: Two-Sample Assuming Unequal Variances

1 Inside Metro

2 Outside Metro

Mean

9435.933333

8261.2625

Variance

10526695.37

7904552.956

Observations

120

80

Hypothesized Mean Difference

0

df

185

t Stat

2.719835073

P(T<=t) one-tail

0.003576947

t Critical one-tail

2.34667322

P(T<=t) two-tail

0.007153893

t Critical two-tail

2.602665303

Test 3

The third test determined whether each of the 3 variables is significantly affected by regional differences amongst the four different regional areas. A One-way ANOVA analysis for each variable was used to test the null Hypothesis that regional means were equal, against the alternative hypothesis that regional means were not equal. The interpretation of the data determined that the null hypothesis was rejected and the alternative hypothesis was accepted. The data reveals that there is a significance difference amongst the regions and within the three different data sets.

Test Hypothesis:

H0: µ NE = µ MW = µ South = µ West

H1: µ NE ≠ µ MW ≠ µ South ≠ µ West

Test Analysis:

The ANOVA calculations display a difference amongst all four regions for Annual Food Spending, but the Northeast Region 1 and West Region 4 have similar annual food spending averaging at $545,084.50. Region Midwest 2 and Region South 3 Annual Food Spending were similar as well with an average of $351,522.00 annually for food spending. The Annual Household Income per Household ranged from a low of $50,508.15 to a high of $58,141.72. However, the ANOVA calculations compared provided an average among all four regions to be $55,117.60. The data from the case study also observed that Non-Mortgage Household Debt appeared not to be a major factor amongst the regions due to the amount of Debt seen in the four different regions. Data showed an Annual Non-Mortgage Debt in Northeast (Region 1) having $824,556.30, Midwest (Region 2) calculating to be $575,322.10, South (Region 3) being $748,678.20, and the West (Region 4) with a $971,274.90 annual debt other than mortgages. The Annual Non-Mortgage Debt calculations have more emphasis on consumer spending other than consumer food spending. The data tables below represent three different one-way ANOVA calculations for the three data sets of dependent variables which will be used as the quantitative data.

ANOVA Tables

ANOVA Table A: Single Factor

Region 1

SUMMARY

Groups

Count

Sum

Average

Variance

Annual Food Spending ($)

60

568079

9467.98

13937489.34

Annual Household Income ($)

60

3441731

57362.2

288077734.2

Non mortgage household debt ($)

60

824556.3

13742.6

64029624.43

ANOVA

Source of Variation

SS

df

MS

F

P-value

F crit

Between Groups

84295915653

2

4.2E+10

345.4327364

8E-62

4.727093

Within Groups

21596646029

177

1.2E+08

Total

1.05893E+11

179

ANOVA Table B: Single Factor

Region 2

SUMMARY

Groups

Count

Sum

Average

Variance

Annual Food Spending ($)

45

389686

8659.69

5449631

Annual Household Income ($)

45

2E+06

54458.4

1.8E+08

Non mortgage household debt ($)

45

576322

12807.2

4.7E+07

ANOVA

Source of Variation

SS

df

MS

F

P-value

F crit

Between Groups

5.77E+10

2

2.9E+10

364.159

1.86E-54

4.769637

Within Groups

1.05E+10

132

7.9E+07

Total

6.82E+10

134

ANOVA Table C: Single Factor

Region 3

SUMMARY

Groups

Count

Sum

Average

Variance

Annual Food Spending ($)

40

313358

7834

7410059

Annual Household Income ($)

40

2E+06

50508

1.72E+08

Non mortgage household debt ($)

40

748678

18717

99289894

ANOVA

Source of Variation

SS

df

MS

F

P-value

F crit

Between Groups

3.934E+10

2

2E+10

211.9474

1.26E-39

4.791

Within Groups

1.086E+10

117

9E+07

Total

5.019E+10

119

ANOVA Table C: Single Factor

Region 4

SUMMARY

Groups

Count

Sum

Average

Variance

Annual Food Spending ($)

55

522090

9492.545

9378327.9

Annual Household Income ($)

55

3197795

58141.73

172415144

Non mortgage household debt ($)

55

971274.9

17659.54

69306094

ANOVA

Source of Variation

SS

df

MS

F

P-value

F crit

Between Groups

7.47E+10

2

3.73E+10

445.98594

1.32E-66

4.738598

Within Groups

1.36E+10

162

83699855

Total

8.82E+10

164

Conclusion

The mean Annual Household Food Spending in the Midwest region did not drastically appear to be significantly different from $8,000. However, the calculations did calculate a mean greater than $8,000 which could predict that the difference in calculations could have happened by chance based on what seasons, available produce, opening, and closing of restaurants, household incomes, etc. The Annual Household Spending test is for the inside the metro location calculated to be significantly different from its location outside the metro. Therefore, the life of living inside the city r metro location is more expensive rather than locations outside the city. The cost of living is skyrocketed based on availability and convenience. Residents moving to the metro area can also be advised to prepare for more expenditure than before; prospective investors can also be advised to prepare for extra expenditure. However, the comparison of the different variables by regions determines the similarities when it comes to Annual Household Spending, but annual incomes vary throughout the various regions. The statistical analysis conducted from this case study attest that the predictions made in the analysis don't extend farther than the means of living life.

Furthermore, the use of this type of information can determine the type of restaurants, health services, stores or even schools that would be beneficial for certain parts of the United States. The proper use of statistics along with sufficient probability that a given variance amongst various groups works on the positive influence of other variables.


Related Solutions

Develop an estimated regression equation with annual income and household size as the independent variables. Discuss...
Develop an estimated regression equation with annual income and household size as the independent variables. Discuss your findings - Income ($1000s) Household Size Amount Charged ($) 54 3 4,016 30 2 3,159 32 4 5,100 50 5 4,742 31 2 1,864 55 2 4,070 37 1 2,731 40 2 3,348 66 4 4,764 51 3 4,110 25 3 4,208 48 4 4,219 27 1 2,477 33 2 2,514 65 3 4,214 63 4 4,965 42 6 4,412 21 2 2,448...
List the four non-income determinants of consumption and spending
List the four non-income determinants of consumption and spending
Use the “Consumer Food” Database on “Excel Databses.xls”. As the researcher you are interested in predicting...
Use the “Consumer Food” Database on “Excel Databses.xls”. As the researcher you are interested in predicting the annual food spending according to annual household income as well as a qualitative variable: location or region. Link to data: https://drive.google.com/file/d/1YMYMy7H0sLRZJzXwFKMrANuAGxmp-z9I/view?usp=sharing Please use excel and post step by step Construct two regression models to predict annual food spending: Model 1: Food = b0 + b1Income + b2Metro Model 2: Food = b0 + b1Income + b2NE + b3MW+ b4S NOTES: DO NOT: Print...
Household Income in Maryland: According to Money magazine, Maryland had the highest median annual household income...
Household Income in Maryland: According to Money magazine, Maryland had the highest median annual household income of an state in 2018 at $75,847 (Time.com website). Assume that annual household income in Maryland follows a normal distribution with a median of $75,847 and a standard deviation of $33,800. a. What is the probability that a household in Maryland has an annual income of $100,000 or more? b. What is the probability that a household in Maryland has an annual income of...
Suppose that consumer spending initially rises by $5 billion for every 1 percent rise in household...
Suppose that consumer spending initially rises by $5 billion for every 1 percent rise in household wealth and that investment spending initially rises by $20 billion for every 1 percentage point fall in the real interest rate. Also assume that the economy’s multiplier is 3 a.  If household wealth falls by 5 percent because of declining house values, and the real interest rate falls by 2 percentage points, in what direction and by how much will the aggregate demand curve initially...
Use Excel to develop a regression model for the Consumer Food Database (using the “Excel Databases.xls”...
Use Excel to develop a regression model for the Consumer Food Database (using the “Excel Databases.xls” file) to predict Annual Food Spending by Annual Household Income. Assume a 5% level of significance. (file here: https://drive.google.com/file/d/13uDUXwoSRZHEUtjMUedu2yjR_4lrLepC/view?usp=sharing ) Must complete all the parts to this problem: PART 1: Perform a simple linear regression in Excel to predict Annual Food Spending by Annual Household Income and output the results. Include the Regression Statistics, ANOVA, and table of Coefficients for each model. PART 2:...
1) Use Excel to develop a regression model for the Consumer Food Database (using the “Excel...
1) Use Excel to develop a regression model for the Consumer Food Database (using the “Excel Databases.xls” file on Blackboard) to predict Annual Food Spending by Annual Household Income for those living in the Metro area only.    Suppose a household in the metro area has an annual income of $60,000. Predict how much they spend on food per year. Write your answer as a number (do not include the $ sign or comma) and round to 2 decimal places....
How much does household weekly income affect the household weekly expenditure on food? The following data...
How much does household weekly income affect the household weekly expenditure on food? The following data shows household weekly expenditure on food and the household weekly income (all in dollars). Use the data below to develop an estimated regression equation that could be used to predict food expenditure for a weekly income. Use Excel commands for your calculations. FOOD INCOME y x 91 292 148 479 107 428 146 766 243 1621 312 1661 243 1292 272 1683 349 1808...
[7] Household spending: A) is based primarily on unearned income. B) tends to fluctuate widely as...
[7] Household spending: A) is based primarily on unearned income. B) tends to fluctuate widely as the economy moves through the business cycle. C) is larger than the combined spending of all U.S. businesses, government units, and foreign buyers. D) all of these answers are correct. [9] If the only spending in the economy were household spending based on earned income, and if households always spent all of their incomes, from one year to the next the level of economic...
What is the predicted annual credit card charge for a three-person household with an annual income...
What is the predicted annual credit card charge for a three-person household with an annual income of $40,000 (show your work) - Income ($1000s) Household Size Amount Charged ($) 54 3 4,016 30 2 3,159 32 4 5,100 50 5 4,742 31 2 1,864 55 2 4,070 37 1 2,731 40 2 3,348 66 4 4,764 51 3 4,110 25 3 4,208 48 4 4,219 27 1 2,477 33 2 2,514 65 3 4,214 63 4 4,965 42 6 4,412...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT