Question

In: Statistics and Probability

Description of Variables/Data Dictionary: The following table is a data dictionary that describes the variables and...

Description of Variables/Data Dictionary:

The following table is a data dictionary that describes the variables and their locations in this dataset (Note: Dataset is on second page of this document):

Variable Name

Location in Dataset

Variable Description

Coding

UniqueID#

First Column

Unique number used to identify each survey responder

Each responder has a unique number from 1-30

SE-MaritalStatus

Second Column

Marital Status of Head of Household

Not Married/Married

SE-Income

Third Column

Total Annual Household Income

Amount in US Dollars

SE-AgeHeadHousehold

Fourth Column

Age of the Head of Household

Age in Years

SE-FamilySize

Fifth Column

Total Number of People in Family (Both Adults and Children)

Number of People in Family

USD-Food

Sixth Column

Total Amount of Annual Expenditures on Food

Amount in US Dollars

USD-Meat

Seventh Column

Total Amount of Annual Expenditure on Meat

Amount in US Dollars

USD-Bakery

Eighth Column

Total Amount of Annual Expenditure on Bakery

Amount in US Dollars

USD-Fruits

Ninth Column

Total Amount of Annual Expenditure on Fruit

Amount in US Dollars

How to read the data set: Each row contains information from one household. For instance, the first row of the dataset starting on the next page shows us that: the head of household is not married and is 39 years old, has an annual household income of $96,727, a family size of 2, annual food expenditures of $7,051, and spends $904 on meat, $345 on bakery items, and $759 on fruit.

UniqueID#

SE-MaritalStatus

SE-Income

SE-AgeHeadHousehold

SE-FamilySize

USD-Food

USD-Meat

USD-Bakery

USD-Fruits

1

Not Married

96727

39

2

7051

904

345

759

2

Not Married

95366

48

2

7130

904

344

760

3

Not Married

95432

51

1

7089

900

350

765

4

Not Married

96886

44

2

6982

917

359

752

5

Not Married

97469

35

4

6900

915

335

773

6

Not Married

95744

52

4

7040

906

353

753

7

Not Married

98717

40

3

7036

889

348

768

8

Not Married

94929

59

2

6948

899

345

771

9

Not Married

97912

49

1

6937

913

353

770

10

Not Married

96244

56

4

7073

918

338

773

11

Not Married

96621

54

2

7000

911

344

768

12

Not Married

97681

53

4

7097

921

341

767

13

Not Married

96697

49

2

6971

898

357

779

14

Not Married

96522

43

4

6991

922

349

758

15

Not Married

96664

53

3

7051

906

346

772

16

Married

95208

52

4

8970

1116

452

979

17

Married

106622

49

4

10865

1554

534

1240

18

Married

95801

54

3

9395

1211

449

1018

19

Married

97611

44

5

9037

1147

449

994

20

Married

97835

30

5

8671

1062

390

1005

21

Married

107235

38

6

10856

1322

549

1156

22

Married

101890

48

2

11089

1481

541

1157

23

Married

107511

56

3

10682

1428

564

1169

24

Married

95385

50

4

9101

1179

450

1001

25

Married

106627

56

3

10363

1561

585

1178

26

Married

107795

51

3

11278

1408

544

1231

27

Married

107338

67

2

11710

1533

541

1324

28

Married

105601

19

4

10330

1377

568

1098

29

Married

96362

37

2

8789

983

355

1146

30

Married

99610

36

2

9513

721

367

1025

Variable

n

Measure(s) of Central Tendency

Measure(s) of Dispersion

Variable: Income

Median=

SD =

Graph and/or Table: Histogram of Income

(Place Histogram here)

Description of Findings.

Variable

n

Measure(s) of Central Tendency

Measure(s) of Dispersion

Variable: FamilySize

Mean=

SD=

Graph and/or Table.

(Place Graph or Table Here)

Description of Findings.

Solutions

Expert Solution

ANSWER:

A data dictionary that decribes the variables and their locations in this dataset.

We have to consider three variables which are marital status, food and fruit.

Marital status is categorical variable having married and un married are two categories.

Food and fruit are quantitative variables.

1. Confidence Interval Analysis: For one expenditure variable, select and run the appropriate method for estimating a parameter, based on a statistic (i.e., confidence interval method) and complete the following table

Here we use Confidence interval methof for variable food.

95% confidence interval for population mean (mu) is,

Xbar - E < mu < Xbar + E

where Xbar is sample mean and

E is margin of error.

E = Zc * (sd / sqrt(n))

For 95% confidence Zc = 1.96

Now we have to find sample mean and standard deviation of the data.

Xbar = 8464.83

sd = 1794.65

n = 30

E = 1.96 * (8464.83 / sqrt(30)) = 3029.10

lower limit = Xbar - E = 8464.83 - 3029.10 = 5435.73

upper limit = XBar + E = 8464.83 + 3029.10 = 11493.94

95% confidence interval for population mean is (5435.73, 11493.94)

SO population mean is lies between these two limits.

2. Hypothesis Testing: Using the second expenditure variable (with socioeconomic variable as the grouping variable for making two groups), select and run the appropriate method for making decisions about two parameters relative to observed statistics (i.e., two sample hypothesis testing method) and complete the following table

Now we use second variable as fruit and grouping variable as marital status.

We can test here,

H0 : mu1 = mu2 Vs H1 : mu1 not= mu2

where mu1 and mu2 are two population means for malee and females.

Assume alpha = level of significance = 0.05

We can use here two sample t-test assuming equal variances.

We can do this test in MINITAB.

ENTER data into MINITAB sheet --> Stat --> Basic statistics --> 2-Sample t --> Both samples are in one column --> Samples : fruit --> Sample ids : marital status --> Options --> Confidence levle : 95.0 --> Hypothesized difference : 0.0 --> Alternative hypothesis : not equal --> Assume equal variances --> ok --> ok


————— 06-08-2018 11:03:00 ————————————————————

Welcome to Minitab, press F1 for help.

Two-Sample T-Test and CI: Fruit, marital status

Two-sample T for Fruit

marital
status N Mean StDev SE Mean
0 15 1115 107 28
1 15 1226 1782 460


Difference = ? (0) - ? (1)
Estimate for difference: -111
95% CI for difference: (-1055, 833)
T-Test of difference = 0 (vs ?): T-Value = -0.24 P-Value = 0.811 DF = 28
Both use Pooled StDev = 1262.2626

Test statistic = -0.24

P-value = 0.811

P-value > alpha

Accept H0 at 5% level of significance.

COnclusion : Two population means do not differ.


Related Solutions

For the following description of​ data, identify the​ W's, name the​ variables, specify for each variable...
For the following description of​ data, identify the​ W's, name the​ variables, specify for each variable whether its use indicates it should be treated as categorical or​quantitative, and for any quantitative variable identify the units in which it was measured​ (or note that they were not​ provided). Specify whether the data come from a designed survey or experiment. Are the variables time series or​ cross-sectional? Report any concerns you have as well. An electronics manufacturerelectronics manufacturer wants to know what...
For the following description of​ data, identify the​ W's, name the​ variables, specify for each variable...
For the following description of​ data, identify the​ W's, name the​ variables, specify for each variable whether its use indicates it should be treated as categorical or quantitative and for any quantitative variable identify the units in which it was measured. Determine if the data comes from a designed survey or experiment. Determine if the variables are time series or​ cross-sectional. A company surveyed a random sample of 65006500 employees in the region. One question they asked​ was, "If your...
For the following description of​ data, identify the​ W's, name the​ variables, specify for each variable...
For the following description of​ data, identify the​ W's, name the​ variables, specify for each variable whether its use indicates it should be treated as categorical or​ quantitative, and for any quantitative variable identify the units in which it was measured​ (or note that they were not​ provided). Specify whether the data come from a designed survey or experiment. Are the variables time series or​ cross-sectional? Report any concerns you have as well. A certain horse race has been run...
Creating a Dictionary (python 3) Description Larry is an immigrant in the USA and has a...
Creating a Dictionary (python 3) Description Larry is an immigrant in the USA and has a hard time understanding English there. So he decides to make a software that will tell him the synonyms of the word that he types. He has asked you for help. Remember, you will first need to choose a data structure that you will use to store the information about the words. You can use lists or dict or tuple or anything else for this...
for the quantitative variables x and y are given in the table below. These data are...
for the quantitative variables x and y are given in the table below. These data are plotted in the scatter plot shown next to the table. In the scatter plot, sketch an approximation of the least-squares regression line for the data. x y 7.4 6.6 4.0 5.0 2.0 3.9 3.1 5.0 9.4 8.4 4.7 5.9 6.6 6.2 8.5 6.9 7.9 7.3 1.6 3.7 4.4 4.6 2.8 3.9 2.7 3.1 5.7 6.3 5.9 7.3 9.7 8.3 6.9 6.3 8.8 8.4 x...
1.            Use the following table to calculate the expected value. The following table describes the possible...
1.            Use the following table to calculate the expected value. The following table describes the possible outcomes and their associated probabilities. x P(x) ($20) 0.05 ($5) 0.1 $0 0.15 $10 0.35 $15 0.2 $30 0.15                 The first 2 outcomes are losses (negative values).                 What is the expected value of this probability distribution?                 (1 point) 2.            In a certain school (enrollment in the school is well over 1000 students), 40% are male. If you randomly select 8 students:...
Enter the following record into the Inventory table using data listed below: Make Model Yr Description...
Enter the following record into the Inventory table using data listed below: Make Model Yr Description CarCondition Cost Selling Price Date Arrived Date Sold RepNumber Pontiac Grand Am 2005 4-Door, Red Excellent $8,000 $9,990 5/5/08 6/1/08 1 Lincoln Town Car 2001 2-Door, White Good $5,500 $5,995 4/15/08 4/20/08 3 Chevrolet Cavalier 2005 4-Door, Blue Excellent $7,000 5/15/08 Toyota Corolla 2001 4-Door, Black Fair $4,000 $4,500 5/1/08 Ford Tempo 2002 2-Door, Red Poor $2,000 $2,300 5/5/08 Chevrolet Lumina 2005 2-Door, White...
The experiment data in below table was to evaluate the effects of three variables on invoice...
The experiment data in below table was to evaluate the effects of three variables on invoice errors for a company. Invoice errors had been a major contributor to lengthening the time that customers took to pay their invoices and increasing the accounts receivables for a major chemical company. It was conjectured that the errors might be due to the size of the customer (larger customers have more complex orders), the customer location (foreign orders are more complicated), and the type...
Exercise 3 The following Two-way Table presents data on two variables, “Type of Surgery” and “Level...
Exercise 3 The following Two-way Table presents data on two variables, “Type of Surgery” and “Level of Complication”. There are three categories in Type of Surgery (Gastric-banding, Sleeve-gastrectomy and Gastric-bypass), and three categories in Level of Complication (Serious, Non-Life-Threat and None). The table presents the counts of patients in each level of Surgery by each level of Complication. Type of Surgery Level of Complication Non-Life-Threat Serious None Gastric-Banding 81 46 5253 Sleeve-Gastrectomy 31 19 804 Gastric-bypass 606 325 8110 Are...
Consider the following data for two variables, and
Consider the following data for two variables,  and . 7 30 21 18 25   10 27 23 16 21   a. Develop an estimated regression equation for the data of the form . Comment on the adequacy of this equation for predicting . Enter negative value as negative number. The regression equation is     s= (to 3 decimals)   R2= % (to 1 decimal)   R Adjusted= % (to 1 decimal)   Analysis of Variance SOURCE DF SS(to 2...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT