Question

In: Statistics and Probability

Use the following data to create the contingency tables. AGE Male           16        17        17       

Use the following data to create the contingency tables.

AGE

Male

          16        17        17        19        19        19        18        17        18        17                                16        19            19        19        17        16        17        16        19        19                                24        31        23        44            21        42        23        43        43        33                                30        41        35        40        24        43            22        30        25        32

            43        51        55        80        61        58        65        52        67        75                                90        63            71        74

Female

             17        16        17        19        19        18        17        19        16        18                                19        17            19        17        18        19        19        16        33        23                                46        46        23        21            46        47        48        47        48        30                                35        24        48        49        47        25            84        54        77        63                                51        72        90        57        69        81

1. In the first table, use gender (male and female) as your row variable and age (<20, 20-50, and >50) for

     your column variable. Run a Chi-square test of independence and find the test statistic, p-value, and

     degrees of freedom.

2. In the second table, use gender (male and female) as your row variable and age (<18, 18-25, 26-45,    

     and >45) for your column variable. Run a Chi-square test of independence and find the test statistic,  

     p-value, and degrees of freedom.

3. Compare the results and comment on problems that may occur when categorizing continuous   

     variables.

Solutions

Expert Solution

H0: The variables gender and age are independent.

H1: The variables gender and age are not independent.

(H0: Null hypothesis; H1: Alternative hypothesis)

1.

Contingency table (Observed frequencies):

<20 20-50 >50 Total
Male 20 21 13 54
Female 18 18 10 46
Total 38 39 23 N =100


Expected frequency, E =(Corresponding Row Total* Corresponding Column Total)/N

Contingency table (Expected frequencies):

<20 20-50 >50 Total
Male 54*38/100 =20.52 54*39/100 =21.06 54*23/100 =12.42 54
Female 46*38/100 =17.48 46*39/100 =17.94 46*23/100 =10.58 46
Total 38 39 23 N =100

Test statistic(​​​​​​):

Sl. No: Observed frequency: O Expected frequency: E (O-E)2/E
1. 20 20.52 (20-20.52)2/20.52 =0.0132
2. 18 17.48 (18-17.48)2/17.48 =0.0155
3. 21 21.06 0.0002
4. 18 17.94 0.0002
5. 13 12.42 0.0271
6. 10 10.58 0.0318
Total 100 100 0.088

The test statistic, = =0.088

Degrees of freedom, df =(r-1)(c-1) =(2-1)(3-1) =1(2) =2

For the test statistic, =0.088 and at df =2, the p-value = 0.956954

p-value is very high (>0.01; >0.05 and >0.10 significance levels) indicating that we cannot reject the null hypothesis that says "the variables gender and age are independent".

2.

Contingency table for observed frequencies:

<18 18-25 26-45 >45 Total
Male 10 17 14 13 54
Female 8 15 3 20 46
Total 18 32 17 33 N =100

Contingency table for expected frequencies:

<18 18-25 26-45 >45 Total
Male 9.72 17.28 9.18 17.82 54
Female 8.28 14.72 7.82 15.18 46
Total 18 32 17 33 N =100

Test statistic(​​​​​​):

Sl. No. Observed frequency: O Expected frequency: E (O-E)2/E
1. 10 9.72 0.0081
2. 8 8.28 0.0095
3. 17 17.28 0.0045
4. 15 14.72 0.0053
5. 14 9.18 2.5308
6. 3 7.82 2.9709
7. 13 17.82 1.3037
8. 20 15.18 1.5305
Total 100 100 8.3633

The test statistic, =8.3633

Degrees of freedom, df =(r-1)(c-1) =(2-1)(4-1) =1(3) =3

For the test statistic, =8.3633 and at df =3, the p-value = 0.039071

p-value is low (<0.05; <0.10) indicating that we can reject the null hypothesis that says "the variables gender and age are independent".

(However at 0.01 significance level, p-value of 0.039071 > 0.01 indicating that we cannot reject the null hypothesis).

3.

The results of 1. and 2. are significantly different with higher and lower p-values that resulted in opposite conclusions at 5% and 10% significance levels.

This is because of the problem of different cut points when categorizing the continuous variable (age in this case).

When categorizing continuous variables, cut points are a major problem. How can one decide where to cut? It depends purely on the researcher and what he wants to determine but it's not that simple to decide where to cut and different cut points may result in contradictory conclusions as above.

Another problem is the loss of information.


Related Solutions

Use the following data to create the contingency tables. AGE Male           16        17        17       
Use the following data to create the contingency tables. AGE Male           16        17        17        19        19        19        18        17        18        17                                16        19            19        19        17        16        17        16        19        19                                24        31        23        44            21        42        23        43        43        33                                30        41        35        40        24        43            22        30        25        32             43        51        55        80        61        58        65        52        67        75                                90        63            71        74 Female              17        16        17        19        19        18        17        19        16        18                                19       ...
1. Collect annual data to create data tables and graphs of the following: a. growth rates...
1. Collect annual data to create data tables and graphs of the following: a. growth rates of NGDP and RGDP for the years 2008-2018 b. CPI-All Urban Consumers (Current Series) and the inflation rate for the years 2008-2018 c. unemployment rate for the years 2008-2018 d. M1 and M2 for the years 2008-2018
You are given the following information. Please use it for the following 31-Dec-16 31-Dec-16 31-dec-17 31-Dec-17...
You are given the following information. Please use it for the following 31-Dec-16 31-Dec-16 31-dec-17 31-Dec-17 stock Price Shares Price Shares w 50$ 10000 25$ 20000 x 40$ 5000 25$ 10000 y 20$ 20000 30$ 20000 z 30$ 15000 40$ 15000 Stocks W and X had 2 for 1 splits on December 31, 2016. The information in the table for 2016 is pre-split. 3.4 Calculate the price weighted series for Dec 31, 2016, prior to the splits. 3.5 Calculate the...
Contingency tables may be used to present data representing scales of measurement higher than the nominal...
Contingency tables may be used to present data representing scales of measurement higher than the nominal scale. For example, a random sample of size 20 was selected from the graduate students who are U.S. citizens, and their grade point averages were recorded. 3.42 3.54 3.21 3.63 3.22 3.8 3.7 3.2 3.75 3.31 3.86 4 2.86 2.92 3.59 2.91 3.77 2.7 3.06 3.3 Also, a random sample of 20 students was selected from the non-U.S. citizen group of graduate students at...
A paper reported that in a representative sample of 291 American teens age 16 to 17,...
A paper reported that in a representative sample of 291 American teens age 16 to 17, there were 79 who indicated that they had sent a text message while driving. For purposes of this exercise, assume that this sample is a random sample of 16- to 17-year-old Americans. Do these data provide convincing evidence that more than a quarter of Americans age 16 to 17 have sent a text message while driving? Test the appropriate hypotheses using a significance level...
Pivot Tables - Please explain how to acheive the following: Using the data below, create a...
Pivot Tables - Please explain how to acheive the following: Using the data below, create a Pivot Table that answers the question “Which salesperson sold the most in any particular month.” A manager wants to click on the Pivot Table and choose a month and have the name of that person appear with his or her amount for that month. Sales Data Salesperson May June July Aug. Sept. Oct. Albertson, Kathy $3,947.00 $557.00 $3,863.00 $1,117.00 $8,237.00 $8,690.00 Allenson, Carol $4,411.00...
The following contingency table represents the relationship between the age of a young adult and the...
The following contingency table represents the relationship between the age of a young adult and the type of movie preference                                18-23 yr 24-29 yr 30-35 yr                                                                            Science Fiction 14 9 8 Comedy 7 10 12 At the 0.05 level of significance, test the claim that the adult age and movie preference are independent (no relationship). H0:              H1: Test Statistic: Critical Region/Critical Value: Decision about H0:
Please create a contingency diagram for the following situation: There is a pigeon in an operant...
Please create a contingency diagram for the following situation: There is a pigeon in an operant chamber. He finds a shiny metal thing and pecks it. When the pigeon pecks it, a food pellet comes out of the wall of the chamber which the pigeon eats. The pigeon continues pecking the shiny metal thing more and more.
Use the data below to compute 2014 FCF (Free Cash Flow): 2014 2013 Cash 16 17...
Use the data below to compute 2014 FCF (Free Cash Flow): 2014 2013 Cash 16 17 Short-term investments 5 67 Accounts receivable 365 319 Inventories 555 415 Property, plant & equipment (net) 925 874 Accounts payable 47 30 Short-term debt 95 64 Accrued liabilities 148 130 Long-term debt 658 582 Common stock 130 130 Retained earnings 770 711 Net revenue 3147 2850 Depreciation expense 110 94 Interest 92 63 Taxes 82 81 Net income 256 123 (Round to the nearest...
Use the following information to answer questions 16 and 17: You sell short 100 shares of...
Use the following information to answer questions 16 and 17: You sell short 100 shares of Doggie Treats Inc. that are currently selling at $40 per share. You post the 50% margin required on the short sale. Your broker requires a 30% maintenance margin. 16) If the price falls to $30 short position in stocks and equity amounts are _______and_______, respectively. a. $3,000, $4,000 b. $2,000, $4,000 c. $3,000, $3,000 d. $4,000, $2,000 17) You will get a margin call...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT