In: Statistics and Probability
According to the Census Bureau, the distribution by ethnic background of the New York City population in a recent year was
Hispanic: 28% Black: 24% White: 35% Asian: 12% Others: 1%
The manager of a large housing complex in the city wonders whether
the distribution byrace of the complex’s residents is consistent
with the population distribution. To find out,she records data from
random sample of 800 residents. The table below displays the sample
data.
Race |
Hispanic |
Black |
White |
Asian |
Others |
Count |
212 |
202 |
270 |
94 |
22 |
Are these data significantly different from the city’s distribution by race? Carry out anappropriate test at the ? = 0.05 level of significance to support your answer.
a. State the null and alternate hypotheses (write it mathematically) and writeyour claim.
b. Find the standardized test statistic
c. Identify the Rejection region (critical region) and fail to reject region.
Decide whether to reject or fail to reject the null.
Make an interpretation of your decision in the context.
a.
p 10 , p 20 , p 30 , p 40 , p 50 being the proportion of participants in the population and p 1 , p 2 , p 3 , p 4 , p 5
in the sample, we can write our null hypothesis as:
H0 : p1 = p 10 , p2 = p 20 , p3 = p 30 , p4 = p40 , p5 = p50
H1 : At least one of the values of proportion of participants in the sample has a different value than in the population i.e. is not equal to it's proportion in the population.
b.
Chi- square test statistic also known as X2 statistic is given as:
O are the observed frequencies and are clear in the sample data. E are expected frequency and can be calculated using our sample size and population proportion (eg. we expect 28% of 800 i.e. 224 of them to be black):
224, 192, 280, 96, 8 are the expected frequencies.
X2 = (212-224)2/224 + (202-192)2/192 + (270-280)2/280 + (94-96)2/96 + (22-8)2/8
X2 = 26.06
c.
For ? = 0.05 and df = 4 (i.e. number of cardinals - 1) :
critical value = 9.488
All the test statistic values below 9.488 lie in the fail to reject region and larger values lie in the critical region or rejection region. The statistic of our sample lies in the critical region so we decide to reject the null hypothesis.
The 800 resident sample we tested here was significantly different from city's distribution by race.