In: Statistics and Probability
Mc Kinley Inc., a major consulting firm conducts a study for Kneewise, a major jeans and apparel manufacturer to determine the relationship between the age of apparel buyers (basically, their customers) and the primary channel from which they will buy the merchandise. Kneewise has three major channels of retail/distribution; department stores, mail-order catalogs and internet kiosks. 1000 apparel buyers were selected from each age group, and the primary source from which the apparel buyer bought the merchandise was noted. Columns A, B, and C contain the channel used to buy merchandise. These are department stores, mail-order catalogs and internet kiosks respectively. The five rows 1,2,3,4 and 5 correspond to the age groups, viz., less than 25, 25 and less than 35, 35 and less than 45, 45 and less than 55, and 55 and over respectively.
[Data Set Assgt#3.xls (Columns A - C)]
The data are in columns A - C of an Excel spreadsheet.
Column A 'Stores', column B 'Catalogs', column C 'Internet Kiosks'.
Select "Chi Square Tests ® Independence of Rows and Columns/ Test of homogeneity" from the Add-ins tab / KPK Data Analysis (or the KPK pull down in 2003). Enter the appropriate range of cells for the Input Range of Observed Frequencies. Select an appropriate cell for Output Range (leaving adequate space for the output). Keep the default significance level of 0.05.
KPK Data Analysis will create the cross tabulations of the expected and observed frequencies, and perform the Chi Square test of independence.
You may now re-label the cells currently labeled as Coln 1 through Coln 3 and Row 1 through Row 5, according to the information provided in the problem description.
Assignment Questions:
What are the null hypothesis and alternative hypothesis to test whether the age groups are homogeneous across the channels from which the apparel was purchased?
Are any expected frequencies less than 5?
What is the p-value of the test?
Do the data indicate that the age groups are homogeneous across the channels from which the apparel was purchased at the 5% significance level?
What recommendation regarding marketing effort in the various channels should be made relative to the various age group markets?
Part 2. **********************************************************************
Doctor and Ramble Inc. hands out free samples of their shampoos usually during freshman orientation on college campuses. It is noteworthy that with more than 2000 college campuses targeted nationwide by Doctor and Ramble, their annual promotion budget runs into the millions. Assume that the data in Columns F - I, which indicate whether the campus population actually availed of the free samples and whether they decided to switch to Doctor and Ramble's brand of shampoos, was obtained from a randomly selected, large university campus. The four rows of Columns F - I were use to indicate four adult age categories: 18 and less than 30, 30 and less than 40, 40 and less than 50, 50 and over years of age respectively.
[Data Set Assgt#3.xls (Columns F - I)]
The data are in columns F - I of the Excel dataset spreadsheet.
The columns F through I indicate whether a student "Did not sample and did not switch", "Did not sample but switched", " Sampled but did not switch" and "Sampled and switched" respectively.
Conduct the Chi-Square test using the KPK Data Analysis macros, as indicated earlier but at a significance level of 0.01. You may select a new sheet for the Output Range.
Combine the columns (add the values) in which shoppers did not sample (columns F & G) into column K. Also combine the columns in which shoppers did sample (columns H& I) into column L.
Repeat the Chi-Square test at a significance level of 0.01 for columns K & L.
Assignment Questions:
What are the null and alternative hypotheses for test of independence for the four original groups of students?
What are the expected values and p-value?
What conclusions can be drawn from this test for independence?
What are the null and alternative hypotheses to test the combined groups of students?
What conclusions can be drawn from the test of these combined groups?
Hints for Excel Assignment 3
Part 1
1. Labels will need to be added following the calculations (for either part).
2. Samples of the print-outs provided below are to help you ensure that you are on the right track.
Before labeling your output should look like the one shown below. (Note some values have been replaced with X/x .)
Crosstabulation Table |
Calculation of the Chi-Square Test |
||||||
OBSERVED |
DESCRIPTION |
VALUE |
|||||
Coln 1 |
Coln 2 |
Coln 3 |
Total |
c2* |
xxxxxxx |
||
Row 1 |
669 |
141 |
190 |
1000 |
p-value |
xxxxxxx |
|
Row 2 |
254 |
337 |
409 |
1000 |
Critical value |
xxxxxxx |
|
Row 3 |
290 |
325 |
385 |
1000 |
a |
0.05 |
|
Row 4 |
557 |
254 |
189 |
1000 |
df |
8 |
|
Row 5 |
725 |
200 |
75 |
1000 |
|||
Total |
2495 |
1257 |
1248 |
5000 |
|||
EXPECTED |
|||||||
Coln 1 |
Coln 2 |
Coln 3 |
Total |
||||
Row 1 |
xxxxx |
Xxxxx |
Xxxxx |
1000 |
|||
Row 2 |
Xxxxx |
Xxxxx |
xxxxx |
1000 |
|||
Row 3 |
xxxxx |
Xxxxx |
Xxxxx |
1000 |
|||
Row 4 |
xxxxx |
Xxxxx |
Xxxxx |
1000 |
|||
Row 5 |
Xxxxx |
xxxxx |
xxxxx |
1000 |
|||
Total |
2495 |
1257 |
1248 |
5000 |
|||
Part 2
For the second part, the data is in Columns F through I of the spreadsheet. Before labeling, your raw output should look like the one shown below. (Note some values have been replaced with X/x.)
Crosstabulation Table |
Calculation of the Chi-Square Test |
|||||||
OBSERVED |
DESCRIPTION |
VALUE |
||||||
Coln 1 |
Coln 2 |
Coln 3 |
Coln 4 |
Total |
c2* |
Xxxx |
||
Row 1 |
xxxx |
xxxx |
xxxx |
xxxx |
61 |
p-value |
Xxxx |
|
Row 2 |
xxxx |
xxxx |
xxxx |
xxxx |
71 |
Critical value |
Xxxx |
|
Row 3 |
xxxx |
xxxx |
xxxx |
xxxx |
63 |
a |
0.01 |
|
Row 4 |
xxxx |
xxxx |
xxxx |
xxxx |
61 |
df |
9 |
|
Total |
64 |
62 |
70 |
60 |
256 |
|||
EXPECTED |
|||||
Coln 1 |
Coln 2 |
Coln 3 |
Coln 4 |
Total |
|
Row 1 |
xxxx |
xxxx |
xxxx |
Xxxx |
61 |
Row 2 |
Xxxx |
xxxx |
Xxxx |
Xxxx |
71 |
Row 3 |
Xxxx |
Xxxx |
Xxxx |
Xxxx |
63 |
Row 4 |
xxxx |
Xxxx |
Xxxx |
xxxx |
61 |
Total |
64 |
62 |
70 |
60 |
256 |
You will need to modify the data to answer the second question of this part. First, combine the students that tried the samples, and those who didn't, in separate columns. The following is an example of the output of the Chi Square analysis:
Crosstabulation Table |
Calculation of the Chi-Square Test |
|||||
OBSERVED |
DESCRIPTION |
VALUE |
||||
Coln 1 |
Coln 2 |
Total |
c2* |
Xxxx |
||
Row 1 |
23 |
38 |
61 |
p-value |
Xxxx |
|
Row 2 |
27 |
44 |
71 |
Critical value |
Xxxx |
|
Row 3 |
41 |
22 |
63 |
a |
Xxxx |
|
Row 4 |
35 |
26 |
61 |
df |
xxxx |
|
Total |
126 |
130 |
256 |
|||
EXPECTED |
||||||
Coln 1 |
Coln 2 |
Total |
||||
Row 1 |
xxxx |
xxxx |
Xxxx |
|||
Row 2 |
xxxx |
Xxxx |
Xxxx |
|||
Row 3 |
xxxx |
Xxxx |
Xxxx |
|||
Row 4 |
xxxx |
xxxx |
xxxx |
|||
Total |
126 |
130 |
256 |
Experiential Exercise
As you work this assignment think about the following questions. Then form a team of 3 to 5 and discuss each of the following. You can engage in this discussion by meeting or your group can use a Wiki to engage in an online discussion. Instructions for setting up a Wiki are provided on the excel assignment page of our course web site.
What wording tells you the alternative and null hypothesis?
What wording tells you the type of statistical test to perform? For example, is a z or t statistic appropriate?
What wording tells you that this is a one or two tail hypothesis test?
What were the steps you used to obtain the calculated value of the test statistic?
What were the steps you used to obtain the critical value of the test statistic?
How do you use the calculated and critical value to make a statistical decision about this test?
How do you obtain the p value for the test statistic?
How do you use the p value and the level of significance to make a statistical decision about this test?
How does the result of your test relate to the statistical significance of your findings?
What managerial implications can you conclude from the results of your test?
To be ready for the “Excel Quiz 3” HLS Web Test you should prepare the following:
correct output as required for both part 1 and part 2
a statement of the null/alternative hypotheses to test independence for Part I and for Part II (Both 4-column and 2-column analyses).
a decision, giving reason, corresponding to each hypothesis test (highlight, bold,
label, or underline any relevant numbers in the output used to make the decision)
answers to all of the assignment questions for each part typed at the bottom of the spreadsheet containing the analysis for that part.
The Excel assignments are each graded via a short Excel Quiz in HLS Web Test that is open for about 48 hours as listed in the syllabus and in your HLS progress report. You are expected to use your output and written answers to complete the quiz. You are not required to turn in the output.
The questions below are much like the quiz you will have in WEBTEST. If you can answer these, you should have no difficulty with those that will be asked. However, these questions are not the exact questions that you will have to answer.
SAMPLE WEBTEST QUIZ
The correct answers to the sample quiz questions are marked with an asterisk.
Part I Questions:
1. What is the critical value of the Chi-square statistic for testing the relationship between purchase channel and age group?
A) 48.82 B) 810.88 C) 0.05 D) 15.507* E) 0
2. What is your conclusion for the relationship between purchase channel and age group?
A) Factors “Age” and “Purchase Channel” are independent.
B) Age groups are homogeneous across the purchase channels.
C) Age groups are distributed differently across the purchase channels*
Part II Questions:
3. What is calculated Chi-square value to test the belief that "consumer response (= decisions on whether to sample the product or switch to the product) is independent of age”?
A) 21.666 B) 14.757 C) 0.0331 D) 11.344 E) 18.182*
4. What was the critical (table) value of the Chi-square statistic for the above test?
A) 21.666* B) 11.344 C) 14.757 D) 0.002 E) 0.0331
5. What is your conclusion for the relationship between consumer response and age?
A) There is insufficient evidence to conclude that consumer response and age are related.*
B) There is sufficient evidence to conclude that consumer response and age are related.
Excel Data that is needed for the assignment:
669 | 141 | 190 | 14 | 9 | 20 | 18 | ||
254 | 337 | 409 | 11 | 16 | 22 | 22 | ||
290 | 325 | 385 | 22 | 19 | 11 | 11 | ||
557 | 254 | 189 | 17 | 18 | 17 | 9 | ||
725 | 200 | 75 |