Question

In: Statistics and Probability

1. From Statistics and Data Analysis from Elementary to Intermediate by Tamhane and Dunlop, pg 339....

1. From Statistics and Data Analysis from Elementary to Intermediate by Tamhane and
Dunlop, pg 339. The following table gives the eye color and hair color of 592 students.

Eye Hair Color Row
Color Black Brown Red Blond Total
Brown 68 119 26 7 220
Blue 20 84 17 94 215
Hazel 15 54 14 10 93
Green 5 29 14 16 64
Column Total 108 286 71 127 592

a) What test should we use to test that the eye color and hair color are associated? Give the null
and alternative hypothesis.
b) Conduct the test at α = 0.05. What do you conclude? Give more than reject or fail to reject
H0.

2. The following data is looking at how long it takes to get to work. Let x = commuting
distance (miles) and y = commuting time (minutes)

x 15 16 17 18 19 20
y 42 35 45 42 49 46

a) Give a scatterplot of this data and comment on the direction, form, and strength of this relationship.

b) Determine the least-squares estimate equation for this data set.
c) Give the coefficient of determination, R2, comment on what that means.
d) Give the residual plot based on the least-squares estimate equation.
e) Test if this least-squares estimate equation specify a useful relationship between commuting
distance and commuting time.

Solutions

Expert Solution

1.

a) What test should we use to test that the eye color and hair color are associated? Give the null
and alternative hypothesis..

We shall be using a test for independence/association of attributes between eye color and the hair color.

Null Hypothesis: There is no association between eye color and the hair color

Alternate Hypothesis: There is an association between eye color and hair color.

b). Conduct the test at α = 0.05. What do you conclude? Give more than reject or fail to reject
H0.

Level of significance

Test statistic:

where is the observed frequency in the   cell, is the expected frequency in the cell and is given by   where is the total of row, is the total of column and T is the total of all the values. We shall first calculate the expected frequency first and then calculate the Chi-square.

The observed frequencies:

OBS Black Brown Red Blond Total
Brown 68 119 26 7 220
Blue 20 84 17 94 215
Hazel 15 54 14 10 93
Green 5 29 14 16 64
Total 108 286 71 127 592

The expected frequencies.

EXPECTED Black Brown Red Blond Total
Brown 40.13514 106.2838 26.38514 47.19595 220
Blue 39.22297 103.8682 25.78547 46.12331 215
Hazel 16.96622 44.92905 11.15372 19.95101 93
Green 11.67568 30.91892 7.675676 13.72973 64
Total 108 286 71 127 592
CHiSQ Black Brown Red Blond Total
Brown 19.34591 1.521419 0.005622 34.23417 55.10712
Blue 9.421078 3.80046 2.993334 49.69672 65.91159
Hazel 0.227865 1.831378 0.726335 4.96329 7.748867
Green 3.816879 0.119094 5.210887 0.375399 9.522259
Total 32.81173 7.27235 8.936177 89.26958 138.2898

The critical value of at 9 df is 16.9190.

The calculated value of is 138.2898>the critical value and so we reject the null hypothesis. Hence, we conclude that the eye color and the hair color is associated.

2. The given data:

x   y
15 42
16 35
17 45
18 42
19 49
20 46

a). the scatter plot.

The scatter plot shows an upward trend of the points and the points seem to form a line but the relationship is weak since all the points are bit away from the line.

b) Determine the least-squares estimate equation for this data set.

For the calculation of the least square estimates:

x   y x^2 y^2 x*y
15 42 225 1764 630
16 35 256 1225 560
17 45 289 2025 765
18 42 324 1764 756
19 49 361 2401 931
20 46 400 2116 920
Total 105 259 1855 11295 4562

Therefore the estimated linear equation is

c) Give the coefficient of determination, R2, comment on what that means.

The coefficient of determination is . The correlation is given by

  

  

The coefficient of determination is This implies that 43.31% of the variation of the data is being explained by this model.

d) Give the residual plot based on the least-squares estimate equation.

e). Test if this least-squares estimate equation specify a useful relationship between commuting
distance and commuting time.

For this, we shall create the ANOVA table. We need to calculate the F-value for the Regression SS.

The Total sum of squares is

The regression sum of squares is

The error sum of squares ESS=TSS-RSS=114.83333-49.7285=65.1052

We shall form the ANOVA table:

df SS MS F Significance F
Regression 1 49.72857 49.72857 3.055295 0.15539
Residual 4 65.10476 16.27619
Total 5 114.8333

From the Table above, we can see that the p-value is 0.1554>0.05. Hence we fail to reject the null hypothesis and conclude that there is not enough evidence to that teh relationship between X and Y is linear. There fore,the equation does not specify a useful relationship between commuting distance and commuting time.


Related Solutions

From Statistics and Data Analysis from Elementary to Intermediate by Tamhane and Dunlop, pg 290. Tell...
From Statistics and Data Analysis from Elementary to Intermediate by Tamhane and Dunlop, pg 290. Tell in each of the following instances whether the study uses an independent samples or a matched pairs design. a) Two computing algorithms are compared in terms of the CPU times required to do the same six test problems. b) A survey is conducted of teens from inner city schools and suburban schools to compare the proportion who have tried drugs. c) A psychologist measures...
The following table contains data from 16 former elementary statistics students, where x represents the number...
The following table contains data from 16 former elementary statistics students, where x represents the number of absences a student had for the semester and y represents the student’s final class average. Absences Class Average 2 86 2 83 3 81 10 53 3 92 7 71 9 68 1 79 12 53 9 78 1 77 1 85 13 62 1 97 10 54 3 79 (a) Draw a scatter plot using one the following website(s): http://www.alcula.com/calculators/statistics/scatter-plot/ or https://www.meta-chart.com/scatter...
Q1: a.    Describe and apply the concepts and logic of elementary statistics. b.    Conduct statistical analysis...
Q1: a.    Describe and apply the concepts and logic of elementary statistics. b.    Conduct statistical analysis in SPSS (Statistical Package for the Social Sciences). c.    Compare and contrast different types of data and the statistics that can be used to analyze them d.    Examine the differences between descriptive and inferential statistics and their use in the social sciences. e.    Form critical interpretations of quantitative research literature in sociology and other social sciences. f.    Complete and interpret descriptive and inferential statistical...
Share what you learned about descriptive analysis (statistics), inferential analysis (statistics), and qualitative analysis of data;...
Share what you learned about descriptive analysis (statistics), inferential analysis (statistics), and qualitative analysis of data; include something that you learned that was interesting to you and your thoughts on why data analysis is necessary for discovering credible findings for nursing. Compare clinical significance and statistical significance; include which one is more meaningful to you when considering application of findings to nursing practice.
In a study of memory recall, ten students from a large statistics and data analysis class...
In a study of memory recall, ten students from a large statistics and data analysis class were selected at random and given 15 minutes to memorize a list of 20 nonsense words. Each was asked to list as many of the words as he or she could remember both 1 hour and 24 hours later. The data are as shown in Table 1. Is there evidence to suggest the mean number of words recalled after 1 hour exceeds the mean...
Assignment #3: Inferential Statistics Analysis and Writeup Part A: Inferential Statistics Data Analysis Plan and Computation...
Assignment #3: Inferential Statistics Analysis and Writeup Part A: Inferential Statistics Data Analysis Plan and Computation Introduction: I chose to imagine I am a 36 year old married individual with a large family. (UniqueID#30) Variables Selected: Table 1: Variables Selected for Analysis Variable Name in the Data Set Variable Type Description Qualitative or Quantitative Variable 1: Marital Status Socioeconomic Marital Status of Head of Household Qualitative Variable 2: Housing Expenditure Total Amount of Annual Expenditure on Housing Quantitative Variable 3:...
1. If you wanted to find the difference in Elementary Statistics grades between students who transferred...
1. If you wanted to find the difference in Elementary Statistics grades between students who transferred to CSULB from a community college and students who entered CSULB straight out of high school, what test statistic would you use? 2. If you wanted to find the difference in grades among students who took Elementary statistics in their Freshman year, Sophomore year, Junior year, or Senior year in college, what test statistic would you use? 3. If you wanted to see if...
The earthquakes from transform plate boundaries are; 1, shallow and weak 2, intermediate depth and intermediate...
The earthquakes from transform plate boundaries are; 1, shallow and weak 2, intermediate depth and intermediate strength 3, deep and strong
In business management, how important is it to learn and use statistics and data analysis to...
In business management, how important is it to learn and use statistics and data analysis to analyze trends, patterns, and relationships for making data-driven managerial decisions?
Statistics course, Chapter of data description, easy summary of Exploratory data analysis with examples please.
Statistics course, Chapter of data description, easy summary of Exploratory data analysis with examples please.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT