Question

In: Statistics and Probability

1. The data set on sheet #1 gives data on GPA category and number of hours...

1. The data set on sheet #1 gives data on GPA category and number of hours studied. Construct comparative box plots of the data first GPA category. Then conduct two-sample t-test on the data for whether GPA category influences the number of hours studied. Be prepared to explain the results of the test and the meaning of the boxplots and how they relate to each other. Then redo the analysis by replacing the ordinal GPA category with a numerical dummy variable with Low=0, High=1. Run a regression analysis on how study hours (x) influence GPA category (y). Include the scatterplot. Compare the results of the two tests. Be able to state and null and alternative hypotheses

Student GPA Hours per week
1 Low 6
2 Low 18
3 Low 16
4 Low 14
5 High 0
6 Low 22
7 Low 15
8 Low 12
9 High 6
10 Low 7
11 Low 5
12 High 20
13 High 9
14 High 9
15 Low 22
16 Low 23
17 High 8
18 Low 7
19 Low 14
20 Low 12
21 Low 0
22 High 7
23 High 4
24 Low 9
25 Low 0
26 Low 0
27 High 6
28 High 14
29 Low 10
30 Low 9
31 High 5
32 High 7
33 High 4
34 High 16
35 High 0
36 Low 20
37 Low 13
38 High 0
39 High 4
40 Low 6
41 Low 17
42 Low 8
43 High 4
44 Low 0
45 High 16
46 Low 17
47 Low 4
48 High 11
49 Low 14
50 Low 16
51 High 11
52 High 7
53 High 4
54 Low 11
55 Low 8
56 High 2
57 Low 0
58 Low 0
59 High 13
60 Low 18
61 Low 28
62 High 1
63 Low 20
64 Low 13
65 Low 4
66 Low 7
67 High 11
68 Low 12
69 High 5
70 Low 7
71 Low 22
72 High 8
73 Low 19
74 Low 8
75 High 2
76 High 11
77 Low 18
78 Low 20
79 High 7
80 High 4
81 High 4
82 High 16
83 High 15
84 Low 9
85 High 8
86 High 10
87 Low 13
88 High 9
89 Low 2
90 Low 22
91 Low 12
92 High 6
93 High 9
94 Low 20
95 Low 14
96 High 7
97 High 15
98 High 9
99 High 2
100 Low 23

Solutions

Expert Solution

Box plot:

we can observe that, Mean number of hours studied by GPA-Low students is greater than Mean number of hours studied by GPA-high students. Now we need to test this statement using 2 sample t-test.

2 sample t-test:

Null hypothesis Ho: There is no difference in mean number of hours studied by GPA-Low students and mean number of hours studied by GPA-High students.

Alternative hypothesis H1: Mean number of hours studied by GPA-Low students is greater than Mean number of hours studied by GPA-high students.

(So this is a right tailed or one tailed test)

Test statistic:

where

By usual definition of mean and standard deviation we get,

Substituting the above values in test statistic equation we get,

t=3.511

and degrees of freedom

Now to draw the conclusion, we need to compare the t value (3.511) with t-distribution value at 5% level of significance () with degrees of freedom 98. (its called critical value)

i.e from t-distibution table we get

Since , we reject the null hypothesis at 5% level of significance.

Which means "Mean number of hours studied by GPA-Low students is greater than Mean number of hours studied by GPA-high students."

Or

"GPA category influence the number of hours studied"

Scatter Plot:

we can observe from above Scatter plot that, there is no linear relationship between, Number of hours studied and GPA category. Since dependent variable GPA category is binary (o or 1) we can try to fit a logistic regression.

Logistic regression Model:

Logistic regression model is given by,

and we get the model,


Related Solutions

warpbreaks is a built-in R dataset which gives This data set gives the number of warp...
warpbreaks is a built-in R dataset which gives This data set gives the number of warp breaks per loom, where a loom corresponds to a fixed length of yarn. We are interested in some descriptive statistics related to the warpbreaks dataset. We can access this data directly and convert the time series into a vector by using the assignment x <- warpbreaks$breaks. (In R, use ? warpbreaks for info on this dataset.) The values of x if assigned as above...
The following data gives the number of hours 7 students spent studying and their corresponding grades...
The following data gives the number of hours 7 students spent studying and their corresponding grades on their exams. Hours Spent Studying 0 1 2.5 3 4 4.5 5.5 gRADES 60 69 72 75 78 81 90 Step 1 of 3: Calculate the correlation coefficient, r. Round your answer to six decimal places. Step 2 of 3: Determine if r is statistically significant at the 0.050.05 level. Step 3 of 3: Calculate the coefficient of determination, r2r2. Round your answer...
The following data gives the number of hours 10 students spent studying and their corresponding grades...
The following data gives the number of hours 10 students spent studying and their corresponding grades on their midterm exams. Hours Spent Studying 0 0.5 1 2 2.5 3 4 4.5 5 5.5 Midterm Grades 60 63 75 81 84 87 90 93 96 99 Determine if r is statistically significant at the 0.01 level.
The following data gives the number of hours 5 students spent studying and their corresponding grades...
The following data gives the number of hours 5 students spent studying and their corresponding grades on their midterm exams. Hours Spent Studying 0 1 2 4 5 Midterm Grades 69 72 75 84 93 Copy Data Step 1 of 3: Calculate the coefficient of determination, R2. Round your answer to three decimal places. Step 2 of 3: Determine if r if statistically significant at the 0.01 level. (a) Yes (b) No Step 3 of 3: Calculate the coefficient of...
The following data gives the number of hours 10 students spent studying and their corresponding grades...
The following data gives the number of hours 10 students spent studying and their corresponding grades on their midterm exams. Hours Spent Studying 0 0.5 1 2 2.5 3.5 4 5 5.5 6 Midterm Grades 63 72 75 78 81 84 87 90 93 96 Step 1 of 3: Draw a scatter plot of the given data Step 2 of 3: Estimate the correlation in words: positive, negative, no correlation. Step 3 of 3: Calculate the correlation coefficient, r. Round...
The following data gives the number of hours 7 students spent studying and their corresponding grades...
The following data gives the number of hours 7 students spent studying and their corresponding grades on their midterm exams. Hours Spent Studying 1 1.5 3 3.5 4 4.5 5.5 Midterm Grades 60 63 75 78 84 87 93 Step 2 of 3 : Determine if r is statistically significant at the 0.01 level.
The table below gives the number of parking tickets received in one semester and the GPA...
The table below gives the number of parking tickets received in one semester and the GPA for five randomly selected college students who drive to campus. Using this data, consider the equation of the regression line, yˆ=b0+b1x, for predicting the GPA of a college student who drives to campus based on the number of parking tickets they receive in one semester. Keep in mind, the correlation coefficient may or may not be statistically significant for the data given. Remember, in...
The following data set is from a survey of the weekly number of hours spent exercising...
The following data set is from a survey of the weekly number of hours spent exercising for 40 high school students. 8 10 14 3 11 9 11 5 11 17 13 11 9 13 6 8 10 15 3 10 13 12 15 13 12 5 1 7 5 7 9 5 5 10 9 12 2 7 6 8 a) Put in order the data set and determine the mean, median and mode. b) Calculate the bin width...
An analyst believes that incoming GPA, the number of hours spent on Facebook per week, and...
An analyst believes that incoming GPA, the number of hours spent on Facebook per week, and upperclassman status can predict scores. Data is collected for 260 students. Students’ incoming GPA and the average number of hours spent on Facebook each week is recorded. For Academic Standing, data was included based on the number of years of college already completed  (3 = senior, 2 = junior, 1 = sophomore , 0 = freshmen). A regression is performed, and the results of the...
The data below provides College GPA, High School GPA, SAT total score, and a number of...
The data below provides College GPA, High School GPA, SAT total score, and a number of letters of reference. a.Generate a model for college GPA as a function of the other three variables? b.Is this model useful? Justify your conclusion. c.Are any of the variables not useful predictors? Why? CGPA HSGPA SAT REF 2.04 2.01 1070 5 2.56 3.4 1254 6 3.75 3.68 1466 6 1.1 1.54 706 4 3 3.32 1160 5 0.05 0.33 756 3 1.38 0.36 1058...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT