In: Statistics and Probability
Please solve all of the question using R and do clarify the answers.
Using the (SATGPA) data set in (Stat2Data) package. Test by using ?= .01.
1) Create the following three variables and then print out all the six variables.
A) Create new variable "SAT", which is the sum of (MathSAT) and (VerbalSAT).
B) Create second new variable ("SATLevel"), and assign the value of( "SATLevel") as 1 when
SAT<=1100, 2 when 1100<SAT<=1200, 3 when 1200<SAT<=1300, and 4 when
SAT>1300.
C)Create third new variable "GPALevel" and assign the value of "GPALevel" as 1 when
GPA<=2.8, 2 when 2.8<GPA<=3.3, 3 when 3.3<GPA<=3.5, and 4 when GPA>3.5
D) Print out all the data in the ascending order of their GPALevel and the descending order of
their SAT when( GPALevel )is the same.
2) Use the Chi-Square test to conclude if the SATLevel and GPALevel are independent.
3) Compute the mean and variance of "GPA" for each level of( "GPALevel"), and compute the
correlation matrices for the four variables: MathSAT, VerbalSAT, GPA and SAT.
4) Do the data provide sufficient evidence to indicate that the mean of (MathSAT) is significantly greater
than the mean of (VerbalSAT.)
5) Test if the proportion of ( MathSAT) greater than (VerbalSAT) is 0.5.
I am providing you the code with conclusions. you can run and check it.
########## Question 1
data("SATGPA")
y=SATGPA;gpa=y$GPA
sat=y$MathSAT+y$VerbalSAT
satlevel=rep()
satlevel[which(sat<=1100)]=1;satlevel[which(sat<=1200&sat>1100)]=2
satlevel[which(sat>1300)]=4;satlevel[which(sat<=1300&sat>1200)]=3
gpalevel=rep()
gpalevel[which(gpa<=2.8)]=1;gpalevel[which(gpa>2.8&gpa<=3.3)]=2
gpalevel[which(gpa>3.5)]=4;gpalevel[which(gpa>3.3&gpa<=3.5)]=3
newdata=cbind(y,sat,satlevel,gpalevel)
newdata <- newdata[order(gpalevel,-sat),]
#############Question 2
library(MASS) # load the MASS
package
tbl = table(satlevel, gpalevel)
tbl
chisq.test(tbl)
# As p value is large so we can not reject the null hypthesis, i.e SATLevel and GPALevel are independent
####Question 3
mean(gpa[which(gpalevel==1)]);var(gpa[which(gpalevel==1)])
mean(gpa[which(gpalevel==2)]);var(gpa[which(gpalevel==2)])
mean(gpa[which(gpalevel==3)]);var(gpa[which(gpalevel==3)])
mean(gpa[which(gpalevel==4)]);var(gpa[which(gpalevel==4)])
newdata<-cbind(y,sat)
cor(newdata)
####### Question 4
t.test(y$MathSAT,y$VerbalSAT,alternative = "greater")
############## Here p value is less than 0.05 so at alpha=.05 we
reject the null hypothesis
#i.e the data provide sufficient evidence to indicate that the mean
of (MathSAT) is significantly greater
#than the mean of (VerbalSAT.)