Question

In: Statistics and Probability

Please show r-code used to get answers state sat takers income years public expend rank Iowa...

Please show r-code used to get answers 
state           sat     takers  income  years   public  expend  rank
Iowa            1088    3       326     16.79   87.8    25.60   89.7
SouthDakota     1075    2       264     16.07   86.2    19.95   90.6
NorthDakota     1068    3       317     16.57   88.3    20.62   89.8
Kansas          1045    5       338     16.30   83.9    27.14   86.3
Nebraska        1045    5       293     17.25   83.6    21.05   88.5
Montana         1033    8       263     15.91   93.7    29.48   86.4
Minnesota       1028    7       343     17.41   78.3    24.84   83.4
Utah            1022    4       333     16.57   75.2    17.42   85.9
Wyoming         1017    5       328     16.01   97.0    25.96   87.5
Wisconsin       1011    10      304     16.85   77.3    27.69   84.2
Oklahoma        1001    5       358     15.95   74.2    20.07   85.6
Arkansas        999     4       295     15.49   86.4    15.71   89.2
Tennessee       999     9       330     15.72   61.2    14.58   83.4
NewMexico       997     8       316     15.92   79.5    22.19   83.7
Idaho           995     7       285     16.18   92.1    17.80   85.9
Mississippi     988     3       315     16.76   67.9    15.36   90.1
Kentucky        985     6       330     16.61   71.4    15.69   86.4
Colorado        983     16      333     16.83   88.3    26.56   81.8
Washington      982     19      309     16.23   87.5    26.53   83.2
Arizona         981     11      314     15.98   80.9    19.14   84.3
Illinois        977     14      347     15.80   74.6    24.41   78.7
Louisiana       975     5       394     16.85   44.8    19.72   82.9
Missouri        975     10      322     16.42   67.7    20.79   80.6
Michigan        973     10      335     16.50   80.7    24.61   81.8
WestVirginia    968     7       292     17.08   90.6    18.16   86.2
Alabama         964     6       313     16.37   69.6    13.84   83.9
Ohio            958     16      306     16.52   71.5    21.43   79.5
NewHampshire    925     56      248     16.35   78.1    20.33   73.6
Alaska          923     31      401     15.32   96.5    50.10   79.6
Nevada          917     18      288     14.73   89.1    21.79   81.1
Oregon          908     40      261     14.48   92.1    30.49   79.3
Vermont         904     54      225     16.50   84.2    20.17   75.8
California      899     36      293     15.52   83.0    25.94   77.5
Delaware        897     42      277     16.95   67.9    27.81   71.4
Connecticut     896     69      287     16.75   76.8    26.97   69.8
NewYork         896     59      236     16.86   80.4    33.58   70.5
Maine           890     46      208     16.05   85.7    20.55   74.6
Florida         889     39      255     15.91   80.5    22.62   74.6
Maryland        889     50      312     16.90   80.4    25.41   71.5
Virginia        888     52      295     16.08   88.8    22.23   72.4
Massachusetts   888     65      246     16.79   80.7    31.74   69.9
Pennsylvania    885     50      241     17.27   78.6    27.98   73.4
RhodeIsland     877     59      228     16.67   79.7    25.59   71.4
NewJersey       869     64      269     16.37   80.6    27.91   69.8
Texas           868     32      303     14.95   91.7    19.55   76.4
Indiana         860     48      258     14.39   90.2    17.93   74.1
Hawaii          857     47      277     16.40   67.6    21.21   69.9
NorthCarolina   827     47      224     15.31   92.8    19.92   75.3
Georgia         823     51      250     15.55   86.5    16.52   74.0
SouthCarolina   790     48      214     15.42   88.1    15.60   74.0

The data set is from the first year that SAT scores were published on a state-by-state basis in the U.S. It was originally published in the Harvard Educational Review in 1984, and is also reported in Ramsey and Schafer, 1997. The variables included are:

sat =averagetotalSATscoreforthestate
takers =percentofeligiblestudentsinthestatewhotooktheexam
income =themedianfamilyincomeofstudentsinthestatewhotooktheexam
years =theaveragenumberofyearsthatthetest-takershadforstudiesinthecoresubjects public =percentageoftesttakersattendingpublicsecondaryschools
expend =thestatesexpendituresoneducationinhundredsofdollarsperstudent
rank =themedianpercentilerankingofthetest-takersintheirhigh-schoolclass

Perform a multiple regression to predict the average SAT score in the state from the other variables, and answer the following questions. Present copies of the relevant portions of the R output, and for each question indicate which portion of the output you used and how you used it.

a) Test whether the other six variables as a group are statistically significant predictors of the states average SAT scores. (Report the p-value and your conclusion)

b) Check the assumptions, and state whether you feel comfortable trusting the results of the regression.

c) For which, if any, of the predictor variables is multicollinearity a concern?
d) Which two states’ independent variables are most different from those of the other states?

Solutions

Expert Solution

A)

As we can see that code for the multiple linear regression model.

from the above output we can see that only year , expend and Rank variable are significant

Since their p value < 0.05 which implies that they are statistically significant.

B)

As we can see the various plots which we have ploted above in the code

1)

Residual Plot - It suggests that their might be heteroscedaticy in the model since residuals are showing some pattern.

2)

QQ plot - As we can see that standardised residual plot which suggests that normality assumption is not followed.

So we cannot trust this model since it has many pitfalls in the model.

C)

As we can see that correlation matrix that correlation between takers and rank is very high which is -0.9428

and correlation between takers and Income is very high which is -0.6619351

So their is highly multicollinearity in the data.

D)

As we can see that two points 29 and 22 have high leverage which implies these two states are most different from other states.

so Louisiana and Alaska are the two states.


Related Solutions

Please explain your answers and use Excel to show the excel formula you used to get...
Please explain your answers and use Excel to show the excel formula you used to get your solution. 6. A manufacturing process produces connecting rods whose diameter is normally distributed with mean 1.495 cm and standard deviation .05 cm. In what range will the “middle 80%” of the diameters lie? What about the “middle 98%”?
Solving these useing R program using pnorm() for Statistics Please show the code you used and...
Solving these useing R program using pnorm() for Statistics Please show the code you used and the answer Thank you The fracture toughness (in ???√?) of a particular steel alloy is known to be normally distributed with a mean of 28.3 and a standard deviation of 0.77. We select one sample of alloy at random and measure its fracture toughness. ▶ What is the probability that the fracture toughness will be between 27.8 and 30.7? ▶ What is the probability...
Please show work for how you get answers for a and b. FOR A the answer...
Please show work for how you get answers for a and b. FOR A the answer is -46,247.78 For B the answer is 930,668. Lease or Buy Wolfson Corporation has decided to purchase a new machine that costs $3.2 million. The machine will be depreciated on a straight-line basis and will be worthless after four years. The corporate tax rate is 35 percent. The Sur Bank has offered Wolfson a four-year loan for $3.2 million. The repayment schedule is four...
Show the R inputs and the answers to the questions if asked please # Q1. Generate...
Show the R inputs and the answers to the questions if asked please # Q1. Generate 10 random numbers from a uniform distribution on [0,10]. Use R to find the maximum and minimum values. # Q2. Generate 10 random normal numbers with mean 5 and standard deviation 5 (normal(5,5)). How many are less than 0 (Use R)? # Q3. Generate 100 random normal numbers with mean 100 and standard deviation 10. How many are 2 standard deviations from the mean...
PLEASE TELL ME HOW TO GET TO THESE ANSWERS AND SHOW WORK. Below is the shareholders’...
PLEASE TELL ME HOW TO GET TO THESE ANSWERS AND SHOW WORK. Below is the shareholders’ equity section of Matt Co.’s balance sheet for December 31, 2018 and December 31, 2017. Matt uses the treasury stock method to account for repurchases. During 2017, Matt repurchased 1,000 shares at $10 per share. December 31, 2018 December 31, 2017 Common Stock (par value $0.01) 120 100 Paid-in Capital, in excess of par 189,880 149,900 Paid-in Capital, share repurchase 3,200 0 Treasury Stock...
Please do this question in R and show the code too, please. The alternating current (AC)...
Please do this question in R and show the code too, please. The alternating current (AC) breakdown voltage of an insulating liquid indicates its dielectric strength. The article “Testing Practices for the AC Breakdown Voltage Testing of Insulation Liquids” (IEEE Electrical Insulation Magazine, 1995: 21–26) gave the accompanying sample observations on breakdown voltage (kV) of a particular circuit under certain conditions. 62 50 53 57 41 53 55 61 59 64 50 53 64 62 50 68 54 55 57...
***This problem must be done using R so please provide the R code used to find...
***This problem must be done using R so please provide the R code used to find the solution. I have provided the data in data-wtLoss.txt below the question. I will also give "thumbs-up for correct R code" Thanks in advance.*** The file “data-wtLoss.txt” contains data on weight loss and self esteem evaluation at three time points over a period of three months for 34 individuals who are randomly selected from a residential area. These individuals are randomly assigned to one...
***This problem must be done using R so please provide the R code used to find...
***This problem must be done using R so please provide the R code used to find the solution. I have provided the data in data-wtLoss.txt below the question. I will also give "thumbs-up for correct R code" Thanks in advance.*** The file “data-wtLoss.txt” contains data on weight loss and self esteem evaluation at three time points over a period of three months for 34 individuals who are randomly selected from a residential area. These individuals are randomly assigned to one...
SOLVE THE FOLLOWING USING STATISTICAL SOFTWARE R. SHOW YOUR CODE AND ANSWERS, USING AN RMD FILE...
SOLVE THE FOLLOWING USING STATISTICAL SOFTWARE R. SHOW YOUR CODE AND ANSWERS, USING AN RMD FILE (SHOW ANSWERS IN R MARKDOWN FORWAT WITH CODE AND ANSWERS) PROBLEM 1 A study of 400 glaucoma patients yields a sample mean of 140 mm and a sample standard deviation of 25 mm for the the following summaries for the systolic blood pressure readings. Construct the 95% and 99% confidence intervals for μ, the population average systolic blood pressure for glaucoma patients. PROBLEM 2...
Please solve all answers on Excel and show step by step how you get the WACC...
Please solve all answers on Excel and show step by step how you get the WACC answer.   Tornado Motors is a major producer of sport and utility trucks. It is a family owned company, started by Jane Biscayne in 1935, at the height of the Great Depression. Today the firm produces 3 lines of trucks. These include a standard, no-frills short bed pickup truck (Model A), a mid-size version (Model B ) and a larger, heavy-duty work truck (Model C)....
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT