Question

In: Statistics and Probability

1. Explain the R code. SAT_2010 <- mutate( SAT_2010, Salary =salary/1000) Note: column name is case...

1. Explain the R code. SAT_2010 <- mutate( SAT_2010, Salary =salary/1000)

Note: column name is case sensitive.

2. For the following R code segment, answer the following questions a) and b)
SAT_plot <-ggplot(data =SAT_2010, aes(x=Salary, y =total)) +
geom_point()+geom_smooth( method ="lm") +
ylab("Average total score on the SAT")+
xlab("Average teacher salary (in thousands")
SAT_plot

a). Explain the code. Specially, pay attention to the ggplot.
b). What will happen if we delete the last line: SAT_plot? Why?


2. What command in the 7.6 creates a variable called SAT_grp, how is the variable's value determined? What sampling technique this is called?

3. What is the cofounding factor in this case, and how does it affect the outcome?


4. Check the running outcome, and see if column Salary is added as a new column, or as a replacement column for salary, explain the reason.

5. In lm (x~y), what is the dependent variable, what is the independent variable?

Solutions

Expert Solution

  1. In first question we are creating one extra variable in the SAT_2010 called Salary using already existed variable salary in SAT_2010.
    • to create new variable in the data we are using mutate function from dplyr package.
  2. a) ggplot is a function from ggplot2 package.
    • in data parameter we are providing the data frame object, like for which data we have to plot a graph.
    • in aes we have x and y two sub parameters that shows in x we have to give the x-axis variable and in y we have to provide the y-axis variable.
    • geom_point() is useful for create the point symbols for each data point in the graph.
    • geom_smooth("lm") is draw a smooth line using linear model.
    • ylab is useful for creating the y-label for the plot.
    • xlab is useful for creating the x-label for the plot.

b) if you delete last line SAT_plot you will not be able to show the plot you fitted.because you stored the plot object in SAT_plot.

Note: - For 2nd and 3rd questions data and required code not given.

4. Salary added as new column as i mentioned in the 1st question.

  • Because there we are using mutate function to create new variable called Salary using already existed variable salary.

5. In lm (x~y), x is dependent variable and y is independent variable.


Related Solutions

Java Code: Write an application that takes in user input. (Name, Age, and Salary) ------ Write...
Java Code: Write an application that takes in user input. (Name, Age, and Salary) ------ Write an application that includes a constructor, user input, and operators.
6.) Write the assembly code for the following piece of code. Note: r,g,y,z correlate with the...
6.) Write the assembly code for the following piece of code. Note: r,g,y,z correlate with the w4, w8, w11, and w15. if (g * y > 25) || (y == z / 8) && (y * 4 < z - g)) y = r / 32; else g = 8 * (r - z);
1. Explain the effects of distillation rate, column packing, and column height on fractionation. When is...
1. Explain the effects of distillation rate, column packing, and column height on fractionation. When is a fractionating column not necessary? (3 points) minimum of 150 words.
Write the R code First, generate 1000 observations from a binomial distribution with n=30 and p=0.2...
Write the R code First, generate 1000 observations from a binomial distribution with n=30 and p=0.2 Use the 1000 observations you generated: a) Generate poisson, binomial, negative binomial Diagnostic Distribution Plots using distplot. b) Generate a histogram and overlay a kernel estimator of the density (You can use: binom <- rbinom(n=1000,size=30, prob=0.2))
5.29) Explain what the following fragment of code achieves. Note that the data is signed and...
5.29) Explain what the following fragment of code achieves. Note that the data is signed and that the packed shift right arithmetic instruction operates on word (16-bit) operands. MOVQ MM0, MM1 PSRAW MM0, 15 PXOR MM0, MM1 5.30) Consider the following block of operations that might be found inside a loop. Explainf what the instructions do and what operation is being performed on the data. MOVQ MM1, A ; move 8 pixels of image A MOVQ MM2, B ; move...
1. Use R Studio: Include R Code A survey is taken of 250 students, and a...
1. Use R Studio: Include R Code A survey is taken of 250 students, and a phat of 0.48 is found. The same survey is repeated with 1000 students, and the same phat value is found. Compare the two 95% confidence intervals. What is the relationship between them? Is the margin of error for the second one four times smaller? If not, how much smaller is it?
PLEASE WRITE IN R CODE. Has to output on R software. (1) The stem length of...
PLEASE WRITE IN R CODE. Has to output on R software. (1) The stem length of soybeans from an experiment are: 20.2, 22.9, 23.3, 20.0, 19.4, 22.0, 22.1, 22.0, 21.9, 21.5, 20.9 a. Create a histogram to visualize the data b. Test "t.test" whether the population mean is different from 22 c. Obtain a 2 sided 98% confidence interval on the true mean using "t.test". d. The researcher, by using "t.test" on a sample size of 11 was assuming that...
1)  Create a UEmployee class that contains member variables for the university employee name and salary. The...
1)  Create a UEmployee class that contains member variables for the university employee name and salary. The UEmployee class should contain member methods for returning the employee name and salary. Create Faculty and Staff classes that inherit the UEmployee class. The Faculty class should include members for storing and returning the department name. The Staff class should include members for storing and returning the job title. Write a runner program that creates one instance of each class and prints all of...
Can you explain with R Code: The probability that an electronic component fails in the first...
Can you explain with R Code: The probability that an electronic component fails in the first day of operation is 0.005. 400 items are tested independently and whether they fail or not after a day will be recorded. (a)What is the distribution of the number of items that fail? (b) What is the probability that at least two items fail? (c) Give the Poisson approximation for a, and compute the approximate answer to part b based on the Poisson approximation?...
I need r code for the following: In each simulation case, 3000 independent Gaussian time series...
I need r code for the following: In each simulation case, 3000 independent Gaussian time series of length n = 20, 50, 200, 800. Thanks
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT