Question

In: Statistics and Probability

Please use Statistical Software R Consider a dataset called fandango in fivethirtyeight package: Identify the Top...

Please use Statistical Software R

Consider a dataset called fandango in fivethirtyeight package:

  1. Identify the Top 5 best rated and Top 5 worst rated movies based on rottentomatoes.

  2. Identify the Top 5 best rated and Top 5 worst rated movies based on the average of three users’ scores (rottentomatoes_user, metacritic_user, and imdb).

  3. Visualize the difference between Fandango stars and actual Fandango ratings. Comment on what you see.

  4. Construct a formal test to see if there is a significant difference between between Fandango stars and actual Fandango ratings.

Solutions

Expert Solution

We have written R code for this problem

the code is as below

#########################################################
install.packages("fivethirtyeight")
require(fivethirtyeight)
z=fandango
"rottentomatoes" %in% names(z)
n=length(z$rottentomatoes)
sort(z$film,partial=n-3)[n-3]

#worse 5
z$film[order(z$rottentomatoes)[1:5]]

#top 5
z$film[order(z$rottentomatoes,decreasing = T)[1:5]]


# finding average of 3 columns
new_average=rowMeans(z[,c("rottentomatoes_user" ,"metacritic_user" ,"imdb")])

# worse 5 based on new average
z$film[order(new_average)[1:5]]

# top 5 based on new average
z$film[order(new_average,decreasing = T)[1:5]]
names(z)

plot(z$fandango_stars,col="red",xlab = "flim",main = "Comparison of Fandango stars and actual Fandango ratings")
points(z$fandango_ratingvalue,col="blue")


# test for difference test for signicance paired t-test
t.test(z$fandango_stars,z$fandango_ratingvalue,paired = T)

here p-value=0.000<0.05 we can say that there is a significant difference in these two variables.


Related Solutions

I need this in R code please: Use the dataset ’juul’ in package ’ISwR’ to answer...
I need this in R code please: Use the dataset ’juul’ in package ’ISwR’ to answer the question. (1) Conduct one-way ANOVA test to test if the mean of igf1 of each level of tanner are the same? (2) What is the mean of igf1 in each level of tanner? (3) If there is any difference, which ones appear to be different? (Use pairwise t test for each pair of level with bonferroni method)
Q3. Consider the matrix A . Use R statistical software to determine the eigenvalues and normalized...
Q3. Consider the matrix A . Use R statistical software to determine the eigenvalues and normalized eigenvectors of A, trace of A, determinant of A, and inverse of A. Also determine the eigenvalues and normalized eigenvectors of A-1. Your answer should include your R code (annotated with comments) and a hand-written or typed summary of the answers from the R output.
Solve it by R Use the ‘cement’ dataset in ‘MASS’ package to answer the question. (1)...
Solve it by R Use the ‘cement’ dataset in ‘MASS’ package to answer the question. (1) Conduct the multiple linear regression, regress y value on x1, x2, x3 and x4 (without intercept). Report the estimated coefficients. Which predictor variables have strong linear relationship with response variable y at significance level 0.05? (2) What is the adjusted R square of your regression? What is the interquartile range (IQR) of the residuals from your regression? (3) Conduct a best subset regression (with...
Consider the beauty dataset from the wooldridge package in R. Suppose you wish to estimate the...
Consider the beauty dataset from the wooldridge package in R. Suppose you wish to estimate the following equation: lwage=β0+β1educ+u Using heteroscedastic-robust standard errors, conduct the hypothesis test H0:β1=0 H1:β1≠0 What is the t-value associated with this test?
Instructions: You must use Excel or a similar statistical software package, such as SPSS, for your...
Instructions: You must use Excel or a similar statistical software package, such as SPSS, for your assignment. Tasks: 1.Select thirty stocks (at least 3 different industries) that are listed on the Toronto Stock Exchange. 2.Track each stock’s closing price at the end of the trading day. These closing prices will appear on the Internet. Collect the stock closing price data from January 2017 to December 2019. 4. For each stock, use your data to calculate and interpret: (show calculations) a)the...
Please describe the advantages and disadvantages of statistical package SAS, SPSS, R, and EXCEL (for A4...
Please describe the advantages and disadvantages of statistical package SAS, SPSS, R, and EXCEL (for A4 one page amount)
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In...
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In the study, three treatments (Treat) were applied to groups of young female anorexia patients, and their weights before (Prewt) and after (Postwt) treatment were recorded. The three treatments adminstered were no treatment (Cont), Cognitive Behavioural treatment (CBT), and family treatment (FT). Determine at the 5% significance level if there is a difference in mean weight gain between those receiving no treatment and those receiving...
Install and load the dataset named Carseats (in the ISLR package) into R. Run a multiple...
Install and load the dataset named Carseats (in the ISLR package) into R. Run a multiple linear regression with all the variables. Using the coefficients, write down the model. ( be careful with the qualitative variable ShelveLoc. ) obtain the interaction plot of ShelveLoc and price.
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In...
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In the study, three treat- ments (Treat) were applied to groups of young female anorexia patients, and their weights before (Prewt) and after (Postwt) treatment were recorded. The three treatments adminstered were no treatment (Cont), Cognitive Behavioural treatment (CBT), and family treatment (FT). Determine at the 5% significance level if there is a difference in mean weight gain between those receiving no treatment and those...
1. The dataset prostate (in R package ”faraway”) is from a study on 97 men with...
1. The dataset prostate (in R package ”faraway”) is from a study on 97 men with prostatecancer who were due to receive a radical prostatectomy.Fit a model withlpsa(y) as the response variable andlcavol(x) as the predictor andanswer the following question: •Calculate and plot the 90%confidenceandpredictionbands. Which type ofintervals are wider?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT