Question

In: Statistics and Probability

3 dplyr Let’s work with the data set diamonds : data(diamonds) head(diamonds) A) Calculate the average...

3 dplyr
Let’s work with the data set diamonds :

data(diamonds)
head(diamonds)

A) Calculate the average price of a diamond:

[your code here]

B) Use group_by() to group diamonds by color, then use summarise() to calculate the average price and the standard deviation in price by color:

[your code here)

C) Use group_by() to group diamonds by cut, then use summarise() to count the number of observations by cut:

[your code here]

D) Use filter() to remove observations with a depth greater than 62, then use group_by() to group diamonds by clarity, then use summarise() to find the maximum price of a diamond by clarity:

[your code here]

E) Use mutate() and log() to add a new variable to the data called “log_price”:

[your code here]

Expert Solution

Solution-A:

Rcode:

library(ggplot2)
library(dplyr)

diamonds %>%
summarise(Average = mean(price),

)

Output:

Average
<dbl>
1 3933.

Solution-B:

Rcode;

diamonds %>%
group_by(color) %>%
summarise(Avg_price = mean(price),
std_deviation = sd(price))

Output:

color Avg_price std_deviation
<ord> <dbl> <dbl>
1 D 3170. 3357.
2 E 3077. 3344.
3 F 3725. 3785.
4 G 3999. 4051.
5 H 4487. 4216.
6 I 5092. 4722.
7 J 5324. 4438.

Solution-c:

Rcode:

diamonds %>%
group_by(cut) %>%
summarise(counts = n())

Output:

cut counts
<ord> <int>
1 Fair 1610
2 Good 4906
3 Very Good 12082
4 Premium 13791
5 Ideal 21551

Solution-D

depgt_62 <- filter(diamonds, depth > 62)
depgt_62 %>%
group_by(clarity) %>%
summarise(max_price = max(price))

Output:

clarity max_price
<ord> <int>
1 I1 18531
2 SI2 18804
3 SI1 18818
4 VS2 18791
5 VS1 18500
6 VVS2 18768
7 VVS1 18777
8 IF 18552

Rscreenshot:

orchestra answered 2 years ago

Consider the diamonds data set. How many diamonds are there in the dataset with a cut...

Consider the diamonds data set. How many diamonds are there in the dataset with a cut considered Premium? 4906 12082 13791 21551 1610

For the attached data set, 1. create a 3-month and 6-month moving average forecast. 2. Calculate...

For the attached data set, 1. create a 3-month and 6-month moving average forecast. 2. Calculate the standard errors 3. compare their forecast accuracy Month/Year Unemployment rate Jan-17 5.1 Feb-17 4.9 Mar-17 4.6 Apr-17 4.1 May-17 4.1 Jun-17 4.5 Jul-17 4.6 Aug-17 4.5 Sep-17 4.1 Oct-17 3.9 Nov-17 3.9 Dec-17 3.9 Jan-18 4.5 Feb-18 4.4 Mar-18 4.1 Apr-18 3.7 May-18 3.6 Jun-18 4.2 Jul-18 4.1 Aug-18 3.9 Sep-18 3.6 Oct-18 3.5 Nov-18 3.5

Now, let’s calculate the least-squares line based on your data. Show your work. x y x2...

Now, let’s calculate the least-squares line based on your data. Show your work. x y x2 xy y2 1045 183 2266 283 584 163 444 205 2746 283 698 146 796 143 1304 223 2. Determine the Sample Correlation Coefficient, .

Using R calculate the following properties of the Data Set given below: (a) The average (mean)...

Using R calculate the following properties of the Data Set given below: (a) The average (mean) value for each of the four features (b) (b) the standard deviation for each of the features (c) repeat steps (a) and (b) but separately for each type of flower (d) (d) draw four box plots, one for each feature, such that each figure shows three boxes, one for each type of flower. Properly label your axes in all box plots. Data Set {...

Construct a scattergram for each data set. Then calculate r and r2 for each data set....

Construct a scattergram for each data set. Then calculate r and r2 for each data set. Interpret their values. Complete parts a through d. a. x −1 0 1 2 3 y −3 0 1 4 5 Calculate r. r=. 9853.(Round to four decimal places as needed.) Calculate r2. r2=0.9709(Round to four decimal places as needed.) Interpret r. Choose the correct answer below. A.There is not enough information to answer this question. B.There is a very strong negative linear relationship...

R Language library(tidyverse) data(diamonds) (a) How many diamonds have a `Very Good` cut or better? ...

R Language library(tidyverse) data(diamonds) (a) How many diamonds have a `Very Good` cut or better? - Note that cut is an *ordered factor* so the levels are in order. (b) Which diamond has the highest price per carat (ppc = price / carat)? What is the value? (c) Find the 95th percentile for diamond price. - Try the `quantile()` function. (d) What proportion of the diamonds with a price above the 95th percentile and have the color `D`...

Construct a scattergram for each data set. Then calculate r and r 2 for each data...

Construct a scattergram for each data set. Then calculate r and r 2 for each data set. Interpret their values. Complete parts a through d a. x −1 0 1 2 3 y −3 0 1 4 5 Calculate r. r=. 9853 (Round to four decimal places as needed.) Calculate r2. r2=0.9709. (Round to four decimal places as needed.) Interpret r. Choose the correct answer below. A.There is not enough information to answer this question. B.There is a very strong...

Question

3 dplyr Let’s work with the data set diamonds : data(diamonds) head(diamonds) A) Calculate the average...

Solutions

Expert Solution

Related Solutions

Consider the diamonds data set. How many diamonds are there in the dataset with a cut...

For the attached data set, 1. create a 3-month and 6-month moving average forecast. 2. Calculate...

Now, let’s calculate the least-squares line based on your data. Show your work. x y x2...

Using R calculate the following properties of the Data Set given below: (a) The average (mean)...

Construct a scattergram for each data set. Then calculate r and r2 for each data set....

R Language library(tidyverse) data(diamonds) (a) How many diamonds have a `Very Good` cut or better? ...

Construct a scattergram for each data set. Then calculate r and r 2 for each data...

a) Use the data in the table and calculate the average costs and the marginal cost...

Calculate taxable income for Rod Thirion, who files head of household and claims 3 exemptions: salary,...

Calculate the average value of the numbers 3, 3, 5, 5 first by calculating the normal...