Assignment:
Install and load the ggplot2 package.
load the "diamonds" dataset
RCode:
install.packages("ggplot2")
library(ggplot2)
?diamonds
1. Explore the dataset & state insights
2. Create plots for dataset
3: Provide summary of descriptive stats)
4. Run the regressions, research, Investigate & comment on
R^2 & on regression plots - 1 line each.
#===========================================
# DV = Price, IV or IVs = your choice
# Can we create and compare models to predict "Price"?
# Question- Investigate & comment on R^2 & on
plots
#Compare regression models & discuss R^2 -any
improvement?
# Based on your understanding of regression models, select the
best model
#to predict the price of diamonds based on the dataset
#Name your R file as LastNameFirstInitial.R and include your
full name in the first line of the script.
diamonds {ggplot2} R Documentation
Prices of 50,000 round cut diamonds
Description
A dataset containing the prices and other attributes of almost
54,000 diamonds. The variables are as follows:
Usage
diamonds
Format
A data frame with 53940 rows and 10 variables:
price
price in US dollars (\$326–\$18,823)
carat
weight of the diamond (0.2–5.01)
cut
quality of the cut (Fair, Good, Very Good, Premium,
Ideal)
color
diamond colour, from J (worst) to D (best)
clarity
a measurement of how clear the diamond is (I1 (worst), SI2,
SI1, VS2, VS1, VVS2, VVS1, IF (best))
x
length in mm (0–10.74)
y
width in mm (0–58.9)
z
depth in mm (0–31.8)
depth
total depth percentage = z / mean(x, y) = 2 * z / (x + y)
(43–79)
table
width of top of diamond relative to widest point (43–95)