In: Computer Science
Background: This course is all about data visualization. However, we must first have some understanding about the dataset that we are using to create the visualizations.
Assignment:
Questions/Requests:
Your document should be an easy-to-read font in MS Word (other word processors are fine to use but save it in MS Word format).
For : dataset_price_personal_computers.csv this is the link. With the help of this you can download that csv file
https://drive.google.com/file/d/1Op6XIzU5WuVF-w1OHqdcUJXMTUICaFy0/view
Sorry, i came to know that unable to download the file. Try this link and let me know
Please provide answers for this question including screenshots and code.
Thanks
data<-read.csv("C:/Users/meet/Desktop/dataset_price_personal_computers
.csv")[,-1]
#1
summary(data)
#2
str(data)
#creating a copy of data and saving it in data1
data1<-data
#converting column cd,multi & premium to integer type as they
are factors
data1[,6:8]<-lapply(data1[,7:9],as.integer)
# now finding the correlation between the variables
cor(data1)
#output for 2nd ques
#3
summary(data$price)
#3 another way
min(data$price)
max(data$price)
mean(data$price)
median(data$price)
#4 the correlation values between Price, Ram, and Ads
cor(data[,c('price','ram','ads')])
#5 Create a subset of the dataset with only Price, CD, and
Premium.
data_subset1=subset(data,select=c('price','cd','premium'))
#6 Create a subset of the dataset with only Price, HD, and Ram
where Price is
#greater than or equal to $1750.
data_subset2<-subset(data,data$price>=1750,select=c('price','hd','ram'))
#7 percentage of Premium computers were sold
# answer is 90.22%
tab_premium<-table(data$premium)
prop.table(tab_premium)
#8 Premium computers with CDs were sold
#answer is 2824
table(data$premium,data$cd)
#9 Premium computers with CDs priced over $2000 were sold
#answer is2002
table(data$cd[which(data$price>2000)])
#output of 4th
#output of 5th
#output of 6th
#output 0f 7, 8 &9