In: Computer Science
Using R studio
1. Read the iris data set into a data frame.
2. Print the first few lines of the iris dataset.
3. Output all the entries with Sepal Length > 5.
4. Plot a box plot of Petal Length with a color of your choice.
5. Plot a histogram of Sepal Width.
6. Plot a scatter plot showing the relationship between Petal Length and Petal Width.
7. Find the mean of Sepal Length by species. Hint: You could use the tapply function. Other methods are also acceptable.
8. Use the subset function to extract only rows where the species is "versicolor."
9. Install the dplyr package and load it on your console.
10. Use a function in the dplyr package to show only rows with Sepal Length <6 belonging to species "virginica."
Code:
#Library Dataset was used.
#Loading dataset iris
library(datasets)
data(iris)
#1) reading dataset into a data frame.
cat("dataloaded in data frame\n\n")
data <- data.frame(iris)
#2) first few lines of iris dataset.
cat("few lines of dataset\n\n")
head(data)
cat("\n\n")
Output:
#3) Output all the entries with Sepal Length >
5.
cat("entries with Sepal Length > 5\n\n")
ans1 <- data[(data[,1]>5),]
ans1
cat("\n\n")
Output:
Note: The output of this question is large please run it on your PC and save the Output.
#4) Plot a box plot of Petal Length with a color of your
choice.
cat("boxplot of petal length\n\n")
boxplot(Petal.Length ~ Species, data=iris,
main="Box Plot",
col="red",
xlab="Species",
ylab="Sepal Length")
cat("\n\n")
Output:
#5) histogram of Sepal Width
cat("Histogram of sepal width\n\n")
sepal_width <- data$Sepal.Width
hist(sepal_width)
cat("\n\n")
Output:
#6) scatter plot showing the relationship between Petal
Length and Petal Width.
cat("Scatter plot\n\n")
plot(data$Petal.Length, data$Petal.Width)
cat("\n\n")
Output:
#7) Find the mean of Sepal Length by
species.
cat("mean of Sepal Length by species\n\n")
mean <- tapply(iris$Sepal.Length, iris$Species, mean)
mean
cat("\n\n")
Output:
#8) Use the subset function to extract only rows where the
species is "versicolor."
cat("subset function to extract only rows\n\n")
irisV <- subset(data, Species == "versicolor")
irisV
cat("\n\n")
Output:
#Note:
Please Write this in your R console and install the
library.
install.packages("dplyr")
#10) only rows with Sepal Length <6 belonging to species
"virginica."
library(dplyr)
cat("Sepal Length <6 belonging to species virginica\n\n")
ans <- filter(data, species == "virginica", sepal.length <
6)
ans
Output:
Explanation to code:
1) Library function is used to load a library in R.
2) Data function is used to load a specific dataset with name as
argument.
3) data.frame is used to frame the data which we have take from the
iris dataset.
4) Head function is used to print the first few values of the data
frame.
Detailed
Explanation:
1) ans1 <- data[(data[,1]>5),]
2) boxplot(Petal.Length ~ Species, data=iris,
main="Box Plot Q4",
col="red",
xlab="Species",
ylab="Sepal Length")
3) sepal_width <- data$Sepal.Width
hist(sepal_width)
4) plot(data$Petal.Length, data$Petal.Width)
5) mean <- tapply(iris$Sepal.Length, iris$Species, mean)
6) irisV <- subset(data, Species == "versicolor")
7) ans <- filter(data, species == "virginica", sepal.length < 6)
Note: You can refer the R documentation for more information regarding the function these are related just from the assignment.
Screenshot of code: