In: Math
This is for Predictive Analytics.
1. Read the iris data set into a data frame.
2. Print the first few lines of the iris dataset.
3. Output all the entries with Sepal Length > 5.
4. Plot a box plot of Petal Length with a color of your choice.
5. Plot a histogram of Sepal Width.
6. Plot a scatter plot showing the relationship between Petal Length and Petal Width.
7. Find the mean of Sepal Length by species. Hint: You could use the tapply function. Other methods are also acceptable.
8. Use the subset function to extract only rows where the species is "versicolor."
9. Install the dplyr package and load it on your console.
10. Use a function in the dplyr package to show only rows with Sepal Length <6 belonging to species "virginica."
Submit all the code and subsequent output as a word file
I am not able to attach a file, so would be sharing the code and output snippets.
Here is the code:
iris_data <- iris
head(iris_data)
names(iris_data)
iris_data[iris_data$Sepal.Length>5,]
boxplot(iris_data$Petal.Length,main="Petal Length",col =
"gold")
hist(iris_data$Sepal.Width,main="Sepal Width",col="pink")
plot(iris_data$Petal.Length,iris_data$Petal.Width,col="red")
tapply(iris_data$Sepal.Length, iris_data$Species, mean)
subset(iris_data,iris_data$Species=="versicolor")
library(dplyr)
filter(iris_data,iris_data$Sepal.Length<6 &
iris_data$Species=="virginica")
Hope this helps!