Question

In: Math

This is for Predictive Analytics. 1. Read the iris data set into a data frame. 2....

This is for Predictive Analytics.

1. Read the iris data set into a data frame.

2. Print the first few lines of the iris dataset.

3. Output all the entries with Sepal Length > 5.

4. Plot a box plot of Petal Length with a color of your choice.

5. Plot a histogram of Sepal Width.

6. Plot a scatter plot showing the relationship between Petal Length and Petal Width.

7. Find the mean of Sepal Length by species. Hint: You could use the tapply function. Other methods are also acceptable.

8. Use the subset function to extract only rows where the species is "versicolor."

9. Install the dplyr package and load it on your console.

10. Use a function in the dplyr package to show only rows with Sepal Length <6 belonging to species "virginica."

Submit all the code and subsequent output as a word file

Solutions

Expert Solution

I am not able to attach a file, so would be sharing the code and output snippets.

Here is the code:

iris_data <- iris
head(iris_data)
names(iris_data)
iris_data[iris_data$Sepal.Length>5,]
boxplot(iris_data$Petal.Length,main="Petal Length",col = "gold")
hist(iris_data$Sepal.Width,main="Sepal Width",col="pink")
plot(iris_data$Petal.Length,iris_data$Petal.Width,col="red")
tapply(iris_data$Sepal.Length, iris_data$Species, mean)
subset(iris_data,iris_data$Species=="versicolor")
library(dplyr)
filter(iris_data,iris_data$Sepal.Length<6 & iris_data$Species=="virginica")

Hope this helps!


Related Solutions

Using R studio 1. Read the iris data set into a data frame. 2. Print the...
Using R studio 1. Read the iris data set into a data frame. 2. Print the first few lines of the iris dataset. 3. Output all the entries with Sepal Length > 5. 4. Plot a box plot of Petal Length with a color of your choice. 5. Plot a histogram of Sepal Width. 6. Plot a scatter plot showing the relationship between Petal Length and Petal Width. 7. Find the mean of Sepal Length by species. Hint: You could...
1. Consider the builtin dataset iris. a. What is the structure of the iris data frame?...
1. Consider the builtin dataset iris. a. What is the structure of the iris data frame? b. Create a histogram of the Sepal.Width variable. c. Create a histogram of the Petal.Width variable. d. For both histograms, does the data appear normally distributed? Are they skewed? e. For both histograms, does it appear that the data come from more than one populations? f. What is the mean and median of Sepal.Width? What is the variance and standard deviation? g. What is...
Explain the relationship between data mining and predictive analytics.
Explain the relationship between data mining and predictive analytics.
Part 2: What is the difference between historical analytics or predictive analytics in detail and provide...
Part 2: What is the difference between historical analytics or predictive analytics in detail and provide some example?
How is data analytics different from statistics? Analytics tools fall into 3 categories:descriptive, predictive, and prescriptive....
How is data analytics different from statistics? Analytics tools fall into 3 categories:descriptive, predictive, and prescriptive. What are the main differences among these categories? Explain how businesses use analytics to convert raw operational data into actionable information. Provide at least 1 example
(1) Read in the data and create an R data frame named tennis.dfr that has the...
(1) Read in the data and create an R data frame named tennis.dfr that has the following names for its columns: first.name, last.name, major.match.wins, major.match.losses, overall.match.wins, overall.match.losses, major.titles, overall.titles. (Note that the data file has several explanatory lines before the real data begin that should be skipped when reading in the data lines.) NOTE: For the file name, you must use the following web address (URL): "http://people.stat.sc.edu/hitchcock/tennisplayers2018.txt". Please do not have your code read in the file from your own...
The Iris data set is a well-known data set among data mining analysts. Please provide some...
The Iris data set is a well-known data set among data mining analysts. Please provide some background of this data set and the information contained in it.
There are 4 categories of data analytics discussed in Chapter 7 reading: descriptive, diagnotisc, predictive, and...
There are 4 categories of data analytics discussed in Chapter 7 reading: descriptive, diagnotisc, predictive, and prescriptive. Select one and explain. Give examples.
Question 6: Analytics Concepts What is a “model” in terms of predictive data analysis? Give an...
Question 6: Analytics Concepts What is a “model” in terms of predictive data analysis? Give an example. How will you explain the concept of hypothesis testing to a layperson? What is the difference between Mean and Median? If you could pick only one number to explain the salaries of NBA players, which one will you pick and why? “With big data we can escape the straightjacket of group identities and replace them with much more granular predictions for each individual....
Using R Question 3. kNN Classification 3.1 Read in iris dataset using “data(iris)”. Describe the features...
Using R Question 3. kNN Classification 3.1 Read in iris dataset using “data(iris)”. Describe the features in the data using summary 3.2 Randomize the iris data set, mix it up and normalize it 3.3 split data into training & testing (70/30 split) 3.4 Train model in data and use crosstable function to evaluate the results 3.5 Rerun your code for K=10 and 100. Compare results and explain
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT