(1) Read in the data and create an R data frame named tennis.dfr that has the...

(1) Read in the data and create an R data frame named tennis.dfr 
that has the following names for its columns:  first.name, last.name,
major.match.wins, major.match.losses, overall.match.wins, 
overall.match.losses, major.titles, overall.titles.  (Note that the 
data file has several explanatory lines before the real data begin 
that should be skipped when reading in the data lines.)
NOTE:  For the file name, you must use the following web address (URL): 
"http://people.stat.sc.edu/hitchcock/tennisplayers2018.txt".  
Please do not have your code read in the file from your own personal directory.

(2) Create and add two more columns called major.winning.pct and 
overall.winning.pct (showing winning percentage in the "major" and 
"overall" categories, respectively) to this data frame.
  
Note that "winning percentage" is defined 
as (match wins)/(match wins + match losses).

(3) Sort the data frame by major titles, from most to least.  
Have your program print the sorted data frame.

(4) Perform a nested sort, sorting the data frame first by major
titles (from most to least), and then by major winning percentage 
(from most to least) within major-title levels.
Have your program print this sorted data frame.

(5) Have R extract the subset of the data frame consisting of players
with at least 6 major titles.  Call this new data frame: greatest.dfr
Have your program print this new data frame.

(6)  In the most efficient way possible, have R calculate the sample means 
for each of the numeric variables in the tennis.dfr data set.
(Hint: Extract the appropriate subset of the data frame first.)

(7) Use the write.table() function to write the data set tennis.dfr to an
external file simply called "tennisdata.txt".  Make sure the external file includes the column names.
Also, make sure the players' names are NOT surrounded by quotes in the 
external file.

Expert Solution

(1) R-Code:

data = read.table("http://people.stat.sc.edu/hitchcock/tennisplayers2018.txt",
header = F, fill = TRUE, skip = 7)
colnames(data) = c("first.name", "last.name", "major.match.wins", "major.match.losses",
"overall.match.wins", "overall.match.losses", "major.titles",
"overall.titles")

(2)

data$major.winning.pct = (data$major.match.wins)/(data$major.match.wins + data$major.match.losses)
data$overall.winning.pct = (data$overall.match.wins)/(data$overall.match.wins + data$overall.match.losses)

(3)

data = data[order(-data$major.titles),]
View(data)

(4)

data = data[order(-data$major.titles, -data$major.winning.pct),]
View(data)

(5)

library(dplyr)
data1 = data %>% filter(major.titles >= 6)

(6)

sapply(data[3:8], mean)
major.match.wins major.match.losses overall.match.wins overall.match.losses
159.966667 41.700000 700.900000 225.833333
major.titles overall.titles
6.366667 50.700000

(7)

write.table(data, "tennisdata.txt", col.names = T, row.names = F, quote = F, sep = " ", qmethod = "double")

orchestra answered 2 years ago

Using R studio 1. Read the iris data set into a data frame. 2. Print the...

Using R studio 1. Read the iris data set into a data frame. 2. Print the first few lines of the iris dataset. 3. Output all the entries with Sepal Length > 5. 4. Plot a box plot of Petal Length with a color of your choice. 5. Plot a histogram of Sepal Width. 6. Plot a scatter plot showing the relationship between Petal Length and Petal Width. 7. Find the mean of Sepal Length by species. Hint: You could...

Create a data file frame in R called musseldata which has the following observations: species length...

Create a data file frame in R called musseldata which has the following observations: species length drywght tidehght calif 113 14.3 low tross 48 6.9 med calif 72 8.1 high calif 82 8.7 med tross 33 4.9 high tross 51 7.0 med calif 94 11.6 low Type the name of the data frame and copy/paste your R command the result into the green box. Use a logical condition with the subset() function to create a subset of the data called...

a. In R there is an built in data frame Nile. This has the annual flow...

a. In R there is an built in data frame Nile. This has the annual flow in river Nile for year 1871 to 1971. Produce a time series plot. Print graph(s). b. add the title as "Nile River Annual Flow", x axis label as "Year" and y axis label as "Flow". Print graph(s). c. Add a horizontal line showing the average flow over these years. Print graph(s). d. Add text as: "Average Flow:" with the calculated average flow on the...

This is for Predictive Analytics. 1. Read the iris data set into a data frame. 2....

This is for Predictive Analytics. 1. Read the iris data set into a data frame. 2. Print the first few lines of the iris dataset. 3. Output all the entries with Sepal Length > 5. 4. Plot a box plot of Petal Length with a color of your choice. 5. Plot a histogram of Sepal Width. 6. Plot a scatter plot showing the relationship between Petal Length and Petal Width. 7. Find the mean of Sepal Length by species. Hint:...

Java Create a Project named Chap4b 1. Create a Student class with instance data as follows:...

Java Create a Project named Chap4b 1. Create a Student class with instance data as follows: student id, test1, test2, and test3. 2. Create one constructor with parameter values for all instance data fields. 3. Create getters and setters for all instance data fields. 4. Provide a method called calcAverage that computes and returns the average test score for an object to the driver program. 5. Create a displayInfo method that receives the average from the driver program and displays...

Part 1 Create a class named Room which has two private data members which are doubles...

Part 1 Create a class named Room which has two private data members which are doubles named length and width. The class has five functions: a constructor which sets the length and width, a default constructor which sets the length to 12 and the width to 14, an output function, a function to calculate the area of the room and a function to calculate the parameter. Also include a friend function which adds two objects of the room class. Part...

Install and load the dataset named Carseats (in the ISLR package) into R. Create a new...

Install and load the dataset named Carseats (in the ISLR package) into R. Create a new dataframe that is a copy of Carseats. Create two indicator (dummy) variables: Bad_Shelf = 1 if ShelveLoc = “Bad”, 0 otherwise Good_Shelf = 1 if ShelveLoc = “Good”, 0 otherwise Also, create two interaction variables: Price_Bad_Shelf = Price* Bad_Shelf Price_Good_Shelf = Price* Good_Shelf For Questions 1-2, please estimate a linear regression model (using the lm function) with Sales as the dependent variable and Price,...

Create an interface named Shippable with a method named getVolume and another named getWeight. Neither has...

Create an interface named Shippable with a method named getVolume and another named getWeight. Neither has parameters and both return a number

USE R STUDIO. Consider the pressure data frame. There are two columns: temperature and pressure: •...

USE R STUDIO. Consider the pressure data frame. There are two columns: temperature and pressure: • Construct a scatterplot with pressure on the vertical axis and temperature on the horizontal axis. • The graph of the following function passes through the plotted points reasonably well: y = (0.168 + 0.007 ∗ x) ^(20/3). Recall that the differences between the pressure values predicted by the curve (i.e. y) and the observed pressure values (i.e. the pressure values obtained from the data...

Import the RestaurantRating1 dataset in R and save the resulting data frame. RestaurantRating1 is shown below...

Import the RestaurantRating1 dataset in R and save the resulting data frame. RestaurantRating1 is shown below as a table. Use some of the data wrangling techniques to transform the dataset into a tidy data. Use glimpse() function to show the resulting dataframe. Donalds Fila King Payes Wendi 1 3 1 1 1 2 3 1 1 2 2 3 1 2 2 3 3 1 2 2 3 3 1 3 3 3 3 5 3 3 3 3 5...

Question