2. The data set `MLB-TeamBatting-S16.csv` contains MLB Team Batting Data for selected variables. Load the data...

2. The data set `MLB-TeamBatting-S16.csv` contains MLB Team Batting Data for selected variables. Load the data set from the given url using the code below. This data set was obtained from [Baseball Reference](https://www.baseball-reference.com/leagues/MLB/2016-standard-batting.shtml).
* Tm - Team
* Lg - League: American League (AL), National League (NL)
* BatAge - Batters’ average age
* RPG - Runs Scored Per Game
* G - Games Played or Pitched
* AB - At Bats
* R - Runs Scored/Allowed
* H - Hits/Hits Allowed
* HR - Home Runs Hit/Allowed
* RBI - Runs Batted In
* SO - Strikeouts
* BA - Hits/At Bats
* SH - Sacrifice Hits (Sacrifice Bunts)
* SF - Sacrifice Flies

Using the `mlb16.data` data, do the following:
i) use `filter` to select teams with the following arguments:
a) Cardinals team `STL`.
b) teams with Hits `H` more than 1400 last 2016 season.
c) team league `Lg` is National League `NL`.
ii) use `arrange` to select teams in decreasing number of home runs `HR`.
iii) use `arrange` to display the teams in decreasing number of `RBI`.
iv) use `group_by` to group the teams per league; and `summarise` to compute the average `RBI` within each league. Use the pipe `%>%` operator to string multiple functions.

### Code chunk
```{r}
# load the data set
mlb16.data <- read.csv("https://raw.githubusercontent.com/jpailden/rstatlab/master/data/MLB-TeamBatting-S16.csv")
str(mlb16.data) # check structure
head(mlb16.data) # show first six rows

# last R code line
```

Expert Solution

Code and results are given below

milcah answered 1 year ago

Load “Lock5Data” into your R console. Load “OlympicMarathon” data set in “Lock5Data”. This data set contains...

Load “Lock5Data” into your R console. Load “OlympicMarathon” data set in “Lock5Data”. This data set contains population of all times to finish the 2008 Olympic Men’s Marathon. a) What is the population size? b) Now using “Minutes” column generate a random sample of size 5. c) Calculate the sample mean and record it (create a excel sheet or write a direct R program to record this) d) Continue steps (b) and (c) 10,000 time (that mean you have recorded 10,000...

Data Set The data set (attached) is a modified CSV file on all International flight departing...

Data Set The data set (attached) is a modified CSV file on all International flight departing from US Airports between January and June 2019 reported by the US Department of Transportation (https://data.transportation.gov/Aviation/International_Report_Passengers/xgub-n9bw). Each record holds a route (origin to destination) operated by an airline. This CSV file was modified to keep it simple and relatively smaller. Here is a description of each column: Column 1 – Month (1 – January, 2 – February, 3 – March, 4 – April, 5...

The family college data set contains a sample of 792 cases with two variables, teen and...

The family college data set contains a sample of 792 cases with two variables, teen and parents, and is summarized in Table below. The teen variable is either college or not, where the teenager is labeled as college if she went to college immediately after high school. The parent variable takes the value degree if at least one parent of the teenager completed a college degree. Parents Degree Parents No Degree Total Teen College 231 214 445 Teen Not college...

Number 2 implemented in R (R Studio) Set up the Auto data: Load the...

** Number 2 implemented in R (R Studio) ** Set up the Auto data: Load the ISLR package and the Auto data Determine the median value for mpg Use the median to create a new column in the data set named mpglevel, which is 1 if mpg>median and otherwise is 0. Make sure this variable is a factor. We will use mpglevel as the target (response) variable for the algorithms. Use the names() function to verify that your new column...

Use the ERA and Win% data for the 8 randomly selected MLB pitchers below to answer...

Use the ERA and Win% data for the 8 randomly selected MLB pitchers below to answer the following questions. (please include the steps used to answer the questions. Thank you.) ERA 0.035 0.021 0.022 0.025 0.015 0.029 0.015 0.026 Win% 0.545 0.554 0.718 0.610 0.675 0.535 0.665 0.604 Question 1A Compute the IQR for Win%. Question 2B Compute the covariance between ERA and Win%. Question 3C Compute the correlation between ERA and Win%.

The data file contains displacement (in mm)-load (in N) data for a mechanical test that was...

The data file contains displacement (in mm)-load (in N) data for a mechanical test that was conducted on an unknown metal. The initial length and diameter of the specimen are also given. a. (5 pts.) Using the data and a computer program (such as Excel), create an engineering stress-engineering strain graph with proper labels. The stress axis should be in the units of MPa. You do not need to show your spreadsheet or software code used to make the graph....

Using the MLB attendance data, calculate 90% Confidence leveIs for each team mean attendance. What sampling...

Using the MLB attendance data, calculate 90% Confidence leveIs for each team mean attendance. What sampling distribution did you use? Why? Can you conclude that one team or another has larger crowds given these Confidence leveIs? Brewers White Sox Indians Twins 45,341 38,088 16,789 48,711 22,603 26,337 18,082 24,439 23,649 24,141 14,887 27,539 41,758 33,278 13,843 26,047 41,282 27,653 14,066 24,552 28,019 23,139 20,484 30,131 22,331 25,390 25,065 15,869 17,386 25,459 25,402 27,783 41,522 26,342 25,721 35,269 41,209 30,193 27,250 ...

Trauma and Metabolic Expenditure: The data set “MetabolicExpenditures”contains metabolic expenditures for eight randomly selected patients admitted...

Trauma and Metabolic Expenditure: The data set “MetabolicExpenditures”contains metabolic expenditures for eight randomly selected patients admitted to a hospital for reasons other than trauma (NonTrauma) and for seven randomly selected patients admitted for multiple fractures (Trauma). (Data from C.L. Long, et al. “Contribution of Skeletal Muscle Protein in Elevated Rates of Whole Body Protein Catabolism in Trauma Patients,” American Journal of Clinical Nutrition, 34: (1981): 1087—1093.) Before collecting the data, the researchers hypothesize that trauma patients have higher metabolic expenditures...

R Programming: Load the {ISLR} and {GGally} libraries. Load and attach the College{ISLR} data set. 1.2...

R Programming: Load the {ISLR} and {GGally} libraries. Load and attach the College{ISLR} data set. 1.2 Inspect the data with the ggpairs(){GGally} function, but do not run the ggpairs plots on all variables because it will take a very long time. Only include these variables in your ggpairs plot: “Outstate”,“S.F.Ratio”,“Private”,“PhD”,“Grad.Rate”. 1.3 Briefly answer: if we are interested in predicting out of state tuition (Outstate), can you tell from the plots if any of the other variables have a curvilinear relationship...

Use the Moneyball data set which contains selected statistics for Major League Baseball teams from 1962–2012....

Use the Moneyball data set which contains selected statistics for Major League Baseball teams from 1962–2012. Based on historical data, the probability that in a given year the NYM will make the playoffs is p = 7/47 = 0.149. Let X be the discrete random variable that gives the total number of Playoffs made by NYM in the last 20 years, i.e., from 1993 to 2012. (12 Points) What is the probability that the total...

Question