In: Computer Science
The data for this section comes from data.world, and represents
professional soccer players (data is a few years old, I think from
2016)
```{r}
players <-
read.csv('https://s3.amazonaws.com/douglas2/data/players.csv',
stringsAsFactors = F)
```
1. Without splitting the club variable, determine how many of
the 7,282 players are Forwards.
```{r}
# Your Solution Here
substring(players$club, 1, 7) [substring(players$club, 1, 7)=='Forwards']
2. Create separate columns for position and club (information
currently found in club column)
```{r}
# Your Solution Here
3. Construct a barplot that displays the number of players at each
position.
```{r}
# Your Solution Here
```
4. Construct a barplot that displays the average minutes played
by position.
```{r}
# Your Solution Here
```
5. What percent of clubs have an average player rank of 200 or
less?
```{r}
# Your Solution Here
These questions are based on the R studio. Please show me the exact steps that can generate the result that the questions are asking for.
ANSWER:
CODE TEXT
# reading csv, and storing in players
players <- read.csv('https://s3.amazonaws.com/douglas2/data/players.csv', stringsAsFactors = F)
# 1. Forward players, without splitting the club variable
# using substring function finding Forward club players. using sum function to sum all True values
# returns # of forward players
forward_players <- sum(substring(players$club, 1, 7)=="Forward")
# dispay result
cat("# of Forward players are: ",forward_players)
# 2. Create separate columns for position and club (information currently found in club column)
library(stringr)
splitClub <-str_split_fixed(players$club, " - ", 2)
# creating club and position col in players
players$position <- splitClub[,1]
players$clubName <- splitClub[,2]
# 3. Construct a barplot that displays the number of players at each position.
# using table function to fetch count of players per position
positionCount <- table(players$position)
barplot(positionCount,ylab = "Count",main = "Position Count")
# 4 Construct a barplot that displays the average minutes played by position.
meanMin <- aggregate(players$playedmins, list(players$position), mean)
# fetching mean values and seting their position names
meanVals <- meanMin$x
names(meanVals) <- meanMin$Group.1
barplot(meanVals,ylab = "Mean Minutes",main = "Mean Played min by position")
# 5. What percent of clubs have an average player rank of 200 or less?
# aggregate function to calculate mean rank club wise
avgRank <- aggregate(players$Rank, list(players$clubName), mean)
# percentage calculations
per <- sum(avgRank$x <= 200)/length(avgRank[,1]) * 100
# displayng result
cat("Percentage of clubs: ",per," %")
CODE IMAGE
OUTPUT IMAGE