Question

In: Computer Science

The data for this section comes from data.world, and represents professional soccer players (data is a...


The data for this section comes from data.world, and represents professional soccer players (data is a few years old, I think from 2016)

```{r}
players <- read.csv('https://s3.amazonaws.com/douglas2/data/players.csv', stringsAsFactors = F)
```

1. Without splitting the club variable, determine how many of the 7,282 players are Forwards.
```{r}
# Your Solution Here

substring(players$club, 1, 7) [substring(players$club, 1, 7)=='Forwards']

2. Create separate columns for position and club (information currently found in club column)
```{r}
# Your Solution Here


3. Construct a barplot that displays the number of players at each position.
```{r}
# Your Solution Here
```

4. Construct a barplot that displays the average minutes played by position.
```{r}
# Your Solution Here
```

5. What percent of clubs have an average player rank of 200 or less?
```{r}
# Your Solution Here

These questions are based on the R studio. Please show me the exact steps that can generate the result that the questions are asking for.

Solutions

Expert Solution

ANSWER:

  • I have provided the properly commented code in both text and image format so you can easily copy the code as well as check for correct indentation.
  • I have provided the output image of the code so you can easily cross-check for the correct output of the code.
  • Have a nice and healthy day!!

CODE TEXT

# reading csv, and storing in players
players <- read.csv('https://s3.amazonaws.com/douglas2/data/players.csv', stringsAsFactors = F)

# 1. Forward players, without splitting the club variable
# using substring function finding Forward club players. using sum function to sum all True values
# returns # of forward players
forward_players <- sum(substring(players$club, 1, 7)=="Forward")
# dispay result
cat("# of Forward players are: ",forward_players)

# 2. Create separate columns for position and club (information currently found in club column)
library(stringr)
splitClub <-str_split_fixed(players$club, " - ", 2)
# creating club and position col in players
players$position <- splitClub[,1]
players$clubName <- splitClub[,2]

# 3. Construct a barplot that displays the number of players at each position.
# using table function to fetch count of players per position
positionCount <- table(players$position)
barplot(positionCount,ylab = "Count",main = "Position Count")

# 4 Construct a barplot that displays the average minutes played by position.
meanMin <- aggregate(players$playedmins, list(players$position), mean)
# fetching mean values and seting their position names
meanVals <- meanMin$x
names(meanVals) <- meanMin$Group.1
barplot(meanVals,ylab = "Mean Minutes",main = "Mean Played min by position")

# 5. What percent of clubs have an average player rank of 200 or less?
# aggregate function to calculate mean rank club wise
avgRank <- aggregate(players$Rank, list(players$clubName), mean)
# percentage calculations
per <- sum(avgRank$x <= 200)/length(avgRank[,1]) * 100
# displayng result
cat("Percentage of clubs: ",per," %")

CODE IMAGE

OUTPUT IMAGE


Related Solutions

A coach claims that more than 65% of professional soccer players started playing soccer before the...
A coach claims that more than 65% of professional soccer players started playing soccer before the age of 7. In a survey 80 randomly selected professional soccer players, they found that 59 started playing soccer before the age of 7. Test the coach’s claim at a 10% significance level. a) Define the parameter and random variable of interest. b) State the null and alternative hypotheses, and identify the claim. c) Determine the distribution of the test statistic. (Check the relevant...
A coach claims that more than 65% of professional soccer players started playing soccer before the...
A coach claims that more than 65% of professional soccer players started playing soccer before the age of 7. In a survey 80 randomly selected professional soccer players, they found that 59 started playing soccer before the age of 7. Test the coach’s claim at a 10% significance level. a) Define the parameter and random variable of interest. b) State the null and alternative hypotheses, and identify the claim. c) Determine the distribution of the test statistic. (Check the relevant...
Inventory Purchases Sunflower Soccer Club is a soccer club for dedicated youth soccer players. Use the...
Inventory Purchases Sunflower Soccer Club is a soccer club for dedicated youth soccer players. Use the following information and the t-accounts provided to record the club’s April purchase transactions, assuming it uses the perpetual inventory system. Sunflower purchased 420 engraved medals on account from Evans Engraving for $3.40 each on April 8. The terms of the purchase were 4/10, n/30. The medals were for the winners of a soccer tournament the following weekend. When Sunflower received the medals on April...
Python Please Define a class that will represent soccer players as objects. A soccer player will...
Python Please Define a class that will represent soccer players as objects. A soccer player will have as attributes, name, age, gender, team name, play position on the field, total career goals scored. The class should have the following methods: 1. initializer method that will values of data attributes arguments. Use 0 as default for career goals scored. 2. str method to return all data attributes as combined string object. 3. addToGoals that will accept an argument of int and...
In a football or soccer game, you have 22 players, from both teams, in the field....
In a football or soccer game, you have 22 players, from both teams, in the field. What is the probability of having any two players with the same birthday? (just assume 365 days a year and don’t have to do the exact calendar month and day, use the day number from 1 to 365) Find the closed form mathematical solution by probability theory. Show your derivation/proof.
The following data represent weights (pounds) of a random sample of professional football players on the...
The following data represent weights (pounds) of a random sample of professional football players on the following teams. X1 = weights of players for the Dallas Cowboys X2 = weights of players for the Green Bay Packers X3 = weights of players for the Denver Broncos X4 = weights of players for the Miami Dolphins X5 = weights of players for the San Francisco Forty Niners You join a Fantasy Football league and you are wondering if weight is a...
The following data represent weights (pounds) of a random sample of professional football players on the...
The following data represent weights (pounds) of a random sample of professional football players on the following teams. X1 = weights of players for the Dallas Cowboys X2 = weights of players for the Green Bay Packers X3 = weights of players for the Denver Broncos X4 = weights of players for the Miami Dolphins X5 = weights of players for the San Francisco Forty Niners You join a Fantasy Football league and you are wondering if weight is a...
Two soccer players, Mary and Jane, begin running from nearly the same point at the same...
Two soccer players, Mary and Jane, begin running from nearly the same point at the same time. Mary runs in an easterly direction at 4.34 m/s, while Jane takes off in a direction 60.9o north of east at 5.71 m/s. How long is it before they are 26.7 m apart? Tries 0/8 What is the velocity of Jane relative to Mary? Enter first the x-component and then the y-component. Tries 0/8 How far apart are they after 3.96 s?
Consider sampling heights from the population of all female college soccer players in the United States....
Consider sampling heights from the population of all female college soccer players in the United States. Assume the mean height of female college soccer players in the United States is μ = 67 inches and the standard deviation is σ =3.6 inches. Suppose we randomly sample 98 values from this population and compute the mean, then repeat this sampling process 5000 times and record all the means we get. Which of the following is the best approximation for the mean...
The data in TECHPRO.sav, obtained from Business Week’s (June 22, 2006) technology section, represents typical salaries...
The data in TECHPRO.sav, obtained from Business Week’s (June 22, 2006) technology section, represents typical salaries of technology professionals in 13 metropolitan areas for 2003 and 2005. Suppose you want to determine if the mean salary of technology professionals at all US. Metropolitan areas have increased between 2003 and 2005. (a) Set up the null and alternative hypothesis for the test.    (f) Conduct the appropriate test and provide your conclusion. More specifically, I want you to examine   whether the...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT