In: Statistics and Probability
Your responses are to be based on your own experiences and insights. Do not use materials and examples from other sources, including the Internet. Students have used examples from outside sources in the past and have failed this final assessment and in some cases, failed the course due to evidence of plagiarism.
1) Hypothesis testing can be used in an IT job for comparing the effect of a skill training on the average performance of a team. The performance here is measured by the year-end ratings of the different individuals.
The null hypothesis would be that the training had no impact, i.e. the year-end average ratings remain the same before and after the training. The alternative hypothesis would be that the average ratings increase post the training.
The data can easily be collected by approaching the manager who has been managing the team over the years.
2) Before an election, people will have varying degrees of opinions about the number of votes won by a certain candidate. The mean number of votes won by the candidate can be calculated and the margin of error (calculated using the std deviation of the different opinions about vote counts and the number of people surveyed) can be used to calculate a confidence interval for the number of votes the candidate is expected to win at the given confidence level.
The data can be collected by surveying the population using a probabilistic sampling approach like stratified sampling.
3) Regression analysis can be used to analyze the relationship between my salary and the number of weekly hours spent learning new skills over a period of time. If the salary has been increasing during times of higher weekly training hours, this suggests trainings have helped me earn more because of my upgraded skills.
Independent variable here would be number of training hours per week.
Dependent variable is the salary earned.
Data can be collected by logging the average number of hours spent per week in trainings, and the salary growth over time.
4) The ratings of employees in a team are typically used to form a bell curve (normal distribution) or a histogram. Either of these 2 curves can then be used to analyze the amount of budget pool to be awarded to the employees falling in different segments of the curve/graph.
The pros of using a normal curve is that it helps grade relatively using a distribution quite close to a lot of phenomena in real life, and is easier to use. The cons are that it has very thin tails and hence doesn't model extreme scenarios very well, which generally happen with a significantly non-zero probability (normal distribution models these extreme scenarios as near zero probability events).