In: Statistics and Probability
Instructions
Scenario (information repeated for deliverable 01, 03, and 04)
A major client of your company is interested in the salary distributions of jobs in the state of Minnesota that range from $30,000 to $200,000 per year. As a Business Analyst, your boss asks you to research and analyze the salary distributions. You are given a spreadsheet that contains the following information:
A listing of the jobs by title
The salary (in dollars) for each job
You have previously explained some of the basic statistics to your client already, and he really liked your work. Now he wants you to analyze the confidence intervals.
Background information on the Data
The data set in the spreadsheet consists of 364 records that you will be analyzing from the Bureau of Labor Statistics. The data set contains a listing of several jobs titles with yearly salaries ranging from approximately $30,000 to $200,000 for the state of Minnesota.
What to Submit
Your boss wants you to submit the spreadsheet with the completed calculations. Your research and analysis should be present within the answers provided on the worksheet.
WORKSHEET:
Deliverable 03 Worksheet
Discuss the importance of constructing confidence intervals for the population mean by answering these questions.
What are confidence intervals?
What is a point estimate?
What is the best point estimate for the population mean? Explain.
Why do we need confidence intervals?
Answer and Explanation:
Enter your step-by-step answer and explanations here. |
Using the data from the Excel workbook, construct a 95% confidence interval for the population mean. Assume that your data is normally distributed and σ is unknown. Include a statement that correctly interprets the confidence interval in context of the scenario.
Hint: Use the sample mean and sample standard deviation from Deliverable 1.
Answer and Explanation:
Enter your step-by-step answer and explanations here. |
Using the data from the Excel workbook, construct a 99% confidence interval for the population mean. Assume that your data is normally distributed and σ is unknown. Include a statement that correctly interprets the confidence interval in context of the scenario.
Hint: Use the sample mean and sample standard deviation from Deliverable 1.
Answer and Explanation:
Enter your step-by-step answer and explanations here. |
Compare your answers for (2) and (3). You notice that the 99% confidence interval is wider. What is the advantage of using a wider confidence interval? Why would you not always use the 99% confidence interval? Explain with an example.
Answer and Explanation:
Enter your step-by-step answer and explanations here. |
We want to estimate the mean salary in Minnesota. How many jobs must be randomly selected for their respective mean salaries if we want 95% confidence that the sample mean is within $126 of the population mean and σ = $1150.
Is the current sample size of 364 in the data set in our Excel workbook large enough? Explain.
Answer and Explanation:
Enter your step-by-step answer and explanations here. |
Discuss the importance of the Measures of Center and the
Measures of Variation.
Measures of central tendency are important as they give us a single
central value which is the representative of the given set of
observations.
The measures of Variation gives us the variability of all the
observations from the calculated central value.
Both these measures helps us to genralise the data and gives us a
general idea about its distribution/spread.
What are the measures of center and why are they
important?
The measures of central tendency are mean, median and mode. They
generalises the entire data and gives us an estimate of the central
value around which the data is spread.
What are the measures of variation and why are they
important?
The measures of dispersion/variation are the range, variance and
standard deviation. They gives us an idea of how far the
observations lie from the central value.
Summary Statistics | Value | Formula | Interpretation | Formula used in excel |
---|---|---|---|---|
Mean | 62,306 | The average salary for the different jobs in the state of Minnesota is 62,306 | =average(array) | |
Median | 56,520 | Arrange the data in ascending or descending order. The middle most value gives the median. | The median salary for the different jobs in the state of Minnesota is 56,520 i.e. 50% of the salary lies below this value and 50% of the salaries are above 56520. | =median(array) |
Mode | 46,100 | The highest occuring number in the data | The highest occuring salary for different jobs in the state of Minnesota is 46100 | =mode(array) |
Midrange | 80010 | It gives the middle salary range of the different jobs in the state of Minnesota by taking into consideration the average of the largest and smallest value. | =(max(array) + min(array))/2 | |
Range | 79,680 | It gives the range of variation of all the salary of the different jobs in the state of Minnesota. It helps us get a rough idea about how spread the salary is in the state. | =max(array) - min(array) | |
Variance | 366692391.3 | Variance gives us an estimate of how far the salary of the different jobs of the state vary from the central value. A lower variance is preferred in ideal conditions. | =var(array) | |
Standard Deviation | 19149.21386 | Standard deviation is the square root of variance and it is also used to estimate the amount of variability in the salary of different jobs. | =stdev(array) |
where array is the range of salary numbers.