In: Statistics and Probability
(a.) In order to determine performance-based bonuses, you collect data on the average value of weekly sales ($), largest sale made ($), and current salary ($) for various sales people.
What would the cases be (to be put in the rows of the data table before analyzing it)?
- sales people
- largest sale made ($)
- average value of weekly sales ($)
- current salary ($)
(b.) For a study of TV shows, you obtain the rating, market share,
and advertising revenue for 50 popular TV shows.
Which of these are statistical questions that could be answered from this data set? (Check all that apply.)
- Is there a relationship between the rating and advertising revenue of the 50 shows in this sample?
- For all TV shows on network television, what percentage have ratings of 5.0 or lower?
- What is the distribution of ratings for the 50 shows in this sample?
- Which of the 50 popular shows in this sample would you rate the highest?
- Does the market share of TV shows differ based on the gender of viewers?
(c.) Indicate how the following data set should be organized in order to analyze it. Indicate which items should go in the rows (the cases), and what the headings of columns should be (the variables).
Data collected for financial planning: daily income, day number of the month, daily income predicted by last month, difference between predicted income and realized income.
- Each row is a day. Columns hold the day number (identifier), the income prediction, the actual income, and the difference between the actual and predicted income.
- Each row is a difference between the actual and predicted income. Columns hold the differences (identifier), day number, predicted income, and actual income for that day.
- Each row is a different income amount. Columns hold the actual income amount (identifier), day number, predicted income, and the difference between the actual and predicted income.
- Each row is a income prediction. Columns hold the income (identifier), the day in which the income was predicted, the actual income of the predicted day, and the difference between the actual and predicted income.
a. 'Cases' are nothing but the objects on which data is collected .
Here the cases are :
sales people
b. The questions can be answered are :
- Is there a relationship between the rating and advertising revenue of the 50 shows in this sample?
as we have data on both these variables , we can simple check for correlation and see also the regression line by plotting them on a two dimensional scatter plot.
- For all TV shows on network television, what percentage have ratings of 5.0 or lower?
simply can be calculated mathematically
- What is the distribution of ratings for the 50 shows in this sample?
can be constructed a frequency table and then plot in graph.
now , why the other questions can't be answered ?
Which of the 50 popular shows in this sample would you rate the highest?
because this is a personal question not a statistical one.
- Does the market share of TV shows differ based on the gender of viewers?
no data on gender is captured.
c) Each row is a difference between the actual and predicted income. Columns hold the differences (identifier), day number, predicted income, and actual income for that day.
As data is collected against the day numbers , they should be the cases and others are variables.