Bullying in schools has been the focus of various programs which work to reduce incidents of bullying and increase the reporting of activity related and/or involving this behavior. A very large school district implemented a new anti-bullying program designed to reduce incidents of bullying. For the past 20 years, the school district has received an average of 370 incidents of bullying across all schools in its district. The data below represent the number of bullying incidents for 15 schools at the end of the first year after implementing the new anti-bullying program: Bullying incidents reported
340 384 370 352 328 348 365 355 335 330 360 376 325 375 368
(B) What is the dependent variable and its scale (i.e., NOIR) of measurement?
B. What is the null hypothesis for the problem described above?
C. Conduct a test of the null hypothesis using p = .05. Be sure to properly state your statistical conclusion.
D. Provide an interpretation of your statistical conclusion from part c.
E. What type of statistical error might you have made?
F. How could the researcher increase statistical power for this study?
G. What is the 95% confidence interval for this study?
H. Provide an interpretation for the interval obtained in part g.
I. How does the confidence interval obtained in part g compare to your statistical conclusion from part c? (i.e., Where does the value expected by the null hypothesis fall in the confidence interval and how does this support or not support your decision to reject or not reject the null hypothesis?)
In: Statistics and Probability
The birthday problem considers the probability that two people in a group of a given size have the same birth date. We will assume a 365 day year (no leap year birthdays).
Code set-up
Dobrow 2.28 provides useful R code for simulating the birthday problem. Imagine we want to obtain an empirical estimate of the probability that two people in a class of a given size will have the same birth date. The code
trial = sample(1:365, numstudents, replace=TRUE)
simulates birthdays from a group of numstudents students. So you can assign numstudents or just replace numstudents with the number of students in the class of interest.
If we store the list of birthdays in the variable trial, the code
2 %in% table(trial)
will create a frequency table of birthdays and then determine if there is a match (2 birthdays the same). We can use this code in an if-else statement to record whether a class has at least one pair of students with the same birth date. We then can embed the code within a for-loop to repeat the experiment, store successes in a vector, and then take the average number of successes (a birthday match) across the repeated tasks.
The problems
Recall that the true probability is 1-prod(seq(343,365))/(365)^23 which is approximately 50%.
# [Place code here]
Place your answers to the three items below here:
The birthday problem considers the probability that two people in a group of a given size have the same birth date. We will assume a 365 day year (no leap year birthdays).
Code set-up
Dobrow 2.28 provides useful R code for simulating the birthday problem. Imagine we want to obtain an empirical estimate of the probability that two people in a class of a given size will have the same birth date. The code
trial = sample(1:365, numstudents, replace=TRUE)
simulates birthdays from a group of numstudents students. So you can assign numstudents or just replace numstudents with the number of students in the class of interest.
If we store the list of birthdays in the variable trial, the code
2 %in% table(trial)
will create a frequency table of birthdays and then determine if there is a match (2 birthdays the same). We can use this code in an if-else statement to record whether a class has at least one pair of students with the same birth date. We then can embed the code within a for-loop to repeat the experiment, store successes in a vector, and then take the average number of successes (a birthday match) across the repeated tasks.
The problems
Recall that the true probability is 1-prod(seq(343,365))/(365)^23 which is approximately 50%.
# [Place code here]
Place your answers to the three items below here:
In: Statistics and Probability
The following data are from an experiment designed to investigate the perception of corporate ethical values among individuals specializing in marketing (higher scores indicate higher ethical values).
Marketing Managers |
Marketing Research |
Advertising |
---|---|---|
3 | 6 | 6 |
4 | 6 | 6 |
4 | 5 | 6 |
3 | 6 | 7 |
5 | 5 | 6 |
5 | 5 | 5 |
a) Use α = 0.05 to test for significant differences in perception among the three groups.
Find the value of the test statistic. _______
Find the p-value. (Round your answer to three decimal places.)
p-value = _______
b) At the α = 0.05 level of significance, we can conclude that there are differences in the perceptions for marketing managers, marketing research specialists, and advertising specialists. Use the procedures in Section 13.3 to determine where the differences occur. Use α = 0.05. (Use the Bonferroni adjustment.)
Find the value of LSD. (Round your comparisonwise error rate to four decimal places. Round your answer to three decimal places.)
LSD = _______
Find the pairwise absolute difference between sample means for each pair of treatments. |
xMM − xMR = ________
xMR − xA = ________xMM − xA = ________
In: Statistics and Probability
Here are some prices for randomly selected grocery items from the grocery store:
Items Prices:
Cheese $3.29
Butter $4.99
Eggs $3.49
Yogurt $3.49
Juice $3.89
Tea $3.69
Chips $3.99
Soda $1.99
Pastry $2.99
Cerrial $4.99
Oats $3.29
Almond Milk $2.79
Almonds $4.39
Popcorn $3.29
Crackers $3.59
Ice Cream $6.99
Cookies $2.99
Jam $3.69
Peanut Butter $3.29
Coffee $3.19
Green Tea $4.99
BBQ Sauce $2.99
Oil $6.69
Mayonnaise $4.59
Mustard $2.99
1. Compute the sample mean x and the sample standard deviation of grocery store prices.
2. State the population mean µx and population standard deviation σx of sample mean in terms of population mean and population standard deviation for the original variable x.
3. Construct the 95% confidence interval for the mean grocery prices. Standard deviation $1.10
4. Compute the minimum sample size required to have margin of error at most $ 0.30, while keeping the confidence level at 95%. Standard Deviation is $1.10
5. Construct the 95% confidence interval for the mean grocery store prices. (This time assume that the standard deviation σx is unknown).
6. Suppose the mean grocery price for Safeway is known to be $ 4.10. Test the hypothesis that the mean grocery price for this grocery store vs from mean another grocery stores price with significance level α = 0.10. Standard deviation is $1.10
7. Retest the hypothesis that the mean grocery price for this grocery store vs mean another grocery stores price at significance level α = 0.10. (This time, assume that the standard deviation σx is unknown).
This is for the purposes of checking answers and comparing work. Thank you.
In: Statistics and Probability
Making Comparisons
Is it fair to compare an individual to a population of individuals? Why or why not?
Is it fair to compare two groups of individuals? Why or why not?
Is it fair to compare two different countries on differing levels of happiness for example? Why or why not?
(Hint: these are not all “unfair” otherwise we would not be able to make any comparisons, unless you believe that we should not make any comparisons. If this is so, please defend yourself).
In: Statistics and Probability
Q.1 In a study of binge drinking among undergraduates at Ohio
University, a researcher was interested in gender differences as
related to binge drinking and to drinking-related arrests. She
wanted to know two things: (a) Is there a significant relationship
between gender and binge drinking (as defined by 5 or more drinks
at one sitting), and (b) Is there a significant relationship
between gender and drinking-related arrests? A random sample of
males and females were asked about their experiences with binge
drinking and with drinking-related arrests.
Use the numbers below for this question
only!
Binge
Drinking?
YES NO
Male 292 251
Female 276 268
What would the expected value for the "male-yes"
cell be?
Q2. In a study of binge drinking among undergraduates at Ohio
University, a researcher was interested in gender differences as
related to binge drinking and to drinking-related arrests. She
wanted to know two things: (a) Is there a significant relationship
between gender and binge drinking (as defined by 5 or more drinks
at one sitting), and (b) Is there a significant relationship
between gender and drinking-related arrests? A random sample of
males and females were asked about their experiences with binge
drinking and with drinking-related arrests.
Use the numbers below for this question
only!
Binge
Drinking?
YES NO
Male 225 274
Female 275 214
What would the expected value for the "female-no"
cell be?
Q3. In a study of binge drinking among undergraduates at Ohio
University, a researcher was interested in gender differences as
related to binge drinking and to drinking-related arrests. She
wanted to know two things: (a) Is there a significant relationship
between gender and binge drinking (as defined by 5 or more drinks
at one sitting), and (b) Is there a significant relationship
between gender and drinking-related arrests? A random sample of
males and females were asked about their experiences with binge
drinking and with drinking-related arrests. Test for a relationship
in the following data:
Use the numbers below for this question
only!
Binge
Drinking?
YES NO
Male 54 25
Female 22 46
What is the calculated chi-squared value?
Q4. Using a critical value of 3.84, and based on the obtained chi-square value in Question 3, is there a significant relationship between gender and binge drinking?
Yes
No
In: Statistics and Probability
Q1. A local sports bar wanted to determine whether Ohio
University students prefer a particular type of food in their
establishment. A sample of students responses are reproduced below.
Do students prefer a particular type of bar food? Use critical
value = 6.58.
Use the numbers below for this question
only!
Nachos Pizza Chicken
Wings Cheese
Sticks
33 34 46 46
What would the expected value for Cheese Sticks
be?
Q2. A local sports bar wanted to determine whether Ohio
University students prefer a particular type of food in their
establishment. A sample of students responses are reproduced below.
Do students prefer a particular type of bar food? Use critical
value = 6.58.
Use the numbers below for this question
only!
Nachos Pizza Chicken
Wings Cheese
Sticks
44 40 42 43
What is the calculated chi-squared value?
Q.3 Using a critical value of 6.58, was there a significant preference for what students eat in a sports bar based on the obtained chi-square value in Question 2?
Yes
No
In: Statistics and Probability
Ages of Proofreaders At a large publishing company, the mean age of proofreaders is 36.2 years and the standard deviation is 3.7 years. Assume the variable is normally distributed. Round intermediate z-value calculations to two decimal places and the final answers to at least four decimal places.
If a proofreader from the company is randomly selected, find the probability that his or her age will be between 35.5 and 37 years.
In: Statistics and Probability
particular lake is known to be one of the best places to catch a certain type of fish. In this table, x = number of fish caught in a 6-hour period. The percentage data are the percentages of fishermen who caught x fish in a 6-hour period while fishing from shore. x 0 1 2 3 4 or more % 43% 35% 15% 6% 1%
(b) Find the probability that a fisherman selected at random fishing from shore catches one or more fish in a 6-hour period. (Enter a number. Round your answer to two decimal places.)
(c) Find the probability that a fisherman selected at random fishing from shore catches two or more fish in a 6-hour period. (Enter a number. Round your answer to two decimal places.)
(d) Compute μ, the expected value of the number of fish caught per fisherman in a 6-hour period (round 4 or more to 4). (Enter a number. Round your answer to two decimal places.) μ = fish
(e) Compute σ, the standard deviation of the number of fish caught per fisherman in a 6-hour period (round 4 or more to 4). (Enter a number. Round your answer to three decimal places.) σ = fish
In: Statistics and Probability
What does it mean to say that we are going to use a sample to draw an inference about a population? Why is a random sample so important for this process? If we wanted a random sample of students in the cafeteria, why couldn’t we just choose the students who order Diet Pepsi with their lunch? Comment on the statement, “A random sample is like a miniature population, whereas samples that are not random are likely to be biased.” Why would the students who order Diet Pepsi with lunch not be a random sample of students in the cafeteria?
In: Statistics and Probability
What are the differences between MANOVA and discriminant analysis? What situations best suit each multivariate technique?
In: Statistics and Probability
In 2-3 paragraphs and on your own words, describe Missing data and how to deal with it
In: Statistics and Probability
A new design for the braking system on a certain type of car has been proposed. For the current system, the true average braking distance at 40 mph under specified conditions is known to be 120 ft. It is proposed that the new design be implemented only if sample data strongly indicates a reduction in true average braking distance for the new design.
(a) Define the parameter of interest.
μ = true average braking distance for the new design μ = true average braking distance for the old design p̂ = true proportion of cars whose braking distances reduced p̂ = true proportion of cars whose braking distances did not reduce
State the relevant hypotheses.
H0: p̂ = 120
Ha: p̂ < 120
H0: μ = 120
Ha: μ ≠ 120
H0: μ = 120
Ha: μ > 120 H0:
μ = 120
Ha: μ < 120 H0:
p̂ = 120
Ha: p̂ ≠ 120
(b) Suppose braking distance for the new system is normally
distributed with σ = 11. Let
X
denote the sample average braking distance for a random sample of 36 observations. Which values of
x
are more contradictory to H0 than 117.2?
x ≥ 117.2 x ≤ 117.2
What is the P-value in this case? (Round your answer to
four decimal places.)
What conclusion is appropriate if α = 0.10?
The new design does have a mean breaking distance less than 120 feet at 40 mph. The new design does not have a mean breaking distance less than 120 feet at 40 mph.
(c) What is the probability that the new design is not implemented
when its true average braking distance is actually 115 ft and the
test from part (b) is used? (Round your answer to four decimal
places.)
You may need to use the appropriate table in the Appendix of Tables
to answer this question.
In: Statistics and Probability
Describe an application of multiple discriminant analysis that is specific to your industry or to your academic interests(Data Science). Explain why this technique is suitable in terms of measurement scale of variables and their roles.
In: Statistics and Probability
Describe an application of multiple regression analysis that is specific to your industry or to your academic interests(Data Science). Explain why this technique is suitable in terms of measurement scale of variables and their roles.
In: Statistics and Probability