Questions
Bullying in schools has been the focus of various programs which work to reduce incidents of...

Bullying in schools has been the focus of various programs which work to reduce incidents of bullying and increase the reporting of activity related and/or involving this behavior. A very large school district implemented a new anti-bullying program designed to reduce incidents of bullying. For the past 20 years, the school district has received an average of 370 incidents of bullying across all schools in its district. The data below represent the number of bullying incidents for 15 schools at the end of the first year after implementing the new anti-bullying program: Bullying incidents reported

340 384 370 352 328 348 365 355 335 330 360 376 325 375 368

  1. (A) What is the independent variable and its levels?

(B) What is the dependent variable and its scale (i.e., NOIR) of measurement?

B. What is the null hypothesis for the problem described above?

C. Conduct a test of the null hypothesis using p = .05. Be sure to properly state your statistical conclusion.

D. Provide an interpretation of your statistical conclusion from part c.

E. What type of statistical error might you have made?

F. How could the researcher increase statistical power for this study?

G. What is the 95% confidence interval for this study?

H. Provide an interpretation for the interval obtained in part g.

I. How does the confidence interval obtained in part g compare to your statistical conclusion from part c? (i.e., Where does the value expected by the null hypothesis fall in the confidence interval and how does this support or not support your decision to reject or not reject the null hypothesis?)

In: Statistics and Probability

The birthday problem considers the probability that two people in a group of a given size...

The birthday problem considers the probability that two people in a group of a given size have the same birth date. We will assume a 365 day year (no leap year birthdays).

Code set-up

Dobrow 2.28 provides useful R code for simulating the birthday problem. Imagine we want to obtain an empirical estimate of the probability that two people in a class of a given size will have the same birth date. The code

trial = sample(1:365, numstudents, replace=TRUE)

simulates birthdays from a group of numstudents students. So you can assign numstudents or just replace numstudents with the number of students in the class of interest.

If we store the list of birthdays in the variable trial, the code

2 %in% table(trial)

will create a frequency table of birthdays and then determine if there is a match (2 birthdays the same). We can use this code in an if-else statement to record whether a class has at least one pair of students with the same birth date. We then can embed the code within a for-loop to repeat the experiment, store successes in a vector, and then take the average number of successes (a birthday match) across the repeated tasks.

The problems

  • Simulate the birthday problem to obtain an empirical estimate of the probability that two people in a class of 23 will have the same birth date. In particular, simulate birthdays for 1000 classes (for(i in 1:1000){...}) each of size 23 and compute the proportion of these classes in which at least one pair of students has the same birth date.

Recall that the true probability is 1-prod(seq(343,365))/(365)^23 which is approximately 50%.

  • Using your simulation code, estimate the number of students needed in the class so that the probability of a match is 95%. (You may do this by trial and error.)
  • Using your simulation code, find the approximate probability that three people have the same birthday in a class of 50 students.

# [Place code here]

Place your answers to the three items below here:

  • [Ans 1]

    The birthday problem considers the probability that two people in a group of a given size have the same birth date. We will assume a 365 day year (no leap year birthdays).

    Code set-up

    Dobrow 2.28 provides useful R code for simulating the birthday problem. Imagine we want to obtain an empirical estimate of the probability that two people in a class of a given size will have the same birth date. The code

    trial = sample(1:365, numstudents, replace=TRUE)

    simulates birthdays from a group of numstudents students. So you can assign numstudents or just replace numstudents with the number of students in the class of interest.

    If we store the list of birthdays in the variable trial, the code

    2 %in% table(trial)

    will create a frequency table of birthdays and then determine if there is a match (2 birthdays the same). We can use this code in an if-else statement to record whether a class has at least one pair of students with the same birth date. We then can embed the code within a for-loop to repeat the experiment, store successes in a vector, and then take the average number of successes (a birthday match) across the repeated tasks.

    The problems

  • Simulate the birthday problem to obtain an empirical estimate of the probability that two people in a class of 23 will have the same birth date. In particular, simulate birthdays for 1000 classes (for(i in 1:1000){...}) each of size 23 and compute the proportion of these classes in which at least one pair of students has the same birth date.
  • Recall that the true probability is 1-prod(seq(343,365))/(365)^23 which is approximately 50%.

  • Using your simulation code, estimate the number of students needed in the class so that the probability of a match is 95%. (You may do this by trial and error.)
  • Using your simulation code, find the approximate probability that three people have the same birthday in a class of 50 students.
  • # [Place code here]

    Place your answers to the three items below here:

  • [Ans 1]

In: Statistics and Probability

The following data are from an experiment designed to investigate the perception of corporate ethical values...

The following data are from an experiment designed to investigate the perception of corporate ethical values among individuals specializing in marketing (higher scores indicate higher ethical values).

Marketing
Managers
Marketing
Research
Advertising
3 6 6
4 6 6
4 5 6
3 6 7
5 5 6
5 5 5

a) Use α = 0.05 to test for significant differences in perception among the three groups.

Find the value of the test statistic. _______

Find the p-value. (Round your answer to three decimal places.)

p-value = _______

b) At the α = 0.05 level of significance, we can conclude that there are differences in the perceptions for marketing managers, marketing research specialists, and advertising specialists. Use the procedures in Section 13.3 to determine where the differences occur. Use α = 0.05. (Use the Bonferroni adjustment.)

Find the value of LSD. (Round your comparisonwise error rate to four decimal places. Round your answer to three decimal places.)

LSD = _______

Find the pairwise absolute difference between sample means for each pair of treatments.

xMM − xMR = ________  

xMR − xA = ________xMM − xA = ________

In: Statistics and Probability

Here are some prices for randomly selected grocery items from the grocery store: Items Prices: Cheese...

Here are some prices for randomly selected grocery items from the grocery store:

Items Prices:

Cheese $3.29

Butter $4.99

Eggs $3.49

Yogurt $3.49

Juice $3.89

Tea $3.69

Chips $3.99

Soda $1.99

Pastry $2.99

Cerrial $4.99

Oats $3.29

Almond Milk $2.79

Almonds $4.39

Popcorn $3.29

Crackers $3.59

Ice Cream $6.99

Cookies $2.99

Jam $3.69

Peanut Butter $3.29

Coffee $3.19

Green Tea $4.99

BBQ Sauce $2.99

Oil $6.69

Mayonnaise $4.59

Mustard $2.99

1. Compute the sample mean x and the sample standard deviation of grocery store prices.

2. State the population mean µx and population standard deviation σx of sample mean in terms of population mean and population standard deviation for the original variable x.

3. Construct the 95% confidence interval for the mean grocery prices. Standard deviation $1.10  

4. Compute the minimum sample size required to have margin of error at most $ 0.30, while keeping the confidence level at 95%. Standard Deviation is $1.10

5. Construct the 95% confidence interval for the mean grocery store prices. (This time assume that the standard deviation σx is unknown).

6. Suppose the mean grocery price for Safeway is known to be $ 4.10. Test the hypothesis that the mean grocery price for this grocery store vs from mean another grocery stores price with significance level α = 0.10. Standard deviation is $1.10

7. Retest the hypothesis that the mean grocery price for this grocery store vs mean another grocery stores price at significance level α = 0.10. (This time, assume that the standard deviation σx is unknown).

This is for the purposes of checking answers and comparing work. Thank you.

In: Statistics and Probability

Making Comparisons Is it fair to compare an individual to a population of individuals? Why or...

Making Comparisons

Is it fair to compare an individual to a population of individuals? Why or why not?

Is it fair to compare two groups of individuals? Why or why not?

Is it fair to compare two different countries on differing levels of happiness for example? Why or why not?

(Hint: these are not all “unfair” otherwise we would not be able to make any comparisons, unless you believe that we should not make any comparisons. If this is so, please defend yourself).

In: Statistics and Probability

Q.1 In a study of binge drinking among undergraduates at Ohio University, a researcher was interested...

Q.1 In a study of binge drinking among undergraduates at Ohio University, a researcher was interested in gender differences as related to binge drinking and to drinking-related arrests. She wanted to know two things: (a) Is there a significant relationship between gender and binge drinking (as defined by 5 or more drinks at one sitting), and (b) Is there a significant relationship between gender and drinking-related arrests? A random sample of males and females were asked about their experiences with binge drinking and with drinking-related arrests.

Use the numbers below for this question only!

                              Binge Drinking?

  YES        NO

   Male                       292         251

   Female                   276         268


What would the expected value for the "male-yes" cell be?

Q2. In a study of binge drinking among undergraduates at Ohio University, a researcher was interested in gender differences as related to binge drinking and to drinking-related arrests. She wanted to know two things: (a) Is there a significant relationship between gender and binge drinking (as defined by 5 or more drinks at one sitting), and (b) Is there a significant relationship between gender and drinking-related arrests? A random sample of males and females were asked about their experiences with binge drinking and with drinking-related arrests.

Use the numbers below for this question only!

                              Binge Drinking?

  YES        NO

   Male                       225         274

   Female                   275         214


What would the expected value for the "female-no" cell be?

Q3. In a study of binge drinking among undergraduates at Ohio University, a researcher was interested in gender differences as related to binge drinking and to drinking-related arrests. She wanted to know two things: (a) Is there a significant relationship between gender and binge drinking (as defined by 5 or more drinks at one sitting), and (b) Is there a significant relationship between gender and drinking-related arrests? A random sample of males and females were asked about their experiences with binge drinking and with drinking-related arrests. Test for a relationship in the following data:

Use the numbers below for this question only!

                              Binge Drinking?

  YES        NO

   Male                       54         25

   Female                   22         46


What is the calculated chi-squared value?

Q4. Using a critical value of 3.84, and based on the obtained chi-square value in Question 3, is there a significant relationship between gender and binge drinking?

Yes

No

In: Statistics and Probability

Q1. A local sports bar wanted to determine whether Ohio University students prefer a particular type...

Q1. A local sports bar wanted to determine whether Ohio University students prefer a particular type of food in their establishment. A sample of students responses are reproduced below. Do students prefer a particular type of bar food? Use critical value = 6.58.

Use the numbers below for this question only!

Nachos        Pizza        Chicken Wings        Cheese Sticks

   33               34                    46                          46


What would the expected value for Cheese Sticks be?

Q2. A local sports bar wanted to determine whether Ohio University students prefer a particular type of food in their establishment. A sample of students responses are reproduced below. Do students prefer a particular type of bar food? Use critical value = 6.58.

Use the numbers below for this question only!

Nachos        Pizza        Chicken Wings        Cheese Sticks

   44               40                    42                          43


What is the calculated chi-squared value?

Q.3 Using a critical value of 6.58, was there a significant preference for what students eat in a sports bar based on the obtained chi-square value in Question 2?

Yes

No

In: Statistics and Probability

Ages of Proofreaders At a large publishing company, the mean age of proofreaders is 36.2 years...

Ages of Proofreaders At a large publishing company, the mean age of proofreaders is 36.2 years and the standard deviation is 3.7 years. Assume the variable is normally distributed. Round intermediate z-value calculations to two decimal places and the final answers to at least four decimal places.

If a proofreader from the company is randomly selected, find the probability that his or her age will be between 35.5 and 37 years.

In: Statistics and Probability

particular lake is known to be one of the best places to catch a certain type...

particular lake is known to be one of the best places to catch a certain type of fish. In this table, x = number of fish caught in a 6-hour period. The percentage data are the percentages of fishermen who caught x fish in a 6-hour period while fishing from shore. x 0 1 2 3 4 or more % 43% 35% 15% 6% 1%

(b) Find the probability that a fisherman selected at random fishing from shore catches one or more fish in a 6-hour period. (Enter a number. Round your answer to two decimal places.)

(c) Find the probability that a fisherman selected at random fishing from shore catches two or more fish in a 6-hour period. (Enter a number. Round your answer to two decimal places.)

(d) Compute μ, the expected value of the number of fish caught per fisherman in a 6-hour period (round 4 or more to 4). (Enter a number. Round your answer to two decimal places.) μ = fish

(e) Compute σ, the standard deviation of the number of fish caught per fisherman in a 6-hour period (round 4 or more to 4). (Enter a number. Round your answer to three decimal places.) σ = fish

In: Statistics and Probability

What does it mean to say that we are going to use a sample to draw...

What does it mean to say that we are going to use a sample to draw an inference about a population? Why is a random sample so important for this process? If we wanted a random sample of students in the cafeteria, why couldn’t we just choose the students who order Diet Pepsi with their lunch? Comment on the statement, “A random sample is like a miniature population, whereas samples that are not random are likely to be biased.” Why would the students who order Diet Pepsi with lunch not be a random sample of students in the cafeteria?

In: Statistics and Probability

What are the differences between MANOVA and discriminant analysis? What situations best suit each multivariate technique?

What are the differences between MANOVA and discriminant analysis? What situations best suit each multivariate technique?

In: Statistics and Probability

In 2-3 paragraphs and on your own words, describe Missing data and how to deal with...

In 2-3 paragraphs and on your own words, describe Missing data and how to deal with it

In: Statistics and Probability

A new design for the braking system on a certain type of car has been proposed....

A new design for the braking system on a certain type of car has been proposed. For the current system, the true average braking distance at 40 mph under specified conditions is known to be 120 ft. It is proposed that the new design be implemented only if sample data strongly indicates a reduction in true average braking distance for the new design.

(a) Define the parameter of interest.

μ = true average braking distance for the new design μ = true average braking distance for the old design     = true proportion of cars whose braking distances reduced = true proportion of cars whose braking distances did not reduce


State the relevant hypotheses.

H0: = 120
Ha: < 120 H0: μ = 120
Ha: μ ≠ 120     H0: μ = 120
Ha: μ > 120 H0: μ = 120
Ha: μ < 120 H0: = 120
Ha: ≠ 120


(b) Suppose braking distance for the new system is normally distributed with σ = 11. Let

X

denote the sample average braking distance for a random sample of 36 observations. Which values of

x

are more contradictory to H0 than 117.2?

x ≥ 117.2 x ≤ 117.2    


What is the P-value in this case? (Round your answer to four decimal places.)


What conclusion is appropriate if α = 0.10?

The new design does have a mean breaking distance less than 120 feet at 40 mph. The new design does not have a mean breaking distance less than 120 feet at 40 mph.    


(c) What is the probability that the new design is not implemented when its true average braking distance is actually 115 ft and the test from part (b) is used? (Round your answer to four decimal places.)


You may need to use the appropriate table in the Appendix of Tables to answer this question.

In: Statistics and Probability

Describe an application of multiple discriminant analysis that is specific to your industry or to your...

Describe an application of multiple discriminant analysis that is specific to your industry or to your academic interests(Data Science). Explain why this technique is suitable in terms of measurement scale of variables and their roles.

In: Statistics and Probability

Describe an application of multiple regression analysis that is specific to your industry or to your...

Describe an application of multiple regression analysis that is specific to your industry or to your academic interests(Data Science). Explain why this technique is suitable in terms of measurement scale of variables and their roles.

In: Statistics and Probability