Describe an event, in your own words, and how would you explain an intersection and complement to someone who's never read the chapter introduction to Probability. Remember your reader has never read about these concepts, how can you explain it to them, provide examples. The examples can't be from the textbook, create your own.
In: Statistics and Probability
Briefly describe the specific characteristics of the normal curve.
Conceptually, where does it come from?
What human behavior, trait, or characteristic can you think of that is normally distributed?
Can you think of one that isn't? Remember you cannot use categorical (nominal or ordinal) level measurements for this example.
In: Statistics and Probability
In the United States, voters who are neither Democrat nor Republican are called Independent. It is believed that 12% of voters are Independent. A survey asked 30 people to identify themselves as Democrat, Republican, or Independent.
A. What is the probability that none of the people are Independent?
Probability =
B. What is the probability that fewer than 6 are Independent? Probability =
C. What is the probability that more than 2 people are Independent? Probability =
In: Statistics and Probability
A high school teacher hypothesizes a negative relationship
between performance in exams and performance in presentations. To
examine this, the teacher computes a correlation of -0.23 from a
random sample of 29 students from class. What can the teacher
conclude with α = 0.01?
a) Obtain/compute the appropriate values to make a
decision about H0.
(Hint: Make sure to write down the null and alternative hypotheses
to help solve the problem.)
critical value = ; test statistic =
Decision: ---Select one--- (Reject H0, Fail to reject
H0)
b) Compute the corresponding effect size(s) and
indicate magnitude(s).
If not appropriate, input and/or select "na" below.
effect size = ; ---Select one--- (na,
trivial effect, small effect, medium effect, large effect)
c) Make an interpretation based on the
results.
a. There is a significant positive relationship between performance in exams and performance in presentations.
b. There is a significant negative relationship between performance in exams and performance in presentations.
c. There is no significant relationship between performance in exams and performance in presentations.
In: Statistics and Probability
1.-Given are five observations for two variables, x and y.
xi |
1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
yi |
4 | 7 | 4 | 10 | 15 |
(d) Develop the estimated regression equation by computing the values of b0 and b1 using b1 =(Σ(xi − x)(yi − y))/Σ(xi − x)2 and b0 = y − b1x.
(e)Use the estimated regression equation to predict the value of y when x = 2.
2.-Companies in the U.S. car rental market vary greatly in terms of the size of the fleet, the number of locations, and annual revenue. In 2011, Hertz had 320,000 cars in service and annual revenue of approximately $4.2 billion. Suppose the following data show the number of cars in service (1,000s) and the annual revenue ($ millions) for six smaller car rental companies.
Company | Cars (1,000s) |
Revenue ($ millions) |
---|---|---|
Company A | 11.5 | 118 |
Company B | 10.0 | 135 |
Company C | 9.0 | 100 |
Company D | 5.5 | 37 |
Company E | 4.2 | 42 |
Company F | 3.3 | 30 |
(c) Use the least squares method to develop the estimated regression equation that can be used to predict annual revenue (in $ millions) given the number of cars in service (in 1,000s). (Round your numerical values to three decimal places.)
ŷ =_____
(d) For every additional car placed in service, estimate how much annual revenue will change (in dollars). (Round your answer to the nearest integer.)
Annual revenue will increase by $ ___ , for every additional car placed in service.
(e) A particular rental company has 6,000 cars in service. Use the estimated regression equation developed in part (c) to predict annual revenue (in $ millions) for this company. (Round your answer to the nearest integer.)
$ __ million
In: Statistics and Probability
Mean and Standard Deviation of Prices of Six HIV Drugs
Drug |
Mean Price |
Standard Deviation in Price |
A |
$3,200 |
$856 |
B |
$6,800 |
$925 |
C |
$11,352 |
$754 |
D |
$3,945 |
$920 |
E |
$2,120 |
$645 |
F |
$4,856 |
$540 |
√n
and 95% confidence intervals for the mean drug prices for each drug group. Based on your calculated estimates, which of the drugs have statistically significantly different prices?
In: Statistics and Probability
In a large university, 20% of the students are male. If a random sample of twenty two students is selected
a. |
What is the probability that the sample contains exactly twelve male students? |
b. |
What is the probability that the sample will contain no male students? |
c. |
What is the probability that the sample will contain exactly twenty female students? |
d. |
What is the probability that the sample will contain more than nine male students? |
e. |
What is the probability that the sample will contain fewer than ten male students? |
f. |
What is the expected number of male students? |
In: Statistics and Probability
In the 2015 AFC Championship game, there was a charge the New England Patriots deflated their footballs
for an advantage. The balls should be inflated to between 12.5 and 13.5 pounds per square inch. The actual
measurements for this game are listed below.
11.50 10.85 11.15 10.70 11.10 12.60 12.55 11.10 10.95 10.50 10.90
Use a significance level to test the claim that the population mean is less than 12.5 psi. State
clearly whether the Patriots’ balls are deflated or not. (You may treat the measurements as a random
sample from a normal population.)
a) Calculate the sample mean and the sample standard deviation .
b) State the hypotheses.
d) Find the P-value and the critical value(s). Label the value(s) you found on the sketches below.
e) State the initial conclusion regarding the null hypothesis .
f) State the final conclusion in your own words that addresses the original claim.
In: Statistics and Probability
McAllister et al. (2012) compared varsity football and hockey players with varsity athletes from noncontact sports to determine whether exposure to head impacts during one season have an effect on cognitive performance. In their study, tests of new learning performance were significantly poorer for the contact sport athletes compared to the noncontact sport athletes.
The following presents data similar to the results obtained. Data are the scores of a neurological test. Higher scores indicate better performance.
Type of sport
Noncontact Sport Athletes | Contact Sports Athletes |
10, 8, 7, 9, 13, 7, 6, 12 | 7, 4, 9, 3, 7, 6, 10, 2 |
Computations by hand
a. What are IV and DV in this study?
b. Compute the mean and the standard deviation for each condition (show your work). (use the definition formula to compute each SS)
c. Are the neurological test scores for contact sport athletes significantly different from the neurological test scores for noncontact sport athletes? Use a two-tailed test with α = .05.
1) State the null hypothesis in words and in a statistical form.
2) State the alternative hypothesis in words and a statistical form.
3) Compute the appropriate statistic to test the hypotheses. Sketch the distribution
with the estimated standard error and locate the critical region(s) with the critical value(s).
4) State your statistical decision
5) Compute Cohen’s d. Interpret what the d really means in this context.
6) Compute 95% CI (2).
7) What is your conclusion? Interpret the results. Do not forget to include a statistical form (e.g., t-score, df, α, Cohen’s d)
In: Statistics and Probability
According to a 2017 AAA survey, 35% of Americans planned to take a family vacation (a vacation more than
50 miles from home involving two or more immediate family members). Suppose a recent survey of 300
randomly selected Americans found that 115 planned on taking a family vacation. Use a significance level
to test the claim that the proportion of Americans planning a family vacation has changed since 2017.
a) State the hypotheses.
b) Calculate the test statistic.
c) Find the P-value and the critical value(s). Label the value(s) you found on the sketches below.
d) State the initial conclusion regarding the null hypothesis .
e) State the final conclusion in your own words that addresses the original claim.
In: Statistics and Probability
This is an exploratory problem intended to introduce the idea of
curvilinear regression. Personally, I was a bit shocked to discover
that multiple LINEAR regression is the main vehicle to calculate
regressions for data with nonlinear relationships...sounds a bit
counter-intuitive. However, if we think of the higher-power terms
(quadratic, cubic, etc.) as distinct variables, the ideas work well
together.
Here is a data set for students in a gifted program. The first
score (X1=GPAX1=GPA) is the students’ math grade from last year,
and the second score (Y=SATY=SAT) is their SAT-M score. As this is
a non-representative group (when considering the population of all
students taking math classes in high school), it is not unexpected
to see range-restriction effects (generally all high performing,
few lower performing representatives) or ceiling effects (maximum
score on the SAT-M is 800). In data such as this, it is not
uncommon to see non-linear trends.
GPA | SAT |
---|---|
3.2 | 760 |
3.8 | 775 |
3 | 760 |
2.8 | 745 |
4 | 770 |
3.5 | 760 |
3.1 | 760 |
3.2 | 770 |
3.3 | 765 |
3.5 | 765 |
3.5 | 755 |
3.3 | 760 |
3.6 | 765 |
2.9 | 750 |
2.1 | 725 |
3.2 | 765 |
3.4 | 770 |
3.8 | 765 |
2.2 | 720 |
2.8 | 760 |
2.8 | 755 |
3.6 | 755 |
3.6 | 770 |
3.5 | 765 |
3.4 | 770 |
Step 1: Copy the data into your prefered
statistical software program. Change the variable names to GPA and
SAT if need be. Before doing any analysis, look at a scatterplot of
the data with GPA on the horizontal axis and SAT on the vertical
axis. Be sure to note any trends.
The following includes information for Excel users. If you are
not using Excel, please disregard.
Step 2: Run a regression (Data Analysis >
Regression) with SAT as the X variable. Again, be sure to note what
evidence supports the assumptions for a regression analysis. Report
the regression equation and the requested statistics:
SAT=SAT= + ×GPA×GPA
(Report regression coefficients accurate to 3 decimal
places.)
R2adj=Radj2=
(Report accurate to 3 decimal places.)
Step 3: Create a third variable called GPAsq (for
squared GPA). In Excel, use a formula, something like =B1^2 and
fill down the rest of the column.
Step 4: Run the quadratic regression by adding the
independent variable GPAsq to the model. Report the regression
equation and the requested statistics:
SAT=SAT= + ×GPA×GPA
+ ×GPA2×GPA2
(Report regression coefficients accurate to 3 decimal
places.)
R2adj=Radj2=
(Report accurate to 3 decimal places.)
Step 5: Notice how the adjusted coefficient of
multiple determination changed from the bivariate regression to the
quadratic (multiple) regression. The next step is to determine if
this more complicated model is statistically significantly better
than the more parsimonious linear model.
For the multiple regression model, what was the F-ratio
and the resulting P-value?
Fmodel=Fmodel=
(Report accurate to 2 decimal places.)
P=P=
(Report accurate to 3 decimal places.)
In: Statistics and Probability
CONSTRUCT A 90% CONFIDENCE INTERVAL OF THE AVERAGE GASOLINE PRICE PER GALLON BASED ON A RECENT SAMPLE OF 48 STATIONS. THE SAMPLE MEAN WAS $2.63. ASSUME THE POPULATION STANDARD DEVIATION IS $0.31.
In: Statistics and Probability
Find the value of z such that 0.03 of the area lies to the left of z. Round your answer to two decimal places.
In: Statistics and Probability
Suppose that Phil and Jen each have a bag containing 10 distinguishable objects. The two bags have the same content. Phil and Jen randomly choose 3 objects each from their bags. Consider the following random variables: X − the number of objects chosen by both Phil and Jen Y − the number of objects not chosen by either Phil or Jen Z − the number of objects chosen by exactly one of Phil and Jen Compute E(X), E(Y ) and E(XZ).
In: Statistics and Probability
Random Sample Selection 40 out of 60 students
What value is two standard deviations above the mean?
What value is 1.5 standard deviations below the mean?
Construct a histogram displaying your data.
In complete sentences, describe the shape of your graph.
Do you notice any potential outliers? If so, what values are they?
Show your work in how you used the potential outlier formula to
determine whether or not the values might be outliers.
Construct a box plot displaying your data.
Does the middle 50% of the data appear to be concentrated together
or spread apart? Explain how you determined this.
Looking at both the histogram and the box plot, discuss the
distribution of your data.
# of pencils | Frequency | Culumative Frequency | Relative Frequency | Cumulative Relative Frequency |
0 | 5 | 5 | 0.125 | 0.125 |
1 | 14 | 19 | 0.35 | 0.475 |
2 | 10 | 29 | 0.25 | 0.725 |
3 | 7 | 36 | 0.175 | 0.90 |
4 | 1 | 37 | 0.025 | 0.925 |
5 | 0 | 37 | 0 | 0.925 |
6 | 0 | 37 | 0 | 0.925 |
7 | 1 | 38 | 0.025 | 0.95 |
8 | 1 | 39 | 0.025 | 0.975 |
9 | 0 | 39 | 0 | 0.975 |
10 | 1 | 40 | 0.025 | 1 |
In: Statistics and Probability