Suppose a certain state university's college of business obtained the following results on the salaries of a recent graduating class:
Finance Majors | Business Analytics Majors |
---|---|
n1 = 140 |
n2 = 30 |
x1 = $48,437 |
x2 = $55,217 |
s1 = $19,000 |
s2 = $10,000 |
(a)
Formulate hypotheses so that, if the null hypothesis is rejected, we can conclude that salaries for Finance majors are significantly lower than the salaries of Business Analytics majors. Use α = 0.05. (Let μ1 = the population mean salary for Finance majors, and let μ2 = the population mean salary for Business Analytics majors. Enter != for ≠ as needed.)
H0:
Ha:
(b)
What is the value of the test statistic? (Use
μ1 − μ2.
Round your answer to three decimal places.)
What is the p-value? (Round your answer to four decimal places.)
p-value =
(d)
What is your conclusion?
Reject H0. We can conclude that salaries for Finance majors are significantly lower than the salaries of Business Analytics majors.
Reject H0. We cannot conclude that salaries for Finance majors are significantly lower than the salaries of Business Analytics majors.
Do not reject H0. We cannot conclude that salaries for Finance majors are significantly lower than the salaries of Business Analytics majors.
Do not reject H0. We can conclude that salaries for Finance majors are significantly lower than the salaries of Business Analytics majors.
In: Math
A consumer product testing organization uses a survey of readers to obtain customer satisfaction ratings for the nation's largest supermarkets. Each survey respondent is asked to rate a specified supermarket based on a variety of factors such as: quality of products, selection, value, checkout efficiency, service, and store layout. An overall satisfaction score summarizes the rating for each respondent with 100 meaning the respondent is completely satisfied in terms of all factors. Suppose sample data representative of independent samples of two supermarkets' customers are shown below.
Supermarket 1 | Supermarket 2 |
---|---|
n1 = 260 |
n2 = 300 |
x1 = 82 |
x2 = 81 |
(a)
Formulate the null and alternative hypotheses to test whether there is a difference between the population mean customer satisfaction scores for the two retailers. (Let μ1 = the population mean satisfaction score for Supermarket 1's customers, and let μ2 = the population mean satisfaction score for Supermarket 2's customers. Enter != for ≠ as needed.)
H0:
Ha:
Assume that experience with the satisfaction rating scale indicates that a population standard deviation of 17 is a reasonable assumption for both retailers. Conduct the hypothesis test.
Calculate the test statistic. (Use
μ1 − μ2.
Round your answer to two decimal places.)
Report the p-value. (Round your answer to four decimal places.)
p-value =
At a 0.05 level of significance what is your conclusion?
Reject H0. There is not sufficient evidence to conclude that the population mean satisfaction scores differ for the two retailers.
Do not reject H0. There is not sufficient evidence to conclude that the population mean satisfaction scores differ for the two retailers.
Do not reject H0. There is sufficient evidence to conclude that the population mean satisfaction scores differ for the two retailers.
Reject H0. There is sufficient evidence to conclude that the population mean satisfaction scores differ for the two retailers.
Which retailer, if either, appears to have the greater customer satisfaction?
Supermarket 1Supermarket 2 neither
Provide a 95% confidence interval for the difference between the population mean customer satisfaction scores for the two retailers. (Use
x1 − x2.
Round your answers to two decimal places.)
to
In: Math
Let S be the square centered at the origin with sides of length 2, and C be the unit circle centered at the origin.
(a) If you randomly throw a point on S, what is the probability that it will lie in C?
Ans: 0.785
(b) Describe how you could use simulation to estimate the probability in part (a).
(c) How can you use simulation to estimate a?
For part b and c, there maybe a need to generate random variables for the simulation by Box-Muller, Accept/Reject or importance sampling. Any help on b and c will be appreciated !
In: Math
Part 1.
3.13 Overweight baggage: Suppose weights of the checked baggage of airline passengers follow a nearly normal distribution with mean 44.8 pounds and standard deviation 3.3 pounds. Most airlines charge a fee for baggage that weigh in excess of 50 pounds. Determine what percent of airline passengers incur this fee. (Round to the nearest percent.) __________.
Part 2.
There are two distributions for GRE scores based on the two
parts of the exam. For the verbal part of the exam, the mean is 151
and the standard deviation is 7. For the quantitative part, the
mean is 153 and the standard deviation is 7.67. Use this
information to compute each of the following:
(Round to the nearest whole number.)
a) The score of a student who scored in the 80-th percentile on
the Quantitative Reasoning section. ________.
b) The score of a student who scored worse than 65% of the test
takers in the Verbal Reasoning section. ________.
Part 3.
3.10 Heights of 10 year olds: Heights of 10 year olds, regardless of gender, closely follow a normal distribution with mean 56 inches and standard deviation 5 inches.
a) What is the probability that a randomly chosen 10 year old is
shorter than 47 inches? (Keep 4 decimal places.)
____________.
b) What is the probability that a randomly chosen 10 year old is
between 60 and 66 inches? (Keep 4 decimal places.)
__________.
c) If the tallest 10% of the class is considered "very tall", what
is the height cutoff for "very tall"? (Keep 2 decimal places.)
________. inches
d) The height requirement for Batman the Ride at Six Flags
Magic Mountain is 55 inches. What percent of 10 year olds cannot go
on this ride? (Keep 2 decimal places.) %_______.
Part 4.
3.12 Speeding on the I-5, Part I: The distribution of passenger vehicle speeds traveling on the Interstate 5 Freeway (I-5) in California is nearly normal with a mean of 72.1 miles/hour and a standard deviation of 5 miles/hour. (Keep 2 decimal places.)
a) What percent of passenger vehicles travel slower than 80
miles/hour? _________%
b) What percent of passenger vehicles travel between 60 and 80
miles/hour? ____________%
c) How fast do the fastest 5% of passenger vehicles travel?
__________ miles/hour
d) The speed limit on this stretch of the I-5 is 70 miles/hour.
Approximate what percentage of the passenger vehicles travel above
the speed limit on this stretch of the I-5. __________%
In: Math
1. Which of the following variables is an example of a categorical variable?
a) The amount of money you spend on eating out each month.
b) The time it takes you to write a test.
c) The geographic region of the country in which you live.
d) The weight of a cereal box.
2. Which of the following is an example of a discrete random variable?
a) The monthly electric bill for a local business.
b) The number of people eating at a local café between noon and 2:00 p.m.
c) The amount of time it takes for a worker to complete a complex task.
d) The percentage of people living below the poverty level in Boston.
3. A measurement scale that rates product quality as either 1 = poor, 2 = average and 3 = good is known as:
a) Nominal
b) Ordinal
c) Interval
d) Ratio
4. Which of the following statements involve descriptive statistics as opposed to inferential statistics?
a) The Alcohol, Tobacco and Firearms Department reported that Seattle had 1,825 registered gun dealers in 2013.
b) Based on a survey of 380 magazine readers, the magazine reports that 30% of its readers prefer double column articles.
c) The FAA samples 425 traffic controllers in order to estimate the percent retiring due to job stress related illness.
d) Based on a sample of 350 professional baseball players, a baseball magazine reported that 23% of the parents of all professional baseball players did not play baseball.
5. Suppose a survey is taken of 300 high school seniors out of a total of 1,000 seniors. This group is probably a:
a) Sample
b) Population
c) System
d) Process
6. Which of the following is a quantitative variable?
a) the make of a washing machine
b) a person's gender
c) price of a car in thousands of dollars
d) whether a person is a college graduate or not
7. Pareto's principle is applied to a wide variety of behavior over many systems. It is sometimes referred to as the:
a) "20-80" Rule
b) "80-20" Rule
c) "10-90" Rule
d) "90-10" Rule
8. Which of the following is most likely a continuous numerical variable?
a) the number of gallons of paint purchased
b) the number of reams of paper ordered
c) the population of Egypt in 2005
d) the number of miles of interstate highways
9. A company has developed a new battery, but the average lifetime is unknown. In order to estimate this average, a sample of 110 batteries is tested and the average lifetime of this sample is found to be 200 hours. The 200 hours is the value of a:
a) parameter
b) statistic
c) sampling frame
d) population
In: Math
In the 2008 General Social Survey, participants were asked: “To what extent do you consider yourself a religious person?” Of 2023 surveyed, 317 responded: “not at all.”
(a) Construct a 95% confidence interval for the true proportion of US adults who would respond “not at all” to this question.
(b) What is the margin of error for your confidence interval?
(c) How large would the sample need to be to obtain a margin of error of ±1%?
In: Math
Canadian male has recently had a Prostate Specific Antigen (PSA)
test as to determine if he has prostate cancer. The false-positive
rate of a PSA test is 14%. If he does have prostate cancer, PSA
test will be positive 79% of the time.
Because this male is showing symptoms that are consistent with
prostate cancer, it is assumed that the chance he has prostate
cancer prior to taking the PSA test is 0.17.
Part (a) What is the probability that the PSA test
will yield a positive result?
(use four decimals in your answer)
Part (b) If the PSA test gives a positive result,
what is the probability that he does not have prostate cancer?
(use four decimals)
Part (c) Suppose the PSA test result is negative, indicating that he does not have prostate cancer and his symptoms are a result of something else. What is the probability that he does have prostate cancer?
(use four decimals)
In: Math
Employers sometimes seem to prefer executives who appear physically fit, despite the legal troubles that may result. Employers may also favor certain personality characteristics. Researchers are interested in determining whether fitness and personality are related. In one study, random samples of middle-aged managers who had volunteered for a fitness program were divided into low-fitness and high-fitness groups based on a physical examination. The subjects then took the Cattell Sixteen Personality Factor Questionnaire.
Here are the data for the “ego strength” personality factor:
Low fitness: 4.99 4.24 4.74 4.93 4.16 5.53 4.12 5.1 4.47 5.3 3.12 3.77 5.09 5.4
High fitness: 6.68 6.42 7.32 6.38 6.16 5.93 7.08 6.37 6.53 6.68 5.71 6.2 6.04 6.51
Is there a statistically significant difference in mean ego strength for the two fitness groups? Conduct a complete and appropriate hypothesis test at the 2% significance level.
In: Math
. In s study done by the Ohio Department of Job and Family Services, it was determined 38 (out of 62) poor children who attended preschool needed social services later in life compared to 49 (out of 61) poor children who did not attend preschool. A. Does this study provide significant evidence that preschool reduces the need for social services later in life? Do a complete and appropriate hypothesis test using α = .01. B. Construct a 98% confidence interval estimate of the difference in the proportion of poor children who attended preschool and poor children who did not attend preschool who needed social services later in life. Interpret the practical meaning of the resulting interval estimate, in the context of the problem, in plain English.
In: Math
Types of Data: Classify the level of measurement for the described data as nominal, ordinal, interval, or ratio.(a) the gas mileage from 30 different types of cars
(a) the gas mileage from 30 different types of cars ANSWER> ????
(b) the colors of all models of a certain type of car ANSWER> NOMINAL
(c) the years of major tsunami events ANSWER> ?????
In: Math
What is the relevant out come of having two or more testing on a particular research experimental design?
In: Math
3. Use `sample()` to generate rolls from biased coin with $Pr(Head) = 0.6$ .
i) get a sample of size 10 tosses and tally the results
ii) get a sample of size 30 tosses and tally the results
iii) get a sample of size 100 tosses and tally the results
iv) what do you notice with the proportion of heads in each
sample?
### Code chunk
```{r}
# star your code
# last R code line
```
In: Math
For this exercise, you will need to use the package `mosaic` to find numerical and graphical summaries.
```{r warning=FALSE, message=FALSE}
# install packages if necessary
if (!require(mosaic)) install.packages(`mosaic`)
if (!require(dplyr)) install.packages(`dplyr`)
if (!require(gapminder)) install.packages(`gapminder`)
# load the package in R
library(mosaic) # load the package mosaic to use its
functions
library(dplyr) # load the package dplyr to use its functions
library(gapminder) # load the package gapminder for question
1
```
1. Using the gapminder data in the lesson, do the following:
i) use `filter` to select all countries with the following
arguments:
a) life expectancy larger than 60 years.
b) United Kingdom and Vietnam and years greater than 1990.
ii) use `arrange` and `slice` to select the countries with the top
15 GDP per capital `gdpPercap`. Use the pipe `%>%` operator to
string multiple functions.
iii) use `mutate` to create a new variable called
`gdpPercap_lifeExp` which is the quotient of `gdpPercap` and
`lifeExp` and display the output.
iv) use `summarise` to find the average or mean value of the
variable `gdpPercap_lifeExp` created in part (iii).
v) use `group_by` to group the countries by `continent`; and
`summarise` to compute the average life expectancy `lifeExp` within
each continent. Use the pipe `%>%` operator to string multiple
functions.
### Code chunk
```{r}
# load the necessary packages
library(mosaic)
library(dplyr)
library(gapminder)
# last R code line
```
In: Math
2. The data set `MLB-TeamBatting-S16.csv` contains MLB Team Batting
Data for selected variables. Load the data set from the given url
using the code below. This data set was obtained from [Baseball
Reference](https://www.baseball-reference.com/leagues/MLB/2016-standard-batting.shtml).
* Tm - Team
* Lg - League: American League (AL), National League (NL)
* BatAge - Batters’ average age
* RPG - Runs Scored Per Game
* G - Games Played or Pitched
* AB - At Bats
* R - Runs Scored/Allowed
* H - Hits/Hits Allowed
* HR - Home Runs Hit/Allowed
* RBI - Runs Batted In
* SO - Strikeouts
* BA - Hits/At Bats
* SH - Sacrifice Hits (Sacrifice Bunts)
* SF - Sacrifice Flies
Using the `mlb16.data` data, do the following:
i) use `filter` to select teams with the following arguments:
a) Cardinals team `STL`.
b) teams with Hits `H` more than 1400 last 2016 season.
c) team league `Lg` is National League `NL`.
ii) use `arrange` to select teams in decreasing number of home runs
`HR`.
iii) use `arrange` to display the teams in decreasing number of
`RBI`.
iv) use `group_by` to group the teams per league; and `summarise`
to compute the average `RBI` within each league. Use the pipe
`%>%` operator to string multiple functions.
### Code chunk
```{r}
# load the data set
mlb16.data <-
read.csv("https://raw.githubusercontent.com/jpailden/rstatlab/master/data/MLB-TeamBatting-S16.csv")
str(mlb16.data) # check structure
head(mlb16.data) # show first six rows
# last R code line
```
In: Math
Suppose we would roll two standard 6-sided dice.
(a) Compute the expected value of the sum of the rolls.
(b) Compute the variance of the sum of the rolls.
(c) If X represents the maximum value that appears in the two rolls, what is the expected value of X? What’s the probability of sum = 7?
In: Math