1.One of the relatively uncommon instances in which researchers know the population standard deviation is in the case of the Intelligence Quotient or IQ test. In general, the average IQ score in large, diverse populations is 100 and the standard deviation is 15. Suppose that your sample of 300 members of your community gives you a mean IQ score of 108. Calculate a 90% confidence interval for the mean and indicate which answers come closest to those that would fill the blanks in the following interpretation: we can be 90% confident that they mean IQ score in this community lies between _____ and _____ .
2.Suppose that you are a city planner who obtains and sample of 20 randomly selected members of a mid-sized town in order to determine the average amount of money that residents spend on transportation each month (such as fuel, vehicle repairs, and public transit). You do not have the population standard deviation. To 3 decimal places, what is the critical value for the 95% confidence interval? In the same scenario as 7.09, suppose you obtained a mean of $167 spent on transportation and a standard deviation of $40. Calculate a 95% confidence interval for the mean and select the values that come closest to those that would fill the spaces in the following interpretation: we can be 95% confident that they mean amount of money spent on transportation lies between _____ and _____ .
In: Math
(Combinations -- Poker Probabilities) Suppose you deal a poker hand of 5 cards from a standard deck
(a) What is the probability of a flush (all the same suit) of all red cards?
(b) What is the probability of a full house where the 3-of-a-kind include two black cards, and the 2-of-a-kind are not clubs?
(c) What is the probability of a pair, where among the 3 non-paired cards, we have 3 distinct suits?
Hint: These are straight-forward modifications of the formulae given in lecture. You may quote formulae already given in lecture or lab without attribution and without explaining how to get the formula.
In: Math
11.Private Four-Year College Enrollment A random sample of enrollments in Pennsylvania’s private four-year colleges is listed here. Check for normality. Answer: Not Normal. Please show work this is a review for an exam
1350 |
1886 |
1743 |
1290 |
1767 |
2067 |
1118 |
3980 |
1773 |
4605 |
1445 |
3883 |
1486 |
980 |
1217 |
3587 |
In: Math
Here is a bivariate data set.
x | y |
---|---|
39 | 45 |
43 | 43 |
12 | 48 |
36 | 38 |
29 | 33 |
31 | 31 |
31 | 37 |
20 | 39 |
-4 | 51 |
52 | 31 |
Find the correlation coefficient and report it accurate to four
decimal places.
r =
In: Math
Determine the margin of error for a 99% confidence interval to estimate the population mean when s =39 for the sample sizes below.
a) |
n=14 |
b) |
n=34 |
c) |
n=53 |
____________________________________________________________________________________
Construct a 90% confidence interval to estimate the population mean when x overbarx =122 and s =26
for the sample sizes below.
a) n=40
b)n=50
c)n=100
In: Math
Problem 3: A linear regression by using famous data set found in Freedman et al. (1991) in Table 1: ‘Statistics’ refers to the percapita consumption of cigarettes in various countries in 1930 and the death rates (number of deaths per million people) from lung cancer for 1950.
Table 1: Death rate data in in Freedman |
|||
Obs |
Country |
Cigarette |
Deaths per million |
1 |
Australia |
480 |
180 |
2 |
Canada |
500 |
150 |
3 |
Denmark |
380 |
170 |
4 |
Finland |
1100 |
350 |
5 |
GreatBritain |
1100 |
460 |
6 |
Iceland |
230 |
60 |
7 |
Netherlands |
490 |
240 |
8 |
Norway |
250 |
90 |
9 |
Sweden |
300 |
110 |
10 |
Switzerland |
510 |
250 |
11 |
USA |
1300 |
200 |
In: Math
Can someone please explain these steps from this data description
All of Bubba Gump’s data has recently been integrated in a data warehouse. That enterprise data warehouse was built specifically to support data mining initiatives like the one you have been assigned to conduct, by consolidating data from multiple operations and channels in one place and integrating the data across sources for a complete view of the customer experience. For the first time, Bubba Gump analysts can link sales transactions to specific customers at specific restaurants, for example. It also means that you can link customer transactions across channels; that is, for any given customer, you can link to both their restaurant purchases, their online purchases, and (in some cases) their purchases from third-party retail partners.
You have been selected to develop and execute the data mining analysis plan for Bubba Gump’s customer analysis project. Your project will be the first major data mining project conducted against the new Bubba Gump data warehouse. Because Bubba Gump’s data was not previously integrated in a single data warehouse, company leadership has never been able to analyze its customers across their complete experience. In other words, customer restaurant purchases, online purchases, and third-party retailer purchases could not be analyzed together previously; each channel had to be analyzed separately.
As a first step, a sample of 500 customers has been selected from the analytics data warehouse and given a survey in exchange for purchase credits at one of Bubba Gump’s sales channels. The survey sample was selected from the universe of customers who have made purchases from at least one Bubba Gump outlet (restaurant, web store, etc.). Responses to various customer satisfaction questions were recorded, and historical purchase information has been extracted from the data warehouse for each customer in the sample.
To answer these questions I am having trouble to know what these questions wants if it is ok can someone please explain this to me thank you.
Your task is to analyze the survey responses to understand
whether there are natural “clusters” within Bubba Gump’s customer
population. You are then to
create a visualization of this survey data that describes Bubba
Gump’s customers across any dimensions that define those
subgroups.
Your Assignment
In your response, address the following critical elements:
Analysis Tools
What data mining tools will you use to perform the analysis?
Why these particular ones?
Data Visualizations
What data visualizations will you use in your report, and why?
Research Question
What is the specific research question that needs to be
addressed?
What research question will you work from in order to analyze
the given data for meaningful
patterns?
Research Measurement
How will you determine if your research question was answered or if
your hypothesis-generation was successful?
How will you measure progress?
Follow-Up Questions
What are cogent follow-up questions or explorations that should
follow from your initial research?
Research and Support
Are there any published sources or other resources that address
your line of inquiry? Where do they fall short? How will they help
guide your analysis?
In: Math
Test the Hypothesis If the Mean travel time in minutes between Point A to Point B is equal to the mean of the travel time in minutes from Point B to your A. First you must find the mean and standard deviations. Then perform and list the complete required steps for the TWO required Hypothesis tests and ALSO USE THE P-VALUE AS A REJECTION RULE FOR BOTH TESTS. One Hypothesis test is an F test for the equality of the variances of travel Times and the second test is a T test for the equality of the means of travel times in minutes. The F test must be performed first in order to select either Case1 or Case 2 for the T-test. PLEASE SHOW HOW YOU OBTAINED ALL ANSWERS
Recorded Time values in minutes from point A to point B: 32, 34, 51, 30, 29, 35, 36, 29, 32, 29, 33, 32, 29, 30, 33, 30, 30, 33, 30, 31, 35, 35, 34, 32, 33, 33, 31, 33, 34, 30, 30, 29, 34, 32, 36, 29, 30, 32, 30, 33, 31
n1=41
Recorded Time values in minutes from point B to point A: 36, 28, 48, 28, 27, 54, 34, 29, 26, 34, 33, 42, 29, 34, 31, 48, 27, 42, 28, 45, 26, 43, 32, 41, 30, 36, 27, 44, 29, 29, 35, 26, 31, 28, 27, 28, 32, 41, 34, 28, 31
n2=41
In: Math
When you do a Kruskal Wallis test, why do you always use a right tail test?
In: Math
A budget is an expression of management's expectations and goals
concerning future revenues and costs. To increase their
effectiveness, many budgets are flexible, including allowances for
the effect of variation in uncontrolled variables. For example, the
costs and revenues of many production plants are greatly affected
by the number of units produced by the plant during the budget
period, and this may be beyond a plant manager's control. Standard
cost-accounting procedures can be used to adjust the direct-cost
parts of the budget for the level of production, but it is often
more difficult to handle overhead. In many cases, statistical
methods are used to estimate the relationship between overhead
(y) and the level of production (x) using
historical data. As a simple example, consider the historical data
for a certain plant. Enter the data into EXCEL and analyze it to
answer the following items.
Production (in 10,000) units: | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
Overhead costs (in $1000): | 13 | 11.4 | 14 | 15 | 15.6 | 15.2 | 17.6 |
(a) Test the hypotheses H0:
β1 = 0
versus Ha:
β1 ≠ 0 at the 5% level of significance.
State the decision rule. (Which Option below should we use)
1. Reject H0 if p value >
0.025.
2. Do not reject H0 if p value ≤
0.025.Reject H0 if p value >
0.05.
3. Do not reject H0 if p value ≤
0.05. Reject H0 if
p value < 0.025.
4. Do not reject H0 if p value ≥
0.025.Reject H0 if p value <
0.05.
5. Do not reject H0 if p value ≥
0.05.
State the appropriate test statistic name, degrees of freedom, test
statistic value, and the associated p-value (Enter your
degrees of freedom as a whole number, the test statistic value to
three decimal places, and the p-value to four decimal
places).
(Choose: (x, ,z p, t, G)) (
_______ ) =
___________ , p (Choose: (≥,
>, <, ≤, = )) ___________
State your decision. Are production and overhead costs linearly related?
1.) Reject the null hypothesis: Yes, production and overhead costs appear to be linearly related
2.) Reject the null hypothesis: No, production and overhead costs do not appear to be linearly related.
3.) Do not reject the null hypothesis: Yes, production and overhead costs appear to be linearly related.
4.) Do not reject the null hypothesis: No, production and overhead costs do not appear to be linearly related.
(b) Test the hypotheses H0:
β1 = 1
versus Ha:
β1 ≠ 1 at the 5% level of significance.
State the appropriate test statistic name, degrees of freedom, test
statistic value, and the associated p-value (Enter your
degrees of freedom as a whole number, the test statistic value to
three decimal places, and the p-value to four decimal
places).
(Choose: (x, t, p, G, z))
(________ ) =
__________ , p (Choose:(= ≥ >
≤ < )) _____________
State your decision. Choose one of the following:
1.) Do not reject the null hypothesis: The slope of the line representing the relationship between overhead costs and production is not significantly different from 1.
2.) Do not reject the null hypothesis: The slope of the line representing the relationship between overhead costs and production is significantly different from 1.
3.) Reject the null hypothesis: The slope of the line representing the relationship between overhead costs and production is not significantly different from 1.
4.) Reject the null hypothesis: The slope of the line representing the relationship between overhead costs and production is significantly different from 1.
In: Math
James is a baseball player who hits left handed. Based on his past statistics, his strikeout rate against left-handed pitchers is 12.5%. He would like to reduce this rate, so he changes his batting stance. To test whether it works, he uses a pitching machine to simulate 200 at bats. In these, he struck out 16 times.
James conducts a one-proportion hypothesis test at the 5% significance level, to test whether the true proportion of strikeouts against left-handed pitchers using James's new stance is less than 12.5%.
(a) H0:p=0.125; Ha:p<0.125, which is a left-tailed test.
(b) Use Excel to test whether the true proportion of strikeouts against left-handed pitchers using James's new stance is less than 12.5%. Identify the test statistic, z, and p-value from the Excel output, rounding to three decimal places.
In: Math
Problem 3: A linear regression by using famous data set found in Freedman et al. (1991) in Table 1: ‘Statistics’ refers to the percapita consumption of cigarettes in various countries in 1930 and the death rates (number of deaths per million people) from lung cancer for 1950.
Table 1: Death rate data in in Freedman |
|||
Obs |
Country |
Cigarette |
Deaths per million |
1 |
Australia |
480 |
180 |
2 |
Canada |
500 |
150 |
3 |
Denmark |
380 |
170 |
4 |
Finland |
1100 |
350 |
5 |
GreatBritain |
1100 |
460 |
6 |
Iceland |
230 |
60 |
7 |
Netherlands |
490 |
240 |
8 |
Norway |
250 |
90 |
9 |
Sweden |
300 |
110 |
10 |
Switzerland |
510 |
250 |
11 |
USA |
1300 |
200 |
In: Math
The Aluminum Association reports that the average American uses 56.8 pounds of aluminum in a year. A random sample of 51 households is monitored for one year to determine aluminum usage. If the population standard deviation of annual usage is 12.1 pounds, what is the probability that the sample mean will be each of the following?
a. More than 61 pounds
b. More than 56 pounds
c. Between 55 and 57 pounds
d. Less than 54 pounds e. Less than 48 pounds
(Round the values of z to 2 decimal places. Round your answers to 4 decimal places.)
In: Math
What is the error of the predicted Systolic BP when Age = 39?
Age | Systolic BP | Year | Stories | Height | Year | Germany GDP | ||
47 | 145 | 1990 | 54 | 770 | 1950 | 5.725433 | ||
65 | 162 | 1980 | 47 | 677 | 1951 | 6.256754 | ||
46 | 142 | 1990 | 28 | 428 | 1952 | 6.70308 | ||
67 | 170 | 1989 | 38 | 410 | 1953 | 7.256435 | ||
42 | 124 | 1966 | 29 | 371 | 1954 | 7.72644 | ||
67 | 158 | 1976 | 38 | 504 | 1955 | 8.570349 | ||
56 | 154 | 1974 | 80 | 1136 | 1956 | 9.076571 | ||
64 | 162 | 1991 | 52 | 695 | 1957 | 9.45931 | ||
56 | 150 | 1982 | 45 | 551 | 1958 | 9.665697 | ||
59 | 140 | 1986 | 40 | 550 | 1959 | 10.259906 | ||
34 | 110 | 1931 | 49 | 568 | 1960 | 10.608815 | ||
42 | 128 | 1979 | 33 | 504 | 1961 | 11.032132 | ||
48 | 130 | 1988 | 50 | 560 | 1962 | 11.384714 | ||
45 | 135 | 1973 | 40 | 512 | 1963 | 11.611703 | ||
17 | 114 | 1981 | 31 | 448 | 1964 | 12.266443 | ||
20 | 116 | 1983 | 40 | 538 | 1965 | 12.813883 | ||
19 | 124 | 1968 | 27 | 410 | 1966 | 13.016213 | ||
36 | 136 | 1927 | 31 | 409 | 1967 | 12.964814 | ||
50 | 142 | 1969 | 35 | 504 | 1968 | 13.730252 | ||
39 | 120 | 1988 | 57 | 777 | 1969 | 14.665157 | ||
21 | 120 | 1987 | 31 | 496 | 1970 | 15.392277 | ||
44 | 160 | 1960 | 26 | 386 | 1971 | 15.720841 | ||
53 | 158 | 1984 | 39 | 530 | 1972 | 16.197464 | ||
63 | 144 | 1976 | 25 | 360 | 1973 | 16.907173 | ||
29 | 130 | 1920 | 23 | 355 | 1974 | 16.97702 | ||
25 | 125 | 1931 | 102 | 1250 | 1975 | 16.72403 | ||
69 | 175 | 1989 | 72 | 802 | 1976 | 17.6721 | ||
1907 | 57 | 741 | 1977 | 18.195684 | ||||
1988 | 54 | 739 | 1978 | 18.798212 | ||||
1990 | 56 | 650 | 1979 | 19.640699 | ||||
1973 | 45 | 592 | 1980 | 19.935295 | ||||
1983 | 42 | 577 | 1981 | 19.903635 | ||||
1971 | 36 | 500 | 1982 | 19.723139 | ||||
1969 | 30 | 469 | 1983 | 19.985983 | ||||
1971 | 22 | 320 | ||||||
1988 | 31 | 441 | ||||||
1989 | 52 | 845 | ||||||
1973 | 29 | 435 | ||||||
1987 | 34 | 435 | ||||||
1931 | 20 | 375 | ||||||
1931 | 33 | 364 | ||||||
1924 | 18 | 340 | ||||||
1931 | 23 | 375 | ||||||
1991 | 30 | 450 | ||||||
1973 | 38 | 529 | ||||||
1976 | 31 | 412 | ||||||
1990 | 62 | 722 | ||||||
1983 | 48 | 574 | ||||||
1984 | 29 | 498 | ||||||
1986 | 40 | 493 | ||||||
1986 | 30 | 379 | ||||||
1992 | 42 | 579 | ||||||
1973 | 36 | 458 | ||||||
1988 | 33 | 454 | ||||||
1979 | 72 | 952 | ||||||
1972 | 57 | 784 | ||||||
1930 | 34 | 476 | ||||||
1978 | 46 | 453 | ||||||
1978 | 30 | 440 | ||||||
1977 | 21 | 428 | ||||||
In: Math
Assume that females have pulse rates that are normally distributed with a mean of mu equals 73.0 beats per minute and a standard deviation of sigma equals 12.5 beats per minute. Complete parts (a) through (c) below. a. If 1 adult female is randomly selected, find the probability that her pulse rate is between 67 beats per minute and 79 beats per minute. The probability is nothing . (Round to four decimal places as needed.) b. If 4 adult females are randomly selected, find the probability that they have pulse rates with a mean between 67 beats per minute and 79 beats per minute. The probability is nothing . (Round to four decimal places as needed.) c. Why can the normal distribution be used in part (b), even though the sample size does not exceed 30? A. Since the distribution is of individuals, not sample means, the distribution is a normal distribution for any sample size. B. Since the distribution is of sample means, not individuals, the distribution is a normal distribution for any sample size. C. Since the original population has a normal distribution, the distribution of sample means is a normal distribution for any sample size. D. Since the mean pulse rate exceeds 30, the distribution of sample means is a normal distribution for any sample size.
In: Math