Questions
1.One of the relatively uncommon instances in which researchers know the population standard deviation is in...

1.One of the relatively uncommon instances in which researchers know the population standard deviation is in the case of the Intelligence Quotient or IQ test. In general, the average IQ score in large, diverse populations is 100 and the standard deviation is 15. Suppose that your sample of 300 members of your community gives you a mean IQ score of 108. Calculate a 90% confidence interval for the mean and indicate which answers come closest to those that would fill the blanks in the following interpretation: we can be 90% confident that they mean IQ score in this community lies between _____ and _____ .

2.Suppose that you are a city planner who obtains and sample of 20 randomly selected members of a mid-sized town in order to determine the average amount of money that residents spend on transportation each month (such as fuel, vehicle repairs, and public transit). You do not have the population standard deviation. To 3 decimal places, what is the critical value for the 95% confidence interval? In the same scenario as 7.09, suppose you obtained a mean of $167 spent on transportation and a standard deviation of $40. Calculate a 95% confidence interval for the mean and select the values that come closest to those that would fill the spaces in the following interpretation: we can be 95% confident that they mean amount of money spent on transportation lies between _____ and _____ .

In: Math

(Combinations -- Poker Probabilities) Suppose you deal a poker hand of 5 cards from a standard...

(Combinations -- Poker Probabilities) Suppose you deal a poker hand of 5 cards from a standard deck

(a) What is the probability of a flush (all the same suit) of all red cards?

(b) What is the probability of a full house where the 3-of-a-kind include two black cards, and the 2-of-a-kind are not clubs?

(c) What is the probability of a pair, where among the 3 non-paired cards, we have 3 distinct suits?

Hint: These are straight-forward modifications of the formulae given in lecture. You may quote formulae already given in lecture or lab without attribution and without explaining how to get the formula.

In: Math

11.Private Four-Year College Enrollment A random sample of enrollments in Pennsylvania’s private four-year colleges is listed...

11.Private Four-Year College Enrollment A random sample of enrollments in Pennsylvania’s private four-year colleges is listed here. Check for normality. Answer: Not Normal. Please show work this is a review for an exam

1350

1886

1743

1290

1767

2067

1118

3980

1773

4605

1445

3883

1486

  980

1217

3587

In: Math

Here is a bivariate data set. x y 39 45 43 43 12 48 36 38...

Here is a bivariate data set.

x y
39 45
43 43
12 48
36 38
29 33
31 31
31 37
20 39
-4 51
52 31



Find the correlation coefficient and report it accurate to four decimal places.
r =

In: Math

Determine the margin of error for a 99​% confidence interval to estimate the population mean when...

Determine the margin of error for a 99​% confidence interval to estimate the population mean when s​ =39 for the sample sizes below.

​a)

n=14

​b)

n=34

​c)

n=53

____________________________________________________________________________________

Construct a 90​% confidence interval to estimate the population mean when x overbarx =122 and s​ =26

for the sample sizes below.

​a) n=40       

​b)n=50       

​c)n=100

In: Math

Problem 3: A linear regression by using famous data set found in Freedman et al. (1991)...

Problem 3: A linear regression by using famous data set found in Freedman et al. (1991) in Table 1: ‘Statistics’ refers to the percapita consumption of cigarettes in various countries in 1930 and the death rates (number of deaths per million people) from lung cancer for 1950.

Table 1: Death rate data in in Freedman

Obs

Country

Cigarette

Deaths per million

1

Australia

480

180

2

Canada

500

150

3

Denmark

380

170

4

Finland

1100

350

5

GreatBritain

1100

460

6

Iceland

230

60

7

Netherlands

490

240

8

Norway

250

90

9

Sweden

300

110

10

Switzerland

510

250

11

USA

1300

200

  1. Order the data by cigarette using into 5 groups by (2, 2, 2, 2, 3), calculate the standard deviation, and make the graph of standard deviation against cigarettes.
  2. Check the graph in 1, what kind of transformation should be used to stabilize the variance?

In: Math

Can someone please explain these steps from this data description All of Bubba Gump’s data has...

Can someone please explain these steps from this data description

All of Bubba Gump’s data has recently been integrated in a data warehouse. That enterprise data warehouse was built specifically to support data mining initiatives like the one you have been assigned to conduct, by consolidating data from multiple operations and channels in one place and integrating the data across sources for a complete view of the customer experience. For the first time, Bubba Gump analysts can link sales transactions to specific customers at specific restaurants, for example. It also means that you can link customer transactions across channels; that is, for any given customer, you can link to both their restaurant purchases, their online purchases, and (in some cases) their purchases from third-party retail partners.

You have been selected to develop and execute the data mining analysis plan for Bubba Gump’s customer analysis project. Your project will be the first major data mining project conducted against the new Bubba Gump data warehouse. Because Bubba Gump’s data was not previously integrated in a single data warehouse, company leadership has never been able to analyze its customers across their complete experience. In other words, customer restaurant purchases, online purchases, and third-party retailer purchases could not be analyzed together previously; each channel had to be analyzed separately.

As a first step, a sample of 500 customers has been selected from the analytics data warehouse and given a survey in exchange for purchase credits at one of Bubba Gump’s sales channels. The survey sample was selected from the universe of customers who have made purchases from at least one Bubba Gump outlet (restaurant, web store, etc.). Responses to various customer satisfaction questions were recorded, and historical purchase information has been extracted from the data warehouse for each customer in the sample.

To answer these questions I am having trouble to know what these questions wants if it is ok can someone please explain this to me thank you.

Your task is to analyze the survey responses to understand whether there are natural “clusters” within Bubba Gump’s customer population. You are then to
create a visualization of this survey data that describes Bubba Gump’s customers across any dimensions that define those subgroups.
Your Assignment
In your response, address the following critical elements:
Analysis Tools
What data mining tools will you use to perform the analysis?

Why these particular ones?


Data Visualizations
What data visualizations will you use in your report, and why?


Research Question
What is the specific research question that needs to be addressed?

What research question will you work from in order to analyze the given data for meaningful
patterns?
Research Measurement
How will you determine if your research question was answered or if your hypothesis-generation was successful?

How will you measure progress?

Follow-Up Questions
What are cogent follow-up questions or explorations that should follow from your initial research?
Research and Support
Are there any published sources or other resources that address your line of inquiry? Where do they fall short? How will they help guide your analysis?

In: Math

Test the Hypothesis If the Mean travel time in minutes between Point A to Point B...

Test the Hypothesis If the Mean travel time in minutes between Point A to Point B is equal to the mean of the travel time in minutes from Point B to your A. First you must find the mean and standard deviations. Then perform and list the complete required steps for the TWO required Hypothesis tests and ALSO USE THE P-VALUE AS A REJECTION RULE FOR BOTH TESTS. One Hypothesis test is an F test for the equality of the variances of travel Times and the second test is a T test for the equality of the means of travel times in minutes. The F test must be performed first in order to select either Case1 or Case 2 for the T-test. PLEASE SHOW HOW YOU OBTAINED ALL ANSWERS

Recorded Time values in minutes from point A to point B: 32, 34, 51, 30, 29, 35, 36, 29, 32, 29, 33, 32, 29, 30, 33, 30, 30, 33, 30, 31, 35, 35, 34, 32, 33, 33, 31, 33, 34, 30, 30, 29, 34, 32, 36, 29, 30, 32, 30, 33, 31

n1=41

Recorded Time values in minutes from point B to point A: 36, 28, 48, 28, 27, 54, 34, 29, 26, 34, 33, 42, 29, 34, 31, 48, 27, 42, 28, 45, 26, 43, 32, 41, 30, 36, 27, 44, 29, 29, 35, 26, 31, 28, 27, 28, 32, 41, 34, 28, 31

n2=41

In: Math

When you do a Kruskal Wallis test, why do you always use a right tail test?

When you do a Kruskal Wallis test, why do you always use a right tail test?

In: Math

A budget is an expression of management's expectations and goals concerning future revenues and costs. To...

A budget is an expression of management's expectations and goals concerning future revenues and costs. To increase their effectiveness, many budgets are flexible, including allowances for the effect of variation in uncontrolled variables. For example, the costs and revenues of many production plants are greatly affected by the number of units produced by the plant during the budget period, and this may be beyond a plant manager's control. Standard cost-accounting procedures can be used to adjust the direct-cost parts of the budget for the level of production, but it is often more difficult to handle overhead. In many cases, statistical methods are used to estimate the relationship between overhead (y) and the level of production (x) using historical data. As a simple example, consider the historical data for a certain plant. Enter the data into EXCEL and analyze it to answer the following items.

Production (in 10,000) units: 5 6 7 8 9 10 11
Overhead costs (in $1000): 13 11.4 14 15 15.6 15.2 17.6

(a) Test the hypotheses H0: β1 = 0 versus  Ha: β1 ≠ 0 at the 5% level of significance.

State the decision rule. (Which Option below should we use)

1. Reject H0 if p value > 0.025.
2. Do not reject H0 if p value ≤ 0.025.Reject H0 if p value > 0.05.
3. Do not reject H0 if p value ≤ 0.05.    Reject H0 if p value < 0.025.
4. Do not reject H0 if p value ≥ 0.025.Reject H0 if p value < 0.05.
5. Do not reject H0 if p value ≥ 0.05.


State the appropriate test statistic name, degrees of freedom, test statistic value, and the associated p-value (Enter your degrees of freedom as a whole number, the test statistic value to three decimal places, and the p-value to four decimal places).

(Choose: (x, ,z p, t, G)) ( _______  ) = ___________  , p (Choose: (≥, >, <, ≤, = )) ___________

State your decision. Are production and overhead costs linearly related?

1.) Reject the null hypothesis: Yes, production and overhead costs appear to be linearly related

2.) Reject the null hypothesis: No, production and overhead costs do not appear to be linearly related.

3.) Do not reject the null hypothesis: Yes, production and overhead costs appear to be linearly related.

4.) Do not reject the null hypothesis: No, production and overhead costs do not appear to be linearly related.



(b) Test the hypotheses  H0: β1 = 1 versus  Ha: β1 ≠ 1 at the 5% level of significance.

State the appropriate test statistic name, degrees of freedom, test statistic value, and the associated p-value (Enter your degrees of freedom as a whole number, the test statistic value to three decimal places, and the p-value to four decimal places).

(Choose: (x, t, p, G, z)) (________  ) = __________ , p (Choose:(= ≥ > ≤ < )) _____________

State your decision. Choose one of the following:

1.) Do not reject the null hypothesis: The slope of the line representing the relationship between overhead costs and production is not significantly different from 1.

2.) Do not reject the null hypothesis: The slope of the line representing the relationship between overhead costs and production is significantly different from 1.

3.) Reject the null hypothesis: The slope of the line representing the relationship between overhead costs and production is not significantly different from 1.

4.) Reject the null hypothesis: The slope of the line representing the relationship between overhead costs and production is significantly different from 1.

In: Math

James is a baseball player who hits left handed. Based on his past statistics, his strikeout...

James is a baseball player who hits left handed. Based on his past statistics, his strikeout rate against left-handed pitchers is 12.5%. He would like to reduce this rate, so he changes his batting stance. To test whether it works, he uses a pitching machine to simulate 200 at bats. In these, he struck out 16 times.

James conducts a one-proportion hypothesis test at the 5% significance level, to test whether the true proportion of strikeouts against left-handed pitchers using James's new stance is less than 12.5%.

(a) H0:p=0.125; Ha:p<0.125, which is a left-tailed test.

(b) Use Excel to test whether the true proportion of strikeouts against left-handed pitchers using James's new stance is less than 12.5%. Identify the test statistic, z, and p-value from the Excel output, rounding to three decimal places.

In: Math

Problem 3: A linear regression by using famous data set found in Freedman et al. (1991)...

Problem 3: A linear regression by using famous data set found in Freedman et al. (1991) in Table 1: ‘Statistics’ refers to the percapita consumption of cigarettes in various countries in 1930 and the death rates (number of deaths per million people) from lung cancer for 1950.

Table 1: Death rate data in in Freedman

Obs

Country

Cigarette

Deaths per million

1

Australia

480

180

2

Canada

500

150

3

Denmark

380

170

4

Finland

1100

350

5

GreatBritain

1100

460

6

Iceland

230

60

7

Netherlands

490

240

8

Norway

250

90

9

Sweden

300

110

10

Switzerland

510

250

11

USA

1300

200

  1. Perform the simple linear regression with and without USA and make the overlay graph.
  2. The question is: should we use USA data, should the regression line pass through the original? Give your answer.

In: Math

The Aluminum Association reports that the average American uses 56.8 pounds of aluminum in a year....

The Aluminum Association reports that the average American uses 56.8 pounds of aluminum in a year. A random sample of 51 households is monitored for one year to determine aluminum usage. If the population standard deviation of annual usage is 12.1 pounds, what is the probability that the sample mean will be each of the following?

a. More than 61 pounds

b. More than 56 pounds

c. Between 55 and 57 pounds

d. Less than 54 pounds e. Less than 48 pounds

(Round the values of z to 2 decimal places. Round your answers to 4 decimal places.)

In: Math

What is the error of the predicted Systolic BP when Age = 39? Age Systolic BP...

What is the error of the predicted Systolic BP when Age = 39?

Age Systolic BP Year Stories Height Year Germany GDP
47 145 1990 54 770 1950 5.725433
65 162 1980 47 677 1951 6.256754
46 142 1990 28 428 1952 6.70308
67 170 1989 38 410 1953 7.256435
42 124 1966 29 371 1954 7.72644
67 158 1976 38 504 1955 8.570349
56 154 1974 80 1136 1956 9.076571
64 162 1991 52 695 1957 9.45931
56 150 1982 45 551 1958 9.665697
59 140 1986 40 550 1959 10.259906
34 110 1931 49 568 1960 10.608815
42 128 1979 33 504 1961 11.032132
48 130 1988 50 560 1962 11.384714
45 135 1973 40 512 1963 11.611703
17 114 1981 31 448 1964 12.266443
20 116 1983 40 538 1965 12.813883
19 124 1968 27 410 1966 13.016213
36 136 1927 31 409 1967 12.964814
50 142 1969 35 504 1968 13.730252
39 120 1988 57 777 1969 14.665157
21 120 1987 31 496 1970 15.392277
44 160 1960 26 386 1971 15.720841
53 158 1984 39 530 1972 16.197464
63 144 1976 25 360 1973 16.907173
29 130 1920 23 355 1974 16.97702
25 125 1931 102 1250 1975 16.72403
69 175 1989 72 802 1976 17.6721
1907 57 741 1977 18.195684
1988 54 739 1978 18.798212
1990 56 650 1979 19.640699
1973 45 592 1980 19.935295
1983 42 577 1981 19.903635
1971 36 500 1982 19.723139
1969 30 469 1983 19.985983
1971 22 320
1988 31 441
1989 52 845
1973 29 435
1987 34 435
1931 20 375
1931 33 364
1924 18 340
1931 23 375
1991 30 450
1973 38 529
1976 31 412
1990 62 722
1983 48 574
1984 29 498
1986 40 493
1986 30 379
1992 42 579
1973 36 458
1988 33 454
1979 72 952
1972 57 784
1930 34 476
1978 46 453
1978 30 440
1977 21 428

In: Math

Assume that females have pulse rates that are normally distributed with a mean of mu equals...

Assume that females have pulse rates that are normally distributed with a mean of mu equals 73.0 beats per minute and a standard deviation of sigma equals 12.5 beats per minute. Complete parts​ (a) through​ (c) below. a. If 1 adult female is randomly​ selected, find the probability that her pulse rate is between 67 beats per minute and 79 beats per minute. The probability is nothing . ​(Round to four decimal places as​ needed.) b. If 4 adult females are randomly​ selected, find the probability that they have pulse rates with a mean between 67 beats per minute and 79 beats per minute. The probability is nothing . ​(Round to four decimal places as​ needed.) c. Why can the normal distribution be used in part​ (b), even though the sample size does not exceed​ 30? A. Since the distribution is of​ individuals, not sample​ means, the distribution is a normal distribution for any sample size. B. Since the distribution is of sample​ means, not​ individuals, the distribution is a normal distribution for any sample size. C. Since the original population has a normal​ distribution, the distribution of sample means is a normal distribution for any sample size. D. Since the mean pulse rate exceeds​ 30, the distribution of sample means is a normal distribution for any sample size.

In: Math