Questions
A company would like to estimate its total cost equation using customer records.  The company has randomly...

A company would like to estimate its total cost equation using customer records.  The company has randomly sampled 28 customer records. Each customer record contains a Customer #, the Order Size, and the Total Cost of the Order.  The analyst remembers from accounting and economics classes taken in college that

TOTAL COST = Fixed Costs + Variable Cost per Unit *Order Size.

The analysis sees that this is a linear relationship where the TOTAL COST depends on the Fixed Costs, which do not depend on order size, and a variable cost per unit, which is multiplied by the Order Size.  The analysis decides to use simple linear regression to estimate the firm’s Total Cost function.  Use the data file, Estimating a Total Cost Regression Model.xlsx to answer the following questions:

  1. What is the dependent variable in this analysis? What is the independent variable in this analysis?
  2. Use excel to estimate the regression model.  State the estimated total cost function.
  3. What is the estimated Fixed Cost for the Company? Remember the fixed costs is independent of output. You can estimate it as the Total Cost when output is “0”. (Look at the regression output produced by Excel for part b.)
  4. What is the estimated average unit variable cost for the Company? (Look at the regression output produced by Excel for part b.)
Customer # Order Size (Quantity) Total Cost of Order
10211 28 1631
10212 31 1923
10213 43 2070
10214 47 2392
10215 32 1886
10216 43 2307
10217 25 1486
10218 46 2448
10219 41 2210
10220 48 2401
10221 29 1860
10222 32 1786
10223 49 2485
10224 44 2203
10225 33 1855
10226 46 2380
10227 42 2102
10228 31 1683
10229 30 1706
10230 35 1955
10231 34 1992
10232 33 1926
10233 27 1852
10234 32 1807
10235 31 1880
10236 42 2134
10237 39 1979
10238 36 1882

In: Statistics and Probability

An accounting firm checks that accuracy of a company’s records, which contains 13 inaccurate accounts out...

An accounting firm checks that accuracy of a company’s records, which contains 13 inaccurate accounts out of a total of 50 because of time constraints the accounting firm can only audit eight of the 50 accounts. The company supplied the accounting firm with eight randomly selected accounts. However, none of the eight accounts contains inaccuracies. In light of this, an investigation asks, ‘is it true that the company randomly selected the eight accounts to be audited, or did the company purposefully supply only accurate accounts on each step of the process? State the outcome of the investigation under the significant level of α=0.10

In: Statistics and Probability

A random sample of 24 chemical solutions is obtained, and their strengths are measured. The sample...

A random sample of 24 chemical solutions is obtained, and their strengths are measured. The sample mean is 5437.2 and the sample standard deviation is 376.9.

(a) Construct a two-sided 98% confidence interval for the average strength.

(b) Estimate how many additional chemical solutions need to be measured in order to obtain a twosided 98% confidence interval for the average strength within a margin of error that is 150

In: Statistics and Probability

In the dataset airways (see R code), we have the change in airflow from moderate exercise...

In the dataset airways (see R code), we have the change in airflow from moderate exercise for 19 subjects under 2 different exposure conditions – regular air (air) and 0.25% sulpher dioxide (so2).

a) Look at the correlation, and use the t-table to test the null hypothesis that air flow change under these two conditions is uncorrelated. Test at significance level 0.05. Show your work.

b) Use a linear model and the summary function in R to test the null hypothesis that air flow changes are uncorrelated among people at the alpha = 0.05.

c) Look at the scatterplot of the airway changes. Does a linear model fully explain the relationship between these variables? Why or why not?

R codes:

##################

subject = 1:19

air = c(0.82,0.86,1.86,1.64,12.57,1.56,1.28,1.08,4.29,1.37,14.68,3.64,3.89,0.58,9.50,0.93,0.49,31.04,1.66)

so2 = c(0.72,1.05,1.40,2.30,13.49,0.62,2.41,2.32,8.19,6.33,19.88,8.87,9.25,6.59,2.17,9.93,13.44,16.25,19.89)

airways = data.frame(subject,air,so2)

airways

# part a, get the correlation

cor(air,so2)

# part b, make a model and get the summary

fit = lm(air ~ so2, data=airways)

summary(fit)

## part c, to see a scatterplot...

plot(air ~ so2, data=airways)

In: Statistics and Probability

Some researchers in child development wanted to develop ways to increase the spatial-temporal reasoning of preschool...

Some researchers in child development wanted to develop ways to increase the spatial-temporal reasoning of preschool children. This ability, sometimes referred to as thinking in pictures, is important for generating and conceptualizing solutions to multistep problems and is an important aspect of early childhood development. A researcher designed a study to evaluate several methods proposed to accelerate the growth in spatial-temporal reasoning. The researcher wanted to find out, which one, if any, was most effective in increasing development in this area. The 3 methods proposed included: taking piano lessons for 3 months (piano), playing specially developed computer video games for 3 months (computer), and playing specially developed games in small groups supervised by a trained instructor (instructor). The researcher also included a control group that consisted of children who did not receive any special instruction (control). The effectiveness of the programs was assessed by measuring reasoning before the study and then after the 3 month intervention period. To simplify analysis, the response variable in this study was the difference in the two reasoning scores (score after 3 month intervention period – score at beginning of study).

1. Levene’s test is a common test used to assess whether the homogeneity of variance assumption is met. The researcher carried out this test to determine whether the assumption was met for these data. The results are shown below. Based on the results, can the researcher report that it seems reasonable to believe that this assumption has been met?

Levene's Test for Homogeneity of Variance (center = "mean")

           Df F value    Pr(>F)

group 3 0.0579    0.9816

            96     

a. No, the assumption is not met for these data, the test of the null hypothesis of equal variances is rejected (p<.05). b. No, the assumption is not met for these data, the test of the null hypothesis of equal variances is not rejected (p>.05). c. Yes, the assumption seems to be met for these data, the test of the null hypothesis of equal variances is rejected (p<.05). d. Yes, the assumption seems to be met for these data, the test of the null hypothesis of equal variances is not rejected (p>.05).

2. The ANOVA table for this study is shown below. However, some of the values from the table are missing. What is the F value (c)?

Analysis of Variance Table

Response: reasoningdiff

Df Sum Sq Mean Sq F value Pr(>F)   

method 3 727.24 a c 7.807e-12 ***

Residuals 96 952.30 b                       

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

a. 242.413 b. Cannot be determined from the information provided. c. 24.437 d. 9.92

3. The ANOVA result in question 8 is being used to test which one of the following null hypotheses?

a. Ho:   b. Ho:   c. Ho:   d. Ho:

4. Consider the p-value that is in the output presented in question 2 (data analysis output: 7.807e-12 ***), what decision would be made about the null hypothesis using a significance level of .01?

a. The null hypothesis would not be rejected; the obtained p-value is greater than the chosen significance level. b. The null hypothesis would be rejected; the obtained p-value is less than the chosen significance level. c. The null hypothesis would be rejected; the obtained p-value is greater than the chosen significance level. d. The null hypothesis would not be rejected; the obtained p-value is less than the chosen significance level.

5. Assuming it was appropriate to conduct post hoc tests, the following results were obtained for the child development study. Based on these results and using a significance level of .05, which groups do not differ?

  Posthoc multiple comparisons of means : Tukey HSD

    95% family-wise confidence level

$method

diff lwr.ci upr.ci pval   

piano-instructor -3.608 -5.937181 -1.2788192 0.00059 ***

computer-instructor -2.636 -4.965181 -0.3068192 0.01998 *

control-instructor -7.512 -9.841181 -5.1828192 4.3e-10 ***

computer-piano 0.972 -1.357181 3.3011808 0.69580   

control-piano -3.904 -6.233181 -1.5748192 0.00017 ***

control-computer    -4.876 -7.205181 -2.5468192 2.1e-06 ***

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

a. computer-piano b. control-computer c. piano-instructor d. computer-instructor

6. In reporting the results of the child development study, the researcher produced the table of descriptive statistics that you see below. Remember, the response variable of interest is the difference in reasoning score (score after intervention period– score before intervention started). Based on the calculation of this variable, a positive value would indicate that the reasoning score was higher after the intervention period and a negative score would indicate that the reasoning score was lower after the intervention period. In general, a higher reasoning score would be considered a better outcome.

method count mean sd

1 instructor    25 7.4 3.37   

2 piano 25 3.79 2.75    

3 computer 25 4.76 3.14    

4 control 25 -0.112 3.31   

Based on the descriptive table, the results shown in question 2 and the information provided, which intervention, if any, would you recommend to early childhood educators as a means of increasing the spatial-temporal reasoning in preschool children?

a. instructor b. piano c. computer d. does not seem to matter

In: Statistics and Probability

An American Medical Association study showed 20% of all Americans suffer from high blood pressure. Suppose...

  1. An American Medical Association study showed 20% of all Americans suffer from high blood pressure. Suppose we randomly sample ten Americans to determine the number in the sample who have high blood pressure.
  1. What is the probability at most two persons have high blood pressure? _____

  1. What is the probability exactly five persons have high blood pressure? _____

  1. What is the probability between two and six persons, inclusive, have high blood pressure? ______

  1. What is the variance of the number of persons who have high blood pressure (in random samples of ten individuals)? _______

In: Statistics and Probability

Let x be a random variable that represents the weights in kilograms (kg) of healthy adult...

Let x be a random variable that represents the weights in kilograms (kg) of healthy adult female deer (does) in December in a national park. Then x has a distribution that is approximately normal with mean μ = 50.0 kg and standard deviation σ = 8.6 kg. Suppose a doe that weighs less than 41 kg is considered undernourished.

(a) What is the probability that a single doe captured (weighed and released) at random in December is undernourished? (Round your answer to four decimal places.)

(b) If the park has about 2700 does, what number do you expect to be undernourished in December? (Round your answer to the nearest whole number.) does

(c) To estimate the health of the December doe population, park rangers use the rule that the average weight of n = 65 does should be more than 47 kg. If the average weight is less than 47 kg, it is thought that the entire population of does might be undernourished. What is the probability that the average weight x for a random sample of 65 does is less than 47 kg (assuming a healthy population)? (Round your answer to four decimal places.)

(d) Compute the probability that x < 51.6 kg for 65 does (assume a healthy population). (Round your answer to four decimal places.)

Suppose park rangers captured, weighed, and released 65 does in December, and the average weight was x = 51.6 kg. Do you think the doe population is undernourished or not? Explain.

Since the sample average is above the mean, it is quite unlikely that the doe population is undernourished.

Since the sample average is above the mean, it is quite likely that the doe population is undernourished.

Since the sample average is below the mean, it is quite unlikely that the doe population is undernourished.

Since the sample average is below the mean, it is quite likely that the doe population is undernourished.

In: Statistics and Probability

55% of the males in our Spring 2016 Stats class out of a sample of 161...

55% of the males in our Spring 2016 Stats class out of a sample of 161 males are from South Dakota. 52% of the females out of a sample of 157 are from South Dakota. Assume that the two samples were independent and representative samples. Construct and interpret a 99% large sample confidence interval for the difference in the proportion of males compared to females in our Stats class who are from South Dakota. Can we conclude that there is a significant difference in the proportion of males from females that are from South Dakota at the 99% level? Be sure to check the sample size first.

In: Statistics and Probability

Yale Law School says 74% of their students pass the bar exam on their first try....

  1. Yale Law School says 74% of their students pass the bar exam on their first try.

  1. To simulate passing students, we could assign the random digits as:
    1. 00 to 49 = pass first try, 50 to 99 = fail first try
    2. 0 to 7 = pass first try, 8 to 9 = fail first try
    3. 00 to 73 = pass first try, 74 to 99 = fail first try
    4. 0 to 4 = pass first try, 5 to 9 = fail first try

Using the random digit assignment from part a), and starting at line 106 simulate a class of 20 students.

Student

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

Random digit

Pass or fail

                             

b) Which students was the first in the list to fail the exam?

c) What percentage of these 20 students passed on their first try?

  1. How does that percentage compare to 74% we set up as our probability model? If there is a difference, why do you think that is?

In: Statistics and Probability

The distances traveled by individuals to work is exponential, with a mean of 9 miles a)...

The distances traveled by individuals to work is exponential, with a mean of 9 miles
a) What is the probability that they travel between 5 and 11 miles?

b) What is the 80th percentile of this distribution?

In: Statistics and Probability

Question 4 The list of all means you would compute from all possible samples you could...

Question 4

  1. The list of all means you would compute from all possible samples you could draw of a given size from a given population is called the:

Standard Error of the Mean

Confidence Level

Sampling Distribution of Means

Confidence Interval

Question 5

  1. The larger the sample size, the _______ the standard error of the mean.

larger

smaller

more dispersed.

less dispersed.

What does the central limits theorem tell us about the mean of the sampling distribution of means?

The mean of the sampling distribution of means is always equal to 0.

The mean of the sampling distribution of means is equal to the overall population mean.

The mean of the sampling distribution of means is equal to the overall population standard deviation.

The central limits theorem tells us nothing about the mean of the sampling distribution of means.

Question 7

What does the central limits theorem tell about the standard deviation of the sampling distribution of means, or standard error?

The standard error is equal to the standard deviation of the overall population divided by the square root of the sample size.

The standard error is equal to either 1.97 or 2.57.

The standard error is equal to the sample mean times the sample size.

The central limits theorem tells us nothing about the standard deviation of the sampling distribution of means, or standard error.

Question 7

What does the central limits theorem tell about the standard deviation of the sampling distribution of means, or standard error?

A. The standard error is equal to the standard deviation of the overall population divided by the square root of the sample size.

B. The standard error is equal to either 1.97 or 2.57.

C. The standard error is equal to the sample mean times the sample size.

D. The central limits theorem tells us nothing about the standard deviation of the sampling distribution of means, or standard error.

Question 8

Which of the following symbol labels is NOT correct?

    A. σ Standard Deviation of the Population

    B.    s   Standard Deviation of the Sample

    C.    n Sample Size

    D.     μ Mean of the Sample

Question 9

If the mean of a sample is 50 and the standard deviation of the sample is 15, what Z score corresponds to a raw score of 35?

a. +1.5

b. -1.5

c. +1.0

d. - 1.0

Question 10.

Use the interactive website to determine what percent of the cases in the variable distribution described in Question 9 (Mean = 50 SD = 15) fall below the raw score of 35.

a. .1586%

b. 15.86%

c. .7344%

d. 73.44%

Question 11

Given the variable distribution described in Question 9 (Mean = 50 SD = 15), what z score corresponds to a raw score of 65?

a. +1.5

b. +.5

c. +1.0

d.-1.0

In: Statistics and Probability

Explain why researchers typically focus on statistical independence rather than statistical dependence.

Explain why researchers typically focus on statistical independence rather than statistical dependence.

In: Statistics and Probability

A paper described a survey of 501 undergraduate students at a state university in the southwestern...

A paper described a survey of 501 undergraduate students at a state university in the southwestern region of the United States. Each student in the sample was classified according to class standing (freshman, sophomore, junior, or senior) and body art category (body piercings only, tattoos only, both tattoos and body piercings, no body art).

Use the data in the accompanying table to determine if there is evidence of an association between class standing and response to the body art question. Assume that it is reasonable to regard the sample of students as representative of the students at this university. Use

α = 0.01.

Body
Piercings
Only
Tattoos
Only
Both Body
Piercing and
Tattoos
No
Body
Art
Freshman 62 7 15 86
Sophomore 44 11 10 65
Junior 20 9 7 46
Senior 21 17 24 57

Calculate the test statistic. (Round your answer to two decimal places.)

χ2 =

What is the P-value for the test? (Round your answer to three decimal places.)

P-value =

In: Statistics and Probability

Given the following hypothesis: H0 : μ ≤ 10 H1 : μ > 10 For a...

Given the following hypothesis: H0 : μ ≤ 10 H1 : μ > 10 For a random sample of 10 observations, the sample mean was 11 and the sample standard deviation 4.20. Using the .10 significance level:

(a) State the decision rule. (Round your answer to 3 decimal places.) _____ H0 if t >

(b) Compute the value of the test statistic. (Round your answer to 2 decimal places.) Value of the test statistic ____

(c) What is your decision regarding the null hypothesis? ______ H0. The mean ___ greater than 10.

In: Statistics and Probability

forty new automobile were tested for fuel efficiency by the Environmental protection Agency (in mile per...

forty new automobile were tested for fuel efficiency by the Environmental protection Agency (in mile per gallon). The individual values and frequency distribution are displaced below.

24, 19 22 29 17 31 27 33 27 18

32 24 34 23 31 23 24 8 34 34

23 32 17 22 23 18 31 23 23 30

19 26 31 19 37 23 16 26 30 31

CLSS FREQUECY

8-12 1

13-17    3

18-22    8

23-27 14

28-32    9

33-37    5

Find the median, mode, and midrange of the data set.

In: Statistics and Probability