Question

In: Statistics and Probability

Q2d13A After successfully “pitching” his project to management, Angus MacDonald has started to collect the data...

Q2d13A

After successfully “pitching” his project to management, Angus MacDonald has started to collect the data needed to build his model. Angus first order of business is to collect data on shipping times for steel shipment between Nanticoke, ON and Halifax, NS. To estimate this time, Angus has obtained transit times from two shipping firms (A & B) that currently ship oil between refineries in Nanticoke and Dartmouth, NS. (Dartmouth is located across the harbor from Halifax). Each firm reported its shipping time in days for its last 10 deliveries:

A B

10 8
8 9
10 7
2 18
11 8
8 12
6 4
11 5
11 6
9 11

a. Assuming transit times between the two ports to be normally distributed, determine if the 4th data entry for Firm A (2 days) can be considered to be an anomaly.

b. Considering only shipping Firm B, how many data samples will Angus need to collect to ensure that he can obtain an estimates for transit time that is within ±1 day 19 times out of 20?

c. Based on the results from this pilot study, which of the two firms should be selected as to deliver steel to Halifax? State any assumption that you make and justify your results.

d. Assume Angus collect 100 data points from Firm B only and obtains the following histogram:

Shipping Time

Count

0 – 4.999

10

5 – 9.999

40

10 – 14.999

30

15 – 19.999

8
20+ 12

Angus hypothesizes that the data comes from a Normal (8, 4) distribution. Apply an appropriate statistical test to determine if Angus assumption is correct.

e. Comment on Angus assumption that the transit time data is normally distributed. Strictly considering what the data represents (transit times), why might we be surprised if the data is normally distributed? Again, based only on what the data represents, what other input distributions should be considered for this data? Finally, if the data was shown not to be normally distributed, why might a t-test be inappropriate for comparing the transit times for Firm A and Firm B? If a t-test is not appropriate, what statistical test (or tests) could or should we apply? Please note, there are no calculations required to complete this question.

NOTE: This is Industrial Engineering question in statistics...

Solutions

Expert Solution

The variable A and B is the delivery time of shipping oil between refineries in Nanticoke and Dartmouth.

This is continuous data so we assume it is normally distributed.

a) In variable firm A the 4th data entry is like an anomaly.

i.e. it is outliers of data or extreme value and in the continuous normal variable we remove outlier using mean of data,

Mean = mean(A)

= 8.6 ~ 9

so 4th entry in firm A would be 9.

b) In this study, the data took 10 sample from firm B.

c) to select the which firm is good or bad for transporting we want to use t-test here.

data is continuous and we compare the average performance between two firms so t-test is useful here.

Test of Hypothesis:

H0: the average shipping time of firm A is less than firm B

against,

H1: the average shipping time of firm A is greater than firm B

t-test by using R:

> t.test(A,B,alternative = "greater",var.equal = FALSE,conf.level = 0.95)

data: A and B

t = -0.12734,

df = 16.058,

p-value = 0.5499

alternative hypothesis: true difference in means is greater than 0

# Decision Rule: If p-value greater than 0.05 level of significance then we accept the null hypothesis.
# From above results p-value(0.5499) > 0.05
# we accept the null hypothesis here

i.e The average shipping time of firm A is less than firm B

i.e. The firm A is better than B

d) the frequency distribution table for this sample is,

Xi fi mi fi*mi
0 – 4.999 10 2.4995 24.995
5 – 9.999 40 7.4995 299.98
10 – 14.999 30 12.4995 374.985
15 – 19.999 8 17.4995 139.996
20+ 12 20 240
Total 100 59.998 1079.956

The mean of data is,

Mean = sum(fi * mi) / sum(fi)

= 1079.956 / 100

= 10.79956

but we have normal distribution N(8,4)

i.e. mean of normal distribution 8 does not matches to sample distribution 10.79956

i.e. sample not from N(8,4)

e) To identify the distribution of the variable follow the normal distribution or not use the Shapiro test,

H0: The sample distribution follows the normal distribution.

against,

H1: The sample distribution does not follow the normal distribution.

Shapiro Test using R:

> shapiro.test(A)

Shapiro-Wilk normality test

data: A

W = 0.90161, p-value = 0.2282

> shapiro.test(B)

Shapiro-Wilk normality test

data: B

W = 0.90757, p-value = 0.2647

# Decision Rule: If p-value greater than 0.05 level of significance then we accept the null hypothesis.

Both p-values > 0.05

i.e. Accept the null hypothesis.

From the above decision Rule, both variables follow the normal distribution.

e) For identifying the difference in average effect between two variable or for comparison of two variable use t-test.


Related Solutions

An engineer who works for Zpac is pitching a project for a new machine to his...
An engineer who works for Zpac is pitching a project for a new machine to his management. Zpac’s MARR is 12%. To finance this project, Zpac would have to invest $34,000 immediately and then $5000 each year for the next five years starting one year from now. The project is expected to generate revenue of $10000 one year from now, and then the revenue would increase by $2000 each year through year 4. The revenue in year 5 would be...
For this part of the project, you’ll collect data on various measure of confidence from the...
For this part of the project, you’ll collect data on various measure of confidence from the FRED website Figure 10 – University of Michigan: Consumer Sentiment (UMCSENT), last year Figure 11 – S&P 500 (SP500), last year Figure 12 – St. Louis Fed Financial Stress Index (STLFSI), last year 1. In your own words, briefly describe what each of the three indices measure. Why are these considered an indicator of the economy’s “health”? 2. How have these indices changed over...
For this part of the project, you’ll collect data on various measure of confidence from the...
For this part of the project, you’ll collect data on various measure of confidence from the FRED website. University of Michigan: Consumer Sentiment (UMCSENT), last year S&P 500 (SP500), last year St. Louis Fed Financial Stress Index (STLFSI), last year 1. In your own words, briefly describe what each of the three indices measure. Why are these considered an indicator of the economy’s “health”? 2. How have these indices changed over the past year? Do you think consumers and firms...
After several months of planning, Denise Murphy started a property management business for the for the...
After several months of planning, Denise Murphy started a property management business for the for the properties that its owners invest called ABC Property Management (“ABC”). The following events occurred during its first month: On May 1, Murphy started the firm, investing $3,000 cash and $15,000 of equipment. On May 2, ABC paid $600 cash for furniture for the shop. On May 3, ABC paid $500 cash to rent space in a strip mall for May. On May 4, ABC...
For this part of the project, you’ll collect data on real GDP and potential GDP measures...
For this part of the project, you’ll collect data on real GDP and potential GDP measures from the FRED website.   – Real GDP (GDPC1) AND Real Potential GDP (GDPPOT), 2007 – current year After that, answer all these questions. 1. What is the conceptual difference between real GDP and potential GDP? In terms of the economy using their resources, what does it mean if real GDP is less than potential GDP? What does it mean if real GDP is greater...
Nast stores has derived the following consumer credit-scoring model after years of data collect Y=(0.20 x...
Nast stores has derived the following consumer credit-scoring model after years of data collect Y=(0.20 x Employment) + (0.4 x Homeowner) + (0.3 x Cards) Employment = 1 if employed part-time, and 0 if unemployed Cards= 1 if presently has 1-5 credit cards, 0 otherwise Nast determines that a score of at least 0.70 indicates a very good credit risk, and it extends credit to these individuals. (each letter below is a separate question, answer a-d) PLEASE SHOW ALL WORK...
In this project, you will collect data from real world to construct a multiple regression model....
In this project, you will collect data from real world to construct a multiple regression model. The resulting model will be used for a prediction purpose. For example, suppose you are interested in “sales price of houses”. In a multiple regression model, this is called a “response variable”. There are many important factors that affect the prices of houses. Those factors include size (square feet), number of bedrooms, number of baths, age of the house, distance to a major grocery...
Question A project management consultant estimated that if a particular project was completed, t years after    ...
Question A project management consultant estimated that if a particular project was completed, t years after     completion, N thousand persons would benefit directly from the project, where                                   N (t) = t3/3 -4.5t2 + 16t,   0 ≤ t  ≤10       For what value of t, will the largest number of people receive direct benefits? A clothing manufacturer makes trousers, skirts and blouses. Each trousers requires 20 minutes of cutting time, 60 minutes of sewing time and 5 minutes of packaging time....
Senior management has started reading about knowledge management and has asked you to explore opportunities for...
Senior management has started reading about knowledge management and has asked you to explore opportunities for improving knowledge management at Dirt Bikes. Write a report answering the following questions. What are the most important knowledge assets at Dirt Bikes? What functions and employee positions are responsible for creating, distributing, and using these knowledge assets? Are all of these assets explicit knowledge? What knowledge outside the organization is required by the company? How could the following employee groups benefit from knowledge...
Data Mining is considering an expansion project. The company’s management has decided that the initial cost...
Data Mining is considering an expansion project. The company’s management has decided that the initial cost of the project is $300,000 with an additional installment cost of $80,000. The project life is four years, during it’s life the project will depreciate based on the following rates: • 1st year 33% • 2nd year 45% • 3rd year 15% • 4th year 7% Management has also decided that an additional of $45,000 in inventories and $12,000 in accounts payable needed if...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT