Question

In: Statistics and Probability

An administrator wanted to study the utilization of long-distance telephone service by a department. One variable...

An administrator wanted to study the utilization of long-distance telephone service by a department. One variable of interest (let’s call it X) is the length, in minutes, of long-distance calls made during one month. There were 38 calls that resulted in a connection. The length of calls, already ordered from smallest to largest, are presented in the following table.

1.6

1.7

1.8

1.8

1.9

2.1

2.5

3.0

3.0

4.4

4.5

4.5

5.9

7.1

7.4

7.5

7.7

8.6

9.3

9.5

12.7

15.3

15.5

15.9

15.9

16.1

16.5

17.3

17.5

19.0

19.4

22.5

23.5

24.0

31.7

32.8

43.5

53.3

Which one of the following statements is not true?

  1. The 75th percentile (Q3) is 17.5 minutes.

  2. The 50th percentile is (Q2) 9.4 minutes.

  3. The 25th percentile (Q1) is 4.4 minutes.

  4. Q3- Q2 > Q2- Q1

  5. Average X > Median X.

  6. X distribution is positively skewed.

  7. The percentile rank of 5.9 minutes is 13.

  8. Range of X is 51.7 minutes.

  9. IQR (Inter-Quartile Range) is 13.1 minutes.

  10. There are 2 outliers in X distribution.

Q4: (This continues Q3: 2 marks) Which one of the following cannot be used to describe the distribution of X?

  1. A Histogram.

  2. A Stemplot.

  3. Skewness and Kurtosis.

  4. Mean and SD (Standard Deviation).

  5. The 5-number Summary.

  6. The coefficient of determination.

  7. The coefficient of relative variation (CRV).

  8. The 1.5 IQR Rule.

  9. The Deciles.

  10. A Boxplot.

Solutions

Expert Solution

An administrator wanted to study the utilization of long-distance telephone service by a department.

One variable of interest (let’s call it X) is the length, in minutes, of long-distance calls made during one month.

The length of calls, already ordered from smallest to largest, are presented in the following table.

1.6

1.7

1.8

1.8

1.9

2.1

2.5

3.0

3.0

4.4

4.5

4.5

5.9

7.1

7.4

7.5

7.7

8.6

9.3

9.5

12.7

15.3

15.5

15.9

15.9

16.1

16.5

17.3

17.5

19.0

19.4

22.5

23.5

24.0

31.7

32.8

43.5

53.3

Which one of the following statements is not true?

Now we will find each of the required quantity to justify weather statements is not true

i)

The 75th percentile (Q3) is 17.5 minutes

Given sample size = 38 ( even )

Thus 75th percentile for even data with n = 38 is given by

To calculate 75th percentile we will first calculate Median of data
Median = ( (n\2)th observation + (n/2+1)th observation ) / 2

           = ( (38/2)th observation + (38/2+1)th observation ) / 2

           = ( (19)th observation + (20)th observation ) / 2

From given dat data is (19)th observation = 9.3   and (20)th observation = 9.5

Thus, Median = ( 9.3 + 9.5 ) / 2 = 18.8 /2 = 9.4

Thus Median = 9.4

Now 75th percentile is nothing but median of data which is more that Median value i.e

Median of this observation ( medain value = 9.4 , so observation greater than 9.4 are )

9.5 12.7 15.3 15.5 15.9 15.9 16.1 16.5 17.3 17.5 19.0 19.4 22.5 23.5 24.0 31.7 32.8 43.5 53.3

Number of observation greater than 9.4 are n3 = 19 ( odd )

Median of odd number is given by

Median of data greater than 9.4 = ( n3 + 1 )/2 observation = ( 19 + 1 ) /2 th observation

                                                   = 10 th observation

Now 10th observation is 17.5

Thus

Our 75th percentile   is 17.5

Hence

The 75th percentile (Q3) is 17.5 minutes.   - TRUE

ii)

The 50th percentile is (Q2) 9.4 minutes.

We have have already obtain median of data above

Which was   Median =

Median = ( (n\2)th observation + (n/2+1)th observation ) / 2

           = ( (38/2)th observation + (38/2+1)th observation ) / 2

           = ( (19)th observation + (20)th observation ) / 2

From given dat data is (19)th observation = 9.3   and (20)th observation = 9.5

Thus, Median = ( 9.3 + 9.5 ) / 2 = 18.8 /2 = 9.4

Thus Median = 9.4

Thus The 50th percentile is (Q2) 9.4 minutes. - TRUE

iii)

To find The 25th percentile (Q1)

Now 25th percentile is nothing but median of data which is less that Median value i.e

Median of this observation ( medain value = 9.4 , so observation less than 9.4 are )

1.6 1.7 1.8 1.8 1.9 2.1 2.5 3.0 3.0 4.4 4.5 4.5 5.9 7.1 7.4 7.5 7.7 8.6 9.3

Number of observation less than 9.4 are n1 = 19 ( odd )

Median of odd number is given by

Median of data less than 9.4 = ( n1 + 1 )/2 observation = ( 19 + 1 ) /2 th observation

                                             = 10 th observation

Now 10th observation is 4.4

Thus

Our 25th percentile   is 4.4

Thus The 25th percentile (Q1) is 4.4 minutes. - TRUE

iv)

Q3- Q2 > Q2- Q1

Now

Q3- Q2 = 17.5 - 9.4 = 8.1

Q2- Q1 = 9.4 - 4.4 = 5

hence , Q3- Q2 > Q2- Q1 - TRUE

v)

Average X > Median X.

Now we will calculate mean of X

Mean =

Mean =[ 1.6 + 1.7 +1.8 + 1.8 + 1.9 + 2.1 +.......+ 22.5 +23.5 +24.0 +31.7 +32.8 +43.5+ 53.3 ] /38

        = 508.2 / 38 = 13.37368

Thus Mean = 13.37368

And Median = 9.4

Hence Average X > Median X. - TRUE

vi)

X distribution is positively skewed.

If the mean is greater than the median, the distribution is positively skewed

Here MEAN = 13.37368 and Median = 9.4

Mean > Median

Hence X distribution is positively skewed. - TRUE

vi

The percentile rank of 5.9 minutes is 13.

Yes

Given data is

X      1.6 1.7 1.8 1.8 1.9 2.1 2.5 3.0 3.0 4.4 4.5 4.5   5.9

rank   1 2     3     4 5    6    7     8     9    10   11   12    13

The percentile rank of 5.9 minutes is 13. - TRUE

vii)

Range of X is 51.7 minutes.

Range = max - min = 53.3 - 1.6 = 51.7

Range of X is 51.7 minutes. - TRUE

viii)

IQR (Inter-Quartile Range) is 13.1 minutes.

IQR = Q3 - Q1 = 17.5 - 4.4 = 13.1

Hence IQR (Inter-Quartile Range) is 13.1 minutes. - TRUE

ix)

There are 2 outliers in X distribution.

outlier is any data point more than 1.5 interquartile ranges (IQRs) below the first quartile or above the third quartile

Thus 1.5 interquartile ranges (IQRs) below the first quartile = 4.4 - 1.5 * IQR = 4.4 - 1.5 * 13.1 = -15.25

And 1.5 interquartile ranges (IQRs) above the third quartile = 17.5 - 1.5 * IQR = 17.5 - 1.5 * 13.1 = 37.15

Hence our data should be in interquartile ranges = ( -15.25 , 37.15 )

Now we can see observation 37th and 38th which are 43.5 and 53.3 respectively are out of given interval

Hence 43.5 and 53.3 are outlier

So we have 2 outlier observation

There are 2 outliers in X distribution. - TRUE

Q4: (This continues Q3: 2 marks) Which one of the following cannot be used to describe the distribution of X?

i)A Histogram. - A histogram displays the shape and spread of continuous sample data.

Hence Histogram can be used to describe the distribution of X

ii)

A Stemplot. -

You could make a frequency distribution table or a histogram for the values, or you can use a

                   stem-and-leaf plot and let the numbers themselves to show pretty much the same information.

Hence Stemplot can be used to describe the distribution of X

iii)

Skewness and Kurtosis. -

Skewness is a measure of symmetry, or more precisely, the lack of symmetry

Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution

Hence Skewness and Kurtosis can be used to describe the distribution of X

iv)

Mean and SD (Standard Deviation). -

The mean can be used to get an overall idea or picture of the data set.

Standard deviation measures the spread of a data distribution .

Hence Mean and SD (Standard Deviation) can be used to describe the distribution of X

v)

The 5-number Summary.

A summary consists of five values: the most extreme values in the data set (the maximum and minimum values), the lower and upper quartiles, and the median .This makes the five-number summary a useful measure of spread

Hence 5-number Summary can be used to describe the distribution of X

vi)

The coefficient of determination.

The coefficient of determination is used to explain how much variability of one factor can be caused by its relationship to another factor.

Sometimes referred to as the "goodness of fit.

The coefficient of determination is a measure used in statistical analysis that assesses how well a model explains and predicts future outcomes

So coefficient of determination can be used in regreesion model if there is one more dependent variable , hence here it can not describe the distribution of X

vii)

The coefficient of relative variation (CRV).

The coefficient of relative variation (relative standard deviation) is a statistical measure of the dispersion of data points around the mean

Hence coefficient of relative variation (CRV) can be used to describe the distribution of X

viii)

The 1.5 IQR Rule.

The IQR is often seen as a better measure of spread than the range as it is not affected by outliers.

Hence IQR can be used to describe the distribution of X

ix)

The Deciles

A decile is a quantitative method of splitting up a set of ranked data into 10 equally large subsections

Deciles are similar to quartiles. But while quartiles sort data into four quarters, deciles sort data into ten equal parts

So Deciles can be used to describe the distribution of X

x)

A boxplot

A boxplot is a standardized way of displaying the distribution of data based on a five number summary .

So from box-plot we can observe outlier , weather data is symmeteric , skewed etc .

Hence boxplot can be used to describe the distribution of X


Related Solutions

An administrator wanted to study the utilization of long-distance telephone service by a department. One variable...
An administrator wanted to study the utilization of long-distance telephone service by a department. One variable of interest (let’s call it X) is the length, in minutes, of long-distance calls made during one month. There were 38 calls that resulted in a connection. The length of calls, already ordered from smallest to largest, are presented in the following table. 1.6 1.7 1.8 1.8 1.8 2.1 2.5 3.0 3.0 4.4 4.5 4.5 5.9 7.1 7.4 7.5 7.7 8.6 9.3 9.5 12.7...
The TIV Telephone Company provides long-distance telephone service in an area. According to the company’s records,...
The TIV Telephone Company provides long-distance telephone service in an area. According to the company’s records, the average length of all long-distance phone calls placed through this company in 2015 was 12.44 minutes. The company’s management wants to check if the mean length of the current long- distance calls is different from 12.44 minutes. A sample of 150 such calls placed through the company produced a mean length of 13.71 minutes. The standard deviation of all such calls is 2.65...
Long-Distance Calls A long-distance provider charges the following rates for telephone calls: Rate Category Daytime (6:00...
Long-Distance Calls A long-distance provider charges the following rates for telephone calls: Rate Category Daytime (6:00 A.M. through 5:59 P.M.) Evening (6:00 P.M. through 11:59 P.M.) Off-Peak (12:00 A.M. through 5:59 A.M.) Rate per Minute $0.07 $0.12 $0.05 Create a C++ program that allows the user to select a rate category and enter the number of minutes of the call, then displays the charges. Use the following test data to determine if the application is calculating properly: Rate Category and...
Long distance telephone calls are normally distributed with a mean of 8 minutes and a      standard...
Long distance telephone calls are normally distributed with a mean of 8 minutes and a      standard deviation of 2 minutes.          If random samples of 25 calls were selected, what is the probability that telephone calls would be between 7.83 and 8.2 minutes? If random samples of 25 calls were selected, what is the probability that telephone calls would be at most 8.2 minutes?       c.   If random samples 100 calls were selected, what is the probability that       telephone calls would be between...
One wants to study the daily mean travel distance of a delivery service. In trial runs...
One wants to study the daily mean travel distance of a delivery service. In trial runs of 18 randomly chosen delivery trucks, the mean and the standard deviation are found to be 310 km and 70km, respectively. Assume that the daily travel distance is normally distributed. At the 0.1 level of significance, test the claim that the daily mean travel distance is different from 350 km. What is the CONCLUSION? Select one: a. there is not enough assumptions to do...
A statistical analysis of​ 1,000 long-distance telephone calls made by a company indicates that the length...
A statistical analysis of​ 1,000 long-distance telephone calls made by a company indicates that the length of these calls is normally​ distributed, with a mean of 280280 seconds and a standard deviation of 3030 seconds. Complete parts​ (a) through​ (d). a. What is the probability that a call lasted less than 230230 ​seconds?The probability that a call lasted less than 230230 seconds is . 0478.0478 . ​(Round to four decimal places as​ needed.) b. What is the probability that a...
A statistical analysis of​ 1,000 long-distance telephone calls made by a company indicates that the length...
A statistical analysis of​ 1,000 long-distance telephone calls made by a company indicates that the length of these calls is normally​ distributed, with a mean of 240 seconds and a standard deviation of 30 seconds. Complete parts​ (a) through​ (d). a. What is the probability that a call lasted less than 180 ​seconds? The probability that a call lasted less than 180 seconds is nothing. ​(Round to four decimal places as​ needed.) b. What is the probability that a call...
Suppose that a long distance taxi service owns 4 vehicles. These are of different ages and...
Suppose that a long distance taxi service owns 4 vehicles. These are of different ages and have different repair records. The probabilities that, on a given day, each vehicle will be available for use are: 0.90, 0.90, 0.80, 0.70. Whether one vehicle is available is independent of whether any other vehicle is available. a. Find the probability distribution for the number of vehicles available for use on a given day. b. Find the expected number of vehicles available for use...
2a). The customer service department of H&R Block received a total of 235 telephone requests for...
2a). The customer service department of H&R Block received a total of 235 telephone requests for a tip-sheet on personal tax or corporate tax. The following table summarizes callers' primary area of interest, and how they first heard about the tip-sheet. Topic of most interest to caller How the caller first heard about the report Radio Newspaper Television Internet Personal tax 34 20 26 20 Corporate tax 36 70 14 15 What is the probability a caller is interested in...
The Springfield Emergency Medical Service keeps records of emergency telephone calls. A study of 150 five-minute...
The Springfield Emergency Medical Service keeps records of emergency telephone calls. A study of 150 five-minute time intervals resulted in the distribution of number of calls as follows. For example, during 18 of the 5-minute intervals, no calls occurred. Use the chi-square goodness-of-fit test and α = .01 to determine whether this distribution is Poisson. Number of Calls (per 5-minute interval) Frequency 0 18 1 28 2 47 3 21 4 16 5 11 6 or more 9
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT