Question

In: Statistics and Probability

Many statistical procedures require that we draw a sample from a population whose distribution is approximately...

Many statistical procedures require that we draw a sample from a population whose distribution is approximately normal. Often we don’t know whether the population is approximately normal when we draw the sample. So the only way we assess whether the population is approximately normal is to examine its sample. Assessing normality is more important for small samples. Below, you’ll see some small samples and you’ll be asked to assess whether the populations they are drawn from can be treated as approximately normal.  

DATA-

  1. 2.6

    4.2

    1.5

    2.0

    0.6

    0.7

    6.6

    2.2

    9.7

    1.8

    4.2

    4.4

    0.6

    0.2

    The following data set is given. Determine whether it is reasonable to treat the following sample as though it comes from an approximately normal population. Include any charts or graphs you make in Excel here and justify your answer.  

  1. The following normal quantile plot illustrates a sample. Determine whether it is reasonable to treat this sample as though it comes from an approximately normal population. Explain your answer.

  2. The following histogram illustrates a sample. Determine whether it is reasonable to treat this sample as though it comes from an approximately normal population. Explain your answer.

  3. The following data set is given. Determine whether it is reasonable to treat the following sample as though it comes from an approximately normal population. Include any charts or graphs you make in Excel here and justify your answer.  

8.8

11.2

11.6

6.3

9.3

10.5

14.6

8.5

7.3

7.5

5.2

9.0

4.3

9.9

7.8

13.1

12.3

10.1

Solutions

Expert Solution

Q-Q Plot is the best way to find if the data is normally distributed or not, using EXCEL

Calculate the following 3 values using the below formulae:

a) Probability of i = (i -0.5)/total elements

b) Z-Score of i = NORMSINV(i)

c) Standardized value of i = (i - average of all elements)/sample std deviation of all elements

Now we find whether the data is normally distributed or not using following method:

1) For the first dataset containing 14 data entries, we first arrange them in ascending order and calculate probability, Z-Score and Standardized values in Excel and tabulate these values as shown below:

S.No

Data

Probability = (i-0.5)/n

Z-Score

Standardised Data

1

0.2

0.035714

-1.80274

-1.0293

2

0.6

0.107143

-1.24187

-0.87958

3

0.6

0.178571

-0.92082

-0.87958

4

0.7

0.25

-0.67449

-0.84215

5

1.5

0.321429

-0.46371

-0.54272

6

1.8

0.392857

-0.27188

-0.43043

7

2

0.464286

-0.08964

-0.35558

8

2.2

0.535714

0.089642

-0.28072

9

2.6

0.607143

0.27188

-0.131

10

4.2

0.678571

0.463708

0.467864

11

4.2

0.75

0.67449

0.467864

12

4.4

0.821429

0.920823

0.542722

13

6.6

0.892857

1.241867

1.366162

14

9.7

0.964286

1.802743

2.526464

Then using the last 2 columns namely Z-Score and Standardized data, we plot a Q-Q plot i.e a scatter plot with a trend line showing how much is the standardized curve deviating from normal curve:

The Q-Q Plot above has the blue curve i.e the quantile plot deviating a lot from the Linear trend line (black line). Hence this data cannot be considered to be following standard distribution.

2) Consider the second set of data containing 18 entries and arrange them in ascending order and calculate probability, Z-Score and Standardized values in Excel and tabulate these values as shown below:

S.No

Data

Probability = (i-0.5)/n

Z-Score

Standardised Data

1

4.3

0.027778

-1.91451

-1.84816

2

5.2

0.083333

-1.38299

-1.51512

3

6.3

0.138889

-1.08532

-1.10807

4

7.3

0.194444

-0.86163

-0.73803

5

7.5

0.25

-0.67449

-0.66402

6

7.8

0.305556

-0.50849

-0.55301

7

8.5

0.361111

-0.35549

-0.29398

8

8.8

0.416667

-0.21043

-0.18297

9

9

0.472222

-0.06968

-0.10896

10

9.3

0.527778

0.069685

0.002056

11

9.9

0.583333

0.210428

0.224082

12

10.1

0.638889

0.35549

0.29809

13

10.5

0.694444

0.508488

0.446107

14

11.2

0.75

0.67449

0.705138

15

11.6

0.805556

0.861634

0.853155

16

12.3

0.861111

1.085325

1.112185

17

13.1

0.916667

1.382994

1.408219

18

14.6

0.972222

1.914506

1.963284

Using Excel's scatterplot, we again plot the Q-Q Plot as shown below:

Here in the Q-Q plot, we see that most of the points expect 2 points, lie closer to the linear black trend line. Hence this dataset follows normal distribution.


Related Solutions

1.) What is we draw a sample of 225 people from a population with the mean...
1.) What is we draw a sample of 225 people from a population with the mean age of 40 and the standard deviation of 24: 1.) What is the probability of getting a sample mean of 38.2 or lower ? 2.) What is the probability of getting a sample mean of 44.5 or higher ? 3.) What is the probability of getting a sample mean between 38.2 and 44.5 ?
We draw a random sample of size 49 from a normal population with variance 2.1. If...
We draw a random sample of size 49 from a normal population with variance 2.1. If the sample mean is 21.5, what is a 99% confidence interval for the population mean?
We draw a random sample of size 36 from a population with standard deviation 3.2. If...
We draw a random sample of size 36 from a population with standard deviation 3.2. If the sample mean is 27, what is a 95% confidence interval for the population mean? [26.7550, 28.2450] [25.9547, 28.0453] [25.8567, 28.1433] [26.8401, 27.1599]
We draw a random sample of size 36 from a population with standard deviation 3.2. If...
We draw a random sample of size 36 from a population with standard deviation 3.2. If the sample mean is 27, what is a 95% confidence interval for the population mean?
4 We draw a random sample of size 40 from a population with standard deviation 2.5....
4 We draw a random sample of size 40 from a population with standard deviation 2.5. Show work in excel with formulas a If the sample mean is 27, what is a 95% confidence interval for the population mean? b If the sample mean is 27, what is a 99% confidence interval for the population mean? c If the sample mean is 27, what is a 90% confidence interval for the population mean? d If the sample mean is 27...
If we could draw many random samples from the same population, and each time we ran the exact same regression
  If we could draw many random samples from the same population, and each time we ran the exact same regression, then we would get the same regression coefficients but different standard errors.
Suppose you draw a sample from a population with a standard deviation of 25. You draw...
Suppose you draw a sample from a population with a standard deviation of 25. You draw 50 observations and end up with a sample mean of 100. a) Estimate a 90% confidence interval for the population mean b) Estimate a 95% confidence interval for the population mean c) Estimate a 99% confidence interval for the population mean d) What effect does increasing the confidence level have on the resulting confidence interval? e) Carefully interpret your confidence interval from part (a).
List procedures or treatments that require a consent form. As many as you can.
List procedures or treatments that require a consent form. As many as you can.
Assuming that the population (or sample) has a normal distribution, how many standard deviations above and...
Assuming that the population (or sample) has a normal distribution, how many standard deviations above and below the mean contains 95% of the population (or sample)? Be precise! Given the data set A = {9, 5, 16, 4, 32, 8, 12, 9, 11, 15, 5, 9, 18, 10}, which is the data of an entire population of subjects: Calculate the arithmetic mean Find the median Find the mode Calculate the range Calculate the interquartile range Calculate the mean deviation Calculate...
What is statistical inference? A) a way to infer conclusions about the wider population from sample...
What is statistical inference? A) a way to infer conclusions about the wider population from sample data B) using facts about a sample to estimate the truth about the whole population C) a way to describe the uncertainty and variability inherent in all statistical information D) all of the above
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT