Question

In: Statistics and Probability

[4 5 5 2 4 4 6 3 3 7 5 3 6 3 4 4...

[4 5 5 2 4 4 6 3 3 7 5 3 6 3 4 4 6 5 4 5 3 7 5 5 4 2 6 5 6 6]

This is my dataset

  1. Find mean, median, mode, variance, standard deviation, coefficient of variation, range, 70th percentile, 3rdquartile of the data and skewness and define what each of these statistics measure. For example, mean is a measure of the central tendency, what about the rest?
  2. Use Chebyshev’s rule to find the percent of data values which are located in 1,5 standard deviations of the mean;
  3. Create a new series by standardization. It means find z-score for each datavalue in the dataset. Find out if there is any outlier using z-score rule. If you forgot that check 21st slide of lecture note 3b;
  4. Draw Box plot (5-number summary of the data) and find if there are any outliers;

Solutions

Expert Solution

Solution :

Mean, Median and Mode are the measures of central tendency.

Mean measure the average of data values. Median is the value that separates bottom 50% observations form that of top 50% observations. Mode is most repeating observation in the given data.

Variance, standard deviation, coefficient of variation and the range are the measures of dispersion.

Variance and standard deviation measures the spread of all the observations from it's mean. Coefficient of variation is the ratio of standard deviation to th mean. The higher the CV, the greater is the level of dispersion from it's mean.

The Chebychev's rule applies to all types of data sets and it states that, the minimum proportion of observation that lies within "k" standard deviations from the mean is this is true for ( k > 1).

For k=5, we get; . And hence the proportion of observation that lie within 5 standard deviation from the mean of data is, 0.96.

Consider the following output;

The column "standardized" gives the standardized values ( or Z scores ) for the data. Since there is not any observation whose Z score lie outside the the interval (-2, 2) we conclude that there is no outlier in the data.


Related Solutions

[4 5 5 2 4 4 6 3 3 7 5 3 6 3 4 4...
[4 5 5 2 4 4 6 3 3 7 5 3 6 3 4 4 6 5 4 5 3 7 5 5 4 2 6 5 6 6] This is my dataset Split the dataset in two equal parts. You have 30 datavalues. If you split the data in two equal parts each part will contain 15 data values.  Call the first part Y and second part X.Draw scatter plot of the 2 datasets, X being on the horizontal...
ID X Y 1 2 3 2 3 6 3 4 6 4 5 7 5...
ID X Y 1 2 3 2 3 6 3 4 6 4 5 7 5 8 7 6 5 7 7 6 7 8 8 8 9 7 8 10 12 11 Test the significance of the correlation coefficient. Then use math test scores (X) to predict physics test scores (Y).  Do the following: Create a scatterplot of X and Y. Write the regression equation and interpret the regression coefficients (i.e., intercept and slope). Predict the physics score for each....
x 2 8 5 9 4 3 9 6 7 8 y 3 6 5 7...
x 2 8 5 9 4 3 9 6 7 8 y 3 6 5 7 9 7 4 6 9 9 -5.48x + 0.17 5.48x + 0.17 -0.17x + 5.48 0.17x + 5.48
Step 2 Data Set A x 1 2 3 4 5 6 7 y 7 7...
Step 2 Data Set A x 1 2 3 4 5 6 7 y 7 7 7 9 9 9 10 Data Set B x 1 2 3 4 5 6 7 8 9 10 11 y 4 6 6 6 8 9 9 9 10 10 10 Step 2 Find the equation for the least-squares line, and graph the line on the scatter plot. Find the sample correlation coefficient r and the coefficient of determination r2. Is r significant?...
4) Let ? = {2, 3, 5, 7}, ? = {3, 5, 7}, ? = {1,...
4) Let ? = {2, 3, 5, 7}, ? = {3, 5, 7}, ? = {1, 7}. Answer the following questions, giving reasons for your answers. a) Is ? ⊆ ?? b)Is ? ⊆ ?? c) Is ? ⊂ ?? d) Is ? ⊆ ?? e) Is ? ⊆ ?? 5) Let ? = {1, 3, 4} and ? = {2, 3, 6}. Use set-roster notation to write each of the following sets, and indicate the number of elements in...
6. Let A = {1, 2, 3, 4} and B = {5, 6, 7}. Let f...
6. Let A = {1, 2, 3, 4} and B = {5, 6, 7}. Let f = {(1, 5),(2, 5),(3, 6),(x, y)} where x ∈ A and y ∈ B are to be determined by you. (a) In how many ways can you pick x ∈ A and y ∈ B such that f is not a function? (b) In how many ways can you pick x ∈ A and y ∈ B such that f : A → B...
Let S = {1, 2, 3, 4, 5, 6, 7} be a sample of an experiment...
Let S = {1, 2, 3, 4, 5, 6, 7} be a sample of an experiment and let X = {1, 4, 7}, Y = {2, 3, 5}, and Z = {1, 3, 5} be events. Which of the following statements is correct? a) X and S are mutually exclusive events. b) X and Y are mutually exclusive events. c) X, Y, and Z are mutually exclusive events. d) Z and Y are mutually exclusive events. e) X and Z...
n = 8 measurements: 5, 3, 6, 7, 6, 5, 4, 7 Calculate the sample variance,...
n = 8 measurements: 5, 3, 6, 7, 6, 5, 4, 7 Calculate the sample variance, s2, using the definition formula. (Round your answer to four decimal places.) s2 = Calculate the sample variance, s2 using the computing formula. (Round your answer to four decimal places.) s2 =   Find the sample standard deviation, s. (Round your answer to three decimal places.) s =
Calculate 7 ratios of your choosing 1 2 3 4 5 6 7 Show work. A)...
Calculate 7 ratios of your choosing 1 2 3 4 5 6 7 Show work. A) Calculate the PV of a single payment of $1,000,000 that you are receive in 5 years and that has an interest rate of 5%. B) Calculate the PV of a single payment of $21,500,000 that you are receive in 5 years and that has an interest rate of 8%. C) Calculate the PV of a single payment of $4,000,000 that you are receive in...
Maria says that 4 2/3 x 7 1/5 = 4 x 7 + 2/3 x 1/5....
Maria says that 4 2/3 x 7 1/5 = 4 x 7 + 2/3 x 1/5. Explain why Maria has made a good attempt, but her answer is not correct. Explain how to work with what Maria has already written and modify it to get the correct answer. In other words, don’t just start from scratch and show Maria how to do the problem, but rather take what she has already written, use it, and make it mathematically correct. Which...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT