Question

In: Statistics and Probability

A data set has 1000 observations. In the data, a quantitative variable's highest value is 780...

A data set has 1000 observations. In the data, a quantitative variable's highest value is 780 and its lowest value is 95. a) How many number of classes would you recommend? b) What is the class interval that you would recommend?

Expert Solution

A data set has consists 1000 observations.

Minimum value = 95 and maximum value = 780

To determine the class intervals we are using the 2^k Baseline greater than or equals n rule

The number of classes should not be too many or too few. A rough guideline for constructing k classes for sample data is smallest integer value k such that 2 ^k > n, where n is the sample size.

a) The sample size is 1000 and since 2¹⁰ > 1000.

We would use k=10 classes.

Class width is computed as (Largest - Smallest) / class size

= (780-95) / 10 = 68.5

Class width ? 68

b) Class Intervals are

95 - 163

164 - 232

233 - 301

302 - 370

371 - 439

440 - 508

509 - 577

578 - 646

647 - 715

716 - 784

orchestra answered 2 years ago

Using R program and with a For loop. Assuming a data set of 1000 observations and...

Using R program and with a For loop. Assuming a data set of 1000 observations and 10 predictors. How would one use a for loop to cycle through different proportions of training and test sizes. For example, 20% of data goes to training and 80% for test in first iteration. Each iteration adding another 10% to the training set. So first set= (20% train, 80% test), second set = (30% train, 70% test), third set= (40% train,60%test) and so on....

A set of data contains 40 observations. The lowest value is 50 and the largest is...

A set of data contains 40 observations. The lowest value is 50 and the largest is 119. The data are to be organized into a frequency distribution. a. How many classes would you suggest? b. What would you suggest as the lower limit of the first class?

Collect at least 11 observations of quantitative data. Give the source of your data. Construct a...

Collect at least 11 observations of quantitative data. Give the source of your data. Construct a dot plot of your data. Compute the mean, median, range, and sample standard deviation. Presentation counts.

What is median of a set of data? How is median represented for a set of n observations when:

What is median of a set of data? How is median represented for a set of n observations when i) n is odd, ii) n is even. iii) weighted median of frequency distribution

A data set with whole numbers has a low value of 20 and a high value...

A data set with whole numbers has a low value of 20 and a high value of 89. Find the class width for a frequency table with seven classes. If possible, Please show work in notebook.

The quantitative data set under consideration has roughly a bell-shaped distribution. Apply the empirical rule to...

The quantitative data set under consideration has roughly a bell-shaped distribution. Apply the empirical rule to answer the following question. A quantitative data set of size 90 has mean 40 and standard deviation 6. Approximately how many observations lie between 22 and 58? Approximately _____ observations lie between 22 and 58.

Given a data set with 100 observations, a goodness of fit test to see if a...

Given a data set with 100 observations, a goodness of fit test to see if a sample follows a uniform distribution or a poisson distribution or a normal distribution will have the same number of degrees of freedom. true or false and When a contingency table of expected frequencies is constructed, the null hypothesis is that all of the cells in the table are equally likely. true or false thank you :)

Given two sets of data, A and B. i) Data set A has an r value...

Given two sets of data, A and B. i) Data set A has an r value of -.81 and data set B has an r value of .94 Describe the differences between the two data sets as completely as you can using the regression information we have learned. ii) Which linear regression equation, the one for A or the one for B, would probably be a better predictor? Why?

Generate a simulated data set with 100 observations based on the following model. Each data point...

Generate a simulated data set with 100 observations based on the following model. Each data point is a vector Z= (X, Y) where X describes the age of a machine New, FiveYearsOld, and TenYearsOld and Y describes whether the quality of output from the machine Normal or Abnormal. The probabilities of a machine being in the three states are P(X = New) = 1/4 P(X = FiveYearsOld) = 1/3 P(X = TenYearsOld) = 5/12 The probabilities of Normal output conditioned...

The lowest and highest observations in a population are 14 and 48, respectively. What is the...

The lowest and highest observations in a population are 14 and 48, respectively. What is the minimum sample size n required to estimate μ with 90% confidence if the desired margin of error is E = 1.5? What happens to n if you decide to estimate μ with 95% confidence? (You may find it useful to reference the z table. Round intermediate calculations to at least 4 decimal places and "z" value to 3 decimal places. Round up your answers...