Question

In: Math

Generate a simulated data set with 100 observations based on the following model. Each data point...

Generate a simulated data set with 100 observations based on the following model. Each data point is a vector Z= (X, Y) where X describes the age of a machine New, FiveYearsOld, and TenYearsOld and Y describes whether the quality of output from the machine Normal or Abnormal. The probabilities of a machine being in the three states are

P(X = New) = 1/4

P(X = FiveYearsOld) = 1/3

P(X = TenYearsOld) = 5/12

The probabilities of Normal output conditioned are machine age are

P(Y = Normal | X= New) = 8/10

P(Y = Normal | X= FiveYearsOld) = 8/10

P(Y = Normal | X= TenYearsOld) = 4/10

Your data should consist of two vectors Y and Z both of which are of class character. Convert these to factors using the as.factor function. Analyze your simulated data using the chisq.test function with inputs x=x, y=y. Perform the analysis with the exact same function, but with simulated p-values using the inputs x=x, y=y, simulate.p.values=TRUE, B=10000. Would you trust the p-values from the asymptotic distribution or the simulated p-values more? What conclusions can you draw about your simulated data from this analysis?

Solutions

Expert Solution


Related Solutions

Given a data set with 100 observations, a goodness of fit test to see if a...
Given a data set with 100 observations, a goodness of fit test to see if a sample follows a uniform distribution or a poisson distribution or a normal distribution will have the same number of degrees of freedom. true or false and When a contingency table of expected frequencies is constructed, the null hypothesis is that all of the cells in the table are equally likely. true or false thank you :)
Generate n = 100 observations from each of the three models. - ARMA (1,1) - ARMA...
Generate n = 100 observations from each of the three models. - ARMA (1,1) - ARMA (1,0) -ARMA (0,1) Compute the sample ACF for each model and compare it to the theoretical values. Compute the sample PACF for each of the generated series and compare the Sample ACFs and PACFs
Using the GDP Data do the following: Generate the best fit model (regression) Generate the specific...
Using the GDP Data do the following: Generate the best fit model (regression) Generate the specific regression form Explain any dummy variables created Explain any time variables created Discuss the significance of all variables Generate and discuss the residual plot GDP C I G 822.2 625.7 93.6 110.1 751.5 592.3 62.5 121.3 703.6 574.3 39.2 126.6 611.8 523.0 11.8 122.4 603.3 511.0 17.5 118.0 668.3 546.9 31.6 133.0 728.3 580.6 58.4 137.0 822.5 639.6 74.9 158.9 865.8 663.5 93.6 153.2...
5. What is the skewness and kurtosis of each data set? 6. Generate a histogram plot...
5. What is the skewness and kurtosis of each data set? 6. Generate a histogram plot of each of the data sets. 7. Based on the variability of the data, what do you think the next step would be to analyze the data? Age Income 29 9315 25 6590 28 9668 27 8412 25 1654 24 2431 25 6977 19 8966 27 9327 18 3871 25 9934 19 2236 19 3035 29 2518 19 3616 19 9219 28 1090 18...
For the following, the data set has a mean of ? = 100 and a standard...
For the following, the data set has a mean of ? = 100 and a standard deviation of ? = 20.   Find the following areas under the curve for the region that is: 1) Less than 100 2) Greater than 160 For the following, scores on a psychology exam were normally distributed with a mean of 67 and a standard deviation of 8. Find the following: 1) What percentage of scores were below 83? 2) If 500 students took the...
Simulate 100 observations from an ARMA(1,1) model and another 30 observations from an ARMA(1,1) model both...
Simulate 100 observations from an ARMA(1,1) model and another 30 observations from an ARMA(1,1) model both with = 0.8 and = 0.3. please use Rstudio and provide the codes.
For the data in Data Set #1, generate a Frequency Distribution with an interval size of...
For the data in Data Set #1, generate a Frequency Distribution with an interval size of 10, a lower apparent limit value as a multiple of 10, the largest interval size place on the top of the distribution, and use this distribution to answer questions 11-18. 54 67 88 109 26 33 92 97 32 55 75 81 83 45 21 86 94 100 78 62 What is the midpoint of the lowest interval? What is the relative frequency of...
The following is a small data set of 25 observations. I want you to calculate some...
The following is a small data set of 25 observations. I want you to calculate some statistics by hand to cement the class material. You can use the regular formula for the variance or the computational formula - I’m just interested in the correct result. If you use the computational formula, you will want to calculate a separate row of each value squared in order to calculate the Sum(x^2). Excel can be used to solve this problem. VarX 44 34...
The following is a small data set of 25 observations. I want you to calculate some...
The following is a small data set of 25 observations. I want you to calculate some statistics by hand to cement the class material. You can use the regular formula for the variance or the computational formula - I’m just interested in the correct result. If you use the computational formula, you will want to calculate a separate row of each value squared in order to calculate the Sum(x^2). Excel can be used to solve this problem. VarX 44 34...
10.2 Suppose we a data set where each data point represents a single student's scores on...
10.2 Suppose we a data set where each data point represents a single student's scores on a math test, a physics test, a reading comprehension test, and a vocabulary test. We find the first two principal components, which capture 90% of the variability in the data, and interpret their loadings. We conclude that the first principal component represents overall academic ability, and the second represents a contrast between quantitative ability and verbal ability. What loadings would be consistent with that...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT