True or false: The Markov Analysis is a type of analysis that allows us to predict the future by using the state probabilities and a Matrix of Transition Probabilities.
In: Math
Use the R script to answer the following questions: (write down your answers in the R script with ##)
(1). Import FarmSize.csv to Rstudio. Use the correct function to build a linear regression model predicting the average size of a farm by the number of farms; Give the model a name (e.g. FarmSize_Model). Call the model name to inspect the intercept and slope of the regression model. Verify the answers in your manual calculation.
(2). Use the correct function to generate the residuals for the 12 examples in the dataset from the model. Create a residual plot, with x axis as independent variable and y axis as residual.
(3). Use the correct function to inspect SSE, Se and r². Write down the values for these measures. Verify the answers in your manual calculation.
(4). Use the correct function to inspect slope statistic testing result. What is the t value for the slope statistic testing? What is the p value? What is the statistical decision?
Year | NumberofFarms | AverageSize |
1950 | 5.65 | 213 |
1955 | 4.65 | 258 |
1960 | 3.96 | 297 |
1965 | 3.36 | 340 |
1970 | 2.95 | 374 |
1975 | 2.52 | 420 |
1980 | 2.44 | 426 |
1985 | 2.29 | 441 |
1990 | 2.15 | 460 |
1995 | 2.07 | 469 |
2000 | 2.17 | 434 |
2005 | 2.1 | 444 |
In: Math
I need a regression analysis done on the following numbers.
IC | Price | Income | Temp | Lag-temp | |
0.386 | 0.27 | 78 | 41 | 56 | |
0.374 | 0.282 | 79 | 56 | 63 | |
0.393 | 0.277 | 81 | 63 | 68 | |
0.425 | 0.28 | 80 | 68 | 69 | |
0.406 | 0.272 | 76 | 69 | 65 | |
0.344 | 0.262 | 78 | 65 | 61 | |
0.327 | 0.275 | 82 | 61 | 47 | |
0.288 | 0.267 | 79 | 47 | 32 | |
0.269 | 0.265 | 76 | 32 | 24 | |
0.256 | 0.277 | 79 | 24 | 28 | |
0.286 | 0.282 | 82 | 28 | 26 | |
0.298 | 0.27 | 85 | 26 | 32 | |
0.329 | 0.272 | 86 | 32 | 40 | |
0.318 | 0.287 | 83 | 40 | 55 | |
0.381 | 0.277 | 84 | 55 | 63 | |
0.381 | 0.287 | 82 | 63 | 72 | |
0.47 | 0.28 | 80 | 72 | 72 | |
0.443 | 0.277 | 78 | 72 | 67 | |
0.386 | 0.277 | 84 | 67 | 60 | |
0.342 | 0.277 | 86 | 60 | 44 | |
0.319 | 0.292 | 85 | 44 | 40 | |
0.307 | 0.287 | 87 | 40 | 32 | |
0.284 | 0.277 | 94 | 32 | 27 | |
0.326 | 0.285 | 92 | 27 | 28 | |
0.309 | 0.282 | 95 | 28 | 33 | |
0.359 | 0.265 | 96 | 33 | 41 | |
0.376 | 0.265 | 94 | 41 | 52 | |
0.416 | 0.265 | 96 | 52 | 64 | |
0.437 | 0.268 | 91 | 64 | 71 | |
In: Math
A systematic random sample was taken from the set of all Presidents of the United States. The data file potus heights.csv random sample includes the height (in inches) of each sampled President. (a) From this data, estimate the average height of United States Presidents. Calculate two error bounds for your estimate, one using the usual SRS formula, and one using the successive difference variance estimator. (b) Which variance estimator is more appropriate for these data? Briefly explain
president | hgt |
Van Buren | 56 |
McKinley | 57 |
Harrison | 68 |
Carter | 69 |
Roosevelt | 70 |
Cleveland | 71 |
Buchanan | 72 |
Kennedy | 72 |
AJackson | 73 |
GHWBush | 74 |
Lincoln | 76 |
In: Math
A regional transit company wants to determine whether there is a relationship between the age of a bus and the annual maintenance cost. A sample of 10 buses resulted in the following data:
Age of Bus (years) |
Annual Maintenance Cost ($) |
1 |
350 |
2 |
370 |
2 |
480 |
2 |
520 |
2 |
590 |
3 |
550 |
4 |
750 |
4 |
800 |
5 |
790 |
5 |
950 |
Instructions:
In: Math
In: Math
The observations are Y1, . . . , Yn. The model is Yi = βxi + i , i = 1, . . . , n, where (i) x1, . . . , xn are known constants, and (ii) 1, . . . , n are iid N(0, σ2 ). Find the MLEs of β and σ^ 2 . Are they jointly sufficient for β and σ ^2 ?
In: Math
Assume that a sample is used to estimate a population mean μμ.
Find the 95% confidence interval for a sample of size 56 with a
mean of 65.3 and a standard deviation of 6.4. Enter your answer as
an open-interval (i.e., parentheses)
accurate to 3 decimal places.
95% C.I. =
The answer should be obtained without any preliminary rounding.
In: Math
True or False
1. In a completely randomized experimental design with 10 treatments, if the sample size (n) is 40 and α = 0.05, then tukey’s critical value is qα = 4.82.
2. The Chi-Square distribution is a right-skewed distribution that is dependent on two degrees of freedom (the numerator df and the denominator df).
In: Math
Let x represent the average annual salary of college and university professors (in thousands of dollars) in the United States. For all colleges and universities in the United States, the population variance of x is approximately σ2 = 47.1. However, a random sample of 15 colleges and universities in Kansas showed that x has a sample variance s2 = 85.4. Use a 5% level of significance to test the claim that the variance for colleges and universities in Kansas is greater than 47.1. Find a 95% confidence interval for the population variance.
(a) What is the level of significance?
State the null and alternate hypotheses.
Ho: σ2 = 47.1; H1: σ2 < 47.1 Ho: σ2 < 47.1; H1: σ2 = 47.1 Ho: σ2 = 47.1; H1: σ2 ≠ 47.1 Ho: σ2 = 47.1; H1: σ2 > 47.1
(b) Find the value of the chi-square statistic for the sample.
(Round your answer to two decimal places.)
What are the degrees of freedom?
What assumptions are you making about the original
distribution?
We assume a exponential population distribution. We assume a binomial population distribution. We assume a normal population distribution. We assume a uniform population distribution.
(c) Find or estimate the P-value of the sample test
statistic.
P-value > 0.100 0.050 < P-value < 0.100 0.025 < P-value < 0.050 0.010 < P-value < 0.025 0.005 < P-value < 0.010 P-value < 0.005
(d) Based on your answers in parts (a) to (c), will you reject or
fail to reject the null hypothesis?
Since the P-value > α, we fail to reject the null hypothesis. Since the P-value > α, we reject the null hypothesis. Since the P-value ≤ α, we reject the null hypothesis. Since the P-value ≤ α, we fail to reject the null hypothesis.
(e) Interpret your conclusion in the context of the
application.
At the 5% level of significance, there is insufficient evidence to conclude the variance of annual salaries is greater in Kansas. At the 5% level of significance, there is sufficient evidence to conclude the variance of annual salaries is greater in Kansas.
(f) Find the requested confidence interval for the population
variance. (Round your answers to two decimal places.)
lower limit | |
upper limit |
Interpret the results in the context of the application.
We are 95% confident that σ2 lies within this interval. We are 95% confident that σ2 lies above this interval. We are 95% confident that σ2 lies outside this interval. We are 95% confident that σ2 lies below this interval.
In: Math
a.explain why stratifying the sampling in order to control the effect of other factors is not practical
b.why is it important to specify a variable of interest and to distinguish between it and control variables
In: Math
At this point, we have a variety of options when choosing a test or a confidence interval. I'd like for you to walk us through your process of choosing based on what you see in the problem. What do you look for? What helps you decide what to choose?
In: Math
Part 1
The systolic blood pressure of adults in the USA is nearly
normally distributed with a mean of 120 and standard deviation of
23 .
Someone qualifies as having Stage 2 high blood pressure if their
systolic blood pressure is 160 or higher.
a. Around what percentage of adults in the USA have stage 2 high
blood pressure? Give your answer rounded to two decimal places.
____ %
b. If you sampled 2000 people, how many would you expect to have
BP> 160? Give your answer to the nearest person. Note: I had a
bit of an issue encoding rounded answers, so try rounding both up
and down if there's an issue! ____ people
c. Stage 1 high BP is specified as systolic BP between 140 and 160.
What percentage of adults in the US qualify for stage 1? _______
%
d. Your doctor tells you you are in the 30th percentile for blood
pressure among US adults. What is your systolic BP? Round to 2
decimal places. ________ lbs
Part 2
In the country of United States of Heightlandia, the height
measurements of ten-year-old children are approximately normally
distributed with a mean of 55.2 inches, and standard deviation of
8.7 inches.
What is the probability that the height of a randomly chosen child
is between 44.35 and 45.05 inches? Do not round
until you get your final answer, and then round to 3 decimal
places.
Part 3
A distribution of values is normal with a mean of 130.7 and a
standard deviation of 89.2.
Find the probability that a randomly selected value is greater than
202.1.
P(X > 202.1) =
Part 4
A distribution of values is normal with a mean of 150.1 and a
standard deviation of 78.5.
Find the probability that a randomly selected value is between
-38.3 and -30.5.
P(-38.3 < X < -30.5) =
Part 5
Company XYZ know that replacement times for the portable MP3
players it produces are normally distributed with a mean of 3.6
years and a standard deviation of 1.1 years.
Find the probability that a randomly selected portable MP3 player
will have a replacement time less than 0.5 years?
P(X < 0.5 years) = ________
Enter your answer accurate to 4 decimal places. Answers obtained
using exact z-scores or z-scores rounded to 3
decimal places are accepted.
If the company wants to provide a warranty so that only 4.4% of the
portable MP3 players will be replaced before the warranty expires,
what is the time length of the warranty?
warranty = _______ years
Part 6
The combined SAT scores for the students at a local high school
are normally distributed with a mean of 1503 and a standard
deviation of 308. The local college includes a minimum score of
1626 in its admission requirements.
What percentage of students from this school earn scores that fail
to satisfy the admission requirement?
P(X < 1626) = _______%
In: Math
On April 1, 1992, New Jersey’s minimum wage was increased from $4.25 to $5.05 per hour, while the minimum wage in Pennsylvania stayed at $4.25 per hour. Energetic students collected data on 410 fast food restaurants in New Jersey (the treatment group) and eastern Pennsylvania (the control group). The “before” period is February 1992, and the “after” period is November 1992. Using these data, we will estimate the effect of the “treatment,” raising the New Jersey minimum wage on employment at fast food restaurants in New Jersey (i.e., H_0:δ=0 versus H_A:δ<0). It is easier and more general to use the regression format to compute the differences-in-differences estimate using sample means. Let y=FTE employment , the treatment variable is the indicator variable NJ=1 if observation is from New Jersey, and zero if from Pennsylvania. The time indicator is D=1 if the observation is from November and zero if it is from February.
(a.)Write out the regression equation.
(b)Report the least squares estimates .
(c)At the α=.05 level of significance the regression region for the left tail test in above hypotheses is t≤-1.645, what is your conclusion?
(d)As with randomized control (quasi) experiments it is interesting to see the robustness of the result from (c). Please, add indicator variables for fast food chain and whether the restaurant was company-owned rather than franchise-owned. These changes alter the DID estimator?
(e)Please, add indicator variables for geographical regions within the survey area. These changes alter the DID estimator?
In: Math
Describe the kind of data that are collected for an independent-measures t test and the hypotheses that the test evaluates. The key to helping formulate your explanation would be to include the assumptions of this statistical model, the type of sample used in this model, and a statement about the null hypothesis.
In: Math