In: Statistics and Probability
Question 1
Why can you never have 100% confidence in correctly estimating the population characteristic of interest?
When are you able to use the t distribution to develop the confidence interval estimation for the mean?
Why is it true that for a given sample size, n, an increase in confidence is acheived by widening (and making less precise) the confidence interval?
Why is the sample size needed to determine the proportion smaller when the population proportion is 0.20 than when the population proportion is 0.50?
What is the difference between a null hypothesis, H0, and an alternative hypothesis, H1?
What is the difference between a one-tailed test and a two-tailed test?
What is meant by a p value?
What is the six step critical value approach to hypthesis testing? List the six steps.
Why can you never have 100% confidence in correctly estimating the population characteristic of interest?
In statistics, a confidence interval (CI) is a kind of interval
estimate of a population parameter and is used to indicate the
reliability of an estimate. It is an observed interval (i.e. it is
calculated from the observations), in principle different from
sample to sample, that frequently includes the parameter of
interest, if the experiment is repeated. How frequently the
observed interval contains the parameter is determined by the
confidence level or confidence coefficient. More specifically, the
meaning of the term "confidence level" is that, if confidence
intervals are constructed across many separate data analyses of
repeated (and possibly different) experiments, the proportion of
such intervals that contain the true value of the parameter will
match the confidence level; this is guaranteed by the reasoning
underlying the construction of confidence intervals.[1][2][3]
Confidence intervals consist of a range of values (interval) that
act as good estimates of the unknown population parameter. However,
in rare cases, none of these values may cover the value of the
parameter. The level of confidence of the confidence interval would
indicate the probability that the confidence range captures this
true population parameter given a distribution of samples. It does
not describe any single sample. This value is represented by a
percentage, so when we say, "we are 99% confident that the true
value of the parameter is in our confidence interval", we express
that 99% of the observed confidence intervals will hold the true
value of the parameter. After a sample is taken, the population
parameter is either in the interval made or not, there is no
chance. The level of confidence is set by the researcher (not
determined by data) . If a corresponding hypothesis test is
performed, the confidence level corresponds with the level of
significance, i.e. a 95% confidence interval reflects an
significance level of 0.05, and the confidence interval contains
the parameter values that, when tested, should not be rejected with
the same sample. Greater levels of confidence give larger
confidence intervals, and hence less precise estimates of the
parameter. Confidence intervals of difference parameters not
containing 0 imply that that there is a statistically significant
difference between the populations.
Certain factors may affect the confidence interval size including
size of sample, level of confidence, and population variability. A
larger sample size normally will lead to a better estimate of the
population parameter.
A confidence interval does not predict that the true value of the
parameter has a particular probability of being in the confidence
interval given the data actually obtained. (An interval intended to
have such a property, called a credible interval, can be estimated
using Bayesian methods; but such methods bring with them their own
distinct strengths and weaknesses).
When are you able to use the t distribution to develop the confidence interval estimation for the mean?
Student’s t-distribution (or simply the t-distribution) is a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown.
Why is it true that for a given sample size, n, an increase in confidence is acheived by widening (and making less precise) the confidence interval?
Why is the sample size needed to determine the proportion smaller when the population proportion is 0.20 than when the population proportion is 0.50?
The sample size formula for p is: n = (z/ME)^2 * p * (1-p)
The required sample size is the largest when p = .5, because the
quantity p * (1-p) is the largest when p = .5, i.e.
.1 * (1-.1) = .09
.3 * (1-.3) = .21
.5 * (1-.5) = .25
.7 * (1-.7) = .21
.9 * (1-.9) = .01