Question

In: Statistics and Probability

4. You have collected weekly earnings and age data from a sub-sample of 1,744 individuals using...

4. You have collected weekly earnings and age data from a sub-sample of 1,744 individuals using the Current Population Survey in a given year.

(a) Given the overall mean of $434.49 and a standard deviation of $294.67, construct a 99% confidence interval for average earnings in the entire population. State the meaning of this interval in words, rather than just in numbers. If you constructed a 90% confidence interval instead, would it be smaller or larger? What is the intuition?

(b) When dividing your sample into people 45 years and older, and younger than 45, the information shown in the table is found.

Age Category

Average Earnings

Standard Deviation

N

Age ≥ 45

$468.87

$308.64

507

Age < 45

$412.20

$276.63

1237

Test whether or not the difference in average earnings is statistically significant. Given your knowledge of age-earning profiles, does this result make sense?

Solutions

Expert Solution

a)

The confidence interval for mean weekly earnings is

Based on the sample at hand, the best guess for the population mean is $434.49.

However, because of random sampling error, this guess is likely to be wrong. Instead, the interval

estimate for the average earnings lies between $416.29 and $452.69. Committing to such an interval

repeatedly implies that the resulting statement is incorrect 1 out of 100 times.

For a 90% confidence interval, the only change in the calculation of the confidence interval is to replace 2.58 by 1.64.

Hence the confidence interval is smaller. A smaller interval implies, given the same average earnings and the standard deviation, that the statement will be false more often.

The larger the confidence interval, the more likely it is to contain the population value.

b)

Assuming unequal population variances

which is statistically significant at conventional levels

whether we will use  use a two sided or one sided t test

Hence the null hypothesis of equal average earnings in the two groups is rejected.

Age earning profiles typically take on an inverted U shape. Maximum earnings occur in the 40s, depending on some other factors such as years of education, which are not considered here. Hence it is not clear if the alternative hypothesis should be one sided or two sided. In such a situation, it is best to assume a two sided alternative hypothesis.


Related Solutions

Suppose the researcher has collected data from a sample of 150 individuals for this study. For...
Suppose the researcher has collected data from a sample of 150 individuals for this study. For each individual, the weekly take-home pay and weekly food expenditure were recorded. The data is stored in the file FOODEXP.XLS. Using this data set and EXCEL, answer the questions below. Take-home pay   Weekly food expenditure 262   82 369   182 374   144 381   161 378   210 395   126 401   196 410   212 408   130 415   151 418   145 415   171 425   156 116   114 120  ...
Using a sample of 1801 individuals, (4 Marks) the following earnings equation has been estimated: y...
Using a sample of 1801 individuals, the following earnings equation has been estimated: y = 7.059 + 0.147x1 + 0.049x2 - 0.201x3       (0.135)      (0.008)      (0.007)        (0.036) Y = expected earnings X1 = level of education X2 = years of experience X3 = female R2 = 0.179;   n = 1801 Critical value of t is 1.96 at 5% significance level. where the standard errors are reported in parenthesis. (a) Interpret the coefficient estimate on female: (b) Test the hypothesis that...
Let's say you have collected data from a large sample of participants on some variable that...
Let's say you have collected data from a large sample of participants on some variable that you think is normally distributed. In this case use IQ scores. Describe the variable and state whether the scale of measurement is nominal, ordinal, interval, or ratio, and why you came to that conclusion. What does it mean to say the variable is normally distributed? What is probability value and explain how a probability value of .05 in your example is related to the...
The following sample data have been collected from a paired sample from two populations. The claim...
The following sample data have been collected from a paired sample from two populations. The claim is that the first population mean will be at least as large as the mean of the second population. This claim will be assumed to be true unless the data strongly suggest otherwise. Population Data Sample 1 Sample 2 4.4 3.7 2.7 3.5 1.0 4.0 3.5 4.9 2.8 3.1 2.6 4.2 2.4 5.2 2.0 4.4 2.8 4.3 State the appropriate null and alternative hypotheses....
This regression is on 1744 individuals and the relationship between their weekly earnings (in dollars) and...
This regression is on 1744 individuals and the relationship between their weekly earnings (in dollars) and age (in years) during 2019. The regression yielded the following result: = 239.16 + 5.20 × Age , R2 = 0.05, SER = 287.21 (20.24) (0.57) ( Values in parentheses are heteroskedasticity robust standard errors, respectively) (a) Is the relationship between Age and Earn statistically significant? (b) The variance of the error term and the variance of the dependent variable are related. Given the...
The sample data below have been collected based on a simple random sample from a normally...
The sample data below have been collected based on a simple random sample from a normally distributed population. Complete parts a and b. 5 4 0 5 7 6 9 0 8 4 a. Compute a 98% confidence interval estimate for the population mean. The 98% confidence interval for the population mean is from to . (Round to two decimal places as needed. Use ascending order.) b. Show what the impact would be if the confidence level is increased to...
Using a sample of 1801 black individuals, the following earnings equation has been estimated: ln(earnings )...
Using a sample of 1801 black individuals, the following earnings equation has been estimated: ln(earnings ) = 7.059 + 0.147educ + 0.049experience-0.201female                          (.135)        (.008)             (.007)                       (0.036)          R 2 = 0.179; n = 1801 where the standard errors are reported in parenthesis. Interpret the coefficient estimate on educ Interpret the coefficient estimate on experience: In answering parts (c)-(e), you must write down: (i) the null and alternative hypotheses; (ii) the test statistic; (iii) the rejection rule. Test the hypothesis that educ has no effect on...
Researchers have collected data from a random sample of six students on the number of hours...
Researchers have collected data from a random sample of six students on the number of hours spent studying for an exam and the grade received on the exam as given in Table 6.5. Table 6.5 Observation Grade Number of Hours Studying 1 85 8 2 73 10 3 95 13 4 77 5 5 68 2 6 95 12 d) Find and interpret a 90% confidence interval for the true population slope parameter.
Researchers have collected data on the hours of television watched in a day and the age...
Researchers have collected data on the hours of television watched in a day and the age of a person. You are given the data below. Hours of Television Age (in years) 1 45 3 30 4 22 3 25 6   5 ​ a. Determine the dependent variable. b. Compute the least squares estimated regression equation. c. Is there a significant relationship between the two variables? Use a t test and a .05 level of significance. Be sure to state the...
Researchers have collected data on the hours of television watched in a day and the age...
Researchers have collected data on the hours of television watched in a day and the age of a person. They have taken 6 samples. A linear regression analysis has already been run on this data. Here are the partial results: Analysis of variance (Hours of Television): Source DF Sum of squares Mean squares F Pr > F Model    10.939 Error       Total    13.333 Model parameters (Hours of Television): Source Value Standard error t Pr > |t| Lower...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT