Question

In: Statistics and Probability

The data set data_ksubs.csv contains information on net financial wealth (nettf a), age of the survey...

The data set data_ksubs.csv contains information on net financial wealth (nettf a), age of the survey respondent (age), annual family income (inc), family size (fsize), and participation in certain pension plans for people in the United States. The wealth and income variables are both recorded in thousands of dollars. In particular, the variable e401k is equal to 1 is the person is eligible for 401k, a retirement savings plan sponsored by the employer, and 0 otherwise.

a. Create a scatter plot of nettf a against inc. Can you observe any visible correlation between nettf a and inc? Do you think that a regression of nettf a on inc may feature heteroskedasticity? Explain.

Solutions

Expert Solution

Suppose that the Least-Squares assumptions are satisfied and estimate the fol-

lowing regression model:

nettf ai = β0 + β1male + β2e401k + β3inci + β4agei + ui

, i = 1, . . . , n. (2)

Report the estimated values of the regression coefficients and discuss their signs

(if it is or it is not as expected), their (heteroskedasticity robust) standard errors,

and significance level. Also, report the R2

, the adjusted R2

(R ̄2

) and the value

of the F-statistics for the null hypothesis that all the slope coefficients are equal

to 0. Do you reject the null hypothesis?

(c) We now introduce some additional variables and some nonlinearities in the

model. We add the square of age (agesq), the square of income (incsq), a

dummy for the individual being married (marr), and the household size (fsize).

We thus estimate the following model.

nettf ai =β0 + β1male + β2e401k + β3inci + β4agei (3)

+β5incsqi + β6agesqi + β7marri + β8fsizei + ui

, i = 1, . . . , n.

Obtain the OLS estimators of the regression coefficients and their (heteroskedas-

ticity robust) standard errors. Compare the new estimators with those obtained

in (b). How have they changed? Compare the adjusted R2

in this model and in

model (2). Obtain the F-statistic to test for the null hypothesis that β5, β6, β7

and β8 are jointly equal to 0, and for the null that β5 and β6 are jointly equal

to zero.


Related Solutions

The data set data_ksubs.csv contains information on net financial wealth (nettf a), age of the survey...
The data set data_ksubs.csv contains information on net financial wealth (nettf a), age of the survey respondent (age), annual family income (inc), family size (fsize), and participation in certain pension plans for people in the United States. The wealth and income variables are both recorded in thousands of dollars. In particular, the variable e401k is equal to 1 is the person is eligible for 401k, a retirement savings plan sponsored by the employer, and 0 otherwise. a. Create a scatter...
The table below contains information from a survey among 499 participants classified according to their age...
The table below contains information from a survey among 499 participants classified according to their age groups. The second column shows the percentage of obese people per age class among the study participants. The last column comes from a different study at the national level that shows the corresponding percentages of obese people in the same age classes in the USA. Perform a hypothesis test at the 5% significance level to determine whether the survey participants are a representative sample...
The crab data set contains information on the number of "satellites" per female crab. Use a...
The crab data set contains information on the number of "satellites" per female crab. Use a Bayesian model to infer the Poisson parameter. a.  Write the likelihood function. b. Derive the posterior distribution using a Gamma prior w/ rate=20 & shape=3 c. Provide the posterior mean, posterior SD and 95 and 99% posterior credibility region (hint: you can use qgamma). d. Plot the prior and posterior distribution of lambda, in the same plot. mean = 2.919075 variance = 9.912018 n =...
1. The following data set contains information on years of formal education and incomes in 2015....
1. The following data set contains information on years of formal education and incomes in 2015. Row    Education    Income in         in Years      2015 Dollars 1          7         22587 2         10         28305 3         12         40196 4         13         49483 5         14         54483 6         16         78073 7         18         99540 8         19        155646 9         21        125310 a. Estimate the regression equation Income = a + b(Education). b. What is the predicted increase in Income for a one-year increase in Education? c. What do you...
The data set in CEOSAL2 contains information on chief executive officers for U.S. corporations. The variable...
The data set in CEOSAL2 contains information on chief executive officers for U.S. corporations. The variable salary is annual compensation, in thousands of dollars, and ceoten is prior number of years as company CEO. Write the steps using R Studio! (a) Find the average salary and the average tenure in the sample. (b) How many CEOs are in their first year as CEO (that is, ceoten = 0)? What is the longest tenure as a CEO? (c) Plot scatter plot...
Load “Lock5Data” into your R console. Load “OlympicMarathon” data set in “Lock5Data”. This data set contains...
Load “Lock5Data” into your R console. Load “OlympicMarathon” data set in “Lock5Data”. This data set contains population of all times to finish the 2008 Olympic Men’s Marathon. a) What is the population size? b) Now using “Minutes” column generate a random sample of size 5. c) Calculate the sample mean and record it (create a excel sheet or write a direct R program to record this) d) Continue steps (b) and (c) 10,000 time (that mean you have recorded 10,000...
The StatCrunch data set for this question contains the data measurements described in Question 11. (H0...
The StatCrunch data set for this question contains the data measurements described in Question 11. (H0 : µ1 - µ2 ≤ 0 HA : µ1 - µ2 > 0) Assume that the two samples are dawn from independent, normally distributed populations that have different standard deviations. Use this data set and the results from Question 11 to calculate the p-value for the hypothesis test. Round your answer to three decimal places; add trailing zeros as needed. The p-value = [S90PValue]....
The American Community Survey is a survey that uses U.S. census data to compile information on...
The American Community Survey is a survey that uses U.S. census data to compile information on various characteristics of the U.S. population. Here are statistics that I would like you to analyze from a sample of states. The independent variable (x-variable) is the percent of the state population living below the poverty level. The dependent variable (y-variable) is the state infant mortality rate in deaths per 1000 births. The data is displayed in the following table: State: Percent of Population...
The American Community Survey is a survey that uses U.S. census data to compile information on...
The American Community Survey is a survey that uses U.S. census data to compile information on various characteristics of the U.S. population. Here are statistics that I would like you to analyze from a sample of states. The independent variable (x-variable) is the percent of the state population living below the poverty level. The dependent variable (y-variable) is the state infant mortality rate in deaths per 1000 births. The data is displayed in the following table: State: Percent of Population...
A data set contains the yearly tuitions in for undergraduate programs in arts and humanities at...
A data set contains the yearly tuitions in for undergraduate programs in arts and humanities at 66 universities and colleges. Tuition fees are different for domestic and international students. Suppose the mean tuition charged to domestic students was ​$5146, with a standard deviation of ​$944. For international​ students, suppose the mean was $14,504​, with a standard deviation of ​$3175. Which would be more​ unusual: a university or college with a domestic student tuition fee of ​$3000 or one with an...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT