Question

In: Statistics and Probability

I have a question which involves the use of stata what regression would I have to...

I have a question which involves the use of stata

what regression would I have to run to answer this question?

I have

  • WEEKPAY – Gross weekly pay in the respondent’s main job
  • GENDER – The respondent’s reported gender
  • MON – The month the respondent started her current job
  • YEAR– The year the respondent started her current job

What is the average size of the gender pay gap after the implementation (2018) of the regulation? Run a regression to estimate this. Think carefully about how the variables in your dataset might need to be transformed in order to interpret the estimated coefficients in a reasonable way. Justify your choice of model.

GRSSWK Gender Mon Year
430 male January 2018
420 male January 2017
390 female April 2017
390 male June 2018
450 female August 2017
400 female December 2017
550 male March 2017
500 male March 2018
420 male June 2018

Solutions

Expert Solution

Here we have to fit a proper regression model in the given data to answer the average size of the gender pay gap after the implimentation(2018) of regulation. Note that, here the first two columns and the last column are the variables. First column that is, column represents "gross weekly pay in the respondent's main job" , secoend column that is "gender of respondent" and the last column that is column represents "the year the respondent started his/her current job". Here 'year' and 'gender' are the covariates and 'GRSSWK' is the response variable. Note that, year is either 2018 or 2017 and gender is also of two category male or female. So this is a problem of ANOVA (analysis of variance) with covariates. So we fit simple linear regression model with two covariates x1 and x2 which are nothing but the indicator variables representing 'year' and 'gender' respectively.

For year 2018 x1 is '1' and '0' for 2017.

For male x2 is '1' and '0' for female.

Our model is

y=a + bx1 + cx2 + e

where y represents "GRSSWK", a,b,c are constants and 'a' is the intercept term and b & c are the regression coefficients. 'e' is the error term with mean 0 and constant variance.

The estimate of 'a' will give the average size of the gender pay gap after the implimentation(2018) of reulation.

We can use R to run the regression algorithm. R code is given below :

weekpay=c(430,420,390,390,450,440,550,500,420) ##...response y
year=c(1,0,0,1,0,0,0,1,1) ##....indicator vector of year (x1)
gender=c(1,1,0,1,0,0,1,1,1) ##....indicator vector of gender (x2)
year=as.factor(year) ##...factorising year
gender=as.factor(gender) ##...factorising gender
lm(weekpay~year+gender) ##... fitting regression model

Output of R :

Call:
lm(formula = weekpay ~ year + gender)

Coefficients:
(Intercept) year1 gender1
426.67 -50.00 58.33

Hence the average gener pay gap after the implimentation (2018) of regulation is 426.67 that is approximately 427.


Related Solutions

If I receive STATA output (regression) in an exam, and the question is to detect the...
If I receive STATA output (regression) in an exam, and the question is to detect the following issues: 1- Heteroscedasticity 2- multicollinearity 3- Omitted variable 3- over specification How can I detect them and know and detect there is issue in this output easily? for example I know one of the signs of multicollinearity issue is when I notes insignificant t-values.
I have to submit a term paper which involves conducting a regression and correlation analysis on...
I have to submit a term paper which involves conducting a regression and correlation analysis on any topic of my choosing. The paper must be based on yearly data for any economic or business variable, for a period of at least 20 years. The following also must be included in the paper: • The term paper should distinguish between dependent and independent variables; determine the regression equation by the least squares method; plot the regression line on a scatter diagram;...
This question involves the use of simple linear regression on the fat dataset that can be...
This question involves the use of simple linear regression on the fat dataset that can be found in the faraway library. data set. Use the lm() function to perform a simple linear regression with brozek (percent body fat using the reference method) on abdom (abdomen circumference in cm) as the predictor. Print the results of the summary(function) and submit along with your answers to the following questions. Is there a relationship between the predictor and the response? How strong is...
Regression analysis question. I wanna use Regression analysis to incorporate sales of a protein powder. Maybe...
Regression analysis question. I wanna use Regression analysis to incorporate sales of a protein powder. Maybe the price vs sales or how it would go up or down based on price? What could i go with? I have to think of a problem dealing with two possibly related variables (Y and X) that im interested in. Please create my problem and discuss why a regression analysis could be appropriate for this problem Specifically, what statistical questions would I be asking?...
What is regression analysis? When would you use it? What is the difference between simple regression...
What is regression analysis? When would you use it? What is the difference between simple regression and multiple regression?
Discuss the reasons and situations in which researchers would want to use linear regression. How would...
Discuss the reasons and situations in which researchers would want to use linear regression. How would a researcher know whether linear regression would be the appropriate statistical technique to use? What are some of the benefits of fitting the relationship between two variables to an equation for a straight line? Describe the error in the conclusion. Given: There is a linear correlation between the number of cigarettes smoked and the pulse rate. As the number of cigarettes increases the pulse...
Discuss the reasons and situations in which researchers would want to use linear regression. How would...
Discuss the reasons and situations in which researchers would want to use linear regression. How would a researcher know whether linear regression would be the appropriate statistical technique to use? What are some of the benefits of fitting the relationship between two variables to an equation for a straight line? Describe the error in the conclusion. Given: There is a linear correlation between the number of cigarettes smoked and the pulse rate. As the number of cigarettes increases the pulse...
Use this regression model I created to answer this question: 1. Interpret the slope estimates in...
Use this regression model I created to answer this question: 1. Interpret the slope estimates in this model, interpret the impact of Income on U5MR, and interpret the R2[square]. Regression Statistics Multiple R 0.443 R Square 0.197 Adjusted R Square 0.190 Standard Error 35.125 Observations 132 ANOVA df SS MS F Significance F Regression 1 39247.449 39247.449 31.811 1.013E-07 Residual 130 160390.937 1233.776 Total 131 199638.386 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 90.0% Upper 90.0%...
How would I go about this question using R studio? In a multiple regression, investigate whether...
How would I go about this question using R studio? In a multiple regression, investigate whether the categorical variable "Type" has a statistically significant interaction effect with any of the other covariates, A, B, C and D. Of those interactions that are statistically significant (if any), determine which one has the most impact on the model and add it to your model. Please include general 'formulas' for the commands needed!
I have a question that I would like an explanation on how an ethical hacker uses...
I have a question that I would like an explanation on how an ethical hacker uses the information derived by use of Nslookup and Whois to mitigate network connectivity issues. If you could explain in a paragraph it would help me tremendously.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT