Question

In: Economics

600,000 students in 79 countries take a PISA test every three years to evaluate their learning...

600,000 students in 79 countries take a PISA test every three years to evaluate their learning in reading, math, and science. Students in the Beijing-Shanghai-Jiangsu-Zhejiang regions in China scored the highest average scores in all three subjects in 2018. Canada ranked around 12th overall but has had steadily declining scores in math and science. We have data on average math scores and characteristics of 500 school districts across Canada. This problem asks you to build up a regression model to look for factors that affect average math test scores in various school districts in Canada.

I run a basic regression equation that explains math scores (SCORE) as a function of:
1. STR = the student teacher ratio. (number of students per teacher)
2. SPEND = government spending per student in the school district in dollars per student.
3. HIESL= a dummy variable=1 if the percentage of students learning English as a new language is above 20% and =0 if it is below 20%.
The regression also includes the usual constant term.

There is a categorical variable called PROVINCE for the province of the school district. This variable =1 for Ontario, =2 for Quebec, =3 for Alberta, and =4 for British Columbia.

There is also a categorical variable called INCOME which is the average per capita income of households in the school district. INCOME = 1 if the district has average per capital income below $40000, =2 if the district has per capita income between $40000 and $60000, =3 if the district has per capita income >= $60000.

How do I add PROVINCE and INCOME into my regression model? Which specification is correct?

  • A.

    SCORE = b2STR + b3SPEND + b4HIESL + b5ON + b6QC + b7BC + b8AB + b9INC1 + b10INC2 + b11INC3

    where
    ON = 1 if the district is in Ontario and 0 otherwise
    QC = 1 if the district is in Quebec and 0 otherwise
    BC = 1 if the district is in BC and 0 otherwise
    AB = 1 if the district is in Alberta and 0 otherwise
    INC1 = 1 if the district is the low income range and 0 otherwise
    INC2 = 1 if the district is the middle income range and 0 otherwise
    INC3 = 1 if the district is the high income range and 0 otherwise

  • B.

    SCORE = b1 +b2STR + b3SPEND + b4HIESL + b5PROVINCE + b6INCOME

    where PROVINCE takes values 1 to 4
    and INCOME takes values 1 to 3.

  • C.

    SCORE = b1 + b2STR + b3SPEND + b4HIESL +  b5ON + b6QC + b7BC + b8AB + b9INC1 + b10INC2 + b11INC3

    where
    ON = 1 if the district is in Ontario and 0 otherwise
    QC = 1 if the district is in Quebec and 0 otherwise
    BC = 1 if the district is in BC and 0 otherwise
    AB = 1 if the district is in Alberta and 0 otherwise
    INC1 = 1 if the district is the low income range and 0 otherwise
    INC2 = 1 if the district is the middle income range and 0 otherwise
    INC3 = 1 if the district is the high income range and 0 otherwise

  • D.

    SCORE = b1 +b2STR + b3SPEND + b4HIESL + b5ON + b5QC + b6BC + b7INC2 + b8INC3

    where
    ON = 1 if the district is in Ontario and 0 otherwise
    QC = 1 if the district is in Quebec and 0 otherwise
    BC = 1 if the district is in BC and 0 otherwise
    INC2 = 1 if the district is the middle income range and 0 otherwise
    INC3 = 1 if the district is the high income range and 0 otherwise

  • E.

    SCORE = b1 +b2STR + b3SPEND + b4HIESL + b5ON + b6QC + b7BC + b8AB + b7INC2 + b8INC3

    where
    ON = 1 if the district is in Ontario and 0 otherwise
    QC = 1 if the district is in Quebec and 0 otherwise
    BC = 1 if the district is in BC and 0 otherwise
    AB = 1 if the district is in Alberta and 0 otherwise
    INC2 = 1 if the district is the middle income range and 0 otherwise
    INC3 = 1 if the district is the high income range and 0 otherwise

Solutions

Expert Solution

It shall be noted that categorical variable called PROVINCE has 4 values:

1 for Ontario

2 for Quebec

3 for Alberta

4 for British Columbia

The variable has 3 values:

INCOME = 1 if the district has average per capital income below $40000

INCOME =2 if the district has per capita income between $40000 and $60000

INCOME =3 if the district has per capita income >= $60000

It shall be noted that regression model with the properties of classical linear regression model has an intercept.

Since, there are two categorical variable - PROVINCE with4 values and INCOME with 3 values, so, regression model would require creation of dummy variables for each value, such that PROVINCE dummies are QN, QC, BC and AB

Whereas, that for INCOME, the three dummies are INC1, INC2 and INC3

For regression equation, keep one of the values for each of the two categorical variables as base so as to avoid the problem of dummy trap and ensure re-stating the regression equation as follows:

SCORE = b1 +b2STR + b3SPEND + b4HIESL + b5ON + b5QC + b6BC + b7INC2 + b8INC3

where
ON = 1 if the district is in Ontario and 0 otherwise
QC = 1 if the district is in Quebec and 0 otherwise
BC = 1 if the district is in BC and 0 otherwise
INC2 = 1 if the district is the middle income range and 0 otherwise
INC3 = 1 if the district is the high income range and 0 otherwise

It shall be noted that in the above equation the dummy variable AB and INC1 are assumed as base variable.

Thus, the correct answer is: D.


Related Solutions

Alice, Bob, and Chuck are three students who go out for coffee every day. But every...
Alice, Bob, and Chuck are three students who go out for coffee every day. But every day, they randomly determine who pays for the three coffee. If Alice pays for the coffee today, then there is a 25% chance she will also pay tomorrow, a 50% chance Bob will pay tomorrow, and 25% chance Chuck will pay tomorrow. If Bob pays today, there is a 50% chance Alice will pay tomorrow, and 50% chance Chuck will pay tomorrow. If Chuck...
online learning poses some challenges to students in developing countries like Ghana, with your own experience...
online learning poses some challenges to students in developing countries like Ghana, with your own experience discuss four challenges you have faced since UENR's online learning program. Your answer should be at least 500 word long.
1. Suppose a sample of 30 students take an IQ test. If the sample has a...
1. Suppose a sample of 30 students take an IQ test. If the sample has a standard deviation of 12.23, find a 90% confidence interval for the population standard deviation.
Using R Studio: A College Algebra course requires students to take an assessment test at the...
Using R Studio: A College Algebra course requires students to take an assessment test at the start of the course and again at the end of the course. The pre and post test scores for ten students are: Student 1 2 3 4 5 6 7 8 9 10 Pre-test score 70 62 63 61 56 52 71 63 64 67 Post-test score 87 71 82 78 57 50 72 65 78 65 Do the assessment test results support the...
In a school district, all sixth grade students take the same standardized test. The superintendant of...
In a school district, all sixth grade students take the same standardized test. The superintendant of the school district takes a random sample of 2323 scores from all of the students who took the test. She sees that the mean score is 147147 with a standard deviation of 18.963118.9631. The superintendant wants to know if the standard deviation has changed this year. Previously, the population standard deviation was 1212. Is there evidence that the standard deviation of test scores has...
In a school district, all sixth grade students take the same standardized test. The superintendant of...
In a school district, all sixth grade students take the same standardized test. The superintendant of the school district takes a random sample of 25 scores from all of the students who took the test. She sees that the mean score is 170 with a standard deviation of 4.0774. The superintendant wants to know if the standard deviation has changed this year. Previously, the population standard deviation was 13. Is there evidence that the standard deviation of test scores has...
In a school district, all sixth grade students take the same standardized test. The superintendant of...
In a school district, all sixth grade students take the same standardized test. The superintendant of the school district takes a random sample of 22 scores from all of the students who took the test. She sees that the mean score is 160 with a standard deviation of 28.2396. The superintendant wants to know if the standard deviation has changed this year. Previously, the population standard deviation was 28. Is there evidence that the standard deviation of test scores has...
In a school district, all sixth grade students take the same standardized test. The superintendent of...
In a school district, all sixth grade students take the same standardized test. The superintendent of the school district takes a random sample of 26 scores from all of the students who took the test. She sees that the mean score is 130 with a standard deviation of 7.2344. The superintendent wants to know if the standard deviation has changed this year. Previously, the population standard deviation was 16. Is there evidence that the standard deviation of test scores has...
In a school district, all sixth grade students take the same standardized test. The superintendant of...
In a school district, all sixth grade students take the same standardized test. The superintendant of the school district takes a random sample of 26 scores from all of the students who took the test. She sees that the mean score is 101 with a standard deviation of 10.1793. The superintendant wants to know if the standard deviation has changed this year. Previously, the population standard deviation was 21. Is there evidence that the standard deviation of test scores has...
In a school district, all sixth grade students take the same standardized test. The superintendant of...
In a school district, all sixth grade students take the same standardized test. The superintendant of the school district takes a random sample of 29 scores from all of the students who took the test. She sees that the mean score is 167 with a standard deviation of 11.4238. The superintendant wants to know if the standard deviation has changed this year. Previously, the population standard deviation was 18. Is there evidence that the standard deviation of test scores has...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT