Question

In: Math

Suppose we have data from a health survey conducted in year 2000. Data were obtained from...

Suppose we have data from a health survey conducted in year 2000. Data were obtained from a random sample of 1000 persons.

An OLS linear regression analysis was carried out in the following way:

Dependent Variable: Systolic blood pressure (SBP, in mmHg)

Independent Variables: Gender (1 if female, 0 if male)

Age (in years)

Education (binary variables for “Not graduated from high school” and “Graduated from high school (but not from college)”; the reference category is “Graduated from college”)

A part of the results is shown below. The column labeled “Beta” show estimated values of partial regression coefficients. (It can be interpreted that beta’s for the reference categories, “Male” and “Graduated from college”, are fixed to be zero.) The p-values are for the two-sided test.

Variables

Beta

p-value

(Constant)

100.00

<0.01

Gender (Female)

-3.00

0.04

Age (in years)

0.50

<0.01

Education

Not graduated from high school

5.00

<0.01

Graduated from high school

2.00

0.08

1. According to the results of this regression analysis, how much expected difference in systolic blood pressure (in mmHg) is estimated:

1-1. between the two education categories, “Not graduated from high school” and “Graduated from college”, controlling for gender and age (i.e., among those who have the same gender and at the same age)?

1-2. between males and females, controlling for age and education?

2. Suppose we change the reference category of education from “Graduated from college” to “Graduated from high school” and do the same regression analysis again.

What will be the value of partial regression coefficient (beta) for “Not graduated from high school”?

(Hint: The expected SBP differences among the education categories do not change.)

Solutions

Expert Solution

The estimated regression equation is

where Gender =1 is female and Gender = 0 is male

NGHS=1, when Education = "Not graduated from high school", NGHS=0 otherwise

GHS=1, when Education = "Graduated from high school", GHS=0 otherwise

When education = "Graduated from college" , NGHS=0, GHS=0 (the reference categories)

1-1. The expected systolic blood pressure when education categories is “Not graduated from high school”,

setting NGHS=1, GHS=0 we get

The expected systolic blood pressure when education categories is “graduated from College”,

setting NGHS=0, GHS=0 we get

Hence we can say that the coefficient of “Not graduated from high school”, 5 indicates the predicted value of systolic blood pressure for Not graduated from high school” is 5 (in mmHg) higher than than those who "graduated college” which is the reference category.

ans: the expected difference in systolic blood pressure between the two education categories, “Not graduated from high school” and “Graduated from college”, controlling for gender and age is 5 (in mmHg)

1-2. The coefficient of Gender is -3. Following the same logic as in 1.1, since Gender=male is the reference category, the coefficient of Gender (-3) indicates that the predicted value of systolic blood pressure for females is 3 (in mmHg) lower than that for males.

ans: the expected difference in systolic blood pressure between males and females, controlling for age and education is 3 (in mmHg)

2. Suppose we change the reference category of education from “Graduated from college” to “Graduated from high school” and do the same regression analysis again.

Let b3 be the partial regression coefficient for "Not graduated from high school" and b4 be the partial regression coefficient for “Graduated from college” and b0 the new intercept.

The new codes would be

NGHS=1, when Education = "Not graduated from high school", NGHS=0 otherwise

GC=1, When education = "Graduated from college", GC=0 otherwise

NGHS=0, GC=0 when Education = "Graduated from high school" (the reference categories)

The expected systolic blood pressure for education category “Not graduated from high school”,

setting NGHS=1, GC=0 we get

where b0 is the intercept and b3 is the estimate coefficient of NGHS and b4 is the estimated coefficient of GC

similarly, The expected systolic blood pressure for education category “graduated from College”, when the reference category is “Graduated from high school” is

setting NGHS=0, GC=1 we get

The expected systolic blood pressure for education category “Graduated from high school”, when the reference category is “Graduated from high school” is

However, from 1.1, the estimated equation for education category  “Not graduated from high school” is

the estimated equation for education category  “graduated from high school” is

Since these 3 equations have to be the same (the estimated SBP remains the same irrespective of the reference category)

solving these we get

ans: The  value of partial regression coefficient (beta) for “Not graduated from high school” would be 3


Related Solutions

In a recent survey, the following data were obtained in response to the question “When making...
In a recent survey, the following data were obtained in response to the question “When making a purchase, do you prefer to pay with cash or a credit card?” The responses are listed in the table. Gender Cash Credit Card Total Males 21 39 60 Females 15 25 40 Total 36 64 100 What is the probability that a randomly selected person is a female and prefers to pay with cash?
In a survey of MBA students, the following data were obtained on “students’ first reason for...
In a survey of MBA students, the following data were obtained on “students’ first reason for application to the school in which they matriculated.” Reason for Application School School cost or Quality Convenience Other Totals Enrollment Status Full Time 421 393 76 890 Part Time 400 593 46 1039 Totals 821 986 122 1929 (a) Develop a joint probability table for these data. (b) Use the marginal probabilities of school quality, school cost or convenience, and other to comment on...
Suppose a health psychologist conducted a weight loss program for middle aged adults and obtained the...
Suppose a health psychologist conducted a weight loss program for middle aged adults and obtained the data presented in the table below. Use ANOVA to determine whether there is a difference in the three weight loss methods and whether the differences are statistically significant at the .05 level. What is your conclusion? The Zone Weight Watchers Acupuncture 21 67 78 54 62 50 26 57 55 21 68 62 28 58 56 57 70 77 52 58 71 37 59...
Suppose a health psychologist conducted a weight loss program for middle aged adults and obtained the...
Suppose a health psychologist conducted a weight loss program for middle aged adults and obtained the data presented in the table below. Use ANOVA to determine whether there is a difference in the three weight loss methods and whether the differences are statistically significant at the .05 level. The Zone Weight Watchers Acupuncture 21 67 78 54 62 50 26 57 55 21 68 62 28 58 56 57 70 77 52 58 71 37 59 53 26 68 79...
Data The data from the Survey conducted at the beginning of our course about the entry,...
Data The data from the Survey conducted at the beginning of our course about the entry, Height reported in inches, are collected. Complete each of the problems below. 59   51.5   63    60    63    63    59     60    61    61.2    62    62     62 64    64.8     67.8    68    69    72 Determine both the position and value of each quartile. • Q1: • Q2: • Q3:
In a survey of 233 people, the following data were obtained relating gender to political orientation:...
In a survey of 233 people, the following data were obtained relating gender to political orientation: Republican (R) Democrat (D) Independent (I) Total Male (M) 64 37 16 117 Femal (F) 50 50 16 116 Total 114 87 32 233 A person is randomly selected. What is the probability that the person is: a) Male?   b) Male and a Democrat? c) Male given that the person is a Democrat? d) Republican given that the person is Male? e) Female given...
In a survey of 233 people, the following data were obtained relating gender to political orientation:...
In a survey of 233 people, the following data were obtained relating gender to political orientation: Republican (R) Democrat (D) Libertarian (L) Total Male (M) 53 55 7 115 Female (F) 62 44 12 118 Total 115 99 19 233 A person is randomly selected. What is the probability that the person is: a) Male? b) Male and a Democrat? c) Male given that the person is a Democrat? d) Republican given that the person is Male? e) Female given...
Consider the data in the following table, obtained from a cohort study conducted by Iso and...
Consider the data in the following table, obtained from a cohort study conducted by Iso and colleagues [Isa H, Date C, Yamamoto A, et al. Smoking cessation and mortality from cardiovascular disease among Japanese men and women. The JACC study. Am J Epidmeiol. 2005;161(2):170-179.]. Total Cardiovascular Disease According to Smoking Status Disease Current Smoker Cases Person-years Yes 882 — 220,965 No 673 — 189,254 Calculate the rate ratio and corresponding 95% confidence interval for these data. Express your answer to...
The following data is from a survey that was conducted in both 1996 and 2001. Did...
The following data is from a survey that was conducted in both 1996 and 2001. Did the rates of smoking differ from 1996 to 2001? *Smokers are defined as those who smoke every day. State the Null and Alternative hypothesis. If possible, USING EXCEL, run a CHI square test. 1996 2001 Grade Smokers* Non smokers Smokers* Non smokers 8th       1,101     13,566          815     13,852 10th       2,107     12,560       1,833     12,834 12th       3,187    ...
The following data is from a survey that was conducted in both 1996 and 2001. Did...
The following data is from a survey that was conducted in both 1996 and 2001. Did the rates of smoking differ from 1996 to 2001? *Smokers are defined as those who smoke every day. (Data from WSJ, 1996 2001 Grade Smokers* Non smokers Smokers* Non smokers 8th              1,101                 13,566          948     13,852 10th              2,107                 12,560       2,054     12,834 12th              3,187                 11,479       3,021     12,015
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT