In: Statistics and Probability
1) In a study, the final statistical analysis showed that r square=0.35 (p<0.01). Which one of the following interpretations best explains this results?
A) The model explains 65% of the variation in the outcome, because 1.00-0.35=65%.
B) No conclusions can be drawn because it is not apparent whether the estimated coefficients for each covariate were statistically significant.
C) About 35% of the variation in the outcome was explained by the independent variable(s).
D) The model explains about 35% of the variation for independent variable(s).
2) Which of followings refers the multicollinearity problem?
A) Some independent variables are strongly correlated each other.
B) When the coefficient for a product of two independent variables (X1*X2) is statistical significant (p<0.05).
C) There is an interaction effect between two independent variables.
D) It will occur when linear regression encounters step-wise regression.
3) A PGY1 post-graduate conducted a survey study in her community. Of 10,000 surveyed residents, there are 200 persons with diabetes mellitus, 50 persons with heart disease and 20 persons with both diabetes and heart disease. If a selected resident has diabetes mellitus, what is the probability that this same individual also has heart disease? (Clue: need to calculate the relevant probability).
A) 10%
B) 20%
C) 0.2%
D) 40%
E) 0.5%
4) A clinical research plans to conduct a linear regression analysis to assess the Health related quality of life score which is the primary outcome with continuous data. The health outcomes will be regressed on 10 predictors or confounding factors including age, sex, race, BMI, health education, family incomes, number of years disease on set, etc. Based on our discussion in the lecture, how many patients at least does he/she need to recruit for this linear regression?
A) 50
B) 150
C) 200
D) 1500
E) 30
5) Since you learned the multiple linear regression analysis in class, you are given the following linear regression model: Y (female life expectancy) = 82.7 – 0.12 * (fertility number) – 0.24 * (infant mortality per 1000). Please predict the female life expectancy in Ghana country where fertility number = 5.8 and infant mortality per 1000 = 58.3.
A) 80.7
B) 68.0
C) 57.6
D) 69.0
6) A research scientist conducted a factorial ANOVA for her clinical study, which involved 5 different therapy regimens in each of four different hospital settings. In order to assess the therapy effect, the pharmacist would like to evaluate any interaction effect between hospital and regimen. The degree of freedom for interaction is equal to:
A) 7
B) 12
C) 8
D) 20
E) 6
7) There are two kinds of influential statistics: parametric vs. non-parametric statistics. Which of followings is NOT parametric influential statistics?
A) Student t-test
B) F-test
C) Two-way ANOVA
D) Wilcoxon test
E) ANCOVA
Solution-1:
r sq=0.35
=35% variation in Y is explained by x
C) About 35% of the variation in the outcome was explained by the independent variable(s).
2) Which of followings refers the multicollinearity problem?
Multicollinearity means of the independent vriables are strongly related to each other.
There can be multicollinearity among dependent and indpendent variables but not within independent variables
variance inflation factor,VIF=1/1-r^2
if variance inflation factor >10, such variables are multicollinear and can be excluded
A) Some independent variables are strongly correlated each other.
Solution-3:
P(D)=200/10000=0.02
P(H)=50/10000=0.005
P(D and H)=20/10000=0.002
P(D/H)=probability that he has diabetes/given he has heart disease
form conditional prob
P(D/H)=P(D and H)/P(H)
=0.002/0.005
=0.4
=0.4*100
=40%
40%
Solution-5:
Y (female life expectancy) = 82.7 – 0.12 * (fertility number) – 0.24 * (infant mortality per 1000)
Given (fertility number=5.8
infant mortality per 1000=58.3
Y=82.7-0.12*5.8-0.24*58.3
Y=68.012
68