In: Statistics and Probability
In the following study a sample of n=933 men and women age 25 to 64 who have recently suffered severe back pain, where selected. the objective of the study is to determine the factor that have greatest influence of the severity back pain. Each subject has given questionaries and scores on a scale of 0-100 as to serenity of their back pain disability. in addition to age and sex, measurements where take on height,weight,height/weight ratio and lifetime participation rate in physical activities( active and inactive) on each of the n=933 subjects, discuss the possible way to analyze data. what short of problem one can expect on your analysis.
As it has been told in the problem, the objective of the study is to determine the factors that have greatest influence on the severity of back pain.
The recorded variables are –
Severity of back pain : (categorical, ordinal) but as recorded on scale of 0-100, can be considered as quantitative data for practical purpose.
Gender : (categorical, binary) male, female.
Age : (quantitative)
Height/weight ratio: (quantitative)
Lifetime participation: (categorical, binary) active, inactive.
For the analysis, sample of n=933 patients are available.
This is essentially a multiple regression problem with ‘severity of back pain’ as dependent/response variable and all other variables are independent/predictor variables.
The most significant factors of back pain will be the variables that will enter in selected final regression model. i.e. set of only those predictors whose modeling and predicting ability altogether, is optimal in some sense. (We can use different criterion for optimality here like – R2, adj R2, Mallow’s Cp, AIC, BIC etc.)
Problem that may arise:
Multicollinearity, which means near linear association among predictor variables may arise in this problem. As one of the predictor (height/weight ratio) is actually function of other two predictors – height and weight. There are ways to diagnose model for multicollinearity, e.g. eigenvalue analysis. And multicollinearity can be dealt with available techniques like – ridge regression, LASSO regression.