In: Statistics and Probability
In a multi linear regression case study, the dependent variable is house_value, the independent variables are house_age, crime_rate, tax_rate, trying to build a model to predict the house value, how to state model assumptions? What's the assumption in this case? Thanks!
Here, the dependent variable is the house value(with appropriate currency units)(let us denote it by y
The model:
Independent variables are age of the house(X1), crime rate(X2) in that locality, tax rate(X3) . Let us assume that there are n number of houses in that locality. The multi linear regression model is then given by:
where
is the value of the ith house
is the age of the ith house
is the crime rate of the ith house
is the tax rate of the ith house
is the overall mean or the value of the house when all independent values are absent or the purchase price
is the effect due to the age of the house( can be interpreted as the slope or the contribution to the house value by the age of the house. Often times it will be negative since as the house is getting older, the value comes down)
is the effect due to thecrime rate( can be interpreted as the slope or the contribution to the house value by the crime rate. I again see the the value goes down when the crime rate increses)
is the effect due to the tax rate( can be interpreted as the slope or the contribution to the house value by the tax rate)
is the error term in the model.
Assumptions:
1. Each observation(house value) is independent
2. All 's are independent and follows a normal distribution with constant variance(homoscedastic)
ie
3. The model parameters are additive