In: Statistics and Probability
Answers:
**********************************************************************************************************************************
Ans a)
Hypothesis testing is used to assess the plausibility of a hypothesis by using sample data. The test provides evidence concerning the plausibility of the hypothesis, given the data. Statistical analysts test a hypothesis by measuring and examining a random sample of the population being analyzed.
The purpose of hypothesis testing in statistics and econometrics is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about a parameter.
****************************************************************************************************************************
Ans b)
1. A z-test is a statistical test to determine whether two population means are different when the variances are known and the sample size is large. It can be used to test hypotheses in which the z-test follows a normal distribution.
*********************************************************************************************************************************
2. The test statistic is a t statistic (t) defined by the following equation. t = (x - μ) / SE. where x is the sample mean, μ is the hypothesized population mean in the null hypothesis, and SE is the standard error. P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic
*****************************************************************************************************************************
3. t-Tests. The tests are used to conduct hypothesis tests on the regression coefficients obtained in simple linear regression. A statistic based on the distribution is used to test the two-sided hypothesis that the true slope, equals some constant value.
*******************************************************************************************************************************
4. The F-test of overall significance indicates whether your linear regression model provides a better fit to the data than a model that contains no independent variables. In this post, I look at how the F-test of overall significance fits in with other regression statistics, such as R-squared. R-squared tells you how well your model fits the data, and the F-test is related to it.
The F-test for overall significance has the following two hypotheses:
*********************************************************************************************************************
5. DW test of 1st order serial correlation
Serial correlation occurs in time-series studies when the errors associated with a given time period carry over into future time periods. With first-order serial correlation, errors in one time period are correlated directly with errors in the ensuing time period.
*************************************************************************************************************************
6.
In econometrics, the Park test is a test for heteroscedasticity. The test is based on the method proposed by Rolla Edward Park for estimating linear regression parameters in the presence of heteroscedastic error terms.
Test description:
Park, on noting a standard recommendation of assuming proportionality between error term variance and the square of the regressor, suggested instead that analysts 'assume a structure for the variance of the error term' and suggested one such structure.
in which the error terms {\displaystyle v_{i}} are considered well behaved.
This relationship is used as the basis for this test.
The modeler first runs the unadjusted regression
where the latter contains p − 1 regressors, and then squares and takes the natural logarithm of each of the residuals ({\displaystyle {\hat {\epsilon _{i}}}}), which serve as estimators of the {\displaystyle \epsilon _{i}}. The squared residuals {\displaystyle {\hat {\epsilon _{i}}}^{2}} in turn estimate {\displaystyle \sigma _{\epsilon i}^{2}}.
If, then, in a regression of {\displaystyle \ln {(\epsilon _{i}^{2})}} on the natural logarithm of one or more of the regressors {\displaystyle X_{i}}, we arrive at statistical significance for non-zero values on one or more of the {\displaystyle {\hat {\gamma }}_{i}}, we reveal a connection between the residuals and the regressors. We reject the null hypothesis of homoscedasticity and conclude that heteroscedasticity is present.
*********************************************************************************************************************************
Ans c)
The error term is included to make the model more accurate. All the input variables mentioned are not enough to account for the output variable. As sometimes there are other input variables that cannot be quantified or measured and hence they are accounted for in the error term.
An error term is a residual variable produced by a statistical or mathematical model, which is created when the model does not fully represent the actual relationship between the independent variables and the dependent variables.
********************************************************************************************************************************