In: Statistics and Probability
Answer the questions below based on the following table:
Fatigue Vigor Sleepiness
Tension .36* –.31* .08
Fatigue –.73*** .57**
Vigor –.49**
* p < .05, ** p < .01, *** p < .001
The strongest correlation is between _________________________
and ________________________. (Give variable names)
The weakest correlation is between __________________________
and ________________________. (Give variable names)
The correlation between sleepiness and fatigue is ___________________
(indicate direction) and ________________________ (indicate strength).
1. When examining the relation between two variables, when is it better to use the regression coefficient β (beta) instead of the correlation coefficient r to test for significant effects?
2. Psychologists are growing less and less enthused with p-values (or cut-off points). What statistics do they prefer (as a better alternative to p-values) for evaluating studies’ results, and why?
-> Strongest Correlation is between Fatigue and Vigor
-> Weakest Correlation is between Tension and Sleepiness
-> Correaltion between Sleepiness and Fatigue is 0.57. The correlation is positive with moderate strength
1.) If we just want to check how much assocaition exists between two variables we can check for correlation. Again if correlation is low, that doesn't mean variables are not related. Correaltion only checks for linear relationship. There are chacnes tht variables can be non linearly related.
If we want to compute the value of one variable with the help of another or if we want to check how much a variable changes by changing another variable then we shouls use regression.
2.)
J Neyman (and ES Pearson) did not agree with Fishers logic to use p-values for decisions. They developed a purely decision.theoretic approach that allowd to decide between two alternative hypotheses (A vs. B) with a pre-defined confidence. The aim is to make a rational decision betwene A and B. It is not about rejecting a null-hypothesis. Either one accepts A or one accepts B. The rationality of the decision is introduced by the attempt to maximize the expected utility of the decision. This needs a definition of the utility under the various possible outcomes.
The critical point here is that the hypotheses need both be defined, and it must be reasonable to assume that they are true! Note that Fisher never was concerned about the "truth" or "falsehood" of the null-hypothesis. But here we must be convinced that either A is a good description of the reality or B is a good description of the reality. Only then can we make a rational decision based on the sampled data. This requires to state the utility (wins or losses) of correctly and wrongly accepting A and B. Based on these utilities one can derive the required confidences for the decisions, what is translated to the "probability of accepting B when A would actually be the good decsription of reality" (-> alpha; type-I error) and "probability of accepting A when B would actually be a good description of reality" (-> beta; type-II error). Sometimes instead of beta the complement 1-beta = power is used.
It seems obvious that this procedure is not suited for research questions. In reaserch it is usually not possible to dinstinguish two precise hypotheses (what should B be, precisely?), and there is usually no chance to state any utility function. But then it is not possible to select alpha and beta (or the power) to justify a rationaldecision. And if we don't know if the decision is rational, why do we do all this?
A possible way out might be to use Bayesian statistics to get a posterior predictive model and a utility function over the hypothesis space. The decision may then be taken based on the sign of the integral of the product of posterior x utility. This had the advantage that B would not need to be specified (the posterior will tell us what we should believe about B, given the data), and that such decisions would be in a sense rational, but it would again impede the control of error-rates.