In: Statistics and Probability
How is it that psychologists can make inferences about unmeasurably large populations based on measurements taken from relatively small samples of only thirty people?
Psychologists agree that if their ideas and theories about human behaviour are to be taken seriously, they must be backed up by data. However, the research of different psychologists is designed with different goals in mind, and the different goals require different approaches. These varying approaches are known as research designs. A research design is the specific method a researcher uses to collect, analyze, and interpret data.
Psychologists use three major types of research designs in their
research, and each provides an essential avenue for scientific
investigation.
1) Descriptive research is research designed to
provide a snapshot of the current state of affairs.
2) Correlational research is research designed to
discover relationships among variables and to allow the prediction
of future events from present knowledge.
3) Experimental research is research in which
initial equivalence among research participants in more than one
group is created, followed by a manipulation of a given experience
for these groups and a measurement of the influence of the
manipulation on the " whole Population " .
The goal of experimental research design is to provide more
definitive conclusions about the causal relationships among the
variables in the research hypothesis than is available from
correlational designs. In an experimental research design, the
variables of interest are called the independent variable (or
variables) and the dependent variable. The independent variable in
an experiment is the causing variable that is created (manipulated)
by the experimenter. The dependent variable in an experiment is a
measured variable that is expected to be influenced by the
experimental manipulation. The research hypothesis suggests that
the manipulated independent variable or variables will cause
changes in the measured dependent variables.
Null Hypothesis and Testing
Before we venture on the difference between different tests, we
need to formulate a clear understanding of what a null hypothesis
is. A null hypothesis, proposes that no significant difference
exists in a set of given observations. For the purpose of these
tests in general
Null: Given two sample means are equal
Alternate: Given two sample means are not equal
For rejecting a null hypothesis, a test statistic is calculated.
This test-statistic is then compared with a critical value and if
it is found to be greater than the critical value the hypothesis is
rejected. “
Critical Value
A critical value is a point (or points) on the scale of the test
statistic beyond which we reject the null hypothesis, and, is
derived from the level of significance α of the test. Critical
value can tell us, what is the probability of two sample means
belonging to the same distribution. Higher, the critical value
means lower the probability of two samples belonging to same
distribution. The general critical value for a two-tailed test is
1.96, which is based on the fact that 95% of the area of a normal
distribution is within 1.96 standard deviations of the mean.
Critical values can be used to do hypothesis testing in following
way
1. Calculate test statistic
2. Calculate critical values based on significance level
alpha
3. Compare test statistic with critical values.
If the test statistic is lower than the critical value, accept the
hypothesis or else reject the hypothesis. For checking out how to
calculate a critical value in detail please do check
In statistics “population” refers to the total set of
observations that can be made. For eg, if we want to calculate
average height of humans present on the earth, “population” will be
the “total number of people actually present on the earth”.
A sample, on the other hand, is a set of data collected/selected
from a pre-defined procedure. For our example above, it will be a
small group of people selected randomly from some parts of the
earth.
To draw inferences from a sample by validating a hypothesis it is
necessary that the sample is random.
Z-test
In a z-test, the sample is assumed to be normally distributed. A
z-score is calculated with population parameters such as
“population mean” and “population standard deviation” and is used
to validate a hypothesis that the sample drawn belongs to the same
population.
Null: Sample mean is same as the population mean
Alternate: Sample mean is not same as the population mean
The statistics used for this hypothesis testing is called
z-statistic, the score for which is calculated as
z = (x — μ) / (σ / √n), where
x= sample mean
μ = population mean
σ / √n = population standard deviation
If the test statistic is lower than the critical value, accept the
hypothesis or else reject the hypothesis
When the sample size is less than or equal to 30 we use t
test instead of Z test.
T-test
A t-test is used to compare the mean of two given samples. Like a
z-test, a t-test also assumes a normal distribution of the sample.
A t-test is used when the population parameters (mean and standard
deviation) are not known.
There are three versions of t-test
1. Independent samples t-test which compares mean for two
groups
2. Paired sample t-test which compares means from the same group at
different times
3. One sample t-test which tests the mean of a single group against
a known mean.
The statistic for this hypothesis testing is called t-statistic,
the score for which is calculated as
t = (x1 — x2) / (σ / √n1 + σ / √n2), where
x1 = mean of sample 1
x2 = mean of sample 2
n1 = size of sample 1
n2 = size of sample 2
Chi-Square Test
Chi-square test is used to compare categorical variables. There are
two type of chi-square test
1. Goodness of fit test, which determines if a sample matches the
population.
2. A chi-square fit test for two independent variables is used to
compare two variables in a contingency table to check if the data
fits.
a. A small chi-square value means that data fits
b. A high chi-square value means that data doesn’t fit.
The hypothesis being tested for chi-square is
Null: Variable A and Variable B are independent
Alternate: Variable A and Variable B are not independent.
The statistic used to measure significance, in this case, is called
chi-square statistic. The formula used for calculating the
statistic is
Χ2 = Σ [ (Or,c — Er,c)2 / Er,c ] where
Or,c = observed frequency
Er,c = expected frequency
Note: As one can see , in all the tests a statistic is being
compared with a critical value to accept or reject a hypothesis.
However, the statistic and way to calculate it differ depending on
the type of variable, the number of samples being analyzed and if
the population parameters are known. Thus depending upon such
factors a suitable test and null hypothesis is chosen.
Despite these advantages, descriptive research has a distinct disadvantage in that, although it allows us to get an idea of what is currently happening, it is usually limited to static pictures. Although descriptions of particular experiences may be interesting, they are not always transferable to other individuals in other situations, nor do they tell us exactly why specific behaviours or events occurred. For instance, descriptions of individuals who have suffered a stressful event, such as a war or an earthquake, can be used to understand the individuals’ reactions to the event but cannot tell us anything about the long-term effects of the stress. And because there is no comparison group that did not experience the stressful situation, we cannot know what these individuals would be like if they hadn’t had the stressful experience.