In: Statistics and Probability
Critical Thinking Questions
What does a z-score tell you about a number in a data set? [1 sentence]
The z-score gives the relative position of the data point on the standard normal distribution curve from the mean such that how many standard deviations a data point lies on the standard normal distribution curve from the mean.
What two quantities do we need to fully describe a normal
distribution? [1 sentence]
A normal distribution curve is described by two-parameter i.e. mean and the standard deviation.
How is probability determined from a continuous
distribution? Why is this easy for the uniform distribution and not
so easy for the normal distribution? [2 sentences]
The probability of a continuous distribution is obtained by integrating the area which falls under the specified limit.
Now, calculating the probability for the uniform distribution is easy because taking the integration of uniform distribution is simple compared to the Gaussian function for normal distribution.
What does the symmetric bell shape of the normal curve
imply about the distribution of individuals in a normal population?
[2 sentences]
If the distribution for the individual data point is symmetric we can say that the shape of the distribution is the same on either side of the mean.
For the symmetric distribution, the mean, median and mode all lie at the same point i.e. in the middle of the distribution.
How can the empirical rule be restated in terms of z-scores
and percentiles? Restate it for four of the seven
z-scores.
The z score tells how far a data point lies from the mean of the normal curve. It uses mean as the center of the distribution. The percentile score of the data point tells the percentage of the data point that falls under it.
The empirical rule says that if the distribution is roughly normal then approximately 68%, 95% and 99.7% data point lies within one, two and three standard deviations from mean respectively.
We can restate the empirical rule in terms of z scores as
Within z = 2
for z score = -2, percentile score = 2.28% such that 2.28% data value lies below z score = -2 and for z score = 2, percentile score = 97.72% such 97.72% data value lies below z score = 2 and within z = 2 approximately 95% data value lies
Within z = 1
for z score = -1, percentile score = 15.87% such that 15.87% data value lies below z score = -1 and for z score = 1, percentile score = 87.13% such 84.13% data value lies below z score = 1 and within z = 1 approximately 68% data value lies.