In: Statistics and Probability
Explain how z scores are incorporated into r calculations and CI.
Z-scores are a stand-in for the actual measurement, and they represent the distance of a value from the mean measured in standard deviations. So a z-score of 2.0 means the measurement is 2 standard deviations away from the mean.
It is easy to calculate the Z-scores in R.
First, specify a Normal distribution say, of weights of students of a class : N(2) where are mean and standard deviation respectively.
Let us use the z-score to find the probability of finding someone who is 72 kg of weight
i.e for a value x=72 we want to find the z score. Input values for in R as :
We first generate 20 samples from the normal distribution and find the Z score for x=72
R-code :
Now we want to find the probability of a student having weight more than 72. The z-score will be used to determine the area [probability] underneath the distribution curve past the z-score value that we are interested in.
Both R and typical z-score tables will return the area under the curve from -infinity to the specified value on the graph (which in this case is x=72)
We generally use the z-score and look it up in a table and find the probability that way. But R has a function ‘pnorm’ which will give you a more precise answer than a table in a book.[‘pnorm’ stands for “probability normal distribution”.]
R code :
So our required probability of finding a student having weight more than 54 kg is (1-0.9772499)=0.022
Confidence Interval with Z score :
CI for the population mean of the normal distribution is (- /2*, + /2*)
input values in R and formulate the above to get the confidence interval for the mean. Say, alpha=0.1 we know /2=1.645
The qnorm( ) function gives critical z-values corresponding to a given lower-tailed area We will use this function to verify the value of /2=1.645 and hence find the CI.
Hence we get the confidence interval of population mean.