Question

In: Computer Science

Using NumPy etc. How to calculate a population mean and population standard deviation of a column...

Using NumPy etc. How to calculate a population mean and population standard deviation of a column with values. Also, how to calculate the range of values around the mean that includes 95% confidence interval of values within the column.

Solutions

Expert Solution

numpy.mean()

Arithmetic mean is the sum of elements along an axis divided by the number of elements. The numpy.mean() function returns the arithmetic mean of elements in the array. If the axis is mentioned, it is calculated along it.

Example:-

import numpy as np 
a = np.array([[1,2,3],[3,4,5],[4,5,6]]) 

print 'Our array is:' 
print a 
print '\n'  

print 'Applying mean() function:' 
print np.mean(a) 
print '\n'  

print 'Applying mean() function along axis 0:' 
print np.mean(a, axis = 0) 
print '\n'  

print 'Applying mean() function along axis 1:' 
print np.mean(a, axis = 1)

It will produce the following output −

Our array is:
[[1 2 3]
 [3 4 5]
 [4 5 6]]

Applying mean() function:
3.66666666667

Applying mean() function along axis 0:
[ 2.66666667 3.66666667 4.66666667]

Applying mean() function along axis 1:
[ 2. 4. 5.]

Standard Deviation

Standard deviation is the square root of the average of squared deviations from mean. The formula for standard deviation is as follows −

std = sqrt(mean(abs(x - x.mean())**2))

If the array is [1, 2, 3, 4], then its mean is 2.5. Hence the squared deviations are [2.25, 0.25, 0.25, 2.25] and the square root of its mean divided by 4, i.e., sqrt (5/4) is 1.1180339887498949.

Example:-

import numpy as np 
print np.std([1,2,3,4])

It will produce the following output −

1.1180339887498949 

95% confidence interval

A 95% confidence interval means that if we were to take 100 different samples and compute a 95% confidence interval for each sample, then approximately 95 of the 100 confidence intervals will contain the true mean value (μ). In practice, however, we select one random sample and generate one confidence interval, which may or may not contain the true mean. The observed interval may over- or underestimate μ. Consequently, the 95% CI is the likely range of the true, unknown parameter. The confidence interval does not reflect the variability in the unknown parameter. Rather, it reflects the amount of random error in the sample and provides a range of values that are likely to include the unknown parameter. Another way of thinking about a confidence interval is that it is the range of likely values of the parameter (defined as the point estimate + margin of error) with a specified level of confidence (which is similar to a probability).

Suppose we want to generate a 95% confidence interval estimate for an unknown population mean. This means that there is a 95% probability that the confidence interval will contain the true population mean. Thus, P( [sample mean] - margin of error < μ < [sample mean] + margin of error) = 0.95.


Related Solutions

Calculate the Mean, Median, Standard Deviation, Coefficient of Variation and 95% population mean confidence intervals for...
Calculate the Mean, Median, Standard Deviation, Coefficient of Variation and 95% population mean confidence intervals for property prices of Houses based on the following grouping:  Proximity of the property to CBD Note: Calculate above statistics for both “Up to 5KM” and “Between 5KM and 10KM”  Number of bedrooms Note: Calculate above statistics for “One bedroom”, “Two bedrooms” and “Three bedrooms or more”  Number of bathrooms Note: Calculate above statistics for “One bathroom”, “Two bathrooms”, “Three bathrooms or...
Item Sample Mean 1 Population standard deviation of 1 n1 Sample Mean 2 Population Standard Deviation...
Item Sample Mean 1 Population standard deviation of 1 n1 Sample Mean 2 Population Standard Deviation 2 n2 7 18 6 169 12 12 121 0.01 Perform a Two-tailed hypothesis test for two population means.
one can calculate the 95% confidence interval for the mean with the population standard deviation knowing...
one can calculate the 95% confidence interval for the mean with the population standard deviation knowing this gives us an upper and lower confidence limit what happens if we decide to calculate the 99% confidence interval describe how the increase in the confidence level has changed the width of the confidence interval
One can calculate the 95% confidence interval for the mean with the population standard deviation known....
One can calculate the 95% confidence interval for the mean with the population standard deviation known. This will give us an upper and a lower confidence limit. What happens if we decide to calculate the 99% confidence interval? Describe how the increase in the confidence level has changed the width of the confidence interval. Do the same for the confidence interval set at 80%. Include an example with actual numerical values for the intervals in your post to help with...
One can calculate the 95% confidence interval for the mean with the population standard deviation known....
One can calculate the 95% confidence interval for the mean with the population standard deviation known. This will give us an upper and a lower confidence limit. What happens if we decide to calculate the 99% confidence interval? Your task for this discussion is as follows: Describe how the increase in the confidence level has changed the width of the confidence interval. Do the same for the confidence interval set at 80%. Include an example with actual numerical values for...
Suppose the mean and the standard deviation of a distribution are as follows: population mean and...
Suppose the mean and the standard deviation of a distribution are as follows: population mean and standards deviation are 60 and 5, respectively. At least what proportion of the observations lie between 45 and 75?
How do we find a proportion of a population with only a mean and standard deviation?...
How do we find a proportion of a population with only a mean and standard deviation? SAT scores are normally distributed with a mean of 500 and a standard deviation of 100. For each of the SAT scores below, determine what proportion of the population lies below that score. scores, 550, 640,720,370
The mean of a population is 75 and the standard deviation is 13. The shape of...
The mean of a population is 75 and the standard deviation is 13. The shape of the population is unknown. Determine the probability of each of the following occurring from this population. a. A random sample of size 32 yielding a sample mean of 76 or more b. A random sample of size 160 yielding a sample mean of between 74 and 76 c. A random sample of size 218 yielding a sample mean of less than 75.2 (Round all...
a. A population is normally distributed with a mean of 16.4 and a standard deviation of...
a. A population is normally distributed with a mean of 16.4 and a standard deviation of 1.4. A sample of size 36 is taken from the population. What is the the standard deviation of the sampling distribution? Round to the nearest thousandth. b. A population is normally distributed with a mean of 15.7 and a standard deviation of 1.4. A sample of size 24 is taken from the population. What is the the standard deviation of the sampling distribution? Round...
The mean of a population is 74 and the standard deviation is 16. The shape of...
The mean of a population is 74 and the standard deviation is 16. The shape of the population is unknown. Determine the probability of each of the following occurring from this population. a. A random sample of size 32 yielding a sample mean of 78 or more b. A random sample of size 130 yielding a sample mean of between 71 and 76 c. A random sample of size 219 yielding a sample mean of less than 74.7 (Round all...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT