In: Statistics and Probability
(a) highlight the characteristics of a statistical representative sample (b) explain the four main key factors that determined the size of most samples in statistics . (c) list and explain the types of secondary data you know (d) identify and example any 5 types of secondary data you know
(a) In real life we have a large set of data , sometimes it is not possible to access the whole set of data , instead of using the whole set of data we use a subset of the large data by choosing randomly. The technique is know as Random Sampling.
Properties :
1. Sample should be a good representative of the population.
2. Sample mean is unbiased estimator of the population mean.
3. A sample should be proportional. It should be large enough to represent the universe properly. The sample size should be sufficiently large to provide statistical stability or reliability.
4. A sample should be economical. The quality of the sampling techniques depends on how much money you are going to spend.
(b) Sample size determination :
1. using experience, in Hypothesis testing if we know the confidence coefficient , margin of error etc , we can gain some idea regarding sample size.
2 using a target variance for an estimate to be derived from the sample eventually obtained, i.e. if a high precision is required this translates to a low target variance of the estimator.
3. Using power of a statistical testing of hypothesis, we can determine the sample size.
4. Using confidence interval. Larger the sample size required larger confidence interval.
(c) Secondary data refers to data that is collected by someone other than the user.
Secondary data can be obtained from different sources:
In every 10 years a census is conducted by the Indian government from there we can get a lot of information of every individual in our country ( age , sex , religion, annual income etc)
The data or information collected from the Newspaper is the source of secondary data .
We can collect information about health related issues from Hospital records which is secondary for us.
Different educational institutions are the source of secondary data . We can collect marks , ranks for different exam with respective students .
I have answered the question (c) and (d) together because these two questions are similar types.