In: Statistics and Probability
Using appropriate example define the following terms as used in data analysis
Sample Statistic: A sample is a part of a population. A sample statistic is a statistical piece of information that we get from that sample.
For eg, If we want to obtain the mean height of the people of USA, going about calculating the height of every person who resides in USA is not efficient. What we do instead is, we choose a random sample from the population ( a smaller set of people instead of the entire population of the USA) and then measure their heights to get the mean height.
This mean height, is a sample statistic, based on which we will estimate the height for the entire population of USA.
Significance Level:
For understanding the significance level, first critical region needs to be understood.
Critical region is the region corresponding to a statistic in the Sample Space S which amounts to the rejection of the null hypothesis (Ho).
The probability "α" that a random value of the statistic belongs to the critical region is known as the level of significance.
The significance level equals the type I error rate (rejecting the null hypothesis when it's true). You can think of this error rate as the probability of a false positive. The test results lead you to believe that an effect exists when it actually does not exist.
For example:
If the level of significance is 1%, there is a 1% chance of rejecting the null hypothesis even when it is true.
Scale level measurement:
There are 4 different types of measurement scale, and they indicate what type of data is being collected/analyzed. They have 4 properties and are distinguished on the basis of these properties: identity, magnitude, equal interval, and the absolute zero property. They are used for qualitative & quantitative analysis of the data:
1. Nominal Scale: It is used for identification purposes and is strictly used for qualitative analysis. It is also known as the categorical scale: it assigns labels to attributes. Only frequency, percentage can be checked for these type of variables. For eg: We can assign the colours blue, green and red, the numbers 1,2,3, which does not mean anything except for assigning numerical codes for easier identification.
2.Ordinal scale: It involves the ranking or ordering of the attributes and is also used for qualitative analysis like the nominal scale. The attributes are usually arranged in an increasing or decreasing order. For eg, In a customer satisfaction survey: 1 indicates not satisfied, 2: satisfied, 3: highly satisfied, 4: best experience ever; this is an ordinal scale.
3. Interval Scale: It is a quantitative scale in which the levels are ordered and the difference between the two variables is meaningful and equal, and the presence of zero is arbitrary.It is an extension of the ordinal scale, with the main difference being the existence of equal intervals. For eg, while analysing temperature, 10-30 degrees and 50-70 degrees, the order is important, and the difference between the variables is also meaningful.
4. Ratio Scale: This is also a quantitative scale, & is the peak level of data measurement. Being an extension of the interval scale, it covers all four characteristics of measurement scale; identity, magnitude, equal interval, and the absolute zero property. This allows comparison of both the differences and the relative magnitude.
Some examples of ratio scales include length, weight, time, etc.