Question

In: Math

Discuss the three properties (characteristics) of data and explain some of the descriptive measures associated with...

Discuss the three properties (characteristics) of data and explain some of the descriptive measures associated with each property.

Solutions

Expert Solution

The three properties of data are

a) Central Tendency of data

b) Dispersion of data

c) Correlation of data

a) Central Tendency of data :

A measure of central tendency (also referred to as measures of center or central location) is a summary measure that attempts to describe a whole set of data with a single value that represents the middle or center of its distribution.
There are three main measures of central tendency: mean, median and mode. Each of these measures describes a different indication of the typical or central value in the distribution.

Mean :  The mean is the sum of the value of each observation in a dataset divided by the number of observations. This is also known as the arithmetic average.

Looking at the retirement age distribution :
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
The mean is calculated by adding together all the values (54+54+54+55+56+57+57+58+58+60+60 = 623) and dividing by the number of observations (11) which equals 56.6 years.
Median :The median is the middle value in distribution when the values are arranged in ascending or descending order.
The median divides the distribution in half (there are 50% of observations on either side of the median value). In a distribution with an odd number of observations, the median value is the middle value.

Looking at the retirement age distribution (which has 11 observations), the median is the middle value, which is 57 years:
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60

When the distribution has an even number of observations, the median value is the mean of the two middle values. In the following distribution, the two middle values are 56 and 57, therefore the median equals 56.5 years:
52, 54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60

Mode : The mode is the most commonly occurring value in a distribution.

Consider this dataset showing the retirement age of 11 people, in whole years:
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
This table shows a simple frequency distribution of the retirement age data.

Age

Frequency

54

3

55

1

56

1

57

2

58

2

60

2

The most commonly occurring value is 54, therefore the mode of this distribution is 54 years.

b) Dispersion of data :

The measures of central tendency are not adequate to describe data. Two data sets can have the same mean but they can be entirely different. Thus to describe data, one needs to know the extent of variability. This is given by the measures of dispersion. Range, standard deviation are the two commonly used measures of dispersion.

Range : The difference between the lowest and highest values.

In {4, 6, 9, 3, 7} the lowest value is 3, and the highest is 9, so the range is 9 − 3 = 6.

standard deviation : The standard deviation is a statistic that measures the dispersion of a dataset relative to its mean and is calculated as the square root of the variance. It is calculated as the square root of variance by determining the variation between each data point relative to the mean. If the data points are further from the mean, there is a higher deviation within the data set; thus, the more spread out the data, the higher the standard deviation.

The Formula for Standard Deviation :

Example : A standard deviation is the “average” difference between the data points and the average of those data points. If the average of 8, 9, 10, 11, and 12 is 10 (8+9+10+11+12 = 50. 50/5 = 10), what is the average distance of those numbers from 10. 8 is 2 away, 9 is 1 away, 10 is 0 away, 11 is 1 away, and 12 is 2 away. So you add those numbers up (2+1+0+1+2 = 6) and divide them by the number of data points we examined (6/5 = 1.2). In our case, the Mean = 10 and the Standard Deviation = 1.2.

c) Correlation of data:

Correlation is a bivariate analysis that measures the strength of association between two variables and the direction of the relationship. In terms of the strength of relationship, the value of the correlation coefficient varies between +1 and -1. A value of ± 1 indicates a perfect degree of association between the two variables. As the correlation coefficient value goes towards 0, the relationship between the two variables will be weaker. The direction of the relationship is indicated by the sign of the coefficient; a + sign indicates a positive relationship and a – sign indicates a negative relationship. Usually, in statistics, we measure four types of correlations: Pearson r correlation is mostly used in measuring correlation.

Pearson r correlation: Pearson r correlation is the most widely used correlation statistic to measure the degree of the relationship between linearly related variables. For example, in the stock market, if we want to measure how two stocks are related to each other, Pearson r correlation is used to measure the degree of relationship between the two. The following formula is used to calculate the Pearson r correlation:

rxy = Pearson r correlation coefficient between x and y
n = number of observations
xi = value of x (for ith observation)
yi = value of y (for ith observation)


Related Solutions

Describe the common characteristics and mechanical properties associated with aluminum.
Describe the common characteristics and mechanical properties associated with aluminum.
Third, the researcher wishes to use numerical descriptive measures to summarize the data on each of...
Third, the researcher wishes to use numerical descriptive measures to summarize the data on each of the two variables: hours worked per week and income earned per year. b. Compute the correlation coefficient using the relevant Excel function to measure the direction and strength of the linear relationship between the two variables. Display and interpret the correlation value.      Data for HOURSWORKED63 Excel spreadsheet is below: Yearly Income ('000's) Hours Per Week 43.8 18 44.5 13 44.8 18 46.0 25.5...
Third, the researcher wishes to use numerical descriptive measures to summarize the data on each of...
Third, the researcher wishes to use numerical descriptive measures to summarize the data on each of the two variables: hours worked per week and income earned per year. (a) Prepare and display a numerical summary report for each of the two variables including summary measures such as mean, median, range, variance, standard deviation, smallest and largest values and the three quartiles. Notes: Use QUARTILE.EXC command to generate the three quartiles. Data for HOURSWORKED63 Excel spreadsheet is below: Yearly Income ('000's)...
4. the researcher wishes to use numerical descriptive measures to summarize the data on each of...
4. the researcher wishes to use numerical descriptive measures to summarize the data on each of the two variables: hours worked per week and income earned per year. Prepare and display a numerical summary report for each of the two variables including summary measures such as mean, median, range, variance, standard deviation, smallest and largest values and the three quartiles.                               Notes: Use QUARTILE.EXC command to generate the three quartiles. Compute the correlation coefficient using the relevant Excel function...
Discuss some of the characteristics which make an ‘inclusive organisation’ and outline three benefits to the...
Discuss some of the characteristics which make an ‘inclusive organisation’ and outline three benefits to the organization of creating an inclusive workplace. Use examples to further explain your answer.
What are some characteristics associated with dividends paid on common stock?
What are some characteristics associated with dividends paid on common stock?
Discuss any three of the following characteristics of the reliability of evidence. Be sure to explain...
Discuss any three of the following characteristics of the reliability of evidence. Be sure to explain how the listed characteristic affects the reliability of evidence. 1. Independence of provider 2. Effectiveness of client's internal controls 3. Auditor's direct knowledge 4. Qualification of individuals providing the information 5. Degree of objectivity 6. Timeliness
Identify and describe three social characteristics associated with alcohol consumption.
Identify and describe three social characteristics associated with alcohol consumption.
what are three important properties of water and why does it have such characteristics
what are three important properties of water and why does it have such characteristics
. a) Explain three (3) characteristics of a developing economy. (8 marks) b) Discuss three (3)...
. a) Explain three (3) characteristics of a developing economy. b) Discuss three (3) policies that can be implemented by the government to achieve “developed” status. Consider which policy is likely to be the most effective.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT