In: Statistics and Probability
In order to determine if the lexical development of kids is associated with their physical development, data had been collected among the kids of preliminary school. In a simple random sample of size n = 5, the level of lexical development has been determined as the number of words of an individual`s active vocabulary. The level of physical development had been measured as the body size or length (in cm) of an individual, i.e. the distance between the soles of the feet and the crest of the person in an upright and straight posture (see table 1).
The mean size of the kids in this sample is 127cm. The standard deviation of the size of the kids in this sample is 16 cm.
The mean number of words in the active vocabulary of the kids in this sample is 503.2.
The standard deviation of the number of words in the active vocabulary of the kids in this sample is 160.01
Size (cm) |
Vocabulary (number of words) |
108 |
290 |
137 |
607 |
126 |
491 |
148 |
703 |
116 |
425 |
a. In order to describe the relationship between the lexical and physical development of the children in the given school, compute the linear correlation coefficient based on the sample data given in table 1.
b. Describe the properties of the linear correlation coefficient in general and interpret the results of subtask a accordingly!
c. Describe which possible explanation have to be considered when interpreting scatter diagrams and correlation coefficients that indicate a strong linear correlation between two variables. Interpret the results of subtask a accordingly!
d. Do a simple linear regression and determine the equation that describes best the relationship between the variables of the given set of bivariate data.
e. Determine and interpret the coefficient of determination for the model that you determined in d.
a)
The correlation coefficient comes out to be 0.99.
b)
It shows that both the variables(vocabulary and the size) are positively correlated, they are related strongly as well. When one increases, the other increases as well and vice-versa.
When the size of the kid increases, the number of words learned also increases and vice-versa.
c)
We need to check if the correlation is positive or negative. And the strength of it, whether it is strong or weak. Here, it is strong and positive.
d)
We will be applying the Linear regression model here, it can be done by using the function LINEST(y_value, x_value, TRUE, TRUE) where y_values contain values of Vocabulary(number of words) here and x_values have Size values.
Select 5 rows and 2 columns and then write the formula in the first cell and after that, press Shift + Ctrl + Enter.
The equation comes out to be -
Vocabulary = -756.5 + 9.92*Size
e)
The coefficient of determination(R**2) comes out to be 0.983, which means 98.3% of the variation in vocabulary is explained by the size of the kid.