In: Statistics and Probability
The data set for this question set (Tab Q1 in the Excel data file) comes from a research project that tracks the elderly residents in a community to monitor their cognitive function and general health. Based on the literature, education is considered a protective factor against dementia, and memory decline is usually the first sign of dementia. So the researchers would like to know whether education level (measured in number of years of formal schooling) is correlated with memory function (a standardized memory test score) in their sample of elderly residents.
a. Create a scatter plot between the two variables using Excel. Paste the plot here and format it into an APA-styled “figure” (see Assignment Guides for APA format). Be sure to submit the Excel file that you used to create the scatter plot. (2 points)
b. Calculate the mean and standard deviation for the two variables separately.
c. Calculate the Z scores for all the scores of the two variables, separately.
Tips: It may help to prevent error and to increase clarity if the process and/or the answers (z scores) are listed in a table format.
Subject ID | education | memory |
1 | 16 | 112 |
2 | 16 | 117 |
3 | 12 | 96 |
4 | 17 | 114 |
5 | 16 | 106 |
6 | 14 | 80 |
7 | 14 | 89 |
8 | 12 | 94 |
9 | 16 | 113 |
10 | 13 | 111 |
11 | 16 | 112 |
12 | 18 | 84 |
13 | 12 | 108 |
14 | 16 | 95 |
15 | 12 | 117 |
16 | 14 | 83 |
17 | 8 | 106 |
18 | 12 | 100 |
19 | 18 | 120 |
20 | 14 | 103 |
21 | 14 | 89 |
22 | 12 | 94 |
23 | 14 | 113 |
24 | 12 | 111 |
25 | 12 | 108 |
The purpose of the research study is to see if there is a correlation between the number of years of education and memory loss.
For the same, the data from 25 participants is collected and is presented in the question. To check for correlation, we have to plot a scatter plot of the data. Since the effect of education on memory is to be checked, the x-axis would be education and the y-axis would be the dependent variable i.e. memory. Hence, we get:
Next we are to calculate mean and standard deviation for each of the columns.
The formula for mean is:
where x_i are the data points in the column and n is the number of data points in the column,
The formula for standard deviation is:
Solving for both the columns, we get:
education | memory | |
Mean | 14 | 103 |
Standard deviation | 2.3629 | 11.7721 |
Next we calculate z-scores for each value. This is done using the formula:
Solving for each column, we get:
Subject ID |
education | memory | z_education | z_memory |
1 | 16 | 112 | 0.8464 | 0.7645 |
2 | 16 | 117 | 0.8464 | 1.1893 |
3 | 12 | 96 | -0.8464 | -0.5946 |
4 | 17 | 114 | 1.2696 | 0.9344 |
5 | 16 | 106 | 0.8464 | 0.2548 |
6 | 14 | 80 | 0 | -1.9538 |
7 | 14 | 89 | 0 | -1.1893 |
8 | 12 | 94 | -0.8464 | -0.7645 |
9 | 16 | 113 | 0.8464 | 0.8495 |
10 | 13 | 111 | -0.4232 | 0.6796 |
11 | 16 | 112 | 0.8464 | 0.7645 |
12 | 18 | 84 | 1.6928 | -1.614 |
13 | 12 | 108 | -0.8464 | 0.4247 |
14 | 16 | 95 | 0.8464 | -0.6796 |
15 | 12 | 117 | -0.8464 | 1.1893 |
16 | 14 | 83 | 0 | -1.6989 |
17 | 8 | 106 | -2.5393 | 0.2548 |
18 | 12 | 100 | -0.8464 | -0.2548 |
19 | 18 | 120 | 1.6928 | 1.4441 |
20 | 14 | 103 | 0 | 0 |
21 | 14 | 89 | 0 | -1.1893 |
22 | 12 | 94 | -0.8464 | -0.7645 |
23 | 14 | 113 | 0 | 0.8495 |
24 | 12 | 111 | -0.8464 | 0.6796 |
25 | 12 | 108 | -0.8464 | 0.4247 |