In: Statistics and Probability
2. You completed the study described above to determine the extent of anxiety symptoms in students at BMCC (random sample of 1,000 students). You also collected data on demographic variables. You sent out 1,000 surveys and got back 500.
a. What is the n? _____________
b. On your survey, 350 students were female. Construct
a frequency distribution table for sex and create the appropriate
graph to display the results.
What kind of variable is sex? _________________________
c. Your measure of anxiety symptoms can range from 0
to 20, where a higher number indicates more symptoms. Your measure
had a mean of 7.2 with a standard deviation of 2.3. The median was
6, Q1 was 3 and Q3 was 10. The responses on this variable in your
sample ranged from 0 to 19.
What kind of variable is this? _____________________
Test for outliers using this information. Which measures should you
use for your descriptive statistics, the mean and standard
deviation or the median and IQR?
Create a box whisker plot using this information.
d. You decide to split up your data by sex.
For women, minimum = 0, Q1 = 5, median = 8, Q3 = 11, maximum =
19.
Formen,minimum=0,Q1 =2,median=5,Q3 =8,maximum=15.
Create side-by-side box-whisker plots. Can you tell if there are
differences between sexes? Explain.
e. You decide to create groupings for this variable
based on the severity of symptoms. Anything below a score of 6 is
“normal”, between 7 and 10 is “at risk” and anything above 10 is
“likely anxiety.” In your sample, 283 participants fall into the
“normal” group, 134 fall into the “at risk” group, while the rest
are in the “likely anxiety” group. Please create a frequency
distribution table and the appropriate graph to display the
results.
What kind of variable is this? ________________
Solution:
a)Value of n=500
b)Sex is a nominal categorical variable.
The frequency distribution table for sex is as follows:
Sex | Number of students |
Female | 350 |
Male | 150 |
Total | 500 |
Let us use Bar Chart to show the frequency distribution for sex and is as follows:
c) Range from 0-19 is a continuous numeric varaible.
Given data:
Q1 | 3 |
Median | 6 |
Q3 | 10 |
Mean | 7.2 |
Standard eviation | 2.3 |
Let us use interquartile range test to find the outliers.
The condition is as follows.
Q1 - 1.5 [Q3 -Q1] < xi < Q3 + 1.5 [Q3 -Q1]
3-1.5(10-3)< xi < 10+1.5(10-3)
-7.5< xi < 20.5
The responses on this variable in your sample ranged from 0 to 19.
But , the calculated data value range is -7.5< xi < 20.5. So, there are no outliers.
Measure of variability is used for the descriptive statistics.i.e.IQR
Box-whisker plot for the below table is as follows.
Minimum | 0 |
Q1 | 3 |
Median | 6 |
Q3 | 10 |
Maximum | 19 |
d)
Comparison of sex: Figure shows that the median anxiety level of women is greater than men.
Comparison of dispersion: The IQR are reasonably similar.
Comparison of skewness:Here both batches of data appear to be right-skew, and the men batch is slightly more skewed than compared to women.
Comparison of potential outliers: There are no ouliers in both men and women.