In: Statistics and Probability
STAT101 | ||||||||||||
PatientID | Age | Sex | County | CardioRisk | Height | Weight | BloodGroup | Stroke | RegularEx | Group | Cholesterol1 | Cholesterol2 |
225 | 58 | Male | Offaly | Low | 178.2 | 103.5 | AB | N | N | Placebo | 6.1 | 4.6 |
226 | 61 | Male | Carlow | Medium | 173.9 | 63 | AB | Y | N | Control | 2.2 | 5.7 |
227 | 57 | Female | Donegal | Medium | 160.9 | 76.1 | B | N | Y | Control | 5.8 | 5 |
228 | 43 | Male | Offaly | High | 175 | 83.7 | AB | Y | N | Control | 3.6 | 2.4 |
229 | 37 | Female | Longford | Low | 156.8 | 68 | B | N | Y | Control | 4.9 | 6 |
230 | 29 | Male | Leitrim | Medium | 165.1 | 79.1 | A | Y | Y | Placebo | 3.5 | 5.3 |
231 | 52 | Male | Cavan | Low | 166.6 | 60.4 | A | Y | Y | Placebo | 2.9 | 4 |
232 | 47 | Female | Westmeath | Low | 166.8 | 63.6 | B | N | Y | Control | 4 | 3.2 |
233 | 28 | Male | Wicklow | Low | 171.5 | 67.2 | O | Y | N | Control | 4.2 | 4 |
Q1
The commands for these questions using rstudio
a. How many patients are in each of the Control and Placebo groups?
b. Provide a histogram and describe the distribution of age groups in your data.
c. Body Mass Index (BMI) is defined as kg/m2 where kg is the individual’s weight in kilograms and m
is the individual’s height in metres. Provide a histogram and describe the distribution of BMI in the
whole sample. Calculate the sample mean BMI of both males and females.
d. Identify the counties with the highest and lowest mean weights.
e. Compute how many females in the sample belong to Blood Group O.
f. What percentage of men belong to Blood Group B?
g. Calculate the mean difference in each subject’s cholesterol level from the start to the end of the study.
(Hint: create a new variable choldiff.)
R codes and output:
> d=read.table('data1.csv',header=T,sep=',')
> head(d)
PatientID Age Sex County CardioRisk Height Weight BloodGroup
Stroke
1 225 58 Male Offaly Low 178.2 103.5 AB N
2 226 61 Male Carlow Medium 173.9 63.0 AB Y
3 227 57 Female Donegal Medium 160.9 76.1 B N
4 228 43 Male Offaly High 175.0 83.7 AB Y
5 229 37 Female Longford Low 156.8 68.0 B N
6 230 29 Male Leitrim Medium 165.1 79.1 A Y
RegularEx Group Cholesterol1 Cholesterol2
1 N Placebo 6.1 4.6
2 N Control 2.2 5.7
3 Y Control 5.8 5.0
4 N Control 3.6 2.4
5 Y Control 4.9 6.0
6 Y Placebo 3.5 5.3
> attach(d)
The following objects are masked from d (pos = 3):
Age, BloodGroup, CardioRisk, Cholesterol1, Cholesterol2,
County,
Group, Height, PatientID, RegularEx, Sex, Stroke, Weight
Que.a
> length(which(Group=='Placebo'))
[1] 3
> length(which(Group=='Control'))
[1] 6
3 patients in placebo group and 6 patients in control group.
Que.b
> hist(Age)
Que.c
> BMI = round(Weight/(Height)^2,4)
> hist(BMI)
> a=cbind(Sex, BMI)
> a
Sex BMI
[1,] "Male" "0.0033"
[2,] "Male" "0.0021"
[3,] "Female" "0.0029"
[4,] "Male" "0.0027"
[5,] "Female" "0.0028"
[6,] "Male" "0.0029"
[7,] "Male" "0.0022"
[8,] "Female" "0.0023"
[9,] "Male" "0.0023"
> bmi_m = BMI[which(Sex=='Male')];bmi_m
[1] 0.0033 0.0021 0.0027 0.0029 0.0022 0.0023
> bmi_f = BMI[which(Sex=='Female')];bmi_f
[1] 0.0029 0.0028 0.0023
> mean(bmi_m)
[1] 0.002583333
> mean(bmi_f)
[1] 0.002666667
Histogram:
Sample mean BMI for male = 0.00258
Sample mean BMI for female = 0.00267
Que.e
> length(which(Sex=='Female' & BloodGroup == 'O' ))
[1] 0
There are zero females in the sample belong to Blood Group O.
Que.f
> length(which(Sex=='Male' & BloodGroup == 'B' ))
[1] 0
There are zero males in the sample belong to Blood Group B. Hence
percentage is zero.
Que.g
> choldiff= Cholesterol2 - Cholesterol1
> mean(choldiff)
[1] 0.3333333
The mean difference in each subject’s cholesterol level from the start to the end of the study is 0.3333