In: Math
-Event time T follows an exponential distribution with
a mean of 40
-Censoring time Tc follows an exponential distribution with a mean
of 25
-Generate 500 observations, with censoring flag indicating whether
censoring happened before events
Question: What do you think the percent of censoring should be? Show your calculation or reasoning.
I have used R to create the actual random sample and censored random sample, and then I flagged them in excel.
Let us start with the data.
We have to use the data snippet as it is exceeding the character limits.
Actual data | Censoring events | Flag (1 = censored) |
11.49726987 | 76.9530783 | 0 |
108.0359437 | 16.75622817 | 1 |
87.30937175 | 10.53936965 | 1 |
28.37716795 | 52.51840041 | 0 |
50.44413026 | 13.69920358 | 1 |
39.1176727 | 78.27119046 | 0 |
53.74690909 | 5.32343887 | 1 |
6.175411977 | 57.61599983 | 0 |
10.45824168 | 9.579730277 | 1 |
30.60202226 | 13.96628568 | 1 |
191.8416877 | 3.719486971 | 1 |
23.68727703 | 7.696135622 | 1 |
44.21200581 | 25.37628167 | 1 |
107.1747418 | 72.07776978 | 1 |
1.974580679 | 12.66587234 | 0 |
84.32676456 | 89.42452943 | 0 |
31.8948435 | 98.25278729 | 0 |
39.20494169 | 0.89817956 | 1 |
36.75132677 | 38.480403 | 0 |
55.89845888 | 36.14671673 | 1 |
hence, the percentage of censoring will be 300/500 = 0.6Here we can see that 299 are censored observations, That means, in 299 cases, the censoring has occurred before the events actually happened.
60% of data will be censored if we use the distribution of censoring which is given in the question.
Now, for the R code for reference, I am giving the r code here
T = rexp(500,1/40)
T.c = rexp(500,1/25)
Data = cbind(T,T.c)
C = T - T.c
data = cbind(T,T.c,C)
data