In: Statistics and Probability
G-Test of Independence
In a study of the relation between blood type and disease, large samples of patients with peptic ulcers, patients with gastric cancer, and control persons (free from any of these diseases) were classified as to blood type (O, A, B, AB). In this problem, the relatively small numbers of AB patients were omitted for simplicity. The observed numbers are as follows:
Blood Type Peptic Ulcer Gastric Cancer Controls
O 983 383 2892
A 679 416 2625
B 134 84 570
Perform the G-test of independence to analyze the dataset. Work the problem by hand and with SAS.
Graph the relative frequencies of the observed data.
Null Hypothesis (H0): Blood group of a patient does not affect whether they have Peptic Ulcer, Gastric Cancer, or no disease.
Total patients = (983+679+134+383+416+84+2892+2625+570) = 8766
% of patients with Peptic Ulcer = (983+679+134)/8766 = 20.49%
% of patients with Gastric Cancer = (383+416+84)/8766 = 10.07%
% of patients with no disease = (2892+2625+570)/8766 = 69.44%
Calculating expected values
E(O blood group patients having Peptic Ulcer) = 20.49% = 0.2049*(983+383+2892) = 872.46
E(O blood group patients having Gastric Cancer) = 10.07% = 0.1007*(983+383+2892) = 428.78
E(O blood group patients having no disease) = 69.44% = 0.6944*(983+383+2892) = 2956.76
E(A blood group patients having Peptic Ulcer) = 20.49% = 0.2049*(679+416+2625) = 762.23
E(A blood group patients having Gastric Cancer) = 10.07% = 0.1007*(679+416+2625) = 374.60
E(A blood group patients having no disease) = 69.44% = 0.6944*(679+416+2625) = 2583.17
E(B blood group patients having Peptic Ulcer) = 20.49% = 0.2049*(134+84+570) = 161.46
E(B blood group patients having Gastric Cancer) = 10.07% = 0.1007*(134+84+570) = 79.35
E(B blood group patients having no disease) = 69.44% = 0.6944*(134+84+570) = 547.19
G = 2*sum(Oi * ln(Oi/Ei)) for i = 1 to 9
G = 2*[983*ln(983/872.46) + 383*ln(383/428.78) + 2892*ln(2892/2956.76) + 679*ln(679/762.23) + 416*ln(416/374.60) + 2625*ln(2625/2583.17) + 134*ln(134/161.46) + 84*ln(84/79.35) + 570*ln(570/547.19)]
G = 40.6422
degrees of freedom = (3-1)*(3-1) = 4
p-value = 0.00001 < alpha at alpha = 0.1 => Null Hypothesis rejected.