Question

In: Math

Naive Bayes Theorem See the dataset D in Table 1. It consists of clinical data about...

Naive Bayes Theorem

See the dataset D in Table 1. It consists of clinical data about 14 patients. Using the data in D, determine the Naive Bayes classifier and predict the patients in Table 2. Then, compare with your ‘predicted’ ones with the ground-truth label (i.e., column ’Disease’) and report the accuracy P.

Table 1: Dataset D with clinical data of 14 patients

ID

HBP

BMI

Drink

Weight

Disease

1

“Yes”

“Normal”

“No”

“Overweight”

“Yes”

2

“No”

“Normal”

“Yes”

“Normal”

“No”

3

“No”

“Critical”

“No”

“Overweight”

“Yes”

4

“No”

“High”

“Yes”

“Overweight”

“Yes”

5

“Yes”

“Critical”

“Yes”

“Obese”

“Yes”

6

“Yes”

“High”

“Yes”

“Normal”

“Yes”

7

“No”

“High”

“No”

“Obese”

“No”

8

“Yes”

“Normal”

“Yes”

“Normal”

“Yes”

9

“Yes”

“Critical”

“No”

“Obese”

“Yes”

10

“No”

“Normal”

“No”

“Overweight”

“No”

11

“No”

“Critical”

“Yes”

“Normal”

“Yes”

12

“Yes”

“High”

“No”

“Overweight”

“No”

13

“Yes”

“Normal”

“Yes”

“Overweight”

“Yes”

14

“Yes”

“High”

“No”

“Obese”

“No”

Table 2: Test data with additional 5 patients

ID

HBP

BMI

Drink

Weight

Disease

15

“Yes”

“Normal”

“No”

“Overweight”

“Yes”

16

“No”

“Normal”

“Yes”

“Normal”

“No”

17

“No”

“Critical”

“No”

“Overweight”

“Yes”

18

“No”

“High”

“Yes”

“Overweight”

“Yes”

19

“Yes”

“Critical”

“Yes”

“Obese”

“Yes”

Solutions

Expert Solution

In table 1 from Disease column, P(Yes) = 9/14, P(No) = 5/14

The table 1 can be split as (For Yes and No):

Id HBP BMI Drink Weight Disease Id HBP BMI Drink Weight Disease
1 Yes Normal No Overweight Yes 2 No Normal Yes Normal No
3 No Critical No Overweight Yes 7 No High No Obese No
4 No High Yes Overweight Yes 10 No Normal No Overweight No
5 Yes Critical Yes Obese Yes 12 Yes High No Overweight No
6 Yes High Yes Normal Yes 14 Yes High No Obese No
8 Yes Normal Yes Normal Yes
9 Yes Critical No Obese Yes
11 No Critical Yes Normal Yes
13 Yes Normal Yes Overweight Yes

Now, from test data Table 2,

Row 1: P(X) = (8/14)*(5/14)*(7/14)*(6/14) = 0.044.

P(Disease = Yes | X) = 0.0000344/0.044 = 0.00078

P(Disease = No | X) = 0/0.044 = 0

Predicted Row 1: Yes

Similarly,

since, the probability P(X | Disease = No) P(Disease = No) = 0, for rest all other rows the prediction for the disease will be ‘Yes’ only. Hence ID 16 is misclassified here.

So, the accuracy = (4/5)*100 = 80%


Related Solutions

4. Bayes Theorem - One way of thinking about Bayes theorem is that it converts a-priori...
4. Bayes Theorem - One way of thinking about Bayes theorem is that it converts a-priori probability to a-posteriori probability meaning that the probability of an event gets changed based upon actual observation or upon experimental data. Suppose you know that there are two plants that produce helicopter doors: Plant 1 produces 1000 helicopter doors per day and Plant 2 produces 4000 helicopter doors per day. The overall percentage of defective helicopter doors is 0.01%, and of all defective helicopter...
4. Bayes Theorem - One way of thinking about Bayes theorem is that it converts a-priori...
4. Bayes Theorem - One way of thinking about Bayes theorem is that it converts a-priori probability to a-posteriori probability meaning that the probability of an event gets changed based upon actual observation or upon experimental data. Suppose you know that there are two plants that produce helicopter doors: Plant 1 produces 1000 helicopter doors per day and Plant 2 produces 4000 helicopter doors per day. The overall percentage of defective helicopter doors is 0.01%, and of all defective helicopter...
I need this code to be written in Python: Given a dataset, D, which consists of...
I need this code to be written in Python: Given a dataset, D, which consists of (x,y) pairs, and a list of cluster assignments, C, write a function centroids(D, C) that computes the new (x,y) centroid for each cluster. Your function should return a list of the new cluster centroids, with each centroid represented as a list of x, y: def centroid(D, C):
1. What is dependent and independent event. 2. What is the relevance of Bayes theorem in...
1. What is dependent and independent event. 2. What is the relevance of Bayes theorem in posterior probability? 3. what is the difference between discrete and continuous probability distributions?
Using the data found in Table 4 and Bayes’ Formula, determine the probability that a randomly...
Using the data found in Table 4 and Bayes’ Formula, determine the probability that a randomly selected patient will have Strep Throat given the SARTD test result was positive. Use the CDC stated prevalence of 25%. Round answer to nearest hundredth of a percent (i.e. 45.67%). Then using the same Table 4, and Bayes’ Formula, determine the probability that a randomly selected patient will not have Strep Throat given the SARTD test result was negative. Use the CDC stated prevalence...
Using the data found in Table 4 and Bayes’ Formula, determine the probability that a randomly...
Using the data found in Table 4 and Bayes’ Formula, determine the probability that a randomly selected patient will have Strep Throat given the SARTD test result was positive. Use the CDC stated prevalence of 25%. Round answer to nearest hundredth of a percent (i.e. 45.67%). Then using the same Table 4, and Bayes’ Formula, determine the probability that a randomly selected patient will not have Strep Throat given the SARTD test result was negative. Use the CDC stated prevalence...
1. The data in Table 7–6 were collected in a clinical trial to evaluate a new...
1. The data in Table 7–6 were collected in a clinical trial to evaluate a new compound designed to improve wound healing in trauma patients. The new compound was compared against a placebo. After treatment for 5 days with the new compound or placebo, the extent of wound healing was measured. Is there a difference in the extent of wound healing between the treatments? (Hint: Are treatment and the percent wound healing independent?) Run the appropriate test at a 5%...
Open the Corn Yield dataset. There you see data recorded from an experiment to test 4...
Open the Corn Yield dataset. There you see data recorded from an experiment to test 4 organic pesticide methods (wasps, nematodes, pepper spray, and bacteria) to see if they eradicate corn weevils better than a standard chemical pesticide (control). For each method, effectiveness is determined by the average weight in grams of corn that is yielded from a randomly selected ear of corn. Use Minitab to do an Anderson-Darling test on the Yields to determine if the normality assumption for...
Use the data in the accompanying table that summarizes results from a clinical trial of a...
Use the data in the accompanying table that summarizes results from a clinical trial of a drug. Let A = Drug Treatment, B = Placebo headache No Headache total Drug Treatment (A) 122 585 707 Placebo ( B) 28 672 700 Total 150 1257 1407 Answer the following: 1. Find the probability that patient under treatment p(A) 2. Find p(B) 3. Find the probability a patient has a headache 4. Find the probability a patient has a headache and he...
Using the data in the table, calculate the rate constant of this reaction. A+B⟶C+D TRIAL 1:...
Using the data in the table, calculate the rate constant of this reaction. A+B⟶C+D TRIAL 1: A= 0.200 M B=0.250 M Rate= 0.0213 M/S TRIAL 2: A= 0.200 M B=0.650 M Rate=0.144 M/s TRIAL 3: A= 0.360 M B= 0.250 M. Rate= 0.0383 M/s
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT