In: Computer Science
USING 3 FILES breast-cancer.arff, diabetes.arff and iris.arff IN WEKA 3.8
Task 1: Classification performance evaluation [10 marks]
In this comparative analysis task, you are required to evaluate classification performance of five algorithms on three datasets using Weka. Load breast-cancer.arff, diabetes.arff and iris.arff datasets into Weka one at a time and run each of the below algorithms with their default settings. Then, collect a 10-fold cross-validation classification results for quantitative evaluation.
a. MultilayerPerceptron
b. Naive Bayes
c. J48
d. RandomForest
e. RERTree
You need to write a report that shows performance comparison of these algorithms on the datasets. The report should contain quantitative comparison of classification accuracy in terms of the confusion matrix and other performance metrics used in Weka. Include necessary screenshots, tables, graphs, etc. to make your report comprehensive, and revealing insightful details on the performance comparison.
Comparative Classification Accuracy Using Weka
|
Datasets Algorithm |
breast-cancer.arff |
diabetes.arff |
iris.arf |
|
Multilayer Perceptron |
64.68% |
75.39% |
97.33% |
|
Naïve Bayes |
71.67% |
76.30% |
96.00% |
|
J48 |
75.52% |
73.82% |
96.00% |
|
Random Forest |
69.58% |
75.78% |
95.33% |
|
RERTree |
70.62% |
75.26% |
94.00% |
Detailed classification outcome: Test 1
|
Algorithm |
Multilayer Perceptron |
||
|
Dataset |
breast-cancer.arff |
||
|
Instances |
286 |
||
|
Attributes |
10 |
||
|
Correctly classified instances |
185 |
||
|
Incorrectly classified instances |
101 |
||
|
Confusion Matrix |
a |
b |
Classified as |
|
150 |
51 |
a = no-recurrence-event |
|
|
50 |
35 |
b = recurrence-events |
|
Detailed classification outcome: Test 2
|
Algorithm |
Naïve Bayes |
||
|
Dataset |
breast-cancer.arff |
||
|
Instances |
286 |
||
|
Attributes |
10 |
||
|
Correctly classified instances |
205 |
||
|
Incorrectly classified instances |
81 |
||
|
Confusion Matrix |
a |
b |
Classified as |
|
168 |
33 |
a = no-recurrence-event |
|
|
48 |
37 |
b = recurrence-events |
|
Detailed classification outcome: Test 3
|
Algorithm |
J48 |
||
|
Dataset |
breast-cancer.arff |
||
|
Instances |
286 |
||
|
Attributes |
10 |
||
|
Correctly classified instances |
216 |
||
|
Incorrectly classified instances |
70 |
||
|
Confusion Matrix |
a |
b |
Classified as |
|
193 |
8 |
a = no-recurrence-event |
|
|
62 |
23 |
b = recurrence-events |
|
Detailed classification outcome: Test 4
|
Algorithm |
Random Forest |
||
|
Dataset |
breast-cancer.arff |
||
|
Instances |
286 |
||
|
Attributes |
10 |
||
|
Correctly classified instances |
199 |
||
|
Incorrectly classified instances |
87 |
||
|
Confusion Matrix |
a |
b |
Classified as |
|
175 |
26 |
a = no-recurrence-event |
|
|
61 |
24 |
b = recurrence-events |
|
Detailed classification outcome: Test 5
|
Algorithm |
RERTree |
||
|
Dataset |
breast-cancer.arff |
||
|
Instances |
286 |
||
|
Attributes |
10 |
||
|
Correctly classified instances |
202 |
||
|
Incorrectly classified instances |
84 |
||
|
Confusion Matrix |
a |
b |
Classified as |
|
183 |
18 |
a = no-recurrence-event |
|
|
66 |
19 |
b = recurrence-events |
|
Detailed classification outcome: Test 6
|
Algorithm |
Multilayer Perceptron |
||
|
Dataset |
diabetes.arff |
||
|
Instances |
768 |
||
|
Attributes |
9 |
||
|
Correctly classified instances |
579 |
||
|
Incorrectly classified instances |
189 |
||
|
Confusion Matrix |
a |
b |
Classified as |
|
416 |
84 |
a = tested negative |
|
|
105 |
163 |
b = tested positive |
|
Detailed classification outcome: Test 7
|
Algorithm |
Naïve Bayes |
||
|
Dataset |
Diabetes.arff |
||
|
Instances |
768 |
||
|
Attributes |
9 |
||
|
Correctly classified instances |
586 |
||
|
Incorrectly classified instances |
182 |
||
|
Confusion Matrix |
a |
b |
Classified as |
|
422 |
78 |
a = tested negative |
|
|
104 |
164 |
b = tested positive |
|
Detailed classification outcome: Test 8
|
Algorithm |
J48 |
||
|
Dataset |
Diabetes.arff |
||
|
Instances |
768 |
||
|
Attributes |
9 |
||
|
Correctly classified instances |
567 |
||
|
Incorrectly classified instances |
201 |
||
|
Confusion Matrix |
a |
b |
Classified as |
|
407 |
93 |
a = tested negative |
|
|
108 |
160 |
b = tested positive |
|
Detailed classification outcome: Test 9
|
Algorithm |
Random Forest |
||
|
Dataset |
Diabetes.arff |
||
|
Instances |
768 |
||
|
Attributes |
9 |
||
|
Correctly classified instances |
582 |
||
|
Incorrectly classified instances |
186 |
||
|
Confusion Matrix |
a |
b |
Classified as |
|
418 |
82 |
a = tested negative |
|
|
104 |
164 |
b = tested positive |
|
Detailed classification outcome: Test 10
|
Algorithm |
RERTree |
||
|
Dataset |
Diabetes.arff |
||
|
Instances |
768 |
||
|
Attributes |
9 |
||
|
Correctly classified instances |
578 |
||
|
Incorrectly classified instances |
190 |
||
|
Confusion Matrix |
a |
b |
Classified as |
|
423 |
177 |
a = tested negative |
|
|
113 |
155 |
b = tested positive |
|
Detailed classification outcome: Test 11
|
Algorithm |
Multilayer Perceptron |
|||
|
Dataset |
iris.arf |
|||
|
Instances |
150 |
|||
|
Attributes |
5 |
|||
|
Correctly classified instances |
146 |
|||
|
Incorrectly classified instances |
4 |
|||
|
Confusion Matrix |
a |
b |
c |
Classified as |
|
50 |
0 |
0 |
a = Iris- setosa |
|
|
0 |
48 |
2 |
b = Iris- versicolor |
|
|
0 |
2 |
48 |
c = Iris- virginica |
|
Detailed classification outcome: Test 12
|
Algorithm |
Naive Bayes Classifier |
|||
|
Dataset |
iris.arf |
|||
|
Instances |
150 |
|||
|
Attributes |
5 |
|||
|
Correctly classified instances |
144 |
|||
|
Incorrectly classified instances |
6 |
|||
|
Confusion Matrix |
a |
b |
c |
Classified as |
|
50 |
0 |
0 |
a = Iris- setosa |
|
|
0 |
48 |
2 |
b = Iris- versicolor |
|
|
0 |
4 |
46 |
c = Iris- virginica |
|
Detailed classification outcome: Test 13
|
Algorithm |
RandomForest |
|||
|
Dataset |
iris.arf |
|||
|
Instances |
150 |
|||
|
Attributes |
5 |
|||
|
Correctly classified instances |
143 |
|||
|
Incorrectly classified instances |
7 |
|||
|
Confusion Matrix |
a |
b |
c |
Classified as |
|
50 |
0 |
0 |
a = Iris- setosa |
|
|
0 |
47 |
3 |
b = Iris- versicolor |
|
|
0 |
4 |
46 |
c = Iris- virginica |
|
Detailed classification outcome: Test 14
|
Algorithm |
J48 pruned tree |
|||
|
Dataset |
iris.arf |
|||
|
Instances |
150 |
|||
|
Attributes |
5 |
|||
|
Correctly classified instances |
144 |
|||
|
Incorrectly classified instances |
6 |
|||
|
Confusion Matrix |
a |
b |
c |
Classified as |
|
49 |
1 |
0 |
a = Iris- setosa |
|
|
0 |
47 |
3 |
b = Iris- versicolor |
|
|
0 |
2 |
48 |
c = Iris- virginica |
|
Detailed classification outcome: Test 15
|
Algorithm |
RERTree |
|||
|
Dataset |
iris.arf |
|||
|
Instances |
150 |
|||
|
Attributes |
5 |
|||
|
Correctly classified instances |
141 |
|||
|
Incorrectly classified instances |
9 |
|||
|
Confusion Matrix |
a |
b |
c |
Classified as |
|
50 |
0 |
0 |
a = Iris- setosa |
|
|
0 |
46 |
4 |
b = Iris- versicolor |
|
|
0 |
5 |
45 |
c = Iris- virginica |
|