In: Computer Science
USING 3 FILES breast-cancer.arff, diabetes.arff and iris.arff IN WEKA 3.8
Task 1: Classification performance evaluation [10 marks]
In this comparative analysis task, you are required to evaluate classification performance of five algorithms on three datasets using Weka. Load breast-cancer.arff, diabetes.arff and iris.arff datasets into Weka one at a time and run each of the below algorithms with their default settings. Then, collect a 10-fold cross-validation classification results for quantitative evaluation.
a. MultilayerPerceptron
b. Naive Bayes
c. J48
d. RandomForest
e. RERTree
You need to write a report that shows performance comparison of these algorithms on the datasets. The report should contain quantitative comparison of classification accuracy in terms of the confusion matrix and other performance metrics used in Weka. Include necessary screenshots, tables, graphs, etc. to make your report comprehensive, and revealing insightful details on the performance comparison.
Comparative Classification Accuracy Using Weka
Datasets Algorithm |
breast-cancer.arff |
diabetes.arff |
iris.arf |
Multilayer Perceptron |
64.68% |
75.39% |
97.33% |
Naïve Bayes |
71.67% |
76.30% |
96.00% |
J48 |
75.52% |
73.82% |
96.00% |
Random Forest |
69.58% |
75.78% |
95.33% |
RERTree |
70.62% |
75.26% |
94.00% |
Detailed classification outcome: Test 1
Algorithm |
Multilayer Perceptron |
||
Dataset |
breast-cancer.arff |
||
Instances |
286 |
||
Attributes |
10 |
||
Correctly classified instances |
185 |
||
Incorrectly classified instances |
101 |
||
Confusion Matrix |
a |
b |
Classified as |
150 |
51 |
a = no-recurrence-event |
|
50 |
35 |
b = recurrence-events |
Detailed classification outcome: Test 2
Algorithm |
Naïve Bayes |
||
Dataset |
breast-cancer.arff |
||
Instances |
286 |
||
Attributes |
10 |
||
Correctly classified instances |
205 |
||
Incorrectly classified instances |
81 |
||
Confusion Matrix |
a |
b |
Classified as |
168 |
33 |
a = no-recurrence-event |
|
48 |
37 |
b = recurrence-events |
Detailed classification outcome: Test 3
Algorithm |
J48 |
||
Dataset |
breast-cancer.arff |
||
Instances |
286 |
||
Attributes |
10 |
||
Correctly classified instances |
216 |
||
Incorrectly classified instances |
70 |
||
Confusion Matrix |
a |
b |
Classified as |
193 |
8 |
a = no-recurrence-event |
|
62 |
23 |
b = recurrence-events |
Detailed classification outcome: Test 4
Algorithm |
Random Forest |
||
Dataset |
breast-cancer.arff |
||
Instances |
286 |
||
Attributes |
10 |
||
Correctly classified instances |
199 |
||
Incorrectly classified instances |
87 |
||
Confusion Matrix |
a |
b |
Classified as |
175 |
26 |
a = no-recurrence-event |
|
61 |
24 |
b = recurrence-events |
Detailed classification outcome: Test 5
Algorithm |
RERTree |
||
Dataset |
breast-cancer.arff |
||
Instances |
286 |
||
Attributes |
10 |
||
Correctly classified instances |
202 |
||
Incorrectly classified instances |
84 |
||
Confusion Matrix |
a |
b |
Classified as |
183 |
18 |
a = no-recurrence-event |
|
66 |
19 |
b = recurrence-events |
Detailed classification outcome: Test 6
Algorithm |
Multilayer Perceptron |
||
Dataset |
diabetes.arff |
||
Instances |
768 |
||
Attributes |
9 |
||
Correctly classified instances |
579 |
||
Incorrectly classified instances |
189 |
||
Confusion Matrix |
a |
b |
Classified as |
416 |
84 |
a = tested negative |
|
105 |
163 |
b = tested positive |
Detailed classification outcome: Test 7
Algorithm |
Naïve Bayes |
||
Dataset |
Diabetes.arff |
||
Instances |
768 |
||
Attributes |
9 |
||
Correctly classified instances |
586 |
||
Incorrectly classified instances |
182 |
||
Confusion Matrix |
a |
b |
Classified as |
422 |
78 |
a = tested negative |
|
104 |
164 |
b = tested positive |
Detailed classification outcome: Test 8
Algorithm |
J48 |
||
Dataset |
Diabetes.arff |
||
Instances |
768 |
||
Attributes |
9 |
||
Correctly classified instances |
567 |
||
Incorrectly classified instances |
201 |
||
Confusion Matrix |
a |
b |
Classified as |
407 |
93 |
a = tested negative |
|
108 |
160 |
b = tested positive |
Detailed classification outcome: Test 9
Algorithm |
Random Forest |
||
Dataset |
Diabetes.arff |
||
Instances |
768 |
||
Attributes |
9 |
||
Correctly classified instances |
582 |
||
Incorrectly classified instances |
186 |
||
Confusion Matrix |
a |
b |
Classified as |
418 |
82 |
a = tested negative |
|
104 |
164 |
b = tested positive |
Detailed classification outcome: Test 10
Algorithm |
RERTree |
||
Dataset |
Diabetes.arff |
||
Instances |
768 |
||
Attributes |
9 |
||
Correctly classified instances |
578 |
||
Incorrectly classified instances |
190 |
||
Confusion Matrix |
a |
b |
Classified as |
423 |
177 |
a = tested negative |
|
113 |
155 |
b = tested positive |
Detailed classification outcome: Test 11
Algorithm |
Multilayer Perceptron |
|||
Dataset |
iris.arf |
|||
Instances |
150 |
|||
Attributes |
5 |
|||
Correctly classified instances |
146 |
|||
Incorrectly classified instances |
4 |
|||
Confusion Matrix |
a |
b |
c |
Classified as |
50 |
0 |
0 |
a = Iris- setosa |
|
0 |
48 |
2 |
b = Iris- versicolor |
|
0 |
2 |
48 |
c = Iris- virginica |
Detailed classification outcome: Test 12
Algorithm |
Naive Bayes Classifier |
|||
Dataset |
iris.arf |
|||
Instances |
150 |
|||
Attributes |
5 |
|||
Correctly classified instances |
144 |
|||
Incorrectly classified instances |
6 |
|||
Confusion Matrix |
a |
b |
c |
Classified as |
50 |
0 |
0 |
a = Iris- setosa |
|
0 |
48 |
2 |
b = Iris- versicolor |
|
0 |
4 |
46 |
c = Iris- virginica |
Detailed classification outcome: Test 13
Algorithm |
RandomForest |
|||
Dataset |
iris.arf |
|||
Instances |
150 |
|||
Attributes |
5 |
|||
Correctly classified instances |
143 |
|||
Incorrectly classified instances |
7 |
|||
Confusion Matrix |
a |
b |
c |
Classified as |
50 |
0 |
0 |
a = Iris- setosa |
|
0 |
47 |
3 |
b = Iris- versicolor |
|
0 |
4 |
46 |
c = Iris- virginica |
Detailed classification outcome: Test 14
Algorithm |
J48 pruned tree |
|||
Dataset |
iris.arf |
|||
Instances |
150 |
|||
Attributes |
5 |
|||
Correctly classified instances |
144 |
|||
Incorrectly classified instances |
6 |
|||
Confusion Matrix |
a |
b |
c |
Classified as |
49 |
1 |
0 |
a = Iris- setosa |
|
0 |
47 |
3 |
b = Iris- versicolor |
|
0 |
2 |
48 |
c = Iris- virginica |
Detailed classification outcome: Test 15
Algorithm |
RERTree |
|||
Dataset |
iris.arf |
|||
Instances |
150 |
|||
Attributes |
5 |
|||
Correctly classified instances |
141 |
|||
Incorrectly classified instances |
9 |
|||
Confusion Matrix |
a |
b |
c |
Classified as |
50 |
0 |
0 |
a = Iris- setosa |
|
0 |
46 |
4 |
b = Iris- versicolor |
|
0 |
5 |
45 |
c = Iris- virginica |