Question

In: Computer Science

USING 3 FILES breast-cancer.arff, diabetes.arff and iris.arff IN WEKA 3.8 Task 1: Classification performance evaluation [10...

USING 3 FILES breast-cancer.arff, diabetes.arff and iris.arff IN WEKA 3.8

Task 1: Classification performance evaluation [10 marks]

In this comparative analysis task, you are required to evaluate classification performance of five algorithms on three datasets using Weka.  Load breast-cancer.arff, diabetes.arff and iris.arff datasets into Weka one at a time and run each of the below algorithms with their default settings. Then, collect a 10-fold cross-validation classification results for quantitative evaluation.

a. MultilayerPerceptron

b. Naive Bayes

c. J48

d. RandomForest

e. RERTree

You need to write a report that shows performance comparison of these algorithms on the datasets. The report should contain quantitative comparison of classification accuracy in terms of the confusion matrix and other performance metrics used in Weka. Include necessary screenshots, tables, graphs, etc. to make your report comprehensive, and revealing insightful details on the performance comparison.

Solutions

Expert Solution

Comparative Classification Accuracy Using Weka

Datasets

Algorithm

breast-cancer.arff

diabetes.arff

iris.arf

Multilayer Perceptron

64.68%

75.39%

97.33%

Naïve Bayes

71.67%

76.30%

96.00%

J48

75.52%

73.82%

96.00%

Random Forest

69.58%

75.78%

95.33%

RERTree

70.62%

75.26%

94.00%

Detailed classification outcome: Test 1

Algorithm

Multilayer Perceptron

Dataset

breast-cancer.arff

Instances

286

Attributes

10

Correctly classified instances

185

Incorrectly classified instances

101

Confusion Matrix

a

b

Classified as

150

51

a = no-recurrence-event

50

35

b = recurrence-events

Detailed classification outcome: Test 2

Algorithm

Naïve Bayes

Dataset

breast-cancer.arff

Instances

286

Attributes

10

Correctly classified instances

205

Incorrectly classified instances

81

Confusion Matrix

a

b

Classified as

168

33

a = no-recurrence-event

48

37

b = recurrence-events

Detailed classification outcome: Test 3

Algorithm

J48

Dataset

breast-cancer.arff

Instances

286

Attributes

10

Correctly classified instances

216

Incorrectly classified instances

70

Confusion Matrix

a

b

Classified as

193

8

a = no-recurrence-event

62

23

b = recurrence-events

Detailed classification outcome: Test 4

Algorithm

Random Forest

Dataset

breast-cancer.arff

Instances

286

Attributes

10

Correctly classified instances

199

Incorrectly classified instances

87

Confusion Matrix

a

b

Classified as

175

26

a = no-recurrence-event

61

24

b = recurrence-events

Detailed classification outcome: Test 5

Algorithm

RERTree

Dataset

breast-cancer.arff

Instances

286

Attributes

10

Correctly classified instances

202

Incorrectly classified instances

84

Confusion Matrix

a

b

Classified as

183

18

a = no-recurrence-event

66

19

b = recurrence-events

Detailed classification outcome: Test 6

Algorithm

Multilayer Perceptron

Dataset

diabetes.arff

Instances

768

Attributes

9

Correctly classified instances

579

Incorrectly classified instances

189

Confusion Matrix

a

b

Classified as

416

84

a = tested negative

105

163

b = tested positive

Detailed classification outcome: Test 7

Algorithm

Naïve Bayes

Dataset

Diabetes.arff

Instances

768

Attributes

9

Correctly classified instances

586

Incorrectly classified instances

182

Confusion Matrix

a

b

Classified as

422

78

a = tested negative

104

164

b = tested positive

Detailed classification outcome: Test 8

Algorithm

J48

Dataset

Diabetes.arff

Instances

768

Attributes

9

Correctly classified instances

567

Incorrectly classified instances

201

Confusion Matrix

a

b

Classified as

407

93

a = tested negative

108

160

b = tested positive

Detailed classification outcome: Test 9

Algorithm

Random Forest

Dataset

Diabetes.arff

Instances

768

Attributes

9

Correctly classified instances

582

Incorrectly classified instances

186

Confusion Matrix

a

b

Classified as

418

82

a = tested negative

104

164

b = tested positive

Detailed classification outcome: Test 10

Algorithm

RERTree

Dataset

Diabetes.arff

Instances

768

Attributes

9

Correctly classified instances

578

Incorrectly classified instances

190

Confusion Matrix

a

b

Classified as

423

177

a = tested negative

113

155

b = tested positive

Detailed classification outcome: Test 11

Algorithm

Multilayer Perceptron

Dataset

iris.arf

Instances

150

Attributes

5

Correctly classified instances

146

Incorrectly classified instances

4

Confusion Matrix

a

b

c

Classified as

50

0

0

a = Iris- setosa

0

48

2

b = Iris- versicolor

0

2

48

c = Iris- virginica

Detailed classification outcome: Test 12

Algorithm

Naive Bayes Classifier

Dataset

iris.arf

Instances

150

Attributes

5

Correctly classified instances

144

Incorrectly classified instances

6

Confusion Matrix

a

b

c

Classified as

50

0

0

a = Iris- setosa

0

48

2

b = Iris- versicolor

0

4

46

c = Iris- virginica

Detailed classification outcome: Test 13

Algorithm

RandomForest

Dataset

iris.arf

Instances

150

Attributes

5

Correctly classified instances

143

Incorrectly classified instances

7

Confusion Matrix

a

b

c

Classified as

50

0

0

a = Iris- setosa

0

47

3

b = Iris- versicolor

0

4

46

c = Iris- virginica

Detailed classification outcome: Test 14

Algorithm

J48 pruned tree

Dataset

iris.arf

Instances

150

Attributes

5

Correctly classified instances

144

Incorrectly classified instances

6

Confusion Matrix

a

b

c

Classified as

49

1

0

a = Iris- setosa

0

47

3

b = Iris- versicolor

0

2

48

c = Iris- virginica

Detailed classification outcome: Test 15

Algorithm

RERTree

Dataset

iris.arf

Instances

150

Attributes

5

Correctly classified instances

141

Incorrectly classified instances

9

Confusion Matrix

a

b

c

Classified as

50

0

0

a = Iris- setosa

0

46

4

b = Iris- versicolor

0

5

45

c = Iris- virginica


Related Solutions

Explain in brief all the below questions Biomedical image classification: classification performance evaluation Principles of image...
Explain in brief all the below questions Biomedical image classification: classification performance evaluation Principles of image compression 3D biomedical image visualisation and processing Colour biomedical image processing
Using appropriate performance measures is crucial for the success of the performance evaluation and incentive schemes....
Using appropriate performance measures is crucial for the success of the performance evaluation and incentive schemes. However, using improper performance measures may result in goal conflict problems (e.g., between departments, between managers and shareholders, or between a superior and a subordinate). Describe two goal conflict situations caused by the use of inappropriate performance measures and also discuss potential solution(s) for each situation.
Task 1: a) Explain the meaning of taxation and various types of taxes and classification of...
Task 1: a) Explain the meaning of taxation and various types of taxes and classification of taxation. You must consider the UK rule while explaining this question. b) “Taxation is very much required for the economic development of the country; to protect the environment and to reduce the gap between rich and poor” Using above, Illustrate and Critically evaluate the purpose of taxation.
Task 1: a) Explain the meaning of taxation and various types of taxes and classification of...
Task 1: a) Explain the meaning of taxation and various types of taxes and classification of taxation. You must consider the UK rule while explaining this question. b) “Taxation is very much required for the economic development of the country; to protect the environment and to reduce the gap between rich and poor” Using above, Illustrate and Critically evaluate the purpose of taxation.
What are the advantages and disadvantages of developing and using standard costs for performance evaluation?
What are the advantages and disadvantages of developing and using standard costs for performance evaluation?
Given these facts: Using the case of testing women for breast cancer (*) 1% have breast...
Given these facts: Using the case of testing women for breast cancer (*) 1% have breast cancer 99% don’t have breast cancer Mammogram tests In event of a woman with cancer 80% of mammograms detect breast cancer when it is there (positive) 20% miss it (false negative) In event of no cancer 9.6% of mammograms detect breast cancer when it’s not there (false positive) 90.4% correctly return a negative result. Questions: A lady takes the test three times and all...
MATLAB CODE FOR E xtreme learning machine using for classification task. image processing electrical. if you...
MATLAB CODE FOR E xtreme learning machine using for classification task. image processing electrical. if you know then try or leave for other
1. Portfolio Performance Discuss, in general, the performance attribution procedures. 2. International Investments Discuss performance evaluation...
1. Portfolio Performance Discuss, in general, the performance attribution procedures. 2. International Investments Discuss performance evaluation of international portfolio managers in terms of potential sources of abnormal returns. 3. Hedge Funds Explain the five major differences between hedge funds and mutual funds.
Develop a Monitoring and Evaluation Plan, including design of at least 3 key performance indicators that...
Develop a Monitoring and Evaluation Plan, including design of at least 3 key performance indicators that can be used to review and evaluate the Change Management Project Plan, and modify future similar projects to roll out this change to operating hours across all NSW Post Offices.
Based on your understanding, write a summary on auditor’s report and evaluation about Ooredoo performance. (10...
Based on your understanding, write a summary on auditor’s report and evaluation about Ooredoo performance. Please Refer Ooredoo Qatar Annual Report 2018 page 40-42. Words limit 150 words.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT