Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

45
Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides

Transcript of Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Page 1: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Machine Learning with Weka

Cornelia Caragea

Thanks to Eibe Frank for some of the slides

Page 2: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Outline

• Weka: A Machine Learning Toolkit

• Preparing Data

• Building Classifiers

• Interpreting Results

Page 3: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

WEKA: the software Machine learning/data mining software written in Java

(distributed under the GNU Public License) Used for research, education, and applications Main features:

Comprehensive set of data pre-processing tools, learning algorithms and evaluation methods

Graphical user interfaces (incl. data visualization) Environment for comparing learning algorithms

Page 4: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

WEKA: versions

There are several versions of WEKA: WEKA: “book version” compatible with

description in data mining book WEKA: “GUI version” adds graphical user

interfaces (book version is command-line only) WEKA: “development version” with lots of

improvements

Page 5: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

WEKA: resources API Documentation, Tutorials, Source code. WEKA mailing list Data Mining: Practical Machine Learning Tools and

Techniques with Java Implementations Weka-related Projects:

Weka-Parallel - parallel processing for Weka RWeka - linking R and Weka YALE - Yet Another Learning Environment Many others…

Page 6: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Weka: web site

http://www.cs.waikato.ac.nz/ml/weka/

Page 7: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

WEKA: launching java -jar weka.jar

Page 8: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Outline

• Weka: A Machine Learning Toolkit

• Preparing Data

• Building Classifiers

• Interpreting Results

Page 9: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

@relation heart-disease-simplified

@attribute age numeric@attribute sex { female, male}@attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}@attribute cholesterol numeric@attribute exercise_induced_angina { no, yes}@attribute class { present, not_present}

@data63,male,typ_angina,233,no,not_present67,male,asympt,286,yes,present67,male,asympt,229,yes,present38,female,non_anginal,?,no,not_present...

WEKA only deals with “flat” files

Page 10: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

@relation heart-disease-simplified

@attribute age numeric@attribute sex { female, male}@attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}@attribute cholesterol numeric@attribute exercise_induced_angina { no, yes}@attribute class { present, not_present}

@data63,male,typ_angina,233,no,not_present67,male,asympt,286,yes,present67,male,asympt,229,yes,present38,female,non_anginal,?,no,not_present...

WEKA only deals with “flat” files

Page 11: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Explorer: pre-processing the data Data can be imported from a file in various formats: ARFF,

CSV, C4.5, binary Data can also be read from a URL or from an SQL database

(using JDBC) Pre-processing tools in WEKA are called “filters” WEKA contains filters for:

Discretization, normalization, resampling, attribute selection, transforming and combining attributes, …

Page 12: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Import Datasets into WEKA

Page 13: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 14: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 15: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 16: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 17: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 18: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 19: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 20: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 21: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 22: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 23: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Outline

• Weka: A Machine Learning Toolkit

• Preparing Data

• Building Classifiers

• Interpreting Results

Page 24: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Explorer: building “classifiers” Classifiers in WEKA are models for

predicting nominal or numeric quantities Implemented learning schemes include:

Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, …

“Meta”-classifiers include: Bagging, boosting, stacking, error-correcting

output codes, locally weighted learning, …

Page 25: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 26: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 27: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 28: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 29: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 30: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 31: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 32: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 33: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 34: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 35: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 36: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 37: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 38: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 39: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 40: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Page 41: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Outline

• Machine Learning Software

• Preparing Data

• Building Classifiers

• Interpreting Results

Page 42: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Understanding Output

Page 43: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Decision Tree Output (1)

Page 44: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

Decision Tree Output (2)

Page 45: Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.

To Do List

Try Decision Tree, Naïve Bayes, and Logistic Regression classifiers on a different Weka dataset

Use various parameters