Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and...
Transcript of Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and...
![Page 1: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/1.jpg)
Introduction to Data Mining and MachineLearning TechniquesIza Moise, Evangelos Pournaras, Dirk Helbing
Iza Moise, Evangelos Pournaras, Dirk Helbing 1
![Page 2: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/2.jpg)
Overview
Main principles of data miningDefinitionSteps of a data mining processSupervised vs. unsupervised data mining
Applications
Data mining functionalities
Iza Moise, Evangelos Pournaras, Dirk Helbing 2
![Page 3: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/3.jpg)
Definition
Data mining
is the automated process of discovering interesting (non-trivial, pre-viously unknown, insightful and potentially useful) information orpatterns, as well as descriptive, understandable, and predictive mod-els from large-scale data.
Also known as:
• Knowledge discovery in databases (KDD)
• Data analysis
• Information harvesting
• Business intelligence
Iza Moise, Evangelos Pournaras, Dirk Helbing 3
![Page 4: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/4.jpg)
Goals
X search consistent patterns and/or systemic relationshipsbetween data
X validate the findings by applying the detected patterns to newsubsets of data
Iza Moise, Evangelos Pournaras, Dirk Helbing 4
![Page 5: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/5.jpg)
Data Mining is...
• k-means clustering
• decision trees
• neural networks
• Bayesian networks
Iza Moise, Evangelos Pournaras, Dirk Helbing 5
![Page 6: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/6.jpg)
Data Mining is not...
• Data warehousing
• SQL / Ad Hoc Queries / Reporting
• Software Agents
• Online Analytical Processing (OLAP)
• Data Visualization
Iza Moise, Evangelos Pournaras, Dirk Helbing 6
![Page 7: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/7.jpg)
Knowledge Discovery Process
Iza Moise, Evangelos Pournaras, Dirk Helbing 7
![Page 8: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/8.jpg)
Steps of a data mining process
1. Exploration
2. Model building and validation
3. Deployment
Iza Moise, Evangelos Pournaras, Dirk Helbing 8
![Page 9: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/9.jpg)
Exploration
• data preparation→ data cleaning, data transformation etc.
• elaborate exploratory analyses using a wide variety ofgraphical and statistical methods, depending on the nature ofthe analytic problem
• determine the general nature of models that can be taken intoaccount in the next stage
Iza Moise, Evangelos Pournaras, Dirk Helbing 9
![Page 10: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/10.jpg)
Model building and validation, and Deployment
• Model building and validation:→ consider various models and choose the best one→ validate the model on existing data
• Deployment:→ apply the model to new data
Iza Moise, Evangelos Pournaras, Dirk Helbing 10
![Page 11: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/11.jpg)
Model building and validation, and Deployment
• Model building and validation:→ consider various models and choose the best one→ validate the model on existing data
• Deployment:→ apply the model to new data
Iza Moise, Evangelos Pournaras, Dirk Helbing 10
![Page 12: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/12.jpg)
Model building and validation, and Deployment
• Model building and validation:→ consider various models and choose the best one→ validate the model on existing data
• Deployment:→ apply the model to new data
Iza Moise, Evangelos Pournaras, Dirk Helbing 10
![Page 13: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/13.jpg)
Two styles of data mining: Predictive andDescriptive
1. Predictive→ Predict the value of a specific attribute (target or dependent)based on the value of other attributes (explanatory)
2. Descriptive→ Derive patterns that summarise the relationships betweendata
Iza Moise, Evangelos Pournaras, Dirk Helbing 11
![Page 14: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/14.jpg)
Supervised vs. Unsupervised
Supervised
• predictive or directed
• useful when you have a specifictarget value to predict aboutyour data
• Classification, Regression,Anomaly Detection
Unsupervised
• descriptive or undirected
• finds hidden structure andrelation within the data
• Clustering, Association, FeatureExtraction
Iza Moise, Evangelos Pournaras, Dirk Helbing 12
![Page 15: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/15.jpg)
Supervised Data Mining
• a pre-specified target variable
• explanatory variables ↔ one (or more) dependent variables
• the algorithm is given many training data where the target isknown
• the algorithm identifies the “new” data points that match themodel of each target value
Iza Moise, Evangelos Pournaras, Dirk Helbing 13
![Page 16: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/16.jpg)
Unsupervised Data Mining
• determine the existence of classes or clusters in the data
• exploratory analysis
• all variable are treated in the same way
Iza Moise, Evangelos Pournaras, Dirk Helbing 14
![Page 17: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/17.jpg)
Overview
Main principles of data miningDefinitionSteps of a data mining processSupervised vs. unsupervised data mining
Applications
Data mining functionalities
Iza Moise, Evangelos Pournaras, Dirk Helbing 15
![Page 18: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/18.jpg)
Fielded Applications
• Web mining→ PageRank, Search Engine, Recommenders, Social Networks
• Screening Images
• Diagnosis• Marketing and Sales
→ Market basket analysis, Direct marketing
• Biology, biomedicine, astronomy, chemistry
Iza Moise, Evangelos Pournaras, Dirk Helbing 16
![Page 19: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/19.jpg)
Beers and Diapers
Iza Moise, Evangelos Pournaras, Dirk Helbing 17
![Page 20: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/20.jpg)
Overview
Main principles of data miningDefinitionSteps of a data mining processSupervised vs. unsupervised data mining
Applications
Data mining functionalities
Iza Moise, Evangelos Pournaras, Dirk Helbing 18
![Page 21: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/21.jpg)
1. Supervised Data MiningX ClassificationX RegressionX Outlier detectionX Frequent pattern mining
2. Unsupervised Data MiningX ClusteringX Feature Extraction
→ definition
→ real use-cases
→ method
→ pros and cons
Iza Moise, Evangelos Pournaras, Dirk Helbing 19
![Page 22: Introduction to Data Mining and Machine Learning Techniques · Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise,](https://reader036.fdocuments.us/reader036/viewer/2022063008/5fbb820c95fd254b86191895/html5/thumbnails/22.jpg)
1. Supervised Data MiningX ClassificationX RegressionX Outlier detectionX Frequent pattern mining
2. Unsupervised Data MiningX ClusteringX Feature Extraction
→ definition
→ real use-cases
→ method
→ pros and cons
Iza Moise, Evangelos Pournaras, Dirk Helbing 19