Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data...
Transcript of Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data...
![Page 1: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/1.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Machine learning – en introduktion
Josefin Rosén, Senior Analytical Expert, SAS Institute
Twitter: @rosenjosefin
#SASFORUMSE
![Page 2: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/2.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Machine learning – en introduktion
Agenda
Vad är machine learning?
När, var och hur används machine learning?
Exempel – deep learning
Machine learning i SAS
![Page 3: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/3.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Machine learning – vad är det?
Wikipedia: Machine learning, a branch of artificial intelligence,
concerns the construction and study of systems that can learn
from data.
SAS: Machine learning is a branch of artificial intelligence that
automates the building of systems that learn from data, identify
patterns, and make decisions – with minimal human intervention.
![Page 4: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/4.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Databases
Statistics
Information Retrieval
AI
Computational Neuroscience
Data Mining
Data Science
MachineLearning
PatternRecognition
Vad är vad egentligen?
![Page 5: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/5.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
”Komplicerade metoder,
men användbara resultat”
Machine learning – vad är det?
![Page 6: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/6.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
När används machine learning?
När modellens prediktionsnoggrannhet är viktigare än tolkningen
av modellen
När traditionella tillvägagångssätt inte passar, t ex när man har:
fler variabler än observationer
många korrelerade variabler
ostrukturerad data
fundamentalt ickelinjära eller ovanliga fenomen
![Page 7: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/7.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Träningsdata
Regression
Beslutsträd
Neuralt nätverk
![Page 8: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/8.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Var används machine learning?
Några exempel:
Rekommendationsapplikationer
Fraud detection
Prediktivt underhåll
Textanalys
Mönster och bildigenkänning
Den självkörande Google-bilen
![Page 9: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/9.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Databases
Statistics
Information Retrieval
AI
Computational Neuroscience
Data Mining
Data Science
MachineLearning
PatternRecognition
![Page 10: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/10.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Dat
a M
inin
gMachine Learning
*In semi-supervised learning, supervised prediction and classification algorithms are often combined with clustering.
SEMI-SUPERVISED LEARNING
Prediction and classification*Clustering*EM TSVMManifoldregularization Autoencoders
Multilayer perceptronRestricted Boltzmannmachines
SUPERVISED LEARNING
RegressionLASSO regressionLogistic regressionRidge regression
Decision treeGradient boostingRandom forests
Neural networks SVMNaïve BayesNeighborsGaussianprocesses
UNSUPERVISEDLEARNING
A priori rulesClustering
k-means clusteringMean shift clustering Spectral clustering
Kernel densityestimationNonnegative matrixfactorizationPCA
Kernel PCASparse PCA
Singular valuedecompositionSOM
Don’t know y
Know ySometimes
know y
![Page 11: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/11.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Deep learning
Deep learning – att använda neurala nätverk med fler än två gömda lager
Används framgångsrikt bl a inom mönsterigenkänning
Bra på att extrahera features från ett dataset
![Page 12: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/12.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
MNIST träningsdata
784 variabler bildar en 28x28 digital grid
784-dimensionell inputvektor X = (x1,…,x784)
Varierande gråskala från 0 till 255
60,000 träningsbilder med label
10,000 testbilder utan label
![Page 13: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/13.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
MNIST exempel
Träna en stacked denoising autoencoder
Extrahera representativa features från MNIST data
Jämföra med PCA, två PCs
![Page 14: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/14.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Stacked
denoising
autoencoder
h1
h2
h3
h4
h5
Partially Corrupted Input Features
Hidden Neurons
Hidden Neurons
Hidden Neurons
Hidden Neurons
Hidden Neurons
Uncorrupted Output Features Target Layer
Input Layer
Extractable FeaturesHidden layers
![Page 15: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/15.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Record ID Pixel 1 Pixel 2 Pixel 3 Pixel 4 Pixel 5 Pixel 6 Pixel 7 Pixel 8 Pixel 9 Pixel 10 …
1 0 0 0 0 0 5 8 11 6 3 …
2 0 0 0 0 10 20 45 46 36 24 …
3 0 25 37 32 40 64 107 200 67 46 …
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
Record ID Hidden Unit 1 Hidden Unit 2
1 0.98754 0.32453
2 0.76854 0.87345
3 0.87435 0.05464
⋮ ⋮ ⋮
h1
h2
h3
Partially Corrupted Input Features
Hidden Neurons
Hidden Neurons
Hidden Neurons
Input Layer
Extractable Features
![Page 16: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/16.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Feature extraction – denoising autoencoder
![Page 17: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/17.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Feature extraction - PCA
![Page 18: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/18.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
SAS machine learning algoritmer Neural networks
Decision trees
Random forests
Associations and sequence
discovery
Gradient boosting and bagging
Support vector machines
Nearest-neighbor mapping
K-means clustering
DBSCAN
Self-organizing maps
Local search optimization techniques
such as genetic algorithms
Expectation maximization
Multivariate adaptive regression
splines
Bayesian networks
Kernel density estimation
Principal components analysis
Singular value decomposition
Gaussian mixture models
Sequential covering rule building
Model ensembles
Recommendations
![Page 19: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/19.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
SAS-produkter som använder machine learning
SAS Enterprise Miner
SAS Text Miner
SAS In-Memory Statistics for Hadoop
SAS Visual Statistics
SAS/STAT
SAS/OR
SAS Factory Miner
![Page 20: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/20.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Su
pe
rvis
ed
lea
rnin
g a
lgo
ritme
r
Algoritm SAS EM-noder SAS procedurer
Regression High Performance Regression
LARS
Partial Least Squares
Regression
ADAPTIVEREG
GAM
GENMOD
GLMSELECT
HPGENSELECT
HPLOGISTIC
HHPQUANTSELECT
HPREG
LOGISTIC
QUANTREG
QUANTSELECT
REG
Beslutsträd Decision Tree
High Performance Tree
ARBORETUM
HPSPLIT
Random forest High Performance Tree HPFOREST
Gradient boosting Gradient Boosting ARBORETUM
Neurala nätverk AutoNeural
DMNeural
High Performance Neural
Neural Network
HPNEURAL
NEURAL
Support vector machine High Performance Support Vector Machine HPSVM
Naïve Bayes HPBNET*
Neighbors Memory Based Reasoning DISCRIM
*PROC HPBNET kan lära sig olika nätverksstrukturer (naïve, TAN, PC, och MB) och automatiskt välja den bästa modellen
![Page 21: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/21.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Unsupervised learning algoritmer
Algoritm SAS EM-noder SAS procedurer
A priori rules Association
Link Analysis
K-means klustring Cluster
High Performance Cluster
FASTCLUS
HPCLUS
Spektral klustring Custom lösning genom Base SAS och procedurerna
DISTANCE och PRINCOMP
Kernel density estimation KDE
Kernel PCA Custom lösning genom Base SAS och procedurerna
CORR, PRINCOMP och SCORE
Singular value decomposition HPTMINE
IML
Self organizing maps SOM/Kohonen
![Page 22: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/22.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Semi-Supervised learning algoritmer
Algoritm SAS EM-noder SAS procedurer
Denoising autoencoders HPNEURAL
NEURAL
![Page 23: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/23.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Big data
Beräkningsresurser
Kraftfulla datorer
Billig datalagring
Varför har machine learning fått ökat intresse?
“Space is big. You just won't believe how
vastly, hugely, mind-bogglingly big it is”
Douglas Adams i ”Liftarens guide till galaxen”
![Page 24: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/24.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
![Page 25: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/25.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Mer läsning
• White papers
http://www.sas.com/en_us/whitepapers/machine-learning-with-sas-enterprise-miner-107521.html
http://support.sas.com/resources/papers/proceedings14/SAS313-2014.pdf
• SAS-länkar
http://www.sas.com/en_us/insights/analytics/machine-learning.html
http://www.sas.com/en_us/insights/articles/analytics/introduction-to-machine-learning-five-things-the-quants-wish-we-
knew.html
• SAS Data Mining Community https://communities.sas.com/community/support-communities/sas_data_mining_and_text_mining/
• Big Data Matters Webinar Series: www.sas.com/bigdatamatters
![Page 26: Machine learning –en introduktion - SAS Retrieval AI Computational Neuroscience Data Mining Data Science Machine Learning Pattern Recognition Vad är vad egentligen? Copyright ©](https://reader036.fdocuments.us/reader036/viewer/2022062317/5adc6f2b7f8b9ae1408b8f67/html5/thumbnails/26.jpg)
Copyright © 2015, SAS Institute Inc. All rights reserved.
Tack!