Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1....

13
1 © 2016 The MathWorks, Inc. Predictive Analytics, Machine Learning, and Regression Paul Peeling MathWorks 25 May 2016

Transcript of Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1....

Page 1: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

1© 2016 The MathWorks, Inc.

Predictive Analytics, Machine Learning, and Regression

Paul PeelingMathWorks25 May 2016

Page 2: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

2

Agenda

1. Modelling choice and insights2. Machine learning3. Time series4. Networks

Page 3: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

3

Why develop predictive models?

§ Forecast prices/returns

§ Price complex instruments

§ Analyze impact of predictors (sensitivity analysis)

§ Stress testing

§ Gain economic/market insight

Page 4: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

4

MODEL

PREDICTION

Predictive Modeling Workflow

Train: Iterate till you find the best model

Predict: Integrate trained models into applications

MODELSUPERVISEDLEARNING

CLASSIFICATION

REGRESSION

PREPROCESS DATA

SUMMARYSTATISTICS

PCAFILTERS

CLUSTER ANALYSIS

LOAD DATA

PREPROCESS DATA

SUMMARYSTATISTICS

PCAFILTERS

CLUSTER ANALYSIS

NEWDATA

Page 5: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

5

Bank Marketing

Categorical Yes/NoCategorical PredictorsContinuous PredictorsCampaign CostFalse-Positive-RateCustomer SegmentationOutliers

Page 6: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

6

Forecasting S&P® 500

Univariate or multivariate?Lagged returnsInnovationsForecastVariance

Page 7: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

7

Sector Benchmarks

Dependent VariablesMeasures of correlationRelative ImportanceTime DependenciesDescriptive to Model

Page 8: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

8

Machine learning workflow

Steps1. Partition data for cross-validation and hold-out2. Define predictors and responses3. Select and compare models4. Assess relevance of predictors

Refine§ Look for value-added features§ Trade-off complexity vs. accuracy§ Account for prior knowledge§ Experiment with categorical data representation§ Explore individual models and ensemble with confidence

Page 9: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

9

Time Series Regression Workflow

Steps1. Choose appropriate domain2. Fit Model3. Simulate and Forecast

Refine§ Significance Test for Structure§ Parameter Test for Order§ Apply constraints§ Assess long-term behaviour

Page 10: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

10

Networks of dependent variables

Steps1. Choose appropriate domain2. Develop measure of distance3. Determine threshold/significance

Refine§ Analyse structure§ Assess dynamic changes§ Move from descriptive statistics to modelling

Page 11: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

11

Techniques used

§ Classification Decision Tree§ Logistic Regression§ Receiver-Operator-Characteristic (ROC) curve§ Statistical significance tests§ ARIMA / GARCH model§ Akaike Information Criterion§ Minimal Spanning Tree§ Centrality Metrics§ Hidden Markov Model§ http://uk.mathworks.com/company/newsletters/articles/exploring-risk-

contagion-using-graph-theory-and-markov-chains.html

Page 12: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

12

Learn More: Predictive Modelling with MATLAB

To learn more, visit: www.mathworks.com/machine-learning

Basket Selection using Stepwise Regression

Classification in the presence of missing data

Regerssion with Boosted Decision Trees

Hierarchical Clustering

Page 13: Predictive Analytics, Machine Learning, and Regression · 8 Machine learning workflow Steps 1. Partition data for cross-validation and hold-out 2. Define predictors and responses

13

Q&A