Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest...
Transcript of Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest...
![Page 1: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/1.jpg)
Machine learning onTime Series Analysis
Cheng-Yu HanSolutions
Challenges
![Page 2: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/2.jpg)
Investing Gambling
![Page 3: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/3.jpg)
![Page 4: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/4.jpg)
![Page 5: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/5.jpg)
FinLab
Capital
Age
840w
1790w
3500w
60 years old(retire)
25 years old
Income per month…..5w (NTD)
Spent per month……..3w (NTD)
Investment…………….70% (of total capital)
Compound interest….5% (each year)
Personal Life Financial Plan
Capital simulation
investing
No investing
![Page 6: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/6.jpg)
FinLab
lanOutline
![Page 8: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/8.jpg)
OutlineFinancial Data
Features
Labels
Machine LearningModels
NN
LSTM
CNN
EvaluationBacktesting
Purged K-fold
![Page 9: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/9.jpg)
Trading programming language
• Easy learning curve for the beginners• Integrated with language editor in platforms• Can be extend by external DLL
• Most of the functions are encrypted or the source code is not provided• Does not support statistic analysis or machine learning toolkit
![Page 10: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/10.jpg)
Trading programming language
• Friendly statistic toolkit • Friendly statistic toolkit• Strong community and widely applied• Easy to deploy (Flask/Django/…)• More innovative data science applications
![Page 11: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/11.jpg)
Artificial Intelligence papers
All of the papers available in the “artificial intelligence” section (arXiv)
3000
2000
1000
![Page 12: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/12.jpg)
ML algorithms in finance?
![Page 13: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/13.jpg)
Financial Data (Features)Supervised Machine Learning
Color Weight Age Category
3.2 kg 2 cat
4.2 kg 5 cat
6.2 kg 4 dog
features labels
MLModel
Training
TestingColor Weight Age
3.2 kg 2
4.2 kg 5
6.2 kg 4
MLModel
True Answer Prediction
cat cat
cat dog
dog dog
features labels
![Page 14: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/14.jpg)
Financial Data StructuresSupervised Machine LearningFinancial Data (Features)
![Page 15: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/15.jpg)
DemoFinancial Data (Features)
• Useful to combine other data types• Difficult to confirm data release date• Missing data is often backfilled• Consider multiple correction
Fundamental data
Financial Data Structures
Focusing on creating a portrait of a company
Trading dataMarket participant characteristic footprintTrading book, price, broker trading summary…etc
• Data often with timestamp• Generate extra features (ex: technical indicators)• Massive amount of data generated in one day• Some of the data is difficult to obtain
![Page 16: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/16.jpg)
Creating Technical indicators
RSI
KD
Generate technical indicators Predict price movement
Price
Price historical data
![Page 17: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/17.jpg)
Creating Technical indicatorsChallenging of Labeling the dataLabeling
![Page 18: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/18.jpg)
Challenging of Labeling the data
pric
e
time
1
0
-1
𝑝 𝑡 + 𝜏
𝑝 𝑡 − 𝜏
𝑡 + 𝑤𝑡
• 𝜏 is a constant• Do not have stop-loss limits
Fixed time horizonA popular method in the literature
𝑝 𝑡 + 𝑤
𝑝 𝑡
![Page 19: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/19.jpg)
Label Generation Methods
• Triple barrier [Prado 2018]
• Continuous trading signals [Dash 2016]
• Trading Point decision [Chang 2009]
[Prado 2018] Advances in Financial Machine Learning[Tsantekidis 2017] Using Deep Learning to Detect Price Change Indications in Financial Markets[Dash 2016] A hybrid stock trading framework integrating technical analysis with machine learning techniques[Chang 2009] Integrating a Piecewise Linear Representation Method and a Neural Network Model for Stock Trading Points Prediction
![Page 20: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/20.jpg)
Triple barriers [Prado 2018]
pric
e
time
1
0
-1
𝑝 𝑡 + 𝑤
𝑝 𝑡
𝑝 𝑡 + 𝜏!
𝑝 𝑡 − 𝜏"
• Horizontal barriers are defined by profit-taking and stop-loss limit • 𝜏! and 𝜏" are dynamic according to estimated volatility
𝑡 + 𝑤𝑡
![Page 21: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/21.jpg)
Continuous trading signals [Dash 2016]
time
pric
e
𝑝!,!#$max
𝑝!,!#$min
𝑝!,!#$min
𝑝!,!#$max0.5
1
0.5
0
• Using momentum of the stock price
• y(t)’s are continuous
• Provides more detailed information
𝑡 + 𝑤𝑡
![Page 22: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/22.jpg)
Trading point decision
• Find the local minimum and maximum points
• Divide the time series into subsegments
• Threshold value d à length of trend
d
![Page 23: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/23.jpg)
Trading point decision
• Find the local minimum and maximum points
• Divide the time series into subsegments
• Threshold value d à length of trend
![Page 24: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/24.jpg)
ML Models
![Page 25: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/25.jpg)
DemoNeural Network• Built to model the human brain• interpret numeric data through a kind of machine perception
Human neuron structure Single neuron model
y1= g (w1x1+w2x2+w0)
vg(v)
1w0
w1x1
x2
! g
w2
y1
![Page 26: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/26.jpg)
Single node in neural network
! g y1
1w0
w1x1
x2
w2
Neural Network
![Page 27: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/27.jpg)
Simplified expression
y1
1w0
w1x1
x2
w2
Neural Network
![Page 28: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/28.jpg)
A layer contain multiple neurons
1
x1
x2
y2
y1
y3
y4
Neural Network
![Page 29: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/29.jpg)
Multi-layer deep neural network
1
x1
x2
y2
y3
y1
Deep Neural Network
![Page 30: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/30.jpg)
w1
w2
Cost function
Neural Network Optimization
![Page 31: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/31.jpg)
Deep Neural Network Training Result2018-1-1 2019-7-1
Train
2006 ~ 2014 2016 ~ 2019-3-1
Validate
2015
Backtest
FeaturesScaled Technical Indicators
Asset
Data split
LabelsFixed time horizon
Taiwan CapitalizationWeighted Stock Index
benchmark
backtest
![Page 32: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/32.jpg)
Model Interpretation
![Page 33: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/33.jpg)
Deep Neural Network Training Result
https://arxiv.org/pdf/1602.04938.pdf
• Explain the predictions of your machine learning models• Approximate the predictions of the underlying black box model• Focuses on training local surrogate models to explain individual predictions
![Page 34: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/34.jpg)
![Page 35: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/35.jpg)
DemoBacktest
• Survivor bias, lookahead bias, transection cost, outlier, overfitting• Finding the lottery tickets that won the last game• Solutions• Develop model for entire asset or classes• Use Bootstrap aggregating• Record every backtest conducted• Resist the temptation of reusing a failed strategy
![Page 36: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/36.jpg)
Conclusion
Financial DataFeatures
Labels
Machine LearningModels
Evaluation
NN
LSTM
CNN
Backtesting
Purged Validation
Machine Learning
Investing
Life Plan
Investing or gambling ?
Capital
Age
840w
1790w
3500w
60 years old25 years old
![Page 37: Challenges Machinelearningon TimeSeriesAnalysis2006~2014 2016~2019-3-1 Validate 2015 Backtest Features ScaledTechnicalIndicators Asset Datasplit Labels Fixedtimehorizon Taiwan Capitalization](https://reader034.fdocuments.us/reader034/viewer/2022052020/60345fa4ddcc7e6dad5234fe/html5/thumbnails/37.jpg)
FinLab