Estimating Covariance Using Factorial Hidden Markov Modelsjoao/presentations/PGMO_talk.pdf ·...
Transcript of Estimating Covariance Using Factorial Hidden Markov Modelsjoao/presentations/PGMO_talk.pdf ·...
university-of-pennsylvania-logo.png
Estimating Covariance Using Factorial Hidden MarkovModels
Joao Sedoc1,2
with: Jordan Rodu3, Lyle Ungar1, Dean Foster1 and Jean Gallier1
1University of PennsylvaniaPhiladelphia, PA
2Chivalric Trading
3Carnegie Mellon UniversityPittsburg, PA
PGMO Conference, October 29, 2014
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 1 / 42
university-of-pennsylvania-logo.png
Outline
1 MotivationWhat’s Novel?Portfolio OptimizationNon-Stationary Covariance
2 Introduction to Factorial HMMsHMM Application to ProblemsQuick Overview of Hidden Markov ModelsEstimationFactorial HMM
3 Empirical Results
4 Conclusion
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 2 / 42
university-of-pennsylvania-logo.png
Outline
1 MotivationWhat’s Novel?Portfolio OptimizationNon-Stationary Covariance
2 Introduction to Factorial HMMsHMM Application to ProblemsQuick Overview of Hidden Markov ModelsEstimationFactorial HMM
3 Empirical Results
4 Conclusion
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 3 / 42
university-of-pennsylvania-logo.png
What’s Novel- Innovations to Factorial HMMs
Multiple time horizon HMM using a structured approach
Incorporation of high frequency data
Estimation in near real time
Continuous emission HMM
Provable bounds
Incorporation of exogenous data
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 4 / 42
university-of-pennsylvania-logo.png
What’s Novel- Innovations to Factorial HMMs
Multiple time horizon HMM using a structured approach
Incorporation of high frequency data
Estimation in near real time
Continuous emission HMM
Provable bounds
Incorporation of exogenous data
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 4 / 42
university-of-pennsylvania-logo.png
What’s Novel- Innovations to Factorial HMMs
Multiple time horizon HMM using a structured approach
Incorporation of high frequency data
Estimation in near real time
Continuous emission HMM
Provable bounds
Incorporation of exogenous data
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 4 / 42
university-of-pennsylvania-logo.png
What’s Novel- Innovations to Factorial HMMs
Multiple time horizon HMM using a structured approach
Incorporation of high frequency data
Estimation in near real time
Continuous emission HMM
Provable bounds
Incorporation of exogenous data
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 4 / 42
university-of-pennsylvania-logo.png
What’s Novel- Innovations to Factorial HMMs
Multiple time horizon HMM using a structured approach
Incorporation of high frequency data
Estimation in near real time
Continuous emission HMM
Provable bounds
Incorporation of exogenous data
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 4 / 42
university-of-pennsylvania-logo.png
What’s Novel- Innovations to Factorial HMMs
Multiple time horizon HMM using a structured approach
Incorporation of high frequency data
Estimation in near real time
Continuous emission HMM
Provable bounds
Incorporation of exogenous data
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 4 / 42
university-of-pennsylvania-logo.png
Outline
1 MotivationWhat’s Novel?Portfolio OptimizationNon-Stationary Covariance
2 Introduction to Factorial HMMsHMM Application to ProblemsQuick Overview of Hidden Markov ModelsEstimationFactorial HMM
3 Empirical Results
4 Conclusion
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 5 / 42
university-of-pennsylvania-logo.png
What’s Novel- Application to Portfolio Optimization
Markowitz optimization is a well know theory, but hard to do right
The allocation is optimized under exponential utility
argmaxαpos
PTαpos −1
2ζαT
posΣαpos
where αpos is the notional allocation,p is the asset price at time t ,P t = E[pt+τ |pt ] is the expected profit,
Σ is the asset return covariance matrix, andζ is the risk aversion free variable.
In this talk we will only focus on improving covariance estimation
We want a better estimate of Σ→ Σt
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 6 / 42
university-of-pennsylvania-logo.png
What’s Novel- Application to Portfolio Optimization
Markowitz optimization is a well know theory, but hard to do right
The allocation is optimized under exponential utility
argmaxαpos
PTαpos −1
2ζαT
posΣαpos
where αpos is the notional allocation,p is the asset price at time t ,P t = E[pt+τ |pt ] is the expected profit,
Σ is the asset return covariance matrix, andζ is the risk aversion free variable.
In this talk we will only focus on improving covariance estimation
We want a better estimate of Σ→ Σt
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 6 / 42
university-of-pennsylvania-logo.png
What’s Novel- Application to Portfolio Optimization
Markowitz optimization is a well know theory, but hard to do right
The allocation is optimized under exponential utility
argmaxαpos
PTαpos −1
2ζαT
posΣαpos
where αpos is the notional allocation,p is the asset price at time t ,P t = E[pt+τ |pt ] is the expected profit,
Σ is the asset return covariance matrix, andζ is the risk aversion free variable.
In this talk we will only focus on improving covariance estimation
We want a better estimate of Σ→ Σt
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 6 / 42
university-of-pennsylvania-logo.png
What’s Novel- Application to Portfolio Optimization
Markowitz optimization is a well know theory, but hard to do right
The allocation is optimized under exponential utility
argmaxαpos
PTαpos −1
2ζαT
posΣαpos
where αpos is the notional allocation,p is the asset price at time t ,P t = E[pt+τ |pt ] is the expected profit,
Σ is the asset return covariance matrix, andζ is the risk aversion free variable.
In this talk we will only focus on improving covariance estimation
We want a better estimate of Σ→ Σt
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 6 / 42
university-of-pennsylvania-logo.png
Drawbacks of Current Models
Modern approaches are constrained by computational complexity
Trade off between model richness and data richness
Difficult to both explain and identify the model
Incorporation of exogenous data is often difficult in empirical models
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 7 / 42
university-of-pennsylvania-logo.png
Drawbacks of Current Models
Modern approaches are constrained by computational complexity
Trade off between model richness and data richness
Difficult to both explain and identify the model
Incorporation of exogenous data is often difficult in empirical models
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 7 / 42
university-of-pennsylvania-logo.png
Drawbacks of Current Models
Modern approaches are constrained by computational complexity
Trade off between model richness and data richness
Difficult to both explain and identify the model
Incorporation of exogenous data is often difficult in empirical models
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 7 / 42
university-of-pennsylvania-logo.png
Drawbacks of Current Models
Modern approaches are constrained by computational complexity
Trade off between model richness and data richness
Difficult to both explain and identify the model
Incorporation of exogenous data is often difficult in empirical models
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 7 / 42
university-of-pennsylvania-logo.png
Outline
1 MotivationWhat’s Novel?Portfolio OptimizationNon-Stationary Covariance
2 Introduction to Factorial HMMsHMM Application to ProblemsQuick Overview of Hidden Markov ModelsEstimationFactorial HMM
3 Empirical Results
4 Conclusion
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 8 / 42
university-of-pennsylvania-logo.png
S&P 500 realized variance
Figure: S&P 500 variance (second resolution)
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 9 / 42
university-of-pennsylvania-logo.png
S&P 500 and 30 Year Treasury realized covariance
Figure: S&P 500 and 30 Year Treasury covariance (second resolution)
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 10 / 42
university-of-pennsylvania-logo.png
Outline
1 MotivationWhat’s Novel?Portfolio OptimizationNon-Stationary Covariance
2 Introduction to Factorial HMMsHMM Application to ProblemsQuick Overview of Hidden Markov ModelsEstimationFactorial HMM
3 Empirical Results
4 Conclusion
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 11 / 42
university-of-pennsylvania-logo.png
Common Applications of Hidden Markov Models
Gene recognition
Robotics
Natural language processing tasks
Speech Recognition
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 12 / 42
university-of-pennsylvania-logo.png
Outline
1 MotivationWhat’s Novel?Portfolio OptimizationNon-Stationary Covariance
2 Introduction to Factorial HMMsHMM Application to ProblemsQuick Overview of Hidden Markov ModelsEstimationFactorial HMM
3 Empirical Results
4 Conclusion
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 13 / 42
university-of-pennsylvania-logo.png
Hidden Markov Models
There are two primary assumptions for this basic HMM:
1 The underlying hidden state process is Markovian
2 Given the hidden states, the observations are independent
t
ht
xt
t + 1
ht+1
xt+1
t + 2
ht+2
xt+2
Figure: HMM with states ht , ht+1, and ht+2 that emit observations xt , xt+1, andxt+2 respectively.
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 14 / 42
university-of-pennsylvania-logo.png
Hidden Markov Models
The probability distribution over the next hidden state at time t + 1depends only on the current hidden state at time t
Pr(ht+1 | ht , . . . , h1) = Pr(ht+1 | ht).
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 15 / 42
university-of-pennsylvania-logo.png
The Hidden Markov Model parameters
T =Pr(ht+1|ht = i)
Collection of λ(x)′s
...
...
Pr(xt+1|ht+1)
π =
Pr(h1)
Figure: Pictorial view of HMM parametersJoao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 16 / 42
university-of-pennsylvania-logo.png
Hidden Markov Models
The likelihood of a sequence of observations from a specified model is
Pr(x1, . . . , xt) =∑
h1,...,ht
[π]h1
t∏j=2
[T ]hj ,hj−1
t∏j=1
[λ(xj)]hj
though we will not consider this particular form of the likelihood. Instead,we will look at a “new” form for the likelihood,
Pr(xt , . . . , x1) = 1>A(xt) · · ·A(x1)π
where λ(x) is the distribution of the observation given a hidden state, and
A(xt) = Tdiag(λ(x)).
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 17 / 42
university-of-pennsylvania-logo.png
Hidden Markov Models
A(x) =
λ(x) = Pr(x |h)
=
Pr(ht+1, x |ht = 1)
Figure: A(x), graphically
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 18 / 42
university-of-pennsylvania-logo.png
Outline
1 MotivationWhat’s Novel?Portfolio OptimizationNon-Stationary Covariance
2 Introduction to Factorial HMMsHMM Application to ProblemsQuick Overview of Hidden Markov ModelsEstimationFactorial HMM
3 Empirical Results
4 Conclusion
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 19 / 42
university-of-pennsylvania-logo.png
Spectral Methods for Estimation
Spectral methods use singular value decomposition (SVD) andmethod of moments.
Fast SVD instead of forward/backward method EM estimation.
Computing observables for spectral estimation of an HMM, fullyreduced third moment.
Estimation speed is critical given the size of high frequency financialdatasets.
For US equities sampling per second yields roughly 5 million datapoints per year per stock!
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 20 / 42
university-of-pennsylvania-logo.png
Spectral Methods for Estimation
Spectral methods use singular value decomposition (SVD) andmethod of moments.
Fast SVD instead of forward/backward method EM estimation.
Computing observables for spectral estimation of an HMM, fullyreduced third moment.
Estimation speed is critical given the size of high frequency financialdatasets.
For US equities sampling per second yields roughly 5 million datapoints per year per stock!
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 20 / 42
university-of-pennsylvania-logo.png
Spectral Methods for Estimation
Spectral methods use singular value decomposition (SVD) andmethod of moments.
Fast SVD instead of forward/backward method EM estimation.
Computing observables for spectral estimation of an HMM, fullyreduced third moment.
Estimation speed is critical given the size of high frequency financialdatasets.
For US equities sampling per second yields roughly 5 million datapoints per year per stock!
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 20 / 42
university-of-pennsylvania-logo.png
Spectral Methods for Estimation
Spectral methods use singular value decomposition (SVD) andmethod of moments.
Fast SVD instead of forward/backward method EM estimation.
Computing observables for spectral estimation of an HMM, fullyreduced third moment.
Estimation speed is critical given the size of high frequency financialdatasets.
For US equities sampling per second yields roughly 5 million datapoints per year per stock!
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 20 / 42
university-of-pennsylvania-logo.png
Spectral Methods for Estimation
Spectral methods use singular value decomposition (SVD) andmethod of moments.
Fast SVD instead of forward/backward method EM estimation.
Computing observables for spectral estimation of an HMM, fullyreduced third moment.
Estimation speed is critical given the size of high frequency financialdatasets.
For US equities sampling per second yields roughly 5 million datapoints per year per stock!
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 20 / 42
university-of-pennsylvania-logo.png
Spectral Algorithm Sketch
Calculate E [X2 ⊗ X1].
Calculate fast SVD of E [X2 ⊗ X1] keeping k left singular vectors.
Reduce the data where y = U>x .
Compute the first three moments E [Y1],E [Y2⊗Y1],E [Y3⊗Y1⊗Y2].
In the discrete case,
Pr(xt , . . . , x1) = b∞B(yt) · · ·B(y1)b1
where B(y) is the similarity transform of A(x).
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 21 / 42
university-of-pennsylvania-logo.png
Generalization to the Continuous Case
To generalize to the continuous case we need to take expectations where,
Pr(xt , . . . , x1) = b∞B(G (xt)) · · ·B(G (x1))b1
and G (x) is an estimate of E [Y2 | x1].
B(G (x)) is exactly what we want, up to a constant factor depending on x .
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 22 / 42
university-of-pennsylvania-logo.png
Outline
1 MotivationWhat’s Novel?Portfolio OptimizationNon-Stationary Covariance
2 Introduction to Factorial HMMsHMM Application to ProblemsQuick Overview of Hidden Markov ModelsEstimationFactorial HMM
3 Empirical Results
4 Conclusion
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 23 / 42
university-of-pennsylvania-logo.png
Factorial HMM
Different state layers evolve differently
Figure: Factorial HMM diagram
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 24 / 42
university-of-pennsylvania-logo.png
Factorial HMM
Figure: Structured Factorial HMM diagram
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 25 / 42
university-of-pennsylvania-logo.png
Structured Factorial HMM Differences
Improvements
Faster estimation using Spectral methods
Intuition about time horizon
Simple layer aggregation
Drawbacks
Jumps in covariance estimation at hourly boundaries
Heuristic choice of time horizon
Requires lots of data
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 26 / 42
university-of-pennsylvania-logo.png
Structured Factorial HMM Differences
Improvements
Faster estimation using Spectral methods
Intuition about time horizon
Simple layer aggregation
Drawbacks
Jumps in covariance estimation at hourly boundaries
Heuristic choice of time horizon
Requires lots of data
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 26 / 42
university-of-pennsylvania-logo.png
Structured Factorial HMM Differences
Improvements
Faster estimation using Spectral methods
Intuition about time horizon
Simple layer aggregation
Drawbacks
Jumps in covariance estimation at hourly boundaries
Heuristic choice of time horizon
Requires lots of data
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 26 / 42
university-of-pennsylvania-logo.png
Structured Factorial HMM Differences
Improvements
Faster estimation using Spectral methods
Intuition about time horizon
Simple layer aggregation
Drawbacks
Jumps in covariance estimation at hourly boundaries
Heuristic choice of time horizon
Requires lots of data
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 26 / 42
university-of-pennsylvania-logo.png
Structured Factorial HMM Differences
Improvements
Faster estimation using Spectral methods
Intuition about time horizon
Simple layer aggregation
Drawbacks
Jumps in covariance estimation at hourly boundaries
Heuristic choice of time horizon
Requires lots of data
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 26 / 42
university-of-pennsylvania-logo.png
Structured Factorial HMM Differences
Improvements
Faster estimation using Spectral methods
Intuition about time horizon
Simple layer aggregation
Drawbacks
Jumps in covariance estimation at hourly boundaries
Heuristic choice of time horizon
Requires lots of data
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 26 / 42
university-of-pennsylvania-logo.png
Stock Covariance
Model Horizon RMSE N training N out of sample
CAPM daily 0.9 ∗ 10−5 3125 1000
CAPM hourly 1.2 ∗ 10−7 ∼ 40000 ∼ 4000
CAPM second 1.7 ∗ 10−8 ∼ 4000000 ∼ 400000
PCA (1) daily 0.85 ∗ 10−5 3124 1000
PCA (1) hourly 1.0 ∗ 10−7 ∼ 40000 ∼ 4000
PCA (1) second 1.6 ∗ 10−8 ∼ 4000000 ∼ 400000
GARCH daily 0.6 ∗ 10−5 3124 1000
GARCH hourly 0.9 ∗ 10−7 ∼ 40000 ∼ 4000
GARCH second 1.2 ∗ 10−8 ∼ 4000000 ∼ 400000
FHMM daily 1.2 ∗ 10−6 3124 1000
FHMM hourly 3.0 ∗ 10−7 ∼ 40000 ∼ 4000
FHMM second 0.9 ∗ 10−9 ∼ 4000000 ∼ 400000
(1) 15 principal components
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 27 / 42
university-of-pennsylvania-logo.png
Summary
Major Contributions
Multiple time frames
Richer model
Intuitive explanation of model
Fast estimation
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 28 / 42
university-of-pennsylvania-logo.png
Summary
Major Contributions
Multiple time frames
Richer model
Intuitive explanation of model
Fast estimation
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 28 / 42
university-of-pennsylvania-logo.png
Summary
Major Contributions
Multiple time frames
Richer model
Intuitive explanation of model
Fast estimation
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 28 / 42
university-of-pennsylvania-logo.png
Summary
Major Contributions
Multiple time frames
Richer model
Intuitive explanation of model
Fast estimation
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 28 / 42
university-of-pennsylvania-logo.png
Thanks for listening!
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 29 / 42
university-of-pennsylvania-logo.png
Future Work
Empirical frequency selection
Expansion to other datasets (energy / weather)
Better estimation on lower time horizons
Test more distributions for G(x)
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 30 / 42
university-of-pennsylvania-logo.png
Future Work
Empirical frequency selection
Expansion to other datasets (energy / weather)
Better estimation on lower time horizons
Test more distributions for G(x)
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 30 / 42
university-of-pennsylvania-logo.png
Future Work
Empirical frequency selection
Expansion to other datasets (energy / weather)
Better estimation on lower time horizons
Test more distributions for G(x)
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 30 / 42
university-of-pennsylvania-logo.png
Future Work
Empirical frequency selection
Expansion to other datasets (energy / weather)
Better estimation on lower time horizons
Test more distributions for G(x)
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 30 / 42
university-of-pennsylvania-logo.png
For Further Reading I
Spectral Algorithm for Learning Hidden Markov Models Hsu, Kakade,Zhang 2009Finding structure with randomness: Probabilistic algorithms forconstructing approximate matrix decompositions. Halko, Martinsson, Tropp2011Using Regression for Spectral Estimation, Foster, Rodu, Ungar, Wu 2013Two Step CCA: A new spectral method for estimating vector models ofwords, Dhillon, Foster, Rodu, Ungar 2013Spectral Dependency Parsing with Latent Variables, Collins, Dhillon, Foster,Rodu, Ungar 2012Spectral Dimensionality Reduction for HMMs, Foster, Rodu, Ungar 2012
Papers and Projects In Progress
Spectral Estimation of HMMs with a continuous output distribution,Foster, Rodu, Ungar (in progress)Spectral Estimation of hierarchical HMMs, Foster, Rodu, Sedoc, Ungar (inprogress)
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 31 / 42
university-of-pennsylvania-logo.png
Appendix
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 32 / 42
university-of-pennsylvania-logo.png
Spectral Methods for Estimation
In this section we will describe how to build the observables B(x).First note that the first three moments of the data from an HMM yieldthe following theoretical form:
E [X1] = Mπ
E [X2 ⊗ X1] = MT diag(π) M>
E [X3 ⊗ X1 ⊗ X2] = MT diag(λ(x)) T diag(π) M>
where in this particular settingX1 is Pr(Σt−1), X2 is Pr(Σt), X3 is Pr(Σt+1),π is the initial state vector, andM is the expected value of x given hidden state i .
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 33 / 42
university-of-pennsylvania-logo.png
Spectral Algorithm Sketch
Calculate E [X2 ⊗ X1].
Calculate fast SVD of E [X2 ⊗ X1] keeping k left singular vectors.
Reduce the data where y = U>x .
Compute the first three moments E [Y1],E [Y2⊗Y1],E [Y3⊗Y1⊗Y2].
Consider an U such that U>M is invertible,then estimating the second and third moments with reduced datay = U>x allows in the discrete case,
B(x) ≡ E [Y3 ⊗ Y1 ⊗ Y2](λ(x))E [Y2 ⊗ Y1]−1
= (U>M)T diagλ(x)(U>M)−1.
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 34 / 42
university-of-pennsylvania-logo.png
Generalization to the Continuous Case
To generalize to the continuous case we need to take expectations where,
B(G (x)) = (U>M)Tdiagλ(x)(U>M)−11
Pr(x)
where Pr(x) is the marginal probability, andG (x) is a function of E [Y2 | x1].B(G (x)) is exactly what we want, up to a constant factor depending on xas
Pr(Y1, . . . ,Yt) ≡ b>∞ B(G (xt)) · · · B(G (x1)) b1.
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 35 / 42
university-of-pennsylvania-logo.png
Outline
Continuous Emission HMM
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 36 / 42
university-of-pennsylvania-logo.png
Continuous Emission HMM
Define g(x) ≡ E [Y2|x1].Let ht be the probability vector associated with begin in a particular stateat time t. Then
E [y2|h2] = U>Mh2.
Also,
E [h2|h1] = Th1.
thus
E [y2|h1] = U>MTh1
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 37 / 42
university-of-pennsylvania-logo.png
Continuous HMM Emission
To establish a belief about h1 given x1, recall from Bayes formula
Pr(h1|x1) =Pr(x1|h1) Pr(h1)
Pr(x1)
We can arrange each probability into a vector, and because in the indicatorvector case the probability vector is the same as the expected value vector,we have, in vector notation
E [h1|x1] =diagπλ(x)
π>λ(x)
and so putting together the pieces we get
E [y2|x1] =U>MTdiagπλ(x)
π>λ(x)
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 38 / 42
university-of-pennsylvania-logo.png
Continuous HMM Emission
Recall that the goal is to isolate λ(x). Note that
E [y2 ⊗ y1]−1g(x) =(M>U)−1λ(x)
π>λ(x)
≡ G (x)
When this is plugged into our fully reduced version of B(γ), we get
B(G (x)) = (U>M)TdiagM>UG (x)(U>M)−1
= (U>M)Tdiagλ(x)(U>M)−11
Pr(x)
where Pr(x) is the marginal probability. B(G (x)) is exactly what we want,up to a constant factor depending on x .
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 39 / 42
university-of-pennsylvania-logo.png
Spectral Estimation Algorithm
Algorithm 1 Computing observables for spectral estimation of an HMM,fully reduced third moment
1: Input: Training examples- x (i) for i ∈ {1, . . . ,M} where x (i) = x(i)1 , x
(i)2 , x
(i)3 .
2: Compute E [x2 ⊗ x1] = 1M
∑mi=1 x
(i)2 x
(i)>1 .
3: Compute the left k eigenvectors corresponding to the top k eigenvalues of Σ.Call the matrix of these eigenvectors U.
4: Reduce data: y = U>x .
5: Compute µ = 1M
∑Mi=1 y
(i)1 , Σ = 1
M
∑Mi=1 y
(i)2 y
(i)>1 and tensor C =
1M
∑Mi=1 y
(i)3 ⊗ y
(i)1 ⊗ y
(i)2 .
6: Set b1 = µ and b>∞ = b>1 Σ−1
7: Right multiply each slice of the tensor in the y2 direction (so y2 is being sliced
up, leaving the y3y>1 matrices intact) by Σ−1 to form B(γ) = C (γ)Σ−1
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 40 / 42
university-of-pennsylvania-logo.png
Similarity Transform from A(x) to B(x)
Unfortunately, A(x) isn’t directly learnable. However an appropriatesimilarity transformation of A(x) (of which there are more than one) islearnable by the method of moments, bypassing the need to recover theHMM parameters, and still gets us what we want. Note that
P(x1, . . . , xt) = 1> A(xt) · · · A(x1) π
= 1>S−1︸ ︷︷ ︸b>∞
SA(xt)S−1︸ ︷︷ ︸
B(xt)
S · · · S−1 SA(x1)S−1 Sπ︸︷︷︸b1
≡ b>∞ B(xt) · · · B(x1) b1
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 41 / 42
university-of-pennsylvania-logo.png
Markowitz Optimization
Given a vector of current prices pt and unknown future prices Ptτ themarket value is
Ψα = αTpos (Pt+τ − pt) (1)
Assuming that the market is Gaussian, the price distribution is
Pt+τ ∼ N (µ,Σ) (2)
Therefore the distribution of the portfolio is
Ψα ∼ N(αT
pos (µ− pt) ,αTposΣαpos
)(3)
The allocation is optimized under exponential utility, having risk-aversionparameter ζ, and the certainty equivalent by the quadratic program QP
argmaxαposCE(αpos) = PTαpos −
1
2ζαT
posΣαpos (4)
where P is the expected profit, roughly defined as P t = E[Pt+τ |pt ].Numeric optimizers seek to minimize, define the objective function asf (α) ≡ −CE(α).
Joao Sedoc Estimating Covariance Using Factorial Hidden Markov Models 42 / 42