Summary of Polson and Sokolov 2018 · Summary of Polson and Sokolov 2018 Deep Learning for Energy...
Transcript of Summary of Polson and Sokolov 2018 · Summary of Polson and Sokolov 2018 Deep Learning for Energy...
Summary of Polson and Sokolov 2018Deep Learning for Energy Markets
David Prentiss
OR750-004
November 12, 2018
The PJM Interconnection
I The Pennsylvania–New Jersey–Maryland Interconnection(PTO) is a regional transmission organization (RTO).
I It implements a wholesale electricity market for a network ofproducers and consumers in the Mid-Atlantic.
I It’s primary purpose is to prevent outages or otherwise un-metdemand.
I Obligations are exchanged in bilateral contracts, theday-ahead market, and the real-time market.
Local marginal price data
I Local Marginal Prices (LMP) are price data aggregated forprices in various locations and interconnection services is thenetwork.
I They reflect the cost of producing and transmitting electricityin the network.
I Prices are non-linear because electricity.
I This paper proposes a NN to model price extremes.
Load vs. price
Load vs. previous load
RNN vs. long short-term memory
Vanilla RNN
ht = tanh
(W
(ht−1
xt
))LSTM
ifok
=
σσσ
tanh
◦W (ht−1
xt
)
ct = f � ct−1 + i � k
ht = o � tanh (ct)
LTSM model
ifok
=
σσσ
tanh
◦W (ht−1
xt
)
ct = f � ct−1 + i � k
ht = o � tanh (ct)
Extreme value theory
I Extreme value analysis begins by filtering the data to select“extreme” values.
I Extreme values are selected by one of two methods.I Block maxima: Select the peak values after dividing the series
into periods.I Peak over threshold: Select values larger than some threshold.
I Peak over threshold used in this paper.
Peak over threshold
I Pickands–Balkema–de Hann (1974 and 1975) theoremcharacterizes the asymptotic tail distribution of an unknowndistribution.
I Distribution of events that exceed a threshold areapproximated with the generalized Pareto distribution.
I Low threshold increases bias.
I High threshold increases variance.
Generalized Pareto distribution
I CDF
H(y | σ, ξ) = 1−(
1 + ξy − u
σ
)−1ξ
+
I PDF
h(y | σ, ξ) = 1− 1
σ
(1 + ξ
y − u
σ
)−1ξ−1
Parameters
h(y | σ, ξ) = 1− 1
σ
(1 + ξ
y − u
σ
)−1ξ−1
I Location, u, is the threshold
I Scale, σ, is our learned parameter
I Shape, ξ = f (u, σ)? EX [y ] = σ + u =⇒ ξ = 0?
Fourier (ARIMA) model
Fourier (ARIMA) model vs DL
Demand forcasting DL–EVT
I DL–EVT Architecture
X → tanh(W (1)X + b(1)
)→ Z (1) → exp
(tanh
(Z (1)
)→ σ(X )
I W (1) ∈ Rp×3, x ∈ Rp, p = 24 (one day)
I Threshold, u = 31, 000
Vanilla DL vs. DL-EVT