Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu...
-
Upload
kathryn-lizbeth-black -
Category
Documents
-
view
222 -
download
0
description
Transcript of Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu...
![Page 1: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/1.jpg)
Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines
Yu Nishiyama and Sumio Watanabe
Tokyo Institute of Technology, Japan
![Page 2: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/2.jpg)
BackgroundLearning machines
Mixture modelsHidden Markov modelsBayesian networks
Pattern recognitionNatural language processing
Gene analysis
Information systems
mathematically
Bayes learning is effective
Singular statistical models
![Page 3: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/3.jpg)
Problem : Calculations which include a Bayes posterior require huge computational cost.
Mean field approximation
a Bayes posterior a trial distribution
Stochastic Complexity
Accuracy of approximation Difference from regular Model selection statistical models
![Page 4: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/4.jpg)
Asymptotic behavior of mean field stochastic complexities are studied.
Mixture models [ K. Watanabe, et al. 2004. ] Reduced rank regressions [ Nakajima, et al. 2005. ]
Hidden Markov models [ Hosino, et al. 2005. ] Stochastic context-free grammar [ Hosino, et al. 2005. ]
Neural networks [ Nakano, et al. 2005. ]
![Page 5: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/5.jpg)
PurposeWe derive the upper bound of mean field stochastic complexity of complete bipartite graph-type Boltzmann machines.
Boltzmann Machines
Graphical models
Spin systems
![Page 6: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/6.jpg)
Table of ContentsReview
Bayes LearningMean Field ApproximationBoltzmann Machines
Main Theorem
Outline of the Proof Discussion and Conclusion
Main Theorem
( Complete Bipartite Graph-type )
![Page 7: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/7.jpg)
Bayes Learning
1X 2X nX
)(
)|()()|( 1
n
i
n
in
XZ
XpXp
dXpxpXxp nn )|()|()|(
)(xqTrue distribution
)(
)|( xp model
prior
: Bayes posterior
: Bayes predictive distribution
![Page 8: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/8.jpg)
Mean Field Approximation (1)
)()}(~exp{
)(
)|()()|( 1
nn
n
i
n
in
XZHn
XZ
XpXp
0
)()}(~exp{
)(log)()]|(||)([
d
XZHn
ffXpfD
nn
n
dHfndffXZ nn )(~)()(log)()(log
The Bayes posterior can be rewritten as
We consider a Kullback distance from a trial distribution
to the Bayes posterior
)(f
)|( nXp
.
.
![Page 9: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/9.jpg)
Mean Field Approximation (2)
])(~)()(log)([)](log[ dHfndffEXZE nXn
X nn
When we restrict the trial distribution
)(f to
)()(1
ii
d
i
ff
The minimum value of
which minimizes )(f
}])(~)()(log)({min[)()(
dHfndffEnF nfX n
is called mean field stochastic complexity.
,
is called mean field approximation.
![Page 10: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/10.jpg)
Complete Bipartite Graph-typeBoltzmann Machines
1y 2y 3y Ky
1x 2x Mx
Kunits
M units
ijw KMw
Mjjx 1}{
Kiiy 1}{
)exp(
)exp()|(
11
11
ijij
M
j
K
i
ijij
M
j
K
i
yxw
yxwwxp
yx
y
)(
)exp(11
wZ
yxw ijij
M
jy
K
i i
)(
)cosh(11
wZ
xw jij
M
j
K
i
parametric model takes }1,1{
![Page 11: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/11.jpg)
True Distribution
1y Ky Ky
1x 2x Mx
K units
M units
1Ky
0ijw0
ijw
)( KK
We assume that the true distribution is included in the parametric model
)|( wxp and the number of hidden units is
.
)(
)cosh()|( 11
wZ
xwwxp
jij
M
j
K
i
True distribution is
![Page 12: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/12.jpg)
Main TheoremThe mean field stochastic complexity of complete bipartite graph-type Boltzmann machines has the following upper bound.
CnKMMKnF
log4
)(
M: the number of input and output units K: the number of hidden units (learning machines)
K: the number of hidden units (true distribution)
C: constant
![Page 13: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/13.jpg)
Outline of the Proof (Methods)
dwwHwfndwwfwfnF )(~)(~)(~log)(~)(
})ˆ(2
1exp{21)( 2
211
ijij
M
j
K
i
KM
www
})ˆ(exp{)(
1)(~ 2
11ijijij
M
j
K
i
wwNNZ
wf
normal distribution family
prior
depends on the BM
![Page 14: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/14.jpg)
Outline of the Proof [lemma]
of parameter )(H dRand ,
such that the number of elements of the set
if there exists a value
0)( and0)ˆ(;ˆ
2
2
i
HHi
is less than or equal to r, mean field stochastic complexity has the
)1(log4
)( OnrdnF
0
rdero
non-z
following upper bound. Hessian matrix
For Kullback information
![Page 15: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/15.jpg)
We apply this lemma to the Boltzmann machines.
)(
)cosh()( 11
wZ
xwwH
jij
M
j
K
i
x
)(
)cosh(
)(
)cosh(
log
11
11
wZ
xw
wZ
xw
jij
M
j
K
i
jij
M
j
K
i
Kullback information is given by
The second order differential is
wwwwH
ˆ
2
2 )(
ww
tt ˆ2
ˆ )(
Here
.
.
xxwt jj
M
j
)tanh(1
)|()ˆ|()|( ˆ wxfwxpwxfw
x, .
![Page 16: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/16.jpg)
The parameter is a true parameter
*w
0w },,1{ KK for
.
wwwwH2
2 )(
0)( 2 wwtt
0)0tanh()tanh(1
xxxwt jj
M
j},,1{ KK
Then,
becomest
},,1{ KK
.
MKr KMd hold.
By using the lemma, we have
CnKMMKnF
log4
)( .
,
0
MK
KMero
non-z
Then,
.
and
![Page 17: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/17.jpg)
Discussion
n
CnKM log
2
Comparison with other studiesregular statistical model
:Number of Training dataasymptotic
area
Bayes learning
mean field approximation
derived resultCnKMMK
log4
upper bound
algebraic geometry
[Yamazaki]
upper bound
Stochastic Complexity
![Page 18: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,](https://reader035.fdocuments.us/reader035/viewer/2022062401/5a4d1b627f8b9ab0599adfc1/html5/thumbnails/18.jpg)
ConclusionWe derived the upper bound of mean field stochastic complexity of complete bipartite graph-type Boltzmann Machines.
Lower bound
Future works
Comparison with experimental results