Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator...
-
Upload
cornelia-walsh -
Category
Documents
-
view
220 -
download
2
Transcript of Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator...
![Page 1: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/1.jpg)
Introduction to Radial Basis Function Networks
![Page 2: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/2.jpg)
Content
Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function Approximation The Projection Matrix Learning the Kernels Bias-Variance Dilemma The Effective Number of Parameters Model Selection Incremental Operations
![Page 3: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/3.jpg)
Introduction to Radial Basis Function Networks
Overview
![Page 4: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/4.jpg)
Typical Applications of NN
Pattern Classification
Function Approximation
Time-Series Forecasting
( )l f xl C N
( )fy x mY R y
mX R x
nX R x
1 2 3( ) ( , , , )t t tft x x x x
![Page 5: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/5.jpg)
Function Approximation
mX R fnY R
:f X YUnknown
mX R nY R
ˆ :f X YApproximator
f
![Page 6: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/6.jpg)
NeuralNetwork
Supervised Learning
UnknownFunction
ixiy
ˆ iy1,2,i +
+
ie
![Page 7: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/7.jpg)
Neural Networks as Universal Approximators
Feedforward neural networks with a single hidden layer of sigmoidal units are capable of approximating uniformly any continuous multivariate function, to any desired degree of accuracy.
– Hornik, K., Stinchcombe, M., and White, H. (1989). "Multilayer Feedforward Networks are Universal Approximators," Neural Networks, 2(5), 359-366.
Like feedforward neural networks with a single hidden layer of sigmoidal units, it can be shown that RBF networks are universal approximators.
– Park, J. and Sandberg, I. W. (1991). "Universal Approximation Using Radial-Basis-Function Networks," Neural Computation, 3(2), 246-257.
– Park, J. and Sandberg, I. W. (1993). "Approximation and Radial-Basis-Function Networks," Neural Computation, 5(2), 305-316.
![Page 8: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/8.jpg)
Statistics vs. Neural Networks
Statistics Neural Networksmodel networkestimation learningregression supervised learninginterpolation generalizationobservations training setparameters (synaptic) weightsindependent variables inputsdependent variables outputsridge regression weight decay
![Page 9: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/9.jpg)
Introduction to Radial Basis Function Networks
The Model ofFunction Approximator
![Page 10: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/10.jpg)
Linear Models
1
(( ) )ii
i
m
f w
x xFixed BasisFunctions
Weights
![Page 11: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/11.jpg)
Linear Models
Feature VectorsInputs
HiddenUnits
OutputUnits
• Decomposition• Feature Extraction• Transformation
Linearly weighted output
1
(( ) )ii
i
m
f w
x x
y
11 22 mm
x1 x2 xn
w1 w2 wm
x =
![Page 12: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/12.jpg)
Linear Models
Feature VectorsInputs
HiddenUnits
OutputUnits
• Decomposition• Feature Extraction• Transformation
Linearly weighted output
y
11 22 mm
x1 x2 xn
w1 w2 wm
x =
Can you say some bases?Can you say some bases?
1
(( ) )ii
i
m
f w
x x
![Page 13: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/13.jpg)
Example Linear Models
Polynomial
Fourier Series
( ) ii
iwf xx
0exp 2( )k
k jf w k xx
Are they orthogonal bases?Are they orthogonal bases?
( ) , 0,1,2,ii ix x
0( ) exp 2 0,1,, 2, k x j k x k
1
(( ) )ii
i
m
f w
x x
![Page 14: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/14.jpg)
y
11 22 mm
x1 x2 xn
w1 w2 wm
x =
Single-Layer Perceptrons as Universal Aproximators
HiddenUnits
With sufficient number of sigmoidal units, it can be a universal approximator.
1
(( ) )ii
i
m
f w
x x
![Page 15: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/15.jpg)
y
11 22 mm
x1 x2 xn
w1 w2 wm
x =
Radial Basis Function Networks as Universal Aproximators
HiddenUnits
With sufficient number of radial-basis-function units, it can also be a universal approximator.
1
(( ) )ii
i
m
f w
x x
![Page 16: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/16.jpg)
Non-Linear Models
1
(( ) )ii
i
m
f w
x xAdjusted by the
Learning process
Weights
1
(( ) )ii
i
m
f w
x x
![Page 17: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/17.jpg)
Introduction to Radial Basis Function Networks
The Radial Basis Function Networks
![Page 18: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/18.jpg)
Radial Basis Functions1
(( ) )ii
i
m
f w
x x
Center
Distance Measure
Shape
Three parameters for a radial function:
xi
r = ||x xi||
i(x)= (||x xi||)
![Page 19: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/19.jpg)
Typical Radial Functions
Gaussian
Hardy Multiquadratic
Inverse Multiquadratic
2
22 0 and r
r e r
2 2 0 and r r c c c r
2 2 0 and r c r c c r
![Page 20: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/20.jpg)
Gaussian Basis Function (=0.5,1.0,1.5)
2
22 0 and r
r e r
0.5 1.0
1.5
![Page 21: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/21.jpg)
Inverse Multiquadratic
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-10 -5 0 5 10
2 2 0 and r c r c c r
c=1
c=2c=3
c=4c=5
![Page 22: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/22.jpg)
++
Most General RBF
1( ) ( )i iT
i μ x μx x
+1μ
2μ 3μ
Basis {i: i =1,2,…} is `near’ orthogonal.
![Page 23: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/23.jpg)
Properties of RBF’s
On-Center, Off Surround
Analogies with localized receptive fields found in several biological structures, e.g.,– visual cortex;– ganglion cells
![Page 24: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/24.jpg)
The Topology of RBF
Feature Vectorsx1 x2 xn
y1 ym
Inputs
HiddenUnits
OutputUnits
Projection
Interpolation
As a function approximator ( )y f x
![Page 25: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/25.jpg)
The Topology of RBF
Feature Vectorsx1 x2 xn
y1 ym
Inputs
HiddenUnits
OutputUnits
Subclasses
Classes
As a pattern classifier.
![Page 26: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/26.jpg)
Introduction to Radial Basis Function Networks
RBFN’s forFunction Approximation
![Page 27: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/27.jpg)
The idea
x
y Unknown Function to Approximate
TrainingData
![Page 28: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/28.jpg)
The idea
x
y Unknown Function to Approximate
TrainingData
Basis Functions (Kernels)
![Page 29: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/29.jpg)
The idea
x
y
Basis Functions (Kernels)
Function Learned
1
)) ((m
iiiwy f
xx
![Page 30: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/30.jpg)
The idea
x
y
Basis Functions (Kernels)
Function Learned
NontrainingSample
1
)) ((m
iiiwy f
xx
![Page 31: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/31.jpg)
The idea
x
y Function Learned
NontrainingSample
1
)) ((m
iiiwy f
xx
![Page 32: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/32.jpg)
Radial Basis Function Networks as Universal Aproximators
1
( ) ( )m
ii
iy wf
x x
x1 x2 xn
w1 w2 wm
x =
Training set ( )
1
( ),p
k
k
ky
xT
Goal (( ))k kfy x for all k
2
( )
1
( )min kp
k
k
SSE fy
x
2
)(
1
) (
1
p mk
ik
ki
i
y w
x
![Page 33: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/33.jpg)
Learn the Optimal Weight Vector
w1 w2 wm
x1 x2 xnx =
Training set ( )
1
( ),p
k
k
ky
xT
Goal (( ))k kfy x for all k
1
( ) ( )m
ii
iy wf
x x
2
( )
1
( )min kp
k
k
SSE fy
x
2
)(
1
) (
1
p mk
ik
ki
i
y w
x
![Page 34: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/34.jpg)
2
( )
1
( )min kp
k
k
SSE fy
x
2
)(
1
) (
1
p mk
ik
ki
i
y w
x
Regularization
Training set ( )
1
( ),p
k
k
ky
xT
Goal (( ))k kfy x for all k
2
1
m
iiiw
2
1
m
iiiw
0i
If regularization is unneeded, set 0i
C
![Page 35: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/35.jpg)
Learn the Optimal Weight Vector
22( )
1
( )
1
p mk
ki
ii
k wfyC
xMinimize
1
( ) ( )m
ii
iwf
x x
(
( )
) ( )
1
2 2kp
jk
j
k
k
j j
wfC
fw w
y
x
x
( ) ( ) ( )
1
2 2p
k kj
kjj
k f wy
x x
0
( ) * ( ) ( )
1 1
)* (p p
k k kj j
k kjj
kwf y
x x x
*
1
*( ) ( )i
m
ii
wf
x x
![Page 36: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/36.jpg)
Learn the Optimal Weight Vector
(1) ( ), ,T
pj j j φ x x
* * (1) * ( ), ,T
pf ff x xDefine
(1) ( ), ,Tpy yy
* *j
T Tj jjw φ f φ y 1, ,j m
( ) * ( ) ( )
1 1
)* (p p
k k kj j
k kjj
kwf y
x x x
1
( ) ( )m
ii
iwf
x x *
1
*( ) ( )i
m
ii
wf
x x
![Page 37: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/37.jpg)
Learn the Optimal Weight Vector
*1 1
*2 2
*1
*
*
1
2 2
*
T T
T T
T Tm mmm
w
w
w
φ f φ y
φ f φ y
φ f φ y
* *j
T Tj jjw φ f φ y 1, ,j m
1 2, , , mΦ φ φ φ
1
2
m
Λ
* * *1 2* , , ,
T
mw w ww
Define
**T T wΦ Λf Φ y
![Page 38: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/38.jpg)
Learn the Optimal Weight Vector
**T T wΦ Λf Φ y
* (1)
* (2)*
* ( )p
f
f
f
x
xf
x
*
1
*( ) ( )i
m
ii
wf
x x
(1)
1
(2)
1
(
*
*
1
*
)
k
k
m
kk
m
kk
mP
kk
k
w
w
w
x
x
x
(1) (1) (1)*1 21
(2) (2) (2) *1 2 2
*( ) ( ) ( )
1 2
m
m
p p pm
m
w
w
w
x x x
x x x
x x x
*Φw
* *T T ΛwΦ wΦ Φ y
![Page 39: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/39.jpg)
Learn the Optimal Weight Vector
* *T T ΛwΦ wΦ Φ y
*T T wΦ ΛΦ Φ y
1* T T Λw Φ Φ Φ y
Variance Matrix
1 TA Φ y
1 :ADesign Matrix:Φ
![Page 40: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/40.jpg)
Summary
1* T T Λw Φ Φ Φ y
1 TA Φ y
(1)
(2)
( )
j
jj
pj
x
xφ
x
(1)
(2)
( )p
y
y
y
y
1 2, , , mΦ φ φ φ
*1*2
*
*
m
w
w
w
w
1
2
m
Λ
1
)) ((m
iiiwy f
xx
Training set ( )
1
( ),p
k
k
ky
xT
![Page 41: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/41.jpg)
Introduction to Radial Basis Function Networks
The Projection Matrix
![Page 42: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/42.jpg)
The Empirical-Error Vector
1* Tw A Φ y
*1w *
2w *mw
( )kx ( )1
kx ( )2kx ( )k
nx( )kx ( )1
kx ( )2kx ( )k
nx
UnknownFunction
UnknownFunction
y ** f Φw
1φ 2φ mφ
![Page 43: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/43.jpg)
The Empirical-Error Vector
1* Tw A Φ y
*1w *
2w *mw
( )kx ( )1
kx ( )2kx ( )k
nx( )kx ( )1
kx ( )2kx ( )k
nx
UnknownFunction
UnknownFunction
y ** f Φw
1φ 2φ mφ
Error Vector
* e y f * y Φw 1 T ΦAy Φ y
1 Tp
AI Φ Φ yPy
![Page 44: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/44.jpg)
Sum-Squared-Error
e Py1 T
p P I Φ ΦA
T A Φ Φ Λ
1 2, , , mΦ φ φ φError Vector
T Py Py2Ty P y
2( ) * ( )
1
pk k
k
SSE y f
x
If =0, the RBFN’s learning algorithm is to minimize SSE (MSE).
![Page 45: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/45.jpg)
The Projection Matrix
e Py1 T
p P I Φ ΦA
T A Φ Φ Λ
1 2, , , mΦ φ φ φError Vector
2TSSE y P y
1 2( , , , )mspane φ φ φΛ 0
( )P Py Pee Py
Ty Py
![Page 46: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/46.jpg)
Introduction to Radial Basis Function Networks
Learning the Kernels
![Page 47: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/47.jpg)
RBFN’s as Universal Approximators
x1 x2 xn
y1 yl
12 m
w11 w12w1m
wl1 wl2wlml
Training set
) ( )(
1,
pk
k
k
yxT
2
2( ) exp
2j
jj
x μx
Kernels
![Page 48: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/48.jpg)
What to Learn?
Weights wij’s Centers j’s of j’s Widths j’s of j’s
Number of j’s Model Selection
x1 x2 xn
y1 yl
12 m
w11 w12w1m
wl1 wl2wlml
![Page 49: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/49.jpg)
One-Stage Learning
( ) ( ) ( )1
k k kij i i jw y f x x
( ) ( )
1
lk k
i ij jj
f w
x x
( ) ( ) ( )2 2
1
ljk k k
j j ij i iij
w y f
x μμ x x
2
( ) ( ) ( )3 3
1
ljk k k
j j ij i iij
w y f
x μx x
![Page 50: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/50.jpg)
One-Stage Learning
( ) ( )
1
lk k
i ij jj
f w
x xThe simultaneous updates of all three sets of parameters may be suitable for non-stationary environments or on-line setting.
The simultaneous updates of all three sets of parameters may be suitable for non-stationary environments or on-line setting.
( ) ( ) ( )1
k k kij i i jw y f x x
( ) ( ) ( )2 2
1
ljk k k
j j ij i iij
w y f
x μμ x x
2
( ) ( ) ( )3 3
1
ljk k k
j j ij i iij
w y f
x μx x
![Page 51: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/51.jpg)
x1 x2 xn
y1 yl
12 m
w11 w12w1m
wl1 wl2wlml
Two-Stage Training
Step 1
Step 2
Determines Centers j’s of j’s. Widths j’s of j’s. Number of j’s.
Determines Centers j’s of j’s. Widths j’s of j’s. Number of j’s.
Determines wij’s.Determines wij’s.
E.g., using batch-learning.
![Page 52: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/52.jpg)
Train the Kernels
![Page 53: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/53.jpg)
+
+
+
+
+
Unsupervised Training
![Page 54: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/54.jpg)
Subset Selection– Random Subset Selection– Forward Selection– Backward Elimination
Clustering Algorithms– KMEANS– LVQ
Mixture Models– GMM
Methods
![Page 55: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/55.jpg)
Subset Selection
![Page 56: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/56.jpg)
Random Subset Selection
Randomly choosing a subset of points from training set
Sensitive to the initially chosen points.
Using some adaptive techniques to tune– Centers– Widths– #points
![Page 57: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/57.jpg)
Clustering Algorithms
Partition the data points into K clusters.
Partition the data points into K clusters.
![Page 58: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/58.jpg)
Clustering Algorithms
Is such a partition satisfactory?
Is such a partition satisfactory?
![Page 59: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/59.jpg)
Clustering Algorithms
How about this?How about this?
![Page 60: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/60.jpg)
+
++
+
Clustering Algorithms
1
2
3
4
![Page 61: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/61.jpg)
Introduction to Radial Basis Function Networks
Bias-Variance Dilemma
![Page 62: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/62.jpg)
Goal Revisit
Minimize Prediction Error
• Ultimate Goal Generalization
• Goal of Our Learning Procedure
Minimize Empirical Error
![Page 63: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/63.jpg)
Badness of Fit Underfitting
– A model (e.g., network) that is not sufficiently complex can fail to detect fully the signal in a complicated data set, leading to underfitting.
– Produces excessive bias in the outputs. Overfitting
– A model (e.g., network) that is too complex may fit the noise, not just the signal, leading to overfitting.
– Produces excessive variance in the outputs.
![Page 64: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/64.jpg)
Underfitting/Overfitting Avoidance
Model selection Jittering Early stopping Weight decay
– Regularization– Ridge Regression
Bayesian learning Combining networks
![Page 65: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/65.jpg)
Best Way to Avoid Overfitting
Use lots of training data, e.g.,– 30 times as many training cases as there are we
ights in the network.– for noise-free data, 5 times as many training ca
ses as weights may be sufficient.
Don’t arbitrarily reduce the number of weights for fear of underfitting.
![Page 66: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/66.jpg)
Badness of Fit
Underfit Overfit
![Page 67: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/67.jpg)
Badness of Fit
Underfit Overfit
![Page 68: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/68.jpg)
Bias-Variance Dilemma
Underfit Overfit
Large bias Small bias
Small variance Large variance
However, it's not really a dilemma.
![Page 69: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/69.jpg)
Underfit Overfit
Large bias Small bias
Small variance Large variance
Bias-Variance Dilemma
More on overfitting
Easily lead to predictions that are far beyond the range of the training data.
Produce wild predictions in multilayer perceptrons even with noise-free data.
More on overfitting
Easily lead to predictions that are far beyond the range of the training data.
Produce wild predictions in multilayer perceptrons even with noise-free data.
![Page 70: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/70.jpg)
Bias-Variance Dilemma
It's not really a dilemma.
fitmore
overfitmore
underfit
Bias
Variance
![Page 71: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/71.jpg)
Bias-Variance Dilemma
Sets of functionsE.g., depend on # hidden nodes used.
The true modelnoise
bias
Solution obtained with training set 2.
Solution obtained with training set 1.
Solution obtained with training set 3.
biasbias
The mean of the bias=?
The variance of the bias=?
![Page 72: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/72.jpg)
Bias-Variance Dilemma
Sets of functionsE.g., depend on # hidden nodes used.
The true modelnoise
The mean of the bias=?
The variance of the bias=?
bias
Variance
![Page 73: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/73.jpg)
Model Selection
Sets of functionsE.g., depend on # hidden nodes used.
The true modelnoise
bias
Variance
Reduce the effective number of parameters.
Reduce the number of hidden nodes.
![Page 74: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/74.jpg)
Bias-Variance Dilemma
Sets of functionsE.g., depend on # hidden nodes used.
The true modelnoise
bias
( )g x( ) ( )y g x x
( )f x
Goal: 2min ( ) ( )E y f x x
![Page 75: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/75.jpg)
Bias-Variance Dilemma
Goal: 2min ( ) ( )E y f x x
2( ) ( )E y f x x 2( ) ( )E g f x x
2 2( ) ( ) 2 ( ) ( )E g f g f x x x x
2 2( ) ( ) 2 ( ) ( )E g f E g f E x x x x
Goal: 2min ( ) ( )E g f x x 2min ( ) ( )E y f x x
0 constant
![Page 76: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/76.jpg)
Bias-Variance Dilemma
2( ) ( )E g f x x 2[ ( )] [( ) () )( ]E f E fE g f x xx x
2[ ]( )) (E fE g x x 2[ ]( )) (E fE f x x
[ ( )] [ ( )) ( ]2 ( )E g fE f E f x xx x
2 2[ ( )] [ ( )]( ) ( ) [ ( )] [2 ( ) ( ) ( )]E f E f E f EE g g f ff x x x xx x x x
( ) ( )[ ( )] [ ( )]E E f Eg f f x xxx
( ) ( )E g f x x [ )]( ()EE fg x x [ ( ()] )E E f f x x [ ( )] [ ( )]E f E fE x x
( ) ( )E g E f x x [ )]( () EE fg x x [ ( )] ( )EE f f x x [ ( )] [ ( )]E f E f x x 0
2 22( ) ( ) ( ) ( )E y f E E g f x x x x
0
![Page 77: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/77.jpg)
Bias-Variance Dilemma
2 22( ) ( ) ( ) ( )E y f E E g f x x x x
2 22 [ ( )] [ ( )( ( ]) )E fE E fE g E f x x xx
Goal: 2min ( ) ( )E y f x x
noise bias2 variance
Cannot be minimized
Minimize both bias2 and variance
![Page 78: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/78.jpg)
Model Complexity vs. Bias-Variance
2 22( ) ( ) ( ) ( )E y f E E g f x x x x
2 22 [ ( )] [ ( )( ( ]) )E fE E fE g E f x x xx
Goal: 2min ( ) ( )E y f x x
noise bias2 variance
ModelComplexity(Capacity)
![Page 79: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/79.jpg)
Bias-Variance Dilemma
2 22( ) ( ) ( ) ( )E y f E E g f x x x x
2 22 [ ( )] [ ( )( ( ]) )E fE E fE g E f x x xx
Goal: 2min ( ) ( )E y f x x
noise bias2 variance
ModelComplexity(Capacity)
![Page 80: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/80.jpg)
Example (Polynomial Fits)
22exp( 16 )y x x
![Page 81: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/81.jpg)
Example (Polynomial Fits)
![Page 82: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/82.jpg)
Example (Polynomial Fits)
Degree 1 Degree 5
Degree 10 Degree 15
![Page 83: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/83.jpg)
Introduction to Radial Basis Function Networks
The Effective Number of Parameters
![Page 84: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/84.jpg)
Variance Estimation
Mean
Variance 22
1
1ˆ
p
ii
xp
In general, not available.
![Page 85: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/85.jpg)
Variance Estimation
Mean1
1ˆ
p
ii
x xp
Variance 22 2
1
ˆ1
1ˆ
p
ii
s xp
Loss 1 degree of freedom
![Page 86: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/86.jpg)
Simple Linear Regression
0 1y x 2~ (0, )N
01
y
x
y
x
![Page 87: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/87.jpg)
Simple Linear Regression
01
y
x
y
x
01ˆ
ˆ
y
x
0 1y x 2~ (0, )N
21
ˆp
i ii
SSE y y
Minimize
![Page 88: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/88.jpg)
Mean Squared Error (MSE)
01
y
x
y
x
01ˆ
ˆ
y
x
0 1y x 2~ (0, )N
21
ˆp
i ii
SSE y y
Minimize
2
2ˆ
SSEMSE
p
2
2ˆ
SSEMSE
p
Loss 2 degrees
of freedom
![Page 89: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/89.jpg)
Variance Estimation
2ˆSSE
MSEmp
Loss m degrees of freedom
m: #parameters of the model
![Page 90: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/90.jpg)
The Number of Parameters
w1 w2 wm
x1 x2 xnx =
1
( ) ( )m
ii
iy wf
x x
m#degrees of freedom:
![Page 91: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/91.jpg)
The Effective Number of Parameters ()
w1 w2 wm
x1 x2 xnx =
1
( ) ( )m
ii
iy wf
x x The projection Matrix
1 Tp
P I Φ ΦA
T A Φ Φ Λ
Λ 0
( )p trace P m
![Page 92: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/92.jpg)
The Effective Number of Parameters ()
The projection Matrix
1 Tp
P I Φ ΦA
T A Φ Φ Λ
Λ 0
1( ) Tptrace trace AP I Φ Φ
1 Tp trace ΦA Φ
( ) ( ) ( )trace trace trace A B A BFacts:( ) ( )trace traceAB BA
1 TTp trace
Φ Φ ΦΦ
Pf)
1T Tp trace
Φ ΦΦ Φ
mp trace I
p m ( )p trace P m
![Page 93: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/93.jpg)
Regularization
((
1
)
2
)p
k
k
k fy
x2
1
m
iiiw
Empirial
Er
Model s
penalror ts
yCo t
2
( )
1 1
( )p m
ki
k
k
iiwy
x
Penalize models withlarge weightsSSE
( )p trace PThe effective number of parameters:
![Page 94: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/94.jpg)
Regularization
((
1
)
2
)p
k
k
k fy
x2
1
m
iiiw
Empirial
Er
Model s
penalror ts
yCo t
2
( )
1 1
( )p m
ki
k
k
iiwy
x
Penalize models withlarge weightsSSE Without penalty
(i=0), there are m degrees of freedom
to minimize SSE (Cost).
The effective number of
parameters =m.
Without penalty (i=0), there are m
degrees of freedom to minimize SSE
(Cost).
The effective number of
parameters =m.
The effective number of parameters:
( )p trace P
![Page 95: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/95.jpg)
Regularization
((
1
)
2
)p
k
k
k fy
x2
1
m
iiiw
Empirial
Er
Model s
penalror ts
yCo t
2
( )
1 1
( )p m
ki
k
k
iiwy
x
Penalize models withlarge weightsSSE
With penalty (i>0), the liberty to
minimize SSE will be reduced.
The effective number of
parameters <m.
With penalty (i>0), the liberty to
minimize SSE will be reduced.
The effective number of
parameters <m.
The effective number of parameters:
( )p trace P
![Page 96: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/96.jpg)
Variance Estimation
2ˆ MSE
Loss degrees of freedom
The effective number of parameters:
( )p trace P
SSE
p
![Page 97: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/97.jpg)
Variance Estimation
2ˆ MSE
The effective number of parameters:
( )p trace P
( )
SSE
trace
P
![Page 98: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/98.jpg)
Introduction to Radial Basis Function Networks
Model Selection
![Page 99: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/99.jpg)
Model Selection Goal
– Choose the fittest model Criteria
– Least prediction error Main Tools (Estimate Model Fitness)
– Cross validation– Projection matrix
Methods– Weight decay (Ridge regression)– Pruning and Growing RBFN’s
![Page 100: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/100.jpg)
Empirical Error vs. Model Fitness
Minimize Prediction Error
• Ultimate Goal Generalization
• Goal of Our Learning Procedure
Minimize Empirical Error
Minimize Prediction Error
(MSE)
![Page 101: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/101.jpg)
Estimating Prediction Error
When you have plenty of data use independent test sets– E.g., use the same training set to
train different models, and choose the best model by comparing on the test set.
When data is scarce, use– Cross-Validation– Bootstrap
![Page 102: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/102.jpg)
Cross Validation
Simplest and most widely used method for estimating prediction error.
Partition the original set into several different ways and to compute an average score over the different partitions, e.g.,– K-fold Cross-Validation– Leave-One-Out Cross-Validation– Generalize Cross-Validation
![Page 103: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/103.jpg)
K-Fold CV
Split the set, say, D of available input-output patterns into k mutually exclusive subsets, say D1, D2, …, Dk.
Train and test the learning algorithm k times, each time it is trained on D\Di and tested on Di.
![Page 104: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/104.jpg)
K-Fold CV
Available DataAvailable Data
![Page 105: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/105.jpg)
K-Fold CV
Available DataAvailable DataD1 D2 D3. . . Dk
D1 D2 D3. . . Dk
D1 D2 D3. . . Dk
D1 D2 D3. . . Dk
D1 D2 D3. . . Dk
Test Set
Training Set
Estimate 2
![Page 106: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/106.jpg)
Leave-One-Out CV
Split the p available input-output patterns into a training set of size p1 and a test set of size 1.
Average the squared error on the left-out pattern over the p possible ways of partition.
A special case of k-fold CV.
![Page 107: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/107.jpg)
Error Variance Predicted by LOO
A special case of k-fold CV.
22
1
1ˆ ( )
p
LOO i i ii
y fp
x
( , ) : 1,2, ,k kD y k p x
\ ( , )i i iD D y x
Available input-output patterns.
1, ,i p Training sets of LOO.
if Function learned using Di as training set.
The estimate for the variance of prediction error using LOO:
Error-square for the left-out element.
![Page 108: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/108.jpg)
Error Variance Predicted by LOO
A special case of k-fold CV.
22
1
1ˆ ( )
p
LOO i i ii
y fp
x
( , ) : 1,2, ,k kD y k p x
\ ( , )i i iD D y x
Available input-output patterns.
1, ,i p Training sets of LOO.
if Function learned using Di as training set.
The estimate for the variance of prediction error using LOO:
Error-square for the left-out element.
Given a model, the function with least empirical error for Di.
Given a model, the function with least empirical error for Di.
As an index of model’s fitness.We want to find a model also minimize
this.
As an index of model’s fitness.We want to find a model also minimize
this.
![Page 109: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/109.jpg)
Error Variance Predicted by LOO
A special case of k-fold CV.
22
1
1ˆ ( )
p
LOO i i ii
y fp
x
( , ) : 1,2, ,k kD y k p x
\ ( , )i i iD D y x
Available input-output patterns.
1, ,i p Training sets of LOO.
if Function learned using Di as training set.
The estimate for the variance of prediction error using LOO:
Error-square for the left-out element.
How to estimate?How to estimate?
Are there any efficient ways?
Are there any efficient ways?
![Page 110: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/110.jpg)
Error Variance Predicted by LOO
22
1
1ˆ ( )
p
LOO i i ii
y fp
x
Error-square for the left-out element.
2 2(1
ˆ ˆ ( )) ˆTLOO di
pag y PP Py
![Page 111: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/111.jpg)
Generalized Cross-Validation
2 2(1
ˆ ˆ ( )) ˆTLOO di
pag y PP Py
( )( ) p
tracediag
p
PP I
2
22
ˆ ˆˆ
( )
T
GCV
p
trace
y P y
P
2
2
ˆ ˆ
( )
Tp
p
y P y
![Page 112: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/112.jpg)
More Criteria Based on CV
22
2
ˆ ˆˆ
( )
T
GCV
p
p
y P y GCV
(Generalized CV)
22 ˆ ˆ
ˆT
UEV p
y P y UEV
(Unbiased estimate of variance)
22 ˆ ˆ
ˆT
FPE
p
p p
y P y FPE(Final Prediction Error)
Akaike’sInformation Criter
ion
BIC(Bayesian Information Criterio)
22 ln( ) 1 ˆ ˆ
ˆT
BIC
p p
p p
y P y
![Page 113: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/113.jpg)
More Criteria Based on CV
22
2
ˆ ˆˆ
( )
T
GCV
p
p
y P y
22 ˆ ˆ
ˆT
UEV p
y P y
22 ˆ ˆ
ˆT
FPE
p
p p
y P y
22 ln( ) 1 ˆ ˆ
ˆT
BIC
p p
p p
y P y
2ˆUEV
p
p
2ˆUEV
p
p
2ln( ) 1ˆUEV
p p
p
2 2 2 2ˆ ˆ ˆ ˆUEV FPE GCV BIC
![Page 114: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/114.jpg)
More Criteria Based on CV
22
2
ˆ ˆˆ
( )
T
GCV
p
p
y P y
22 ˆ ˆ
ˆT
UEV p
y P y
22 ˆ ˆ
ˆT
FPE
p
p p
y P y
22 ln( ) 1 ˆ ˆ
ˆT
BIC
p p
p p
y P y
2ˆUEV
p
p
2ˆUEV
p
p
2ln( ) 1ˆUEV
p p
p
2 2 2 2ˆ ˆ ˆ ˆUEV FPE GCV BIC
?P
![Page 115: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/115.jpg)
Regularization
((
1
)
2
)p
k
k
k fy
x2
1
m
iiiw
Empirial
Er
Model s
penalror ts
yCo t
2
( )
1 1
( )p m
ki
k
k
iiwy
x
Penalize models withlarge weightsSSE
Standard Ridge Regression,
,i i
![Page 116: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/116.jpg)
Regularization
((
1
)
2
)p
k
k
k fy
x2
1
m
iiiw
Empirial
Er
Model s
penalror ts
yCo t
2
( )
1 1
( )p m
ki
k
k
iiwy
x
Penalize models withlarge weightsSSE
2
1
m
iiw
2
1
m
iiw
Standard Ridge Regression,
,i i
![Page 117: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/117.jpg)
Solution Review
2
(( ) )
1 1
2
1
min p m m
ki
k
ki i
i i
C w wy
x
1* Tw A Φ yT
m Φ Φ IA1 T
p P I Φ ΦA
Used to compute model selection criteria
![Page 118: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/118.jpg)
Example
22( ) (1 2 ) xy x x x e
50p
2~ (0,0.2 )N
Width of RBF r = 0.5Width of RBF r = 0.5
![Page 119: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/119.jpg)
Example2 2 2 2ˆ ˆ ˆ ˆUEV FPE GCV BIC
2 2 2 2ˆ ˆ ˆ ˆUEV FPE GCV BIC
22( ) (1 2 ) xy x x x e
50p
2~ (0,0.2 )N
Width of RBF r = 0.5Width of RBF r = 0.5
![Page 120: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/120.jpg)
Example2 2 2 2ˆ ˆ ˆ ˆUEV FPE GCV BIC
2 2 2 2ˆ ˆ ˆ ˆUEV FPE GCV BIC
22( ) (1 2 ) xy x x x e
50p
2~ (0,0.2 )N
Width of RBF r = 0.5Width of RBF r = 0.5
How the determine the
optimal regularization
parameter effectively?How the determine the
optimal regularization
parameter effectively?
![Page 121: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/121.jpg)
Optimizing the Regularization Parameter
Re-Estimation Formula
2 1 2
1
ˆˆ
( )
T
T
trace
trace
A
w
Ay P y
A Pw
![Page 122: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/122.jpg)
Local Ridge Regression
Re-Estimation Formula
2 1 2
1
ˆˆ
( )
T
T
trace
trace
A
w
Ay P y
A Pw
![Page 123: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/123.jpg)
Example
sin(12 )y x 2~ (0,0.1 )N50m p
— Width of RBF
( )p trace P( )p trace P
0.05r
![Page 124: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/124.jpg)
Example
sin(12 )y x 2~ (0,0.1 )N50m p
— Width of RBF
( )p trace P( )p trace P
0.05r
![Page 125: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/125.jpg)
Example
sin(12 )y x 2~ (0,0.1 )N50m p
— Width of RBF0.05r
There are two local-minima.
2 1 2
1
ˆˆ
( )
T
T
trace
trace
A
w
Ay P y
A Pw
Using the about re-estimation formula, it will be stuck at the nearest local minimum.
That is, the solution depends on the initial setting.
![Page 126: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/126.jpg)
Example
sin(12 )y x 2~ (0,0.1 )N50m p
— Width of RBF0.05r
There are two local-minima.
2 1 2
1
ˆˆ
( )
T
T
trace
trace
A
w
Ay P y
A Pw
ˆ(0) 0.01
lim 0.1ˆ( )t
t
![Page 127: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/127.jpg)
Example
sin(12 )y x 2~ (0,0.1 )N50m p
— Width of RBF0.05r
There are two local-minima.
2 1 2
1
ˆˆ
( )
T
T
trace
trace
A
w
Ay P y
A Pw
5ˆ(0) 10
4ˆ( )lim 2.1 10t
t
![Page 128: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/128.jpg)
Example
sin(12 )y x 2~ (0,0.1 )N50m p
— Width of RBF0.05r
RMSE: Root Mean Squared ErrorIn real case, it is not available.
RMSE: Root Mean Squared ErrorIn real case, it is not available.
![Page 129: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/129.jpg)
Example
sin(12 )y x 2~ (0,0.1 )N50m p
— Width of RBF0.05r
RMSE: Root Mean Squared ErrorIn real case, it is not available.
RMSE: Root Mean Squared ErrorIn real case, it is not available.
![Page 130: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/130.jpg)
Local Ridge Regression
2
(( ) )
1 1
2
1
min p m m
ki
k
ki i
i i
C w wy
x
Standard Ridge Regression
2
(( ) )
1 1
2
1
min p m m
i ik
ik i
ik
i
C w wy
x
Local Ridge Regression
![Page 131: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/131.jpg)
Local Ridge Regression
2
(( ) )
1 1
2
1
min p m m
ki
k
ki i
i i
C w wy
x
Standard Ridge Regression
2
(( ) )
1 1
2
1
min p m m
i ik
ik i
ik
i
C w wy
x
Local Ridge Regression
j implies that j() can be removed.
j implies that j() can be removed.
![Page 132: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/132.jpg)
The Solutions1* Tw A Φ y
Tm Φ Φ IA
1 Tp
P I Φ ΦAUsed to compute model selection criteria
T A Φ Φ Λ
TA Φ Φ Linear Regression
Standard Ridge Regression
Local Ridge Regression
![Page 133: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/133.jpg)
Optimizing the Regularization Parameters
Incremental Operation
P: The current projection Matrix.
Pj: The projection Matrix obtained by removing j().
Tj j j j
j Tj j j j
P φ φ PP P
φ P φ
2
22
ˆ ˆˆ
( )
T
GCV
p
trace
y P y
P
![Page 134: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/134.jpg)
Optimizing the Regularization Parameters
2
22
ˆ ˆˆ
( )
T
GCV
p
trace
y P y
P
Solve 2ˆ
0GCV
j
Subject to 0j
Tj j j j
j Tj j j j
P φ φ PP P
φ P φ
![Page 135: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/135.jpg)
Optimizing the Regularization Parameters
2Tja y P y2T Tj j j jby P φ y P φ2 2( )T T
j j j j jc φ P φ y P φ
( )jtrace P2T
j j j φ P φ
if
if and ˆ0 if and
if
c bTj j j b a
c bTjj j j b a
c b c bT Tj j j j j jb a b a
a b
a b
a b
φ P φ
φ P φ
φ P φ φ P φ
if
if and ˆ0 if and
if
c bTj j j b a
c bTjj j j b a
c b c bT Tj j j j j jb a b a
a b
a b
a b
φ P φ
φ P φ
φ P φ φ P φ
Solve 2ˆ
0GCV
j
Subject to 0j
![Page 136: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/136.jpg)
Optimizing the Regularization Parameters
2Tja y P y2T Tj j j jby P φ y P φ2 2( )T T
j j j j jc φ P φ y P φ
( )jtrace P2T
j j j φ P φ
if
if and ˆ0 if and
if
c bTj j j b a
c bTjj j j b a
c b c bT Tj j j j j jb a b a
a b
a b
a b
φ P φ
φ P φ
φ P φ φ P φ
if
if and ˆ0 if and
if
c bTj j j b a
c bTjj j j b a
c b c bT Tj j j j j jb a b a
a b
a b
a b
φ P φ
φ P φ
φ P φ φ P φ
Solve 2ˆ
0GCV
j
Subject to 0j Remove j()
Remove j()
![Page 137: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/137.jpg)
The Algorithm
Initialize i’s. – e.g., performing standard ridge regression.
Repeat the following until GCV converges:– Randomly select j and compute– Perform local ridge regression– If GCV reduce & remove j()
ˆj
ˆj
![Page 138: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/138.jpg)
References
Kohavi, R. (1995), "A study of cross-validation and bootstrap for accuracy estimation and model selection," International Joint Conference on Artificial Intelligence (IJCAI).
Mark J. L. Orr (April 1996), Introduction to Radial Basis Function Networks, http://www.anc.ed.ac.uk/~mjo/intro/intro.html.
![Page 139: Introduction to Radial Basis Function Networks. Content Overview The Models of Function Approximator The Radial Basis Function Networks RBFN’s for Function.](https://reader038.fdocuments.us/reader038/viewer/2022110210/56649e8e5503460f94b91f93/html5/thumbnails/139.jpg)
Introduction to Radial Basis Function Networks
Incremental Operations