EigenTransfer : A Unified Framework for Transfer Learning
description
Transcript of EigenTransfer : A Unified Framework for Transfer Learning
EigenTransfer: A Unified Framework for Transfer
LearningWenyuan Dai, Ou Jin, Gui-Rong Xue, Qiang
Yang and Yong Yu
Shanghai Jiao Tong University & Hong Kong University of Science and Technology
Motivation Problem Formulation Graph Construction Simple Review on Spectral Analysis Learning from Graph Spectra Experiments Result Conclusion
Outline
Motivation
A variety of transfer learning tasks have been investigated.
Motivation
Lifelong Learning (Thrun,
1996)
Multi-task Learning
(Caruana, 1997)
Cross-domain Learning (Wu et
al., 2004)
Cross-category Learning (Raina
et al., 2006)
Self-taught Learning (Raina
et al., 2007)
General
Framework
Difference◦ Different tasks◦ Different approaches & algorithms
Common
Motivation
Auxiliary Data
Target Data (Training)
Target Data (Test)
Common parts or relation
We can have a graph:
Motivation
Features
Auxiliary Data Training Data Test Data
Labels
New Representation
We can get the new representation of Training Data and Test Data by Spectral Analysis.
Then we can use our traditional non-transfer learner again.
Motivation
Problem Formulation
Target Training Data: with labels Target Test Data: without labels Auxiliary Data:
Task◦ Cross-domain Learning◦ Cross-category Learning◦ Self-taught Learning
Problem Formulation
1{ }i nt t ix
1{ }i iutkx
1{ }i im
a ux
Problem Formulation
Graph Construction
Graph Construction
Cross-domain Learning
-( )- -( )- -( )- -( 1 )- -( 1 )-
itx
jf,i jiux
jf,i jiax
jf,i j
itxiux
jCjC
Graph Construction
Cross-category Learning
-( )- -( )- -( )- -( 1 )- -( 1 )-
itx
jf,i jiux
jf,i jiax
jf,i j
itxiux
jtCjaC
Graph Construction
Self-taught Learning
-( )- -( )- -( )- -( 1 )-
itx
jf,i jiux
jf,i jiax
jf,i j
itx
jtC
Graph Construction
Doc-Token Matrix Adjacency Matrix
Token Token …
Doc
Doc
…
Doc Feature
Label
Doc ?
Feature
? 0
Label 0 0
Simple Review on Spectral Analysis
G is an undirected weighted graph with weight matrix W, where .
D is a diagonal matrix, where
Unnormalized graph Laplacian matrix:
Normalized graph Laplacians:
Simple Review on Spectral Analysis
0ij jiWW
L D W
1/2 1/2 1/2 1/2sym D LD I D WDL
1 1rwL D L WI D
ii ijj
WD
Calculate the first k eigenvectors The New representation:
Simple Review on Spectral Analysis
1 2, kv v v
v1 v2 v3
Node1
Node2
Node3
Node4
…
New Feature Vector of the
Node2
Learning from Graph Spectra
Graph G Adjacency matrix of G: Graph Laplacian of G: Solve the generalized eigenproblem:
The first k eigenvectors form a new feature representation.
Apply traditional learners such as NB, SVM
Learning from Graph Spectra
W
L D W
L Dv v
DocFeatur
e Label
Doc
Feature
Label
Learning from Graph Spectra
DocFeatur
e Label
Doc
Feature
Label
v1 v2
Train
Test
Auxiliary
Feature
Label
Train
v1 v2
Test v1 v2
Classifier
W
L
The only problem remain is the computation time.
Which is lucky:◦ Matrix L is sparse◦ There are fast algorithms for sparse matrix for
solving eigen-problem. (Lanczos) The final computational cost is linear to
Learning from Graph Spectra
( )nz L k
Experiments
Basic Progress
Experiments
Training Data
Test DataAuxiliary
Data
New Training
Data
New Test Data
15 Positive Instances &15 Negative Instances
Baseline
Result
Repeat 10 times
Calculate average
Sample
Classifier(NB/SVM/TSVM)
CV
Cross-domain Learning Data
◦ SRAA◦ 20 Newsgroups (Lang, 1995)◦ Reuters-21578
Target data and auxiliary data share the same categories(top directories), but belong to different domains(sub-directories).
Experiments
ExperimentsCross-domain result with NB
cdl-s
raa1
cdl-s
raa2
cdl-2
0ng1
cdl-2
0ng2
cdl-2
0ng3
cdl-2
0ng4
cdl-2
0ng5
cdl-2
0ng6
cdl-r
eute
rs1
cdl-r
eute
rs2
cdl-r
eute
rs3
aver
age
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Non-TransferSimple combineEigen Transfer
ExperimentsCross-domain result with SVM
cdl-s
raa1
cdl-s
raa2
cdl-2
0ng1
cdl-2
0ng2
cdl-2
0ng3
cdl-2
0ng4
cdl-2
0ng5
cdl-2
0ng6
cdl-r
eute
rs1
cdl-r
eute
rs2
cdl-r
eute
rs3
aver
age
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Non-TransferSimple combineEigen Transfer
ExperimentsCross-domain result with TSVM
cdl-s
raa1
cdl-s
raa2
cdl-2
0ng1
cdl-2
0ng2
cdl-2
0ng3
cdl-2
0ng4
cdl-2
0ng5
cdl-2
0ng6
cdl-r
eute
rs1
cdl-r
eute
rs2
cdl-r
eute
rs3
aver
age
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Non-TransferSimple combineEigen Transfer
Cross-domain result on average
Experiments
Non-Transfer Simple Combine EigenTransfer
NB 0.250±0.036 0.239±0.000 0.134±0.031
SVM 0.190±0.039 0.213±0.000 0.095±0.018
TSVM 0.140±0.038 0.145±0.000 0.101±0.019
Cross-category Learning Data
◦ 20 Newsgroups (Lang, 1995)◦ Ohscal data set from OHSUMED (Hersh et al.
1994) Random select two categories as target
data. Take the other categories as auxiliary labeled data.
Experiments
ExperimentsCross-category result with NB
ccl-2
0ng1
ccl-2
0ng2
ccl-2
0ng3
ccl-2
0ng4
ccl-2
0ng5
ccl-o
hs1
ccl-o
hs2
ccl-o
hs3
ccl-o
hs4
ccl-o
hs5
aver
age
0
0.05
0.1
0.15
0.2
0.25
0.3
Non-TransferEigenTransfer
ExperimentsCross-category result with SVM
ccl-2
0ng1
ccl-2
0ng2
ccl-2
0ng3
ccl-2
0ng4
ccl-2
0ng5
ccl-o
hs1
ccl-o
hs2
ccl-o
hs3
ccl-o
hs4
ccl-o
hs5
aver
age
0
0.05
0.1
0.15
0.2
0.25
Non-TransferEigenTransfer
ExperimentsCross-category result with TSVM
ccl-2
0ng1
ccl-2
0ng2
ccl-2
0ng3
ccl-2
0ng4
ccl-2
0ng5
ccl-o
hs1
ccl-o
hs2
ccl-o
hs3
ccl-o
hs4
ccl-o
hs5
aver
age
0
0.05
0.1
0.15
0.2
0.25
Non-TransferEigenTransfer
Cross-category result on average
Experiments
Non-Transfer EigenTransfer
NB 0.186±0.038 0.099±0.025
SVM 0.131±0.032 0.065±0.016
TSVM 0.104±0.010 0.091±0.013
Self-taught Learning Data
◦ 20 Newsgroups (Lang, 1995)◦ Ohscal data set from OHSUMED (Hersh et al.
1994) Random select two categories as target
data. Take the other categories as auxiliary without labeled data.
Experiments
ExperimentsSelf-taught result with NB
stl-2
0ng1
stl-2
0ng2
stl-2
0ng3
stl-2
0ng4
stl-2
0ng5
stl-o
hs1
stl-o
hs2
stl-o
hs3
stl-o
hs4
stl-o
hs5
aver
age
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Non-TransferEigenTransfer
ExperimentsSelf-taught result with SVM
stl-2
0ng1
stl-2
0ng2
stl-2
0ng3
stl-2
0ng4
stl-2
0ng5
stl-o
hs1
stl-o
hs2
stl-o
hs3
stl-o
hs4
stl-o
hs5
aver
age
0
0.05
0.1
0.15
0.2
0.25
Non-TransferEigenTransfer
ExperimentsSelf-taught result with TSVM
stl-2
0ng1
stl-2
0ng2
stl-2
0ng3
stl-2
0ng4
stl-2
0ng5
stl-o
hs1
stl-o
hs2
stl-o
hs3
stl-o
hs4
stl-o
hs5
aver
age
0
0.05
0.1
0.15
0.2
0.25
Non-TransferEigenTransfer
Self-taught result on average
Experiments
Non-Transfer EigenTransfer
NB 0.189±0.038 0.107±0.032
SVM 0.126±0.030 0.070±0.017
TSVM 0.106±0.011 0.098±0.024
ExperimentsEffect of the number of Eigenvectors
ExperimentsLabeled Target Data
We proposed a general transfer learning framework.
It can model a variety of existing transfer learning problems and solutions.
Our experimental results show that it can greatly outperform non-transfer learners in many experiments.
Conclusion
Thank you!