Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?
-
Upload
azubiaga -
Category
Technology
-
view
550 -
download
1
description
Transcript of Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?
Is Unlabeled Data Suitable for Multiclass SVM-basedWeb Page Classification?
Arkaitz Zubiaga, Vıctor Fresno, Raquel Martınez
Universidad Nacional de Educacion a Distancia
June 4, 2009
Text Classification
Index
1 Text Classification
2 Motivation
3 Support Vector Machines
4 Multiclass SVM
5 S3VM
6 Multiclass S3VM
7 Compared Approaches: Multiclass SVM vs Multiclass S3VM
8 Experiments
9 Results
10 Conclusions and Outlook
11 Thank you
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 2 / 31
Text Classification
What is it?
We have a set of documents:
D = {d1, ..., d|D|}
With a set of predefined categories:
C = {c1, ..., c|C |}
Classification is known as:
〈dj , ci 〉 ∈ D × C
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 3 / 31
Motivation
Index
1 Text Classification
2 Motivation
3 Support Vector Machines
4 Multiclass SVM
5 S3VM
6 Multiclass S3VM
7 Compared Approaches: Multiclass SVM vs Multiclass S3VM
8 Experiments
9 Results
10 Conclusions and Outlook
11 Thank you
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 4 / 31
Motivation
Motivation
Several studies for plain text classification (news), but a few for webpage classification.
Typical web page classification task:
Semi-supervised: not much labeled documents.Multiclass: taxonomy > 2.
(Joachims, 1999) proved the suitability of unlabeled data for binarytasks.
What about multiclass tasks?(Chapelle et al., 2006) did it over image datasets, but never fortext/web pages.
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 5 / 31
Support Vector Machines
Index
1 Text Classification
2 Motivation
3 Support Vector Machines
4 Multiclass SVM
5 S3VM
6 Multiclass S3VM
7 Compared Approaches: Multiclass SVM vs Multiclass S3VM
8 Experiments
9 Results
10 Conclusions and Outlook
11 Thank you
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 6 / 31
Support Vector Machines
SVM
It looks for a hyperplane to separate the classes
Margin maximization
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 7 / 31
Support Vector Machines
SVM
It looks for a hyperplane to separate the classes
Margin maximization
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 7 / 31
Support Vector Machines
SVM
Optimization function: min 12 ||ω||
2 + C ·∑n
i=1 ξdi
Subject to: yi (ω · xi + b) ≥ 1− ξi , ξi ≥ 0
It only handles binary and supervised problems by nature.
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 8 / 31
Multiclass SVM
Index
1 Text Classification
2 Motivation
3 Support Vector Machines
4 Multiclass SVM
5 S3VM
6 Multiclass S3VM
7 Compared Approaches: Multiclass SVM vs Multiclass S3VM
8 Experiments
9 Results
10 Conclusions and Outlook
11 Thank you
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 9 / 31
Multiclass SVM
Multiclass SVM
Approaches to multiclass SVM:
Direct.Combining binary classfiers.
One-against-one.One-against-all.
Usually applied to supervised tasks, but hardly ever to semi-supervisedones.
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 10 / 31
Multiclass SVM
Multiclass SVM: Direct approach
The optimization function considers all the hyperplanes at the sametime.
min1
2
n∑m=1
||wm||2 + Cl∑
i=1
∑m 6=yi
ξmi
Subject to:
wyi · xi + byi ≥ wm · xi + bm + 2− ξmi , ξmi ≥ 0
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 11 / 31
Multiclass SVM
Multiclass SVM: One-against-one
It creates k·(k−1)2 binary classifiers
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 12 / 31
Multiclass SVM
Multiclass SVM: One-against-one
It creates k·(k−1)2 binary classifiers
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 12 / 31
Multiclass SVM
Multiclass SVM: One-against-one
It creates k·(k−1)2 binary classifiers
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 12 / 31
Multiclass SVM
Multiclass SVM: One-against-one
It creates k·(k−1)2 binary classifiers
sign(ωTij · x + bij) −→ Add a vote for the winning class between i and j
The class with more votes will be the output.
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 12 / 31
Multiclass SVM
Multiclass SVM: One-against-all
It creates k binary classifiers
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 13 / 31
Multiclass SVM
Multiclass SVM: One-against-all
It creates k binary classifiers
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 13 / 31
Multiclass SVM
Multiclass SVM: One-against-all
It creates k binary classifiers
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 13 / 31
Multiclass SVM
Multiclass SVM: One-against-all
It creates k binary classifiers
Ci = arg maxi=1,...,k
(ωi · x + bi )
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 13 / 31
S3VM
Index
1 Text Classification
2 Motivation
3 Support Vector Machines
4 Multiclass SVM
5 S3VM
6 Multiclass S3VM
7 Compared Approaches: Multiclass SVM vs Multiclass S3VM
8 Experiments
9 Results
10 Conclusions and Outlook
11 Thank you
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 14 / 31
S3VM
Semi-supervised SVM (S3VM)
Unlabeled documents are considered during the learning phase.
The optimization function results:
min1
2· ||ω||2 + C ·
l∑i=1
ξdi + C ∗ ·u∑
j=1
ξ∗d
j
Convex optimization algorithms required.
Commonly used over binary taxonomies, but hardly ever with moreclasses.
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 15 / 31
Multiclass S3VM
Index
1 Text Classification
2 Motivation
3 Support Vector Machines
4 Multiclass SVM
5 S3VM
6 Multiclass S3VM
7 Compared Approaches: Multiclass SVM vs Multiclass S3VM
8 Experiments
9 Results
10 Conclusions and Outlook
11 Thank you
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 16 / 31
Multiclass S3VM
Multiclass S3VM
(Yajima and Kuo, 2006) present the following optimization function:
min(1
2
h∑i=1
βiT K−1βi + Cl∑
j=1
∑i 6=yj
max(0, 1− (βyj
j − βij ))2)
where β represents the product of a vector and a kernel matrix defined bythe author.
(Chapelle et al., 2006): direct approach by means of the ContinuationMethod.
2 steps:
(Qi et al., 2004) use Fuzzy C-Means to predict new unlabeleddocuments.(Xu and Schuurmans, 2005) rely on a clustering-based approach tolabel the unlabeled data.
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 17 / 31
Compared Approaches: Multiclass SVM vs Multiclass S3VM
Index
1 Text Classification
2 Motivation
3 Support Vector Machines
4 Multiclass SVM
5 S3VM
6 Multiclass S3VM
7 Compared Approaches: Multiclass SVM vs Multiclass S3VM
8 Experiments
9 Results
10 Conclusions and Outlook
11 Thank you
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 18 / 31
Compared Approaches: Multiclass SVM vs Multiclass S3VM
Multiclass SVM vs Multiclass S3VM
2-steps-SVM/1-step-SVM: Multiclass SVM.Does an intermediate step adding newly labeled data improveclassifier’s performance?
One-against-all-S3VM/One-against-all-SVM.
One-against-one-S3VM/One-agaisnt-one-SVM.Does unlabeled data help to improve binary combining classifier’sresults?
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 19 / 31
Experiments
Index
1 Text Classification
2 Motivation
3 Support Vector Machines
4 Multiclass SVM
5 S3VM
6 Multiclass S3VM
7 Compared Approaches: Multiclass SVM vs Multiclass S3VM
8 Experiments
9 Results
10 Conclusions and Outlook
11 Thank you
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 20 / 31
Experiments
Experiments settings
Datasets:
BankSearch: 10.000 web documents / 10 categories (4.000 for thetraining set).WebKB: 4.518 web documents / 6 categories (2.000 for the trainingset).Yahoo! Science: 788 web documents / 6 categories (200 for thetraining set).
Numerous labeled/unlabeled sets.
9 executions for each.
Representation: TF-IDF.
Software:
SVM-light (http://svmlight.joachims.org)SVM-multiclass
Evaluation by means of the accuracy (percent of correct predictions).
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 21 / 31
Results
Index
1 Text Classification
2 Motivation
3 Support Vector Machines
4 Multiclass SVM
5 S3VM
6 Multiclass S3VM
7 Compared Approaches: Multiclass SVM vs Multiclass S3VM
8 Experiments
9 Results
10 Conclusions and Outlook
11 Thank you
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 22 / 31
Results
Results: BankSearch
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 23 / 31
Results
Results: WebKB
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 24 / 31
Results
Results: Yahoo! Science
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 25 / 31
Results
Results
Supervised multiclass approaches (2-steps-SVM & 1-step-SVM)outperform the rest.
Among binary combinations, one-against-all outperformsone-against-one.
Unlabeled data slightly helps for one-against-all.
1-step-SVM and 2-steps-SVM show similar results, except forWebKB, where the former wins.
It could be due to the homogeneous nature of the WebKB dataset.
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 26 / 31
Conclusions and Outlook
Index
1 Text Classification
2 Motivation
3 Support Vector Machines
4 Multiclass SVM
5 S3VM
6 Multiclass S3VM
7 Compared Approaches: Multiclass SVM vs Multiclass S3VM
8 Experiments
9 Results
10 Conclusions and Outlook
11 Thank you
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 27 / 31
Conclusions and Outlook
Conclusions
Comparison of multiclass SVM and S3VM approaches for web pageclassification.
Direct and combining approaches.
Direct approaches outperform the rest.
Unlabeled data did not provide considerable improvements, and evenprovide worsenings in some cases.
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 28 / 31
Conclusions and Outlook
Future Work
To add more multiclass S3VM approaches to the study.
To test with different SVM settings (kernel, parameters,...).
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 29 / 31
Thank you
Index
1 Text Classification
2 Motivation
3 Support Vector Machines
4 Multiclass SVM
5 S3VM
6 Multiclass S3VM
7 Compared Approaches: Multiclass SVM vs Multiclass S3VM
8 Experiments
9 Results
10 Conclusions and Outlook
11 Thank you
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 30 / 31
Thank you
Thank you
Thank you
A. Zubiaga, V. Fresno, R. Martınez (UNED) Unlabeled Data for Multiclass SVM June 4, 2009 31 / 31