An Introduction to Metric Learning for Clustering

12
Metric Learning for Clustering SCC5945 - Análise Semi-Supervisionada e Não-Supervisionada de Padrões em Dados (Seminar) Sidgley Camargo de Andrade PhD student in computer science Institute of Computer Science and Mathematics University of São Paulo June 2016 1 / 12

Transcript of An Introduction to Metric Learning for Clustering

Page 1: An Introduction to Metric Learning for Clustering

Metric Learning for Clustering

SCC5945 - Análise Semi-Supervisionada e Não-Supervisionadade Padrões em Dados

(Seminar)

Sidgley Camargo de AndradePhD student in computer science

Institute of Computer Science and MathematicsUniversity of São Paulo

June 2016

1 / 12

Page 2: An Introduction to Metric Learning for Clustering

Agenda

Constraint-based algorithms

Motivation

Metrics

Metric learning for clustering

MPCK-means algorithm

References

2 / 12

Page 3: An Introduction to Metric Learning for Clustering

Constraint-based algorithms

How to help the unsupervised algorithms to find bettersolution?

I Constraint-based methods– e.g. background knowledgethrough pairwise constraints Wagstaff et al. (2001)

Con ⊆ DxD : must-link constraintsCon6= ⊆ DxD : cannot-link constraints

I Active- and self-learning

I Other . . .

Are there “problems” related to algorithms above?

3 / 12

Page 4: An Introduction to Metric Learning for Clustering

Motivation

Figure: (Basu et al., 2008). Legend [–] must-link [- -] cannot-link

4 / 12

Page 5: An Introduction to Metric Learning for Clustering

Metrics

The metrics depict the relationships between the data (e.g.euclidean distance, mahalanobis distance, etc. . . )

What is the right metric?

There are few forms or systemic mechanisms to tweak distancemetrics, and them are often by hand Xing et al. (2003).

5 / 12

Page 6: An Introduction to Metric Learning for Clustering

Metric learning for clustering

Assumption: keeping dissimilar points far from each other andsimilar points closest to each other reduces the risk of errors.

Xing et al. (2003)Suppose a user indicates that certain points in an input space (say,<n) are considered by them to be “similar” (or “dissimilar”). Can weautomatically learn a distance metric over <n that respects theserelationships, i.e., one that assigns small distances between thesimilar pairs and greater distances otherwise?

Learn a metric d : <nx<n 7→ < over the input space.

6 / 12

Page 7: An Introduction to Metric Learning for Clustering

Problem

A simple way is to require that similar pairs (must-linked) havesmall distance between them, whereas dissimilar pairs (cannot-link)have greater distance between them

d(x , y) = dA(x , y) = ||x − y ||A =√(x − y)TA(x − y)

minA

∑(xi ,xj )∈S ||xi − xj ||2A

s.t.∑

(xi ,xj )∈D ||xi − xj ||2A ≥ c

A � 0

, where A � 0 is a constraint that symmetric matrix A must bepositive semi-definite – “pseudo metric” – and c any positiveconstant ≥ 1

1Question for class – Why is constant c positive?2Question for class – How to transform to max problem?

7 / 12

Page 8: An Introduction to Metric Learning for Clustering

Example – Xing et al. (2003)

8 / 12

Page 9: An Introduction to Metric Learning for Clustering

Metric Pairwise Constraint K-means(MPCK-means)

Assumes a matrix Ah (metric) for each cluster h

Permits the specification of an individual weight for each constraint(fM and fC ); the penalty for constraint violations is proportional tothe violated constraints weight

9 / 12

Page 10: An Introduction to Metric Learning for Clustering

MPCK-means algorithm – Bilenko et al. (2004)

10 / 12

Page 11: An Introduction to Metric Learning for Clustering

MPCK-means algorithm – Bilenko et al. (2004)

11 / 12

Page 12: An Introduction to Metric Learning for Clustering

References

Basu, S., Davidson, I., and Wagstaff, K. (2008). Constrained Clustering:Advances in Algorithms, Theory, and Applications. Chapman &Hall/CRC, 1 edition.

Bilenko, M., Basu, S., and Mooney, R. J. (2004). Integrating constraintsand metric learning in semi-supervised clustering. In Proceedings ofthe Twenty-first International Conference on Machine Learning, ICML’04, pages 11–, New York, NY, USA. ACM.

Wagstaff, K., Cardie, C., Rogers, S., and Schrödl, S. (2001). Constrainedk-means clustering with background knowledge. In Proceedings of theEighteenth International Conference on Machine Learning, ICML ’01,pages 577–584, San Francisco, CA, USA. Morgan KaufmannPublishers Inc.

Xing, E. P., Ng, A. Y., Jordan, M. I., and Russell, S. (2003). Distancemetric learning, with application to clustering with side-information. InAdvances in Neural Information Processing System, pages 505–512.MIT Press.

12 / 12