Data Mining and Matrices - Max Planck...

Data Mining and Matrices01 – Introduction

Rainer Gemulla, Pauli Miettinen

April 18, 2013

Outline

1 What is data mining?

2 What is a matrix?

3 Why data mining and matrices?

4 Summary

2 / 27

What is data mining?

“Data mining is the process of discovering knowledge or patterns frommassive amounts of data.” (Encyclopedia of Database Systems)

Rock Gold Tools Miners

Data Knowledge Software Analysts

Estimated $100 billion industry around managing and analyzing data.

3 / 27Data, Data everywhere. The Economist, 2010.

ScienceI The Sloan Digital Sky Survey gathered 140TB of informationI NASA Center for Climate Simulation stores 32PB of dataI 3B base pairs exist in the human genomeI LHC registers 600M particle collisions per second, 25PB/year

Social dataI 1M customer transactions are performed at Walmart per hourI 25M Netflix customers view and rate hundreds of thousands of moviesI 40B photos have been uploaded to FacebookI 200M active Twitter users write 400M tweets per dayI 4.6B mobile-phone subscriptions worldwide

Government, health care, news, stocks, books, web search, ...

4 / 27Data, Data everywhere. The Economist, 2010.

5 / 27

Prediction

Clustering

Outlier detection

“Regnet es am Siebenschlafertag, derRegen sieben Wochen nicht weichen mag.”(German folklore)

Pattern mining

Knowledge discovery pipeline

6 / 27

Focus of this lecture

8 / 27

mater (Latin) = mother

matrix (Latin) = pregnant animal

matrix (Late Latin) = wombalso source, origin

Since 1550s: place or medium wheresomething is developed

Since 1640s: embedding or enclosingmass

Online Etymology Dictionary

Rectangular arrays of numbers

“Rectangular arrays” known in ancient China (rod calculus, estimatedas early as 300BC)

1 0 0 0 0 00 1 0 0 0 00 0 1 0 0 00 0 0 1 0 00 0 0 0 1 00 0 0 0 0 1

Term “matrix” coined by J.J. Sylvesterin 1850

9 / 27

System of linear equations

Systems of linear equations can be written as matrices

3x + 2y + z = 39

2x + 3y + z = 34

x + 2y + 3z = 26

3 2 1 392 3 1 341 2 3 26

and then be solved using linear algebra methods3 2 1 39

5 1 2412 33

9.254.252.75

10 / 27

Set of data points

−4 −2 0 2 4

11 / 27

−3.84 −2.21−3.33 −2.19−2.55 −1.47−2.46 −1.25−1.49 −0.76−1.67 −0.39−1.3 −0.59

......

1.59 0.781.53 1.021.45 1.261.86 1.182.04 0.962.42 1.242.32 2.032.9 1.35

Linear maps

Linear maps from R3 to Rf1(x , y , z) = 3x + 2y + z

f2(x , y , z) = 2x + 3y + z

f3(x , y , z) = x + 2y + 3z

f4(x , y , z) = x

Linear map f1 written as a matrix

(3 2 1

= f1(x , y , z)

Linear map from R3 to R43 2 12 3 11 2 31 0 0

f1(x , y , z)f2(x , y , z)f3(x , y , z)f4(x , y , z)

12 / 27

Original data

−4 −2 0 2 4

Rotated and stretched

−4 −2 0 2 4−4

Graphs

13 / 27

Adjacency matrix

Objects and attributes

Anna, Bob, and Charlie went shopping

Anna bought butter and bread

Bob bought butter, bread, and beer

Charlie bought bread and beer

Bread Butter Beer

Anna 1 1 0Bob 1 1 1Charlie 0 1 1

Customer transactions

Data Matrix Mining

Book 1 5 0 3Book 2 0 0 7Book 3 4 6 5

Document-term matrix

Avatar The Matrix Up

Alice 4 2Bob 3 2Charlie 5 3

Incomplete rating matrix

Jan Jun Sep

Saarbrucken 1 11 10Helsinki 6.5 10.9 8.7Cape Town 15.7 7.8 8.7

Cities and monthly temperatures

Many different kinds of data fit this object-attribute viewpoint.

14 / 27

What is a matrix?

A means to describe computationI RotationI RescalingI PermutationI ProjectionI · · ·

Linear operators

A means to describe data

15 / 27

Rows Columns Entries

Objects Attributes ValuesEquations Variables Coefficients

Data points Axes CoordinatesVertices Vertices Edges

......

A11 A12 · · · A1j · · ·A21 A22 · · · A2j · · ·

......

. . ....

Object i Ai1 Ai2 · · · Aij · · ·

.... . .

In data mining, we make use of both viewpoints simultaneously.

Outline

2 What is a matrix?

4 Summary

16 / 27

Key tool: Matrix decompositions

A matrix decomposition of a data matrix D is given by three matrices L,M, R such that

D = LMR,

D is an m × n data matrix,

L is an m × r matrix,

M is an r × r matrix,

R is an r × n matrix, and

r is an integer ≥ 1.

17 / 27

Dij =∑

k,k ′ LikMkk ′Rk ′j

k′k′

There are many different kinds of matrixdecompositions, each putting certain con-straints on matrices L, M, R (which maynot be easy to find).

Example: Singular value decomposition

D50×2

−4 −2 0 2 4

L50×2 M2×2 R2×2

−0.4 −0.2 0.0 0.2 0.4

−0.4

−0.2

●●

(11.73 0

0 1.71

)−1.0 −0.5 0.0 0.5 1.0

−1.0

−0.5

18 / 27

Example: Non-negative matrix factorization

19 / 27

L R∗j

LR∗j

Lee and Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 1999.

Example: Latent Dirichlet allocation

20 / 27

Blei et al. Latent dirichlet allocation. JMLR, 2003.

Other matrix decompositions

Singular value decomposition (SVD)

k-means

Non-negative matrix factorization (NMF)

Semi-discrete decomposition (SDD)

Boolean matrix decomposition (BMF)

Independent component analysis (ICA)

Matrix completion

Probabilistic matrix factorization

21 / 27

What can we do with matrix decompositions?

Separate data from multiple processes

Remove noise from the data

Remove redundancy from the data

Reveal latent structure and similarities in the data

Fill in missing entries

Find local patterns

Reduce space consumption

Reduce computational cost

Aid visualization

Matrix decompositions can make data mining algorithmsmore effective. They may also provide insight into thedata by themselves.

22 / 27

Factor interpretion of matrix decompositions

Assume that M is diagonal. Consider object i .

Row of R = part (or piece), called latent factor (“latent object”)

Entry of M = weight of corresponding part

Row of MR = weighted part

Row of L = “view” of corresponding row of Din terms of the weighted parts(r pieces of information)

r forces “compactness” (often r < n)

23 / 27

Each object can be viewed as a combina-tion of r (weighted) “latent objects” (or“prototypical objects”). Similarly, each at-tribute can be viewed as a combination of r(weighted) “latent attributes.”

(e.g., latent attribute = “body size”; latent ob-ject relates body size to real attributes such as“height”, “weight”, “shoe size”)

Di∗ =∑

k LikMkkRk∗

Li∗Di∗

Other interpretions

Geometric interpretationI Transformation of n-dimensional space in r -dimensional spaceI Row of R = axisI Row of C = coordinates

Component interpretationI D is viewed as consisting of r layers (of same shape as D)I k-th layer described by L∗kMkkRk∗I D =

∑k L∗kMkkRk∗

Graph interpretationI D is thought of as a bipartite graph with object and attribute vertexesI Edge weights measure association b/w objects and attributesI Decomposition thought of as a tripartite graph with row, waypoint,

and column vertexes

All interpretations are useful (more later).

24 / 27

Outline

2 What is a matrix?

4 Summary

25 / 27

Lessons learned

Data mining = from data to knowledge→ Prediction, clustering, outlier detection, local patterns

Many different data types can be represented with a matrix→ Linear equations, data points, maps, graphs, relational data, . . .

Common interpretation: rows = objects, columns = attributes

Matrix decompositions reveal structure in the data→ D = LMR

Many different decompositions with different applications exist→ SVD, k-means, NMF, SDD, BMF, ICA, completion, ...

Factor interpretation: objects described by “latent attributes”

26 / 27

Data Mining and Matrices - Max Planck...

Documents

Transcript of Data Mining and Matrices - Max Planck...

Faloutsos, Kolda, Sun SDM'07 1 Mining Large Time-evolving Data Using Matrix and Tensor Tools

Matrix Methods in Data Mining and Pattern …staff.ustc.edu.cn/~ynyang/group-meeting/2014/matrix...Data Mining and Pattern Recognition 1.1 Data Mining and Pattern Recognition In modern

“Inside the MATRIX: Fair Information Practices in a World of Data Mining ”

SAP HANA Text Mining Developer Guide - SAP Help … · The default is TF-IDF, the most commonly used term weight. 2.4 Term-Document Matrix Text mining maintains a Term-Document matrix

Text Mining with Support Vector Machines and Non-Negative Matrix Factorization Algorithms

Fast Sparse Matrix-Vector Multiplication on GPUs : Implications for Graph Mining

Data and text mining Ensemble non-negative matrix ...derekgreene.com/papers/greene08ensemble.pdf · Data and text mining ... hierarchical clustering methods to identify such a groups.

Continuous Improvement ToolkitPayoff Matrix Delphi Method Relations MappingFishbone Diagrams Data Mining Just in Time Automation Product Family Matrix Flow Spaghetti** Multi-vari Studies

Binary Factorizations in Data Mining · Data Matrix Mining Book 1 5 0 3 Book 2 0 0 7 Book 3 4 6 5 1 A Document-term matrix 0 @ Avatar The Matrix Up Alice 4 2 Bob 3 2 Charlie 5 3 1

Pub Matrix- A Tool for Multiplex Literature Mining

MINING & MINERALS - Matrix Service · 2020. 7. 24. · MINING & MINERALS As the environment within which mining companies operate ... resources ensure successful project delivery.

Mining the Matrix

Mining at scale with latent factor models for matrix ... · Mining at scale with latent factor models for matrix completion Fabio Petroni Sapienza University of Rome, 2015. ... Nella

MATRIX · MATRIX Owner System Type Faction Black Market Minor Major Battlefield Industrial (MFG) Industrial (MINING) Industrial (ELEC) Industrial Locals Aur.Res.Aur.Dir. Davion Liao

Petrochemical Industry - matrix-industries · Matrix is a global engineering solutions company with over 14 years of experience in executing projects in Steel & Mining, Oil & Gas,

Data Mining for Security - Overview · Relational records Data matrix, e.g., numerical matrix, crosstabs Document data: text documents: term-frequency vector Transaction data Graph

2.3 Deformation Graphs (MP) - Max Planck Societyresources.mpi-inf.mpg.de/.../2.3_Deformation_Graphs... · Deformation Models Desirable Properties –Generality • handle different

Review Paper On An Ontology-Based Text Mining To Generate ... · Here, an ontology based text mining method for automatically constructing and updating a D-matrix by mining hundreds

Data Mining and Matrices - Max Planck Societyresources.mpi-inf.mpg.de/d5/teaching/ss13/dmm/slides/03-svd-handout.pdf · Pseudo-inverses Problem. Given A 2Rm n and b 2Rm, nd x 2Rn

Mining Large Time-evolving Data Using Matrix and Tensor Tools