Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s...
Transcript of Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s...
![Page 1: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/1.jpg)
Computer Science in the Information Age
John HopcroftCornell UniversityIthaca, New York
![Page 2: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/2.jpg)
Time of change
There is a fundamental revolution taking place that is changing all aspects of our lives.
Those individuals who recognize this and position themselves for the future will benefit enormously.
![Page 3: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/3.jpg)
Early years ofComputer Science
Programming languagesCompilersOperating systemsNetwork protocolsAlgorithmsComputability
![Page 4: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/4.jpg)
Computer Science is changing
Structure of large networksLarge data setsInformationSearch
![Page 5: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/5.jpg)
Drivers of change
Computers are becoming ubiquitousSpeed sufficient for word processing, email, chat and spreadsheetsMerging of computing and communicationsData available in digital formDevices being networked
![Page 6: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/6.jpg)
Computer Science departments are beginning to develop courses that cover the underlying theory
Random graphsPhase transitionsGiant componentsSpectral analysisSmall world phenomenaGrown graphs
![Page 7: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/7.jpg)
What is the theory needed to support the future?
Large amounts of dataNoisy data with outliersHigh dimensionalPossibly power law distributed
![Page 8: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/8.jpg)
Internet queriesToday
Autos
Graph theory
Colleges, universities
Computer science
![Page 9: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/9.jpg)
Internet queries are changingToday
AutosGraph theory
Colleges, universitiesComputer science
TomorrowWhich car should I buy?Construct an annotated bibliography on graph theoryWhere should I go to college?How did the field of CS develop?
![Page 10: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/10.jpg)
What car should I buy?
List of makesCostReliabilityFuel economyCrash safety
Pertinent articlesConsumer reportsCar and driver
![Page 11: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/11.jpg)
Where should I go to college?List of factors that might influence choice
CostGeographic locationSizeType of institution
MetricsRanking of programsStudent faculty ratiosGraduates from your high school/neighborhood
![Page 12: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/12.jpg)
How did field develop?
From ISI database create set of important papers in field?Extract important authors
author of key paperauthor of several important papersthesis advisor of Ph.D.’s student(s) who is (are) important author(s)
Extract key institutionsinstitution of important author(s)Ph.D. granting institution of important authors
OutputDirected graph of key papersFlow of Ph.D.’s geographically with timeFlow of ideas
![Page 13: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/13.jpg)
poral Cluster Histograms: NIPS Results
12: chip, circuit, analog, voltage, vlsi11: kernel, margin, svm, vc, xi10: bayesian, mixture, posterior, likelihood
em9: spike, spikes, firing, neuron, neurons8: neurons, neuron, synaptic, memory,
firing7: david, michael, john, richard, chair6: policy, reinforcement, action, state,
agent5: visual, eye, cells, motion, orientation4: units, node, training, nodes, tree3: code, codes, decoding, message, hints2: image, images, object, face, video1: recurrent hidden training units error
NIPS k-means clusters (k=13)
![Page 14: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/14.jpg)
Publishing
Researcher makes discovery and writes technical paperSubmits paper to journalJournal sends it out for refereeingRevised article is copy edited and appears in print about two years laterResults are available world wide through major research libraries
![Page 15: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/15.jpg)
The future of publishing
Advice to young faculty forced journals to let Google search them and ultimately to allow authors to post their article on their website.
What other changes are in store?
![Page 16: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/16.jpg)
Wikipedia
905,707 articlesRecent comparison showed Wikipedia almost as accurate as Encyclopedia BritannicaMajor text source for formulas, definitions and proofs
![Page 17: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/17.jpg)
Cayley's formula From Wikipedia, the free encyclopedia.
Jump to: navigation, search
In mathematics, Cayley's formula is a result in graph theory. It states that if n is a positive integer, the number of trees on n labeled vertices is
It is a particular case of Kirchhoff's theorem.
[edit]
Proof of the formula
Let Tn be the set of trees possible on the vertex set . We seek to show that | Tn | = nn − 2.
We begin by proving a lemma:
Claim: Let be positive integers such that . Let be the set of trees on the vertex set such that the degree of vi (denoted d(vi)) is di for . Then
Proof: We go by induction on n. For n = 1 and n = 2 the proposition holds trivially and is easy to verify. We move to the inductive step. Assume n > 2 and that the claim holds for degree sequences on n − 1 vertices. Since the di are all positive but their sum is less than 2n,
such that dk = 1. Assume without loss of generality that dn = 1.
For let be the set of trees on the vertex set such that:
![Page 18: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/18.jpg)
Fed Ex package tracking
![Page 19: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/19.jpg)
![Page 20: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/20.jpg)
![Page 21: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/21.jpg)
![Page 22: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/22.jpg)
![Page 23: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/23.jpg)
Theory to supportnew directions
Large graphsSpectral analysisHigh dimensions and dimension reductionClusteringCollaborative filteringExtracting signal from noise
![Page 24: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/24.jpg)
Graph Theory of the 50’s
Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.
![Page 25: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/25.jpg)
Theory of Large Graphs
Large graphsBillion verticesExact edges present not critical
Theoretical basis for study of large graphsMaybe theory of graph generationInvariant to small changes in definitionMust be able to prove basic theorems
![Page 26: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/26.jpg)
Erdös-Renyin verticeseach of n2 potential edges is present with independent probability
Nn
pn (1-p)N-n
vertex degreebinomial degree distribution
numberof
vertices
![Page 27: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/27.jpg)
![Page 28: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/28.jpg)
Generative models for graphs
Vertices and edges added at each unit of timeRule to determine where to place edges
Uniform probabilityPreferential attachment - gives rise to power law degree distributions
![Page 29: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/29.jpg)
Number
of
vertices
Preferential attachment gives rise to the power law degree distribution common in many graphs
Vertex degree
![Page 30: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/30.jpg)
Protein interactions
2730 proteins in data base
3602 interactions between proteins
SIZE OF COMPONENT
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 … 1851
NUMBER OF COMPONENTS
48 179 50 25 14 6 4 6 1 1 1 0 0 0 0 1 1
Science 1999 July 30; 285:751-753
![Page 31: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/31.jpg)
Giant Component
1.Create n isolated vertices
2.Add Edges randomly one by one
3.Compute number of connected components
![Page 32: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/32.jpg)
Giant Component1
1000
1 2
998 1
1 2 3 4 5 6 7 8 9 10 11
1 111359142889548
![Page 33: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/33.jpg)
Giant Component1 2 3 4 5 6 7 8 9 10 11
1 111359142889548
1 2 3 4 5 6 7 8367 70 24 12 9 3 2 2
9 10 12 13 14 20 55 1012 2 1 2 2 1 1 1
1 2 3 4 5 6 7 8 9 11 5141 11126361339252
![Page 34: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/34.jpg)
Tubes
Tendrils
TendrilsIn44 million
Out44 million
SCC56 million nodes
Disconnected components
Source: almaden.ibm.com
![Page 35: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/35.jpg)
Phase transitionsG(n,p)
Emergence of cycleGiant componentConnected graph
N(p)Emergence of arithmetic sequence
CNF satisfiabilityFix number of variables, increase number of clauses
![Page 36: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/36.jpg)
Access to InformationSMART Technology aardvark 0
abacus 0...antitrust 42...CEO 17...microsoft 61...windows 14wine 0wing 0winner 3winter 0...zoo 0zoology 0Zurich 0
document
![Page 37: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/37.jpg)
Locating relevant documentsQuery: Where can I get information on
gates?2,060,000 hits
Bill Gates 593,000Gates county 177,000baby gates 170,000gates of heaven 169,000automatic gates 83,000fences and gates 43,000Boolean gates 19,000
![Page 38: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/38.jpg)
Clustering documents
cluster documentsrefine cluster
microsoftwindowsantitrust
Booleangates
GatesCounty
automaticgates
gates
Bill Gates
![Page 39: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/39.jpg)
Refinement of another type: books
mathem
atics
physics
chemistr
y
astronomy
children’s books
textbooks
reference books
general population
![Page 40: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/40.jpg)
High Dimensions
Intuition from two and three dimensions not valid for high dimension
Volume of cube is one in all dimensions
Volume of sphere goes to zero
![Page 41: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/41.jpg)
1
Unit sphere
Unit square
2 2 11 1 0.7072 2 2
⎛ ⎞ ⎛ ⎞+ = =⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠
2 Dimensions
![Page 42: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/42.jpg)
2 2 2 21 1 1 1 12 2 2 2
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞+ + + =⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
4 Dimensions
![Page 43: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/43.jpg)
2122dd ⎛ ⎞ =⎜ ⎟
⎝ ⎠
1
d Dimensions
![Page 44: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/44.jpg)
Almost all area of the unit cube is outside the unit sphere
![Page 45: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/45.jpg)
High dimension is fundamentally different from 2
or 3 dimensional space
![Page 46: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/46.jpg)
High dimensional data is inherently unstable
Given n random points in d dimensional space essentially all n2 distances are equal.
( )22
1
d
i ii
x yx y=
= −− ∑
![Page 47: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/47.jpg)
Gaussian distribution
![Page 48: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/48.jpg)
Gaussian distribution
Probability mass concentrated between dotted lines
![Page 49: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/49.jpg)
Gaussian in high dimensions
![Page 50: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/50.jpg)
Two Gaussians
![Page 51: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/51.jpg)
![Page 52: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/52.jpg)
Distance between two random points from same Gaussian
Points on thin annulus of radius
Approximate by sphere of radius
Average distance between two points is(Place one pt at N. Pole other at random. Almost surely second point near the equator.)
d
d
d
![Page 53: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/53.jpg)
![Page 54: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/54.jpg)
2d
d
d
![Page 55: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/55.jpg)
Expected distance between pts from two Gaussians separated by δ
2 2dδ +
![Page 56: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/56.jpg)
Can separate points from two Gaussians if
( )
( )
2
14
2
12 2
2
2 2
2 1 2
12 2
2 2
d
d d
d d
d
d
δ
δ γ
γ
δ γ
δ γ
+ > +
+ + > +
>
>
L
![Page 57: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/57.jpg)
Dimension reduction
Project points onto subspace containing centers of GaussiansReduce dimension from d to k, the number of Gaussians
![Page 58: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/58.jpg)
Centers retain separationAverage distance between points reduced by d
k
![Page 59: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/59.jpg)
Can separate Gaussians provided
2 2 2k kδ γ+ > +
> some constant involving k and γindependent of the dimension
δ
![Page 60: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/60.jpg)
Ranking is importantRestaurantsMoviesWeb pages
Multi billion dollar industry
![Page 61: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/61.jpg)
Page rank equals stationary probability of random walk
![Page 62: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/62.jpg)
restart
15%
![Page 63: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/63.jpg)
restart
15%
Restart yields strongly connected graph
![Page 64: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/64.jpg)
Suppose you wish to increase the page rank of vertex v
Capture restartweb farm
Capture random walksmall cycles
![Page 65: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/65.jpg)
Capture restart
Buy 20,000 url’s and capture restart
Can be countered by small restart value Small restart increases web rank of page that captures random walk by small cycles.
![Page 66: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/66.jpg)
1 0.851
0.15 restart
Capture random walk
0.66
0.23 restart
0.6611.56
0.1restart
0.66
![Page 67: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/67.jpg)
X=1.56
Y=0.66
0.1restart
0.56 0.66
10.66
0.23restart
X=1+0.85*y
Y=0.85*x/2
![Page 68: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/68.jpg)
If one loop increases Pagerankfrom 1 to 1.56 why not add many self loops?
Maximum increase in Pagerank is 6.67
![Page 69: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/69.jpg)
Discovery time – time to first reach a vertex by random walk from uniform start
SCannot lower discovery time of any page in S below minimum already in S
![Page 70: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/70.jpg)
Why not replace Pagerank by discovery time?
No efficient algorithm for discovery time
DiscoveryTime(v)
remove edges out of v
calculate Pagerank(v) in modified graph
![Page 71: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/71.jpg)
Is there a way for a spammer to raise Pagerank in a way that is not statistically detectable
![Page 72: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/72.jpg)
Information is important
When a customer makes a purchase what else is he likely to buy?
CameraMemory cardBatteriesCarrying caseEtc.
Knowing what a customer is likely to buy is important information.
![Page 73: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/73.jpg)
How can we extract information from a customer’s visit to a web site?
What web pages were visited?What order?How long?
![Page 74: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/74.jpg)
Collaborative filteringRecommendationsWhich pop-up ads
Detecting changes over timeChanges in a marketBuying habits
Access to information
![Page 75: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/75.jpg)
One million products
⎛ ⎞⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠
Probability of customer
Buying product
200 million
customers
![Page 76: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/76.jpg)
100 categories
⎛ ⎞⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎛ ⎞⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠
One million
products
Probability of product given category
200 million
customers
Probability of category
Given customer
![Page 77: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/77.jpg)
Extracting Information from Large Data Sources
Data streamsLarge data collectionsDetecting changes in patterns
![Page 78: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/78.jpg)
Detecting trends before they become obvious
Is some category of customer changing their buying habits?
Purchases, travel destination, vacationsIs there some new trend in the stock market?How do we detect changes in a large database over time?
![Page 79: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/79.jpg)
Identifying Changing Patterns in a Large Data Set
How soon can one detect a change in patterns in a large volume of information?How large must a change be in order to distinguish it from random fluctuations?
![Page 80: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/80.jpg)
Conclusions
We are in an exciting time of change.Information technology is a big driver of that change.The computer science theory of the last thirty years needs to be extended to cover the next thirty years.
![Page 81: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/81.jpg)
Spectral Analysis
The model
1 12 4
0 1ˆ
1 14 2
McSherry FOCS 2002
G Gmatrix
⎛ ⎞⎜ ⎟ ⎛ ⎞⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟−⎜ ⎟= → = ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎝ ⎠⎜ ⎟⎝ ⎠
![Page 82: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/82.jpg)
Eigenvalue distribution
![Page 83: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/83.jpg)
Spectral Analysis
Recovering the graph structure
( )
1
2
31 2 1 2
4
1 1 12 4
20
01 14 2
0
ˆ
2ˆ
T T Tu u u u u un n
n
TG UDU
G
λ
λ
λ
λ
λ
λ
λ
⎛ ⎞⎜ ⎟⎛ ⎞⎛ ⎞⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟= ⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎝ ⎠ ⎝ ⎠⎜ ⎟⎝ ⎠
⎛ ⎞⎛ ⎞ ⎜ ⎟⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜⎜ ⎟ ⎜ ⎟⎜ ⎟= = ⎜⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜⎝ ⎠ ⎝ ⎠⎜ ⎟ ⎜⎝ ⎠ ⎝ ⎠
=L L
O
O
M cSherry FOCS 2002
⎟⎟⎟⎟⎟⎟⎟
![Page 84: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/84.jpg)
Power law distributions
If the degrees of a random matrix are power law distributed, the major eigenvectors are associate with neighborhoods of high degree vertices rather than structure of the graph.
Papadimitriou and Mihail
1 1
2 21 12 4 ˆ1 14 2
n n
d dd dM DGD M
d d
⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟⎛ ⎞⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎝ ⎠⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠
= = →O O
![Page 85: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/85.jpg)
Signal to noise ratio
Multiply every element of matrix by some fixed constant.The bounds in spectral analysis are determined by maximum noise of an element not the average.Multiplying low variance elements by constant increases signal without increasing maximum noise.
![Page 86: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/86.jpg)
Variance of random variable
21
10 1
px p px p
pσ
⎧⎪⎪ ⎛ ⎞⎛ ⎞⎪ ⎜ ⎟⎨ ⎜ ⎟ ⎜ ⎟⎝ ⎠⎪ ⎝ ⎠⎪⎪⎩
= = ≅−−
![Page 87: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/87.jpg)
Two ways of modifying x
2
2 2 2
11
0 1
10 1
cpy cp cpy cpcp
c pz c p c pz pp
σ
σ
⎧⎪⎪ ⎛ ⎞⎛ ⎞⎪ ⎜ ⎟⎪ ⎜ ⎟ ⎜ ⎟⎨ ⎜ ⎟ ⎜ ⎟⎪ ⎜ ⎟ ⎜ ⎟⎝ ⎠⎪ ⎝ ⎠⎪⎪⎩
⎧⎪⎪ ⎛ ⎞⎪ ⎛ ⎞ ⎜ ⎟⎪ ⎜ ⎟ ⎜ ⎟⎨ ⎜ ⎟ ⎜ ⎟⎪ ⎝ ⎠ ⎜ ⎟⎪ ⎝ ⎠⎪⎪⎩
= = ≅−−
= = ≅−−
![Page 88: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/88.jpg)
Increasing probability by c increases variance by c
Multiplying variable by c increases variance by
Thus we correct for increase in probability of factor of c by multiplying
variable by
2c
1c
1 1
2 2
1 1
1 1
1 1
n n
d d
d dL M
d d
⎛ ⎞ ⎛ ⎞⎛ ⎞⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟
= ⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟
⎝ ⎠⎝ ⎠ ⎝ ⎠
O O
![Page 89: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/89.jpg)
Power law distributions arise in many different contexts1) data2) queries
![Page 90: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/90.jpg)
General theme emerging in clustering
Use SVD to find reduced subspaceProject data onto reduced subspaceClusterAlthough SVD minimizes the sum of squared error between points and their expected values the error is not uniformly distributed and thus there are usually some outliersReproject data onto subspace through the approximate cluster centersRecluster
![Page 91: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/91.jpg)
Two situations
( )( )
Centers 0,0, ,0
1,0, ,0
L
L
( )C e n te rs 0 ,0 , ,0L
( )1 1,1, ,1d
L
![Page 92: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/92.jpg)
Two situations
Centers
Centers
( )0 , 0 , , 0L
( )1 1 1, , ,d d d
L
( )0, 0 , , 0L
( )1, 0 , , 0L
( )1 1 1, , ,d d d
L
![Page 93: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/93.jpg)
For spherical Gaussian the two situations are equivalent – Probability distribution for spherical Gaussian depends only on distance from center.
For binomial distributions – the two situations are fundamentally different.
![Page 94: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/94.jpg)
Binomial distribution in d dimensions
12
12
01i
px
p=⎧
= ⎨ =⎩
![Page 95: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/95.jpg)
Centers differ in one dimension
![Page 96: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/96.jpg)
Centers differ along all dimensions
![Page 97: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/97.jpg)
The two situations
![Page 98: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/98.jpg)
BalanceSuggests we define balance of a unit vector by how uniformly the coordinates contribute to its length.
2
bal( )x
xx
∞=
![Page 99: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/99.jpg)
General method
Project data onto SVD subspaceClusterDraw lines through all k2 pairs of cluster centersSmooth linesProject onto each line and cluster
![Page 100: Computer Science in the Information Age...Extracting signal from noise Graph Theory of the 50’s Theorem: A graph is planar if it does not contain a Kuratowski subgraph as a contraction.](https://reader035.fdocuments.us/reader035/viewer/2022071022/5fd7452dd303e61d2a779326/html5/thumbnails/100.jpg)
A B
C