Dynamic Graphs Compression and Dirchlet-multinomial Entropy · 2019. 2. 7. · Dynamic Graphs...

Dynamic Graphs Compression andDirchlet-multinomial Entropy

Wojciech Szpankowski

Center for Science of Information, Purdue University

Joint work with P. Jacquet, A. Magner, K. Turowski

ITA, San Diego, 2019

Wojciech Szpankowski Dynamic Graphs Compression and Dirchlet-multinomial Entropy

Talk outline

1 Introduction: dynamic networks and structural information

2 Duplication graph models

Full duplication model: entropy

Compression algorithms: Arithmetic Encoding

3 Entropy of Dirichlet-multinomial distribution


Dynamic networks


Dynamic networks

Main idea

Model complex systems as a set of dynamic (time-evolving) interactingentities in which the spatial structure and patterns of interactions changeover time.

Some challenges:

infer underlying dynamic processes and their parameters governingnetwork evolution from sparsely sampled system state,

infer spatio-temporal properties under assumption of a givenunderlying process, e.g. arrival sequences, clustering coefficient,degree distribution,

determine minimum number of bits to describe dynamic networks.


Structural information in random graphs

For random graphs generated according to a known model the mainstructural quantities are:

Automorphisms set Aut(G ) – set of permutations ofvertices in which the graph preserves the edge-vertexconnectivity.

Feasible permutations set Γ(G ) – set of permuta-tions σ of V (G ) such that Pr (Gn = σ(G )) > 0.

Admissible set Adm(G ) – set of positive-probabilitygraphs which can be obtained from G by applyingσ ∈ Γ(G ).

Note: |Adm(G )| = |Γ(G)||Aut(G)| .


Labeled vs. unlabeled graphs

There are two flavours of duplication graph models:

1 unlabeled graphs S(Gn) – the vertices contain no additionalinformation,

2 labeled graphs Gn – the vertices contain information e.g. abouttheir time of arrival.

Figure: Example unlabeled graph.


Labeled vs. unlabeled graphs

There are two flavours of duplication graph models:

1 unlabeled graphs S(Gn) – the vertices contain no additionalinformation,

2 labeled graphs Gn – the vertices contain information e.g. abouttheir time of arrival.

1 2 3 4

5678

9

Figure: Example labeled graph.


Compression of graphs and structures

Theorem (Structural entropy for a broad class of graph models)

If all graphs with a given structure are equiprobable, then

H(Gn)− H(S(Gn)) = E[log |Γ(Gn)|]− E[log |Aut(Gn)|].

Proof.

The theorem follows directly from two simple facts:

H(Gn) = H(Gn,S(Gn)) = H(S(Gn)) + H(Gn|S(Gn)),

Pr(Gn = G |S(Gn)) =1

|Adm(G )|=|Aut(G )||Γ(G )|

,

where H(Gn) = −∑

G Pr(Gn = G ) log Pr(Gn = G ).


Previous results

Erdős-Renyi model G (n, p)

Graph on n vertices, each pair of nodes receives an edge independently,with probability p.

H(G ) =(n2

)h(p), H(S(G )) =

(n2

)h(p)− n log n + O(n),

where h(p) = p log p + (1− p) log(1− p).

Note:

|Γ(G )| = n!,

for ln nn � p � 1−ln nn it is true that |Aut(G )| = 1 whp.


Previous results

Preferential Attachment model PA(n,m)

Start from a single vertex v1 with m self loops.

At each t from 2 on a new vertex vt joins and makes m independentconnection to the existing nodes with probability:

Pr[vt connects to vk |Gt−1] =degt−1(vk)

2m(t − 1).

H(G ) = mn log n + m(log(2m − 1)− log(m!)− A(m))n + o(n) forexplicitly known constant A(m),

if m ≥ 3, then H(S(G )) = (m − 1)n log n + O(n log log n).


Talk outline







Duplication models

Basic duplication (vertex-copying) model DD(n, p,G0)

1 Start from an arbitrary (fixed) G0 on n0 vertices,2 In each step t = 1, . . . , n:

1 add a new vertex vt to a graph,2 pick any vertex u from all previous vertices at random (uniformly),3 attach vt to all vertices connected to u (independently, withprobability p).

There are many variants of this model: e.g. connecting vt to u, addingedges from vt to other vertices of G , or removing some edges.

We consider first the boundary case p = 1.


Example

u1 u2 u3 u4

u5u6

v1v2

v3

Figure: Example graph growth in the full duplication model


Example

u1 u2 u3 u4

u5u6v1

v2

v3



Example

u1 u2 u3 u4

u5u6v1v2

v3



Talk outline







Basic notions

Ancestor

Ancestor of v ∈ V (Gn) is u ∈ V (G0), from v was ultimately copied.

Orbit

Orbit is a group of vertices in Gn with the same ancestor.

u1 (u1) u2 (u2) u3 (u3) u4 (u4)

u5 (u5)u6 (u6)v1 (u2)v2 (u1)

v3 (u2)

Figure: Example graph generated from the full duplication model


Ball-and-urn model

Polya urn model

Let each of n0 urns contain one ball: Ci,0 = 1 for i = 1, . . . , n0.

In each step, pick an urn proportionally to the number of balls in it andadd one ball to it.

The distribution of (Ci,n)n0i=1 is also known in the literature as

the Dirichlet-multinomial distribution DM(n, (1, . . . , 1)).

This distribution has also very well known marginal distribution undername beta-binomial distribution BBin(n, 1, n0 − 1):

Pr(Ci,n = (k + 1)) = (n0 − 1)(n

k

)Γ(k + 1)Γ(n − k + n0 − 1)

Γ(n + n0)

where Γ(a) is Euler gamma function.


Graph structure and ball-and-urn model

(Ci,n)n0i=1 is the number of vertices that have ui as the ancestors, and

H(S(Gn)) = H((Ci,n)n0i=1) =

n0∑i=1

E[logCi,n] = n0E[logC1,n].

u1 (u1) u2 (u2) u3 (u3) u4 (u4)

u5 (u5)u6 (u6)v1 (u2)v2 (u1)

v3 (u2)

(a) Graph

C1,n 2C2,n 3C3,n 1C4,n 1C5,n 1C6,n 1

(b) Representation


Automorphisms and ball-and-urn model

Automorphism:We can always swap vertices within one orbit, never between orbits,therefore:

E[log |Aut(Gn)|] =n0∑i=1

E [logCi,n!] = n0E [logC1,n!] .

Feasible Permutations Γ(G ):We can always swap vertices if (i) both are in G0 or (ii) if both are notin G0 and (iii) start Gn0 by selecting one node in each orbit. Thus:

E[log |Γ(Gn)|] = log n0! + log n! + n0E[logC1,n].


Solution

Finally, we get

E[log |Aut(Gn)|] = n log n − nHn0 log e +3n02

log n + O(1),

E[log |Γ(Gn)|] = n log n − n log e +2n0 + 1

2log n + O(1),

H(S(Gn)) = (n0 − 1) log n + O(1),

H(Gn) = n(Hn0 − 1) log e +n0 − 1

2log n + O(1).

Note: H(Gn) = Θ(n), H(S(Gn)) = Θ(log n) – fairly unique for well-known random graph models!

Graph Compression: arithmetic encoding.


Talk outline







Arithmetic coding


Talk outline







Dirichlet-multinomial distribution

Dirichlet-multinomial distribution is multinomial distribution wiht param-eters distributed as Dirichlet distribution.

Definition (Probability mass function for DM(n, ᾱ))

Let X̄ ∼ DM(n, ᾱ) for ᾱ = (α1, . . . , αm), where αi > 0 for i = 1, . . . ,m.Then for a value x = (x1, . . . , xm), xi ∈ {0, 1, . . . , n} such that∑m

k=1 xi = n, it holds that:

Pr(X̄ = x̄) =∫

[0,1]mPr(p̄)Pr(X̄ = x̄ |p̄) dp̄

=

∫[0,1]m

Γ(α0)∑mk=1 Γ(αk)

m∏k=1

pαk−1i

(n

x1 . . . xk

) m∏k=1

pxki dp̄

=Γ(n + 1)Γ (α0)

Γ (n + α0)

m∏k=1

Γ (xk + αk)

Γ(xk + 1)Γ (αk)=

nB(n, α0)∏k : xk>0

xkB(xk , αk)

where Γ(x) is the Euler gamma function and α0 =∑m

k=1 αk .


Beta-binomial distribution

Definition (Probability mass function for BBin(n, α, β))

Let X ∼ BBin(n, α, β) for α, β > 0.Then for a value x ∈ {0, 1, . . . , n} it holds that:

Pr(X = x) =∫ 10π(α, β, p)Pr(X = x |p) dp

=

∫ 10

pα−1(1− p)β−1

B(α, β)

(n

x

)px(1− p)n−x dp.

Beta-binomial distribution is both a special case of Dirichlet-multinomialdistribution for m = 2 – and a marginal distribution for DM(n, ᾱ).

Entropy was analyzed by Cheraghchi, 2017 (but non asymptotic expres-sion).


Entropy of Dirichlet-multinomial distribution

The entropy of Dirichlet-multinomial distribution can be expressed as

H(X̄ ) = −E log Pr(X̄ )= log Γ (n + α0)− log Γ (n + 1)− log Γ (α0)

+m∑

k=1

log Γ (αk) +m∑

k=1

E log Γ(Xk + 1)

−m∑

k=1

E log Γ (Xk + αk)

where Xk ∼ BBin (n, αk , α0 − αk).

Thus we only need to compute E log Γ (X + t) for some t and X ∼BBin (n, α1, α2).


Main Result: Asymptotic Formula for Entropy

Theorem

If X̄ ∼ DM(n, ᾱ), then

H(X̄ ) = (m − 1) log n − log Γ (α0)

+m∑

k=1

log Γ (αk) + log em∑

k=1

(αk − 1)(ψ(αk)− ψ(α0))

+

dmin{αi}e−1∑s=1

esn−s + O

(polylog(n)nmin{αi}

)where es are explicitly computable.


ExampleExample

Let ᾱ = (3, 4, 5). Then, from Theorem 4 we have

H(X̄ ) = 2 log n +(

955939240

log e + 5 + 2 log 3− log(11!))

+ n−112 log e − n−2 147718

log e + O(

polylog(n)n3

).

Table: Exact values and approximations for the entropy

n exact value approximation absolute error

100 11.29480883 11.29392204 8.8 · 10−4

500 15.81065166 15.81064409 7.5 · 10−6

1000 17.79368785 17.79368690 9.5 · 10−7

5000 22.42380687 22.42380686 7.7 · 10−9

10000 24.42207918 24.42207918 9.6 · 10−10


Roadmap

Taylor’s theorem

Central moments estimation+ Stirling approximation

Euler integrals

Kummer solutions


Dynamic Graphs Compression and Dirchlet-multinomial Entropy · 2019. 2. 7. · Dynamic Graphs...

Documents

Transcript of Dynamic Graphs Compression and Dirchlet-multinomial Entropy · 2019. 2. 7. · Dynamic Graphs...

Efficient Lightweight Compression Alongside Fast Scansorestis/damon15slides.pdf · 2015-06-03 · Lightweight Compression Compression schemes Entropy compression Group nearby similar

Multinomial Regression Models

Compression principles Text and Image - ULisboa · 1 Compression principles Text and Image Compression principles Text and Image Lossless and lossy compression Entropy encoding, Source

ENTROPY - luc.devroye.orgluc.devroye.org/HenriMertens-Entropy.pdf · ENTROPY THE BASICS OF INFORMATION THEORY Shannon 's theory from 1948. Shannon's view Lower bounds for compression

Asymmetric numeral systems: entropy coding combining speed ... · Asymmetric numeral systems: entropy coding combining speed of Hu man coding with compression rate of arithmetic coding

3. Multinomial response models - Universität Kassel · 3. Multinomial response models 3.1 General model approaches Multinomial dependent variables in a microeconometric analysis:

Distribusi Probabilitas Diskrit: Binomial, Multinomial ...debrina.lecture.ub.ac.id/files/2017/03/6-Distribusi-Diskrit-Binomial-Multinomial... · Distribusi Multinomial 08/04/17 22

Compression Algorithm Calculating Entropyvixra.org/pdf/1912.0105v1.pdf · Experiments detect entropy production in mesoscopic quantum systems The production of entropy, which means

Entropy Coding - lad.dsc.ufcg.edu.brlad.dsc.ufcg.edu.br/mpeg/VCD/Ch08.pdf · 164 ENTROPY CODING straightforward to implement, but cannot achieve optimal compression because of the

Interactive compression to entropy · Compression Compression is about identifying the essential parts of objects of interest (and discarding the rest). Compressing is often an evidence

Jestr JOURNAL OF Engineering Science and Technology Revie · 2019. 9. 30. · different types of decision trees were used namely Gini, Entropy, CART, Logworth, and CHAID. Multinomial

FPGA Implementation of High Performance Entropy …ijsrst.com/paper/972.pdf · In information theory an entropy encoding is a lossless data compression scheme that ... length prefix-free

Lecture 14 Image Compression 1.What and why image compression 2.Basic concepts 3.Encoding/decoding, entropy.

CONTEXT-BASED ENTROPY CODING WITH SPACE ...CONTEXT-BASED ENTROPY CODING WITH SPACE-FREQUENCY SEGMENTATION IN ULTRASOUND IMAGE COMPRESSION by Chen Ji B.A.Sc., Shanghai Jiao Tong University,

arxiv.org · Contents 10 Quantum Shannon Theory 1 10.1 Shannon for Dummies 2 10.1.1Shannon entropy and data compression 2 10.1.2Joint typicality, conditional entropy, and mutual infor-mation

Multinomial logisticregression basicrelationships

Implementing Entropy Codec for H.264 Video Compression ...

Shock compression response of high entropy alloys - IMECHexpert.imech.ac.cn/dailanhong/web files/paper/JZJ-MRL-HEA-2016.pdf · Shock compression response of high entropy alloys, ...

Multinomial N-mixture models - USGS · Multinomial N-mixture models •Form of multinomial cell probabilities 𝜋 ... # Fit Poisson models fm1

Lecture #1 From 0-th order entropy compression To k-th order entropy compression.