Post on 21-Apr-2022
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Areas of Countries and Benford’s LawModel of a Dynamical System Proposed by V. Arnold
Alex Janke1 Xiangyu Wang2
1University of MichiganAnn Arbor, MI
2Peking UniversityBeijing, China
August 12, 2011
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Benford’s Law
Numbers from many real-life data sets, particularly thosedominated by exponential processes, have leading digitsdistributed in a non-uniform way.Let d ∈ {1,2,3,4,5,6,7,8,9} be the leading digit of anumber. A set of numbers (or random variable) satisfiesBenford’s law if d occurs with frequency (probability) givenby the following:
P(d) = log10(d + 1)− log10(d) = log10(1 +1d
)
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Benford’s Law
The following shows the frequency of leading digits predictedby Benford’s law.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Benford’s Law
The following shows the leading digit for the populations of 237countries. The dots denote the true Benford’s law.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Rationalizing Benford’s Law
The set {αn | n ∈ Zand log10(α) 6∈ Q} satisfies Benford’slaw. This is a consequence of the equidistribution theorem.A continuous random variable X whose logarithm’sfractional parts are uniformly distributed on [0,1) will satisfyBenford’s law.A continuous random variable X on a lognormaldistribution will satisfy Benford’s law increasingly well as itssecond moment approaches infinity.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
The Dynamical System Proposed by V. Arnold
Consider N countries with areas A1, ...,AN drawn fromsome distribution such that
∑Ni=1 Ai = 1.
At each iteration 1, ...,n randomly select two countries tomerge together and one country to split into two equalparts.Experimental evidence suggests that for large N and n theareas of countries satisfy the first digit law, irrespective ofthe initial distribution.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Experimental Evidence
Let N = 1000 and the initial entries of A are all 1N . We generate
the entries of A after n = 10000 iterations.
0 100 200 300 400 500 600 700 800 900 10000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Country Index
Fra
ctionalPart
ofLog10ofCountryAre
a
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Experimental Evidence
Let N = 1000 and the initial entries of A are all drawn from anexponential distribution with λ = 1000 such that the sum of theentries is normalized to one. We generate the entries of A aftern = 10000 iterations.
0 100 200 300 400 500 600 700 800 900 1000−11
−10
−9
−8
−7
−6
−5
−4
−3
−2
Country Index
Log
10ofCountryAre
a
0 100 200 300 400 500 600 700 800 900 10000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Country Index
Fra
ctionalPart
ofLog10ofCountryAre
a
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Formalizing the Model
Let A ∈ RN be a vector such that∑N
i=1 Ai = 1. At each iteration,three distinct entries (Ai ,Aj ,Ak ) from A are randomly chosen toform V = (Ai ,Aj ,Ak )T . This vector is multiplied on the left bythe following matrix:
M =
1 1 00 0 1
20 0 1
2
We are interested in the evolution of the distribution function forthe coordinates of A. Brackets on the matrix are in honor of Bif.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Density Functions
Suppose A = (x1, ..., xN)T initially. Then the initial densityfunction is given by the following formula:
f0(t) =1N
N∑i=1
δ(t − xi)
Suppose that coordinates xj1, xj2, xj3 are chosen. The newcoordinates are given by x ′ji =
∑3k=1 mikxjk . Then the density
function after a single iteration is given by the following formula:
f1(t) =1N
N∑i=1
δ(t − xi) +1N
3∑i=1
δ(t − x ′ji)−1N
3∑i=1
δ(t − xji)
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Expectation of Discrete Transitions
By linearity of the Laplace transform, the expectation of themoment generating function is just the moment generatingfunction of the expected distribution.
L{f0(t)} =1N
N∑i=1
e−xi s
E [L{f1(t)}] =1
N3
∑j1,j2,j3
(1N
N∑i=1
e−xi s+1N
3∑i=1
e−∑3
k=1 mik xjk s− 1N
e−xji s)
We then consider the expected discrete transition as thedifference E [L{f1(t)}]− L{f0(t)}. We can identify a stabledistribution for this by setting the difference equal to zero.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Stable Solution
Let L{f0(t)} = G0(s). Some algebraic magic reveals:
E [G1(s)|G0(s)]−G0(s) =1N
3∑i=1
(3∏
j=1
G0(mijs)−G0(s)) [1.1]
Here we will substitute in the values from our matrix.
E [G1(s)|G0(s)]−G0(s) =1N
(G20(s) + 2G0(
s2
)− 3G0(s)) [1.2]
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Stable Solution
Suppose at each iteration we follow conditional expectation withE [G1(s)|G0(s)] = G1(s). Then the stable solution to [1.2] is:
G∞(s) =∞∑
i=0
aisi ,a0 = 1,a1 = −1,an =n−1∑i=1
aian−i
1− 21−n [1.3]
Theorem 1: G∞(s) has a positive radius of convergence.Proof: Coefficients of this power series grow no faster thanexponentially. The coefficients grow as the Catalan numbersdo. Thus the power series has a positive radius of convergence.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Continuous Approximation
We define a continuous approximation for these discretetransitions. Define G(s, t) by G(s,0) = G0(s) and t = n
N . LetN →∞ and n→∞ to define the evolution equation by thefollowing:
∂
∂tG(s, t) = (G2(s, t) + 2G(
s2, t)− 3G(s, t)) [1.4]
Theorem 2: ‖ G(s, t)− G(s, t) ‖< CtN for some Ct > 0.
Proof: The difference is bounded like the error of the Riemannapproximation for the integral of the evolution equation.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Convergence to Stable Solution
Theorem 3: The stable solution G∞(s) [1.3] is an attractor ofthe evolution equation [1.4] with a basin containing all analyticG0(s).
Proof: G(s, t) =∑∞
n=0(an − bn(t))sn, where an is defined as inthe stable solution. It can be shown that bn(t) decreasesexponentially with t . This follows from the contractive propertiesof this mapping.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Random Discrete Transitions
Now let’s assume for each step the areas of countries changerandomly, instead of along the expectation.
E [Gn(s)|Gn−1(s)] = Gn−1(s)2 + 2Gn−1(s)− 3Gn−1(s) [2.1]
If we fix the value of s, {Gn(s)} is a series of random variables.Note that E(E [X |Y ]) = E(X ) and E(X 2) = E(X )2 + Var(X ).
E(Gn(s)) = E(Gn−1(s))2 + 2E(Gn−1(s))
−3E(Gn−1(s)) + Var(Gn−1(s)) [2.2]
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Random Discrete Transitions
If this procedure really converges to a unique distribution F thathas Laplace transformation LF (s), then
Gn(s)→ LF (s) a.s. and hence E(Gn(s))→ LF (s) a.s.
Noticing LF (s) is a constant, by the Continuous Mappingtheorem, we know
Var(Gn(s))→ 0 a.s. [2.3]
Then the equation will become identical to the discreteequation [1.3] we derived previously.
E(Gn(s))→ G∞(s) a.s. and therefore LF (s) = G∞(s) a.s.
In other word, if this random procedure converges, it mustconverge to a distribution whose Laplace transformation isidentical to the stable solution we’ve got before.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Convergence
Let X (i)N,n represent the area of country i at the nth step, where
we have N countries total. Via more algebraic magic, we have:
E [X 1N,n+1|FN,n] = (1− 2
N)X (1)
N,n +2N
[2.4]
Let FN,n be a filtration defined by:
FN,n = σ(FN,n−1,X(1)N,n, . . . ,X
(N)N,n )
Then by the following transformation:
Z (i)N,n =
X (i)N,n − 1
(1− 2N )n
We have by [2.4] that:
E [Z (1)N,n+1|FN,n] = Z (1)
N,n [2.5]
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Convergence
Fix the ratio t = n/N and set n→∞,N →∞. This implies:
E(Z (1)+N,n ) < +∞ if n = t × N
Then by the Martingale Convergence theorem, we know
Z (1)N,n
P−→ Zt and X (1)N,n
P−→ 1+Zt
e2t n→ +∞,N → +∞,n = Nt
where Zt is a random variable with finite mean. We can then
establish that the series { Zt
e2t } will converge in probability to aunique distribution Z∞ as t →∞.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Convergence
If we can show that as t →∞, the areas will tend to beindependent or weakly dependent pairwisely, then bycombining the empirical distribution given by:
FN(x) =N∑
i=1
1N
I{X (i)N,n<x}(x)
Then, by application of the Law of Large Numbers, we canshow that:
FN(x)d−→ Z∞
This implies that the areas of countries will converge to somedistribution.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Conjecture: Final Distribution is Exponential to aPower
Our conjecture is that the final distribution function is:
F (x) = 1− e−λx1b
Where the density function is given by:
f (x) =λ
be−λx
1b x
1b−1
The coefficients of the moment generating function are:
an =Γ(nb + 1)
λnbn!n−1
The numerically approximated value for b is 1.64677.A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Conjecture: Final Distribution is Exponential to aPower
The red line is the composed of points drawn from ourconjectured distribution. The blue line is the country areas froman experiment for n = 500000 and N = 10000 with a uniforminitial distribution.
A. Janke, X. Wang Areas of Countries and Benford’s Law
IntroductionEvolution of Density Functions
Convergence to DistributionOutlook
Outlook
Can we formalize our procedure to establish weak pairwisedependence between countries for t →∞?Can it be shown that this limiting distribution is in factexponential with the parameter we estimated numerically?
A. Janke, X. Wang Areas of Countries and Benford’s Law