An Invitation to Classical and Quantum Information...
Transcript of An Invitation to Classical and Quantum Information...
Information Geometry
An Invitation to Classical and QuantumInformation Geometry
Attila Andai
September 4, 2017
Attila Andai Information Geometry
Information Geometry
These slides were presented at the following conference.
XXVI International Fall Workshop on Geometry andPhysics
Universidade do Minho, Braga, Portugal
4-7 September 2017
The slides may contain minor errors and typos. Use at your ownrisk.
Attila Andai Information Geometry
Information Geometry
Outline
Classical information geometry
1 Basic ideas
2 Parametric probability distributions
3 Fisher information
4 Divergences
5 Differential geometry
6 Duality
Attila Andai Information Geometry
Information Geometry
Outline
Quantum information geometry
1 Introduction to noncommutative information geometry
2 Preparations for Petz theorem
3 Means
4 Petz theorem
5 Operator monotone functions
6 Computing monotone metrics
Attila Andai Information Geometry
Information Geometry
Outline
Advanced topics
1 Relative entropy
2 Duality
3 About volume of the state space
4 Uncertainty relations
Attila Andai Information Geometry
Information Geometry
Outline
Classical information geometry
Attila Andai Information Geometry
Information Geometry
Basic ideas
Information Geometry
Statistical model ≈ Parametric probability distribution
Information geometry ≈ Riemannian metric on statistical model
Attila Andai Information Geometry
Information Geometry
Parametric probability distributions
Statistical model
Statistical model
Definition
Statistical model: S = (X ,B(X ), S ,Ξ)
1 X 6= ∅ set, B(X ) σ algebra on X ,
2 the elements of S are probability measures on B(X ),
3 there exists a bijection i : Ξ→ S ϑ 7→ µϑ
Ξ: Parameter space
(This setting is too general.)
Attila Andai Information Geometry
Information Geometry
Parametric probability distributions
Statistical model
We make more assumptions.
1 ∃n ∈ N+: Ξ ⊆ Rn, moreover Ξ connected open set.(n-dimensional statistical model)
2 If X is finite, then B(X ) = P(X ).
3 If X is infinite, then X ⊆ Rm, X connected open set, B(X )contains Borel sets and for every ϑ ∈ Ξ the probabilitydistribution µϑ ∈ S has density function pϑ (with respect tothe Lebesgue measure).
4 We refer to the elements of S as density functions and denoteit by p(x , ϑ) = pϑ(x).
5 Every function pϑ ∈ S has 1., 2., and 3. moment.
Attila Andai Information Geometry
Information Geometry
Parametric probability distributions
Statistical model
6 For every x ∈ X the function
Ξ→ R ϑ 7→ p(x , ϑ)
is smooth. We use the notation
∂ip(x , ϑ) =∂p(x , ϑ)
∂ϑii = 1, . . . ,m.
7 We assume that∫
X
∂i1 . . . ∂ikp(x , ϑ)d x = ∂i1 . . . ∂ik
∫
X
p(x , ϑ) d x = 0.
8 ∀ϑ ∈ Ξ and ∀x ∈ X : p(x , ϑ) > 0
The statistical model is denoted by (X ,S ,Ξ).
Attila Andai Information Geometry
Information Geometry
Parametric probability distributions
Statistical model
Example (Discrete distribution)
X = {0, 1, . . . , n}
Ξ =
{(ϑ1, . . . , ϑn) ∈ Rn
∣∣∣ ϑi > 0,n∑
k=1
ϑk < 1
}
p(x , ϑ) =
ϑx if 1 ≤ x ≤ n,
1−n∑
k=1
ϑk if x = 0.
The space of distributions:
Pn =
{(p0, p1, . . . , pn) ∈ ]0, 1[n+1 |
n∑
i=0
pi = 1
}.
Attila Andai Information Geometry
Information Geometry
Parametric probability distributions
Statistical model
Example (Normal distribution)
X = R
Ξ = R× R+
p(x , µ, σ) =1√2πσ
exp
(−(x − µ)2
2σ2
)
Attila Andai Information Geometry
Information Geometry
Fisher information
Fisher information matrix
Fisher information matrix
For an n-dimensional statistical model (X , S ,Ξ) the Fisherinformation is an n × n matrix for every parameter ϑ ∈ Ξ.
Definition
Assume that (X ,S ,Ξ) is an n dimensional statistical model. Forevery point ϑ ∈ Ξ the Fisher information matrix is given by
g (F)(ϑ)ik =
∫
X
1
p(x , ϑ)(∂ip(x , ϑ))(∂kp(x , ϑ))d x .
The Fisher matrix denoted by g (F)(ϑ).
Attila Andai Information Geometry
Information Geometry
Fisher information
Fisher information matrix
We will use the following representations for Fisher matrix.
g (F)(ϑ)ik =
∫
Xp(x , ϑ)(∂i log p(x , ϑ))(∂k log p(x , ϑ))d x
g (F)(ϑ)ik = 4
∫
X(∂i√p(x , ϑ))(∂k
√p(x , ϑ))d x
Attila Andai Information Geometry
Information Geometry
Fisher information
Fisher information matrix
Theorem
Assume that (X ,S ,Ξ) is an n dimensional statistical model. If thefunctions (∂ip(·, ϑ))i=1,...,n are linearly independent at a pointϑ ∈ Ξ then the Fisher matrix g (F)(ϑ) positive definite.
Proof.
For every c ∈ Rn
⟨(c1, . . . , cn), g (F)(ϑ)(c1, . . . , cn)
⟩
=
∫
Xp(x , ϑ)
(n∑
i=1
ci∂i (log p(x , ϑ))
)2
d x ≥ 0.
Attila Andai Information Geometry
Information Geometry
Fisher information
Induced statistical models
Induced statistical models
Assume that (X ,B(X ), S ,Ξ) is a statistical model and
f : X → Y x 7→ f (x)
is a surjective map.
Let us define B(Y ) =
{A ⊆ Y |
−1f (A) ∈ B(X )
}.
For every ϑ ∈ Ξ, µϑ is probability measure on X , with densityfunction pϑ.
Now define µϑ as
µϑ(A) = µϑ
(−1f (A)
)∀A ∈ B(Y )
and denote its density function with pϑ.Attila Andai Information Geometry
Information Geometry
Fisher information
Induced statistical models
Define S as {µϑ|ϑ ∈ Ξ}.
After these steps, we have an induced statistical model
(Y ,B(Y ), S ,Ξ).
Attila Andai Information Geometry
Information Geometry
Fisher information
Monotonicity of Fisher matrix
Monotonicity of Fisher matrix
If we measure less precisely we can have less information.
Definition
Assume that (X ,S ,Ξ) is a statistical model and f : X → Y is ameasurable surjective map. Let us define
r(·, ·) : X × Ξ→ R (x , ϑ) 7→ r(x , ϑ) =p(x , ϑ)
p(f (x), ϑ).
f sufficient statistic of S , if for every x ∈ X the function
r(x , ·) : Ξ→ R ϑ 7→ r(x , ϑ)
is constant.
Attila Andai Information Geometry
Information Geometry
Fisher information
Monotonicity of Fisher matrix
Monotonicity of Fisher matrix
If we measure less precisely we can have less information.
Definition
Assume that (X ,S ,Ξ) is a statistical model and f : X → Y is ameasurable surjective map. Let us define
r(·, ·) : X × Ξ→ R (x , ϑ) 7→ r(x , ϑ) =p(x , ϑ)
p(f (x), ϑ).
f sufficient statistic of S , if for every x ∈ X the function
r(x , ·) : Ξ→ R ϑ 7→ r(x , ϑ)
is constant.
Attila Andai Information Geometry
Information Geometry
Fisher information
Monotonicity of Fisher matrix
Monotonicity of Fisher matrix
Theorem
Assume that (X ,S ,Ξ) is a statistical model, f : X → Y is ameasurable surjective map and (Y ,Q,Ξ) is the induced statisticalmodel. For every ϑ ∈ Ξ the Fisher information matrix in S is
g(F)S (ϑ) and in Q is g
(F)Q (ϑ). For every ϑ ∈ Ξ
g(F)Q (ϑ) ≤ g
(F)S (ϑ). (?)
Information loss: ∆g(ϑ) = g(F)S (ϑ)− g
(F)Q (ϑ)
∆gik(ϑ) =
∫
Xp(x , ϑ)
∂ log r(x , ϑ)
∂ϑi
∂ log r(x , ϑ)
∂ϑkd x
Equality holds in (?) iff f sufficient statistic of S .
Attila Andai Information Geometry
Information Geometry
Fisher information
Monotonicity under Markov kernel
Monotonicity under Markov kernel
Definition
Assume that X ⊆ Rn and Y ⊆ Rm are connect open sets. Themap
κ : X × Y → R (x , y) 7→ κ(y |x)
is Markov kernel or transition probability if ∀x ∈ X and ∀y ∈ Y :κ(y |x) ≥ 0, and ∀x ∈ X :
∫
Yκ(y |x) d y = 1.
Attila Andai Information Geometry
Information Geometry
Fisher information
Monotonicity under Markov kernel
Theorem
Assume that (X ,S ,Ξ) is a statistical model and
κ : X × Y → R (x , y) 7→ κ(y |x)
is a Markov kernel. Define p(y , ϑ) =∫X κ(y |x)p(x , ϑ) d x , and
denote the set of these distributions by (Y ,Q,Ξ). Then for everyϑ ∈ Ξ we have
g(F)Q (ϑ) ≤ g
(F)S (ϑ).
The information loss ∆g(ϑ) = g(F)S (ϑ)− g
(F)Q (ϑ) is
∆gik(ϑ) =
∫
Xp(x , ϑ)
∂ log r(x , ϑ)
∂ϑi
∂ log r(x , ϑ)
∂ϑkd x .
Attila Andai Information Geometry
Information Geometry
Fisher information
Cramer-Rao inequality
Cramer-Rao inequality
We consider the problem of estimating unknown parameter.
Assume that a data is randomly generated subject to a probabilitydistribution which is unknown but is assumed to be in an ndimensional statistical model.
Assume that (X ,S ,Ξ) is a statistical model. The measurement isa map X : X → Rm. (m = 1 is the real valued measurement)
After k measurements we estimate the parameter ϑ with anestimator
ϑ : (Rm)k → Ξ (x1, . . . , xk) 7→ ϑ(x1, . . . , xk).
Attila Andai Information Geometry
Information Geometry
Fisher information
Cramer-Rao inequality
Assume that we have independent measurements. The expectedvalue of ϑ with respect to p(k)(x , ϑ) is
Eϑ(ϑ) =
∫
X k
p(k)(x , ϑ)ϑ(x) d x .
The estimator ϑ is unbiased if for every ϑ ∈ Ξ
Eϑ(ϑ) = ϑ.
The variance of the estimator is
Vϑ(ϑ)ij = Eϑ((ϑ− Eϑ(ϑ))i (ϑ− Eϑ(ϑ))j
)=
=
∫
X k
p(k)(x , ϑ)(ϑ(x)− Eϑ(ϑ))i (ϑ(x)− Eϑ(ϑ))j d x .
Attila Andai Information Geometry
Information Geometry
Fisher information
Cramer-Rao inequality
Theorem (Cramer-Rao)
Assume that (X ,S ,Ξ) is a statistical model, k ∈ N+, g (F) is theFisher information of (X k ,S (k),Ξ), ϑ is an unbiased estimator of ϑand V(ϑ)(ϑ) its variance. For every ϑ ∈ Ξ we have
Vϑ(ϑ) ≥(g (F)(ϑ)
)−1.
Attila Andai Information Geometry
Information Geometry
Fisher information
Cramer-Rao inequality
Example (Cramer-Rao inequality)
Define X = {0, 1}, Ξ = ]0, 1[ and S a set of functions
p : X × Ξ→ R (x , ϑ) 7→
{1− ϑ if x = 0,
ϑ if x = 1.
Then (X , S ,Ξ) is a statistical model. Assume that we haveindependent measurements x1, . . . , xk . Consider the estimator for ϑ
ϑ : X k → Ξ (x1, . . . , xk) 7→ 1
k
k∑
i=1
xi .
ϑ is unbiased
Eϑ(ϑ) =k∑
i=0
(k
i
)ϑk−i (1− ϑ)i
k − i
k= ϑ.
Attila Andai Information Geometry
Information Geometry
Fisher information
Cramer-Rao inequality
Example (Cramer-Rao inequality (cont.))
The variance of ϑ is
Vϑ(ϑ) =k∑
i=0
(k
i
)ϑk−i (1− ϑ)i
(k − i
k− ϑ
)2
=ϑ(1− ϑ)
k.
The Fisher information is gS(ϑ) =1
ϑ(1− ϑ)for k measurements is
g (F)(ϑ) = kgS(ϑ).The Cramer–Rao inequality in this setting is
ϑ(1− ϑ)
k≥ ϑ(1− ϑ)
k.
So ϑ has the least variance.
Attila Andai Information Geometry
Information Geometry
Fisher information
Entropy and Fisher information
Fisher information of a density function
Consider a density function f : Rn → R and the shift as aparameter
f : Rn × Rn → R (x , y) 7→ f (x , y) = f (x + y).
The Fisher information of f is
gik(y) =
∫
Rn
1
f (x , y)
∂ f (x , y)
∂yi
∂ f (x , y)
∂ykd x .
It does not depend on y , reasonable to define
gik =
∫
Rn
1
p(x)
∂p(x)
∂xi
∂p(x)
∂xkd x
as Fisher information of f .
Attila Andai Information Geometry
Information Geometry
Fisher information
Entropy and Fisher information
Entropy
Definition
The entropy of a density function f : X → R
S(f ) = −∫
Xf (x) log f (x) d x .
(0 log 0 = 0)
Attila Andai Information Geometry
Information Geometry
Fisher information
Entropy and Fisher information
Fisher information vs. Entropy
1 Fisher information is for family of distributions and for singledistributions. Entropy is for single distributions.
2 Fisher information is strictly positive, entropy could be anyreal number.
3 There is maximum entropy principle and minimum Fisherinformation principle.
Attila Andai Information Geometry
Information Geometry
Fisher information
Entropy and Fisher information
Fisher information vs. Entropy
1 Fisher information is for family of distributions and for singledistributions. Entropy is for single distributions.
2 Fisher information is strictly positive, entropy could be anyreal number.
3 There is maximum entropy principle and minimum Fisherinformation principle.
Attila Andai Information Geometry
Information Geometry
Fisher information
Entropy and Fisher information
Fisher information vs. Entropy
1 Fisher information is for family of distributions and for singledistributions. Entropy is for single distributions.
2 Fisher information is strictly positive, entropy could be anyreal number.
3 There is maximum entropy principle and minimum Fisherinformation principle.
Attila Andai Information Geometry
Information Geometry
Fisher information
Entropy and Fisher information
4 The Fisher information of the density function p with singlevariable is
g = 4
∫
R
(d√p(x)
d x
)2
d x .
Fisher defined the probability amplitude q(x) =√p(x).
He also studied the Lagrange density
L = 4(q(x)′
)2
and gave information theoretical background of potentialenergy. Fisher studied complex probability amplitudes too andexamined the Lagrange function with kinetic energy term inthe form of
Lm = C∇ψ ×∇ψ∗.(This was written down half year later in 1926 by Schrodingerfor function ψ.)
Attila Andai Information Geometry
Information Geometry
Fisher information
Entropy and Fisher information
4 The Fisher information of the density function p with singlevariable is
g = 4
∫
R
(d√p(x)
d x
)2
d x .
Fisher defined the probability amplitude q(x) =√p(x).
He also studied the Lagrange density
L = 4(q(x)′
)2
and gave information theoretical background of potentialenergy. Fisher studied complex probability amplitudes too andexamined the Lagrange function with kinetic energy term inthe form of
Lm = C∇ψ ×∇ψ∗.(This was written down half year later in 1926 by Schrodingerfor function ψ.)
Attila Andai Information Geometry
Information Geometry
Fisher information
Distance of coins
Distance of coins
What is the distance between coins (p1, 1− p1) and (p2, 1− p2)?
In 1925 Fisher suggested the angle between vectors(√p1,√
1− p1) and (√p2,√
1− p2) by theoretical arguments.
Attila Andai Information Geometry
Information Geometry
Fisher information
Distance of coins
Distance of coins
What is the distance between coins (p1, 1− p1) and (p2, 1− p2)?
In 1925 Fisher suggested the angle between vectors(√p1,√
1− p1) and (√p2,√
1− p2) by theoretical arguments.
Attila Andai Information Geometry
Information Geometry
Fisher information
Distance of coins
Distance of coins
What is the distance between coins (p1, 1− p1) and (p2, 1− p2)?
In 1925 Fisher suggested the angle between vectors(√p1,√
1− p1) and (√p2,√
1− p2) by theoretical arguments.
Attila Andai Information Geometry
Information Geometry
Fisher information
Distance of coins
Distance of coins
What is the distance between coins (p1, 1− p1) and (p2, 1− p2)?
In 1925 Fisher suggested the angle between vectors(√p1,√
1− p1) and (√p2,√
1− p2) by theoretical arguments.
Attila Andai Information Geometry
Information Geometry
Fisher information
Distance of coins
The measurement based consideration is the following.Assume that p1 < p2. If we can have n measurements then theuncertainty of measurements is the typical fluctuation
∆p =
√p(1− p)
n.
The distributions (p1, 1− p1) and (p2, 1− p2) are said to bedistinguishable in n measurements if
|p1 − p2| ≥ ∆p1 + ∆p2.
Attila Andai Information Geometry
Information Geometry
Fisher information
Distance of coins
Define k(n, p1, p2) as the number of those probability distributions(pi , 1− pi ) for which p1 < pi < p2, pi < pi+1 and (pi , 1− pi )distinguishable in n measurements from (pi+1, 1− pi+1). Let thedistance be between (p1, 1− p1) and (p2, 1− p2)
d(p1, p2) = limn→∞
k(n, p1, p2)√n
.
This gives us for distance d(p1, p2)
p2∫
p1
1√p(1− p)
d p = arccos(√
p1p2 +√
(1− p1)(1− p2)).
Attila Andai Information Geometry
Information Geometry
Divergences
General contrast function
General contrast function
Definition
Let (X ,S ,Ξ) be a statistical model. A general contrast function isa function
D : S × S → R (p, q) 7→ D(p, q)
if ∀p, q ∈ S : D(p, q) ≥ 0 and D(p, q) = 0 iff p = q.
The dual divergence is given as D∗(p, q) = D(q, p).
Let us consider some examples.
Attila Andai Information Geometry
Information Geometry
Divergences
General contrast function
General contrast function
Definition
Let (X ,S ,Ξ) be a statistical model. A general contrast function isa function
D : S × S → R (p, q) 7→ D(p, q)
if ∀p, q ∈ S : D(p, q) ≥ 0 and D(p, q) = 0 iff p = q.
The dual divergence is given as D∗(p, q) = D(q, p).
Let us consider some examples.
Attila Andai Information Geometry
Information Geometry
Divergences
General contrast function
General contrast function
Definition
Let (X ,S ,Ξ) be a statistical model. A general contrast function isa function
D : S × S → R (p, q) 7→ D(p, q)
if ∀p, q ∈ S : D(p, q) ≥ 0 and D(p, q) = 0 iff p = q.
The dual divergence is given as D∗(p, q) = D(q, p).
Let us consider some examples.
Attila Andai Information Geometry
Information Geometry
Divergences
General contrast function
Kullback–Liebler DKL(p, q) =
∫
Xp(x) log
p(x)
q(x)d x
Hellinger DH(p, q) =
∫
X
(√p(x)−
√q(x)
)2d x
χ2 Dχ2(p, q) =
∫
Xp(x)
[(p(x)
q(x)
)2
− 1
]d x
α ∈ ]−1, 1[ Dα(p, q) =4
1− α2
[1−
∫
Xp(x)
1−α2 q(x)
1+α2 d x
]
Harmonic DHa(p, q) = 1−∫
X
2p(x)q(x)
p(x) + q(x)d x
Triangle D∆(p, q) =
∫
X
(p(x)− q(x))2
p(x) + q(x)d x
Attila Andai Information Geometry
Information Geometry
Divergences
General contrast function
These distance like functions used in many areas of mathematicsand applications.
For example DKL(p, q):
? is often called the information gain achieved if P is usedinstead of Q in the context of machine learning,
? can be constructed as measuring the expected number ofextra bits required to code samples from P using a codeoptimized for Q rather than the code optimized for P, in thecontext of coding theory.
Attila Andai Information Geometry
Information Geometry
Divergences
Csiszar divergence
Csiszar divergence
These quantities can be handled as a special cases of Csiszardivergence
Definition
Assume that f : R+ → R is a strictly convex function andf (1) = 0. The Csiszar divergence is
Df (p, q) =
∫
Xp(x)f
(q(x)
p(x)
)d x .
For the function f \(u) = uf (u−1) we have
Df (p, q) = Df \(q, p).
Attila Andai Information Geometry
Information Geometry
Divergences
Csiszar divergence
α-divergence
If α ∈ R and
fα : R→ R x 7→
41−α2
(1− x
1+α2
)if α 6= ±1
x log x if α = 1
− log x if α = −1
then Df−1 = DKL, Df0 = 2DH and in the α 6= ±1 case Dfα = Dα.
Attila Andai Information Geometry
Information Geometry
Divergences
Csiszar divergence
The Csiszar divergence Df is monotone and jointly convex.
Theorem
For probability functions p, q : X → R and Markov kernelκ : X × Y → R define p(y) =
∫X κ(y |x)p(x) d x and
q(y) =∫X κ(y |x)q(x) d x . For the Csiszar divergences we have
Df (p, q) ≤ Df (p, q).
Theorem
For density functions p1, p2, q1, q2 : X → R and parameter0 ≤ λ1 ≤ 1, λ2 = 1− λ1
Df (λ1p1 + λ2p2, λ1q1 + λ2q2)≤λ1Df (p1, q1) + λ2Df (p2, q2)
holds.
Attila Andai Information Geometry
Information Geometry
Divergences
Csiszar divergence
The Csiszar divergence Df is monotone and jointly convex.
Theorem
For probability functions p, q : X → R and Markov kernelκ : X × Y → R define p(y) =
∫X κ(y |x)p(x) d x and
q(y) =∫X κ(y |x)q(x) d x . For the Csiszar divergences we have
Df (p, q) ≤ Df (p, q).
Theorem
For density functions p1, p2, q1, q2 : X → R and parameter0 ≤ λ1 ≤ 1, λ2 = 1− λ1
Df (λ1p1 + λ2p2, λ1q1 + λ2q2)≤λ1Df (p1, q1) + λ2Df (p2, q2)
holds.
Attila Andai Information Geometry
Information Geometry
Divergences
Contrast function
A general contrast function D (in some cases) has seriesexpansion. From now assume that for every ϑ ∈ Ξ the functiony 7→ D(p(x , ϑ+ y), p(x , ϑ)) has series expansion with respect to y .
D(p(x , ϑ+y), p(x , ϑ)) =n∑
i ,k=1
g(D)ik (p)
yiyk2
+n∑
i ,j ,k=1
h(D)ijk
yiyjyk6
+o(‖y‖3)
Definition
We call D to divergence or contrast function if for every ϑ ∈ Ξ thefunction D(p(x , ϑ+ y), p(x , ϑ)) has series expansion with respect
to y and second order term g(D)ik is positive definite.
Attila Andai Information Geometry
Information Geometry
Divergences
Contrast function
Theorem
We have the following equalities for the series expansion ofdivergences.
g (DKL) = g (F) g (DH) =1
2g (F) g (Dχ2 ) = 2g (F)
g (Dα) = g (F) g (DB) =1
4g (F) g (DHa) =
1
2g (F)
g (DJ) = 2g (F) g (D∆) = g (F) g (DLW) =1
4g (F)
g (Df ) = f ′′(1)g (F)
Attila Andai Information Geometry
Information Geometry
Differential geometry
Riemannian metric
Differential geometry, Riemannian metric
Definition
(M,A) is an n dimensional manifold if
1 M is a Hausdorff topological space with countable base,
2 A is countable and its elements are homeomorphismsφi : Ui → Vi , where Ui ⊆ M and Vi ⊆ Rn are open sets,
3 for every pair of functions φi , φj ∈ A the map
φi ◦ φ−1j : φj(Ui ∩ Uj)→ φi (Ui ∩ Uj)
is in C∞,
4 every x ∈ M point is contained in some Ui .
Attila Andai Information Geometry
Information Geometry
Differential geometry
Riemannian metric
Assume that M is an n dimensional manifold and p ∈ M.
Denote by Fp the set of smooth functions defined in aneighbourhood of p.
A derivation is a mapD : Fp → R
such that for every a, b ∈ R and functions f , g ∈ F
D(af + bg) = aD(f ) + bD(g) D(fg) = f (p)D(g) + D(f )g(p)
holds.The set of derivations denoted by TpM and called tangent space.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Riemannian metric
The tangent bundle is TM =⋃
p∈M{p} × TpM.
A vector field is a map
X : M →⋃
p∈MTpM p 7→ X (p)
if
1 for every p ∈ M: X (p) ∈ TpM,
2 for every p ∈ M and f ∈ Fp the function
Xf : Dom(X ) ∩Dom(f )→ R p 7→ X (p)f
is smooth.
The set of vector fields is denoted by X (M).
Attila Andai Information Geometry
Information Geometry
Differential geometry
Riemannian metric
Definition
A map
g : M →⋃
p∈MLin(TpM × TpM,R)
is Riemannian metric if
1 for every p ∈ M the map gp : TpM × TpM → R is a scalarproduct,
2 for every vector field X ∈ X (M) the function
g(X ,X ) : M → R p 7→ gp(Xp,Xp)
is smooth.
The pair (M, g) is called Riemannian geometry or Riemannianmanifold.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Riemannian metric
Assume that p ∈ M and ϕ : U → Rn is a local coordinate systemaround p. For every f ∈ Fp define (i = 1, . . . , n)
∂i f =∂(f ◦ ϕ−1)
∂xi(ϕ(p)).
We consider (∂1, . . . , ∂n) as a basis of TpM. The Riemannianmetric in this coordinate system can be described with the
gij = g(∂i , ∂j).
matrix.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Covariant derivative
Covariant derivative
The map
∇ : X (M)×X (M)→ X (M) (X ,Y ) 7→ ∇XY
is a covariant derivative if
1 for every vector field X ,Y ,Z ∈ X (M)
∇X+YZ = ∇XZ +∇YZ , ∇X (Y + Z ) = ∇XY +∇XZ
2 for every vector field X ,Y ∈ X (M) and function f ∈ F(M)
∇fXY = f∇XY , ∇X (fY ) = (Xf )Y + f∇XY .
Attila Andai Information Geometry
Information Geometry
Differential geometry
Covariant derivative
Assume that p ∈ M and ϕ : U → Rn is a local coordinate systemaround p. The covariant derivative can be described by Christoffelsymbol of the first kind
Γijk = g(∇∂i∂j , ∂k)
and by Christoffel symbol of the second kind
Γ..kij ∂k = ∇∂i∂j .
Attila Andai Information Geometry
Information Geometry
Differential geometry
Levi-Civita covariant derivative
Levi-Civita covariant derivative
The pair (M,∇) is called to be an affine manifold.The affine manifold (M,∇) called torsion free if Γ..kij = Γ..kji holds inevery local coordinate system.The covariant derivative ∇ on a (M, g) Riemannian manifoldcalled Riemannian covariant derivative if for every vector fieldX ,Y ,Z ∈ X (M)
Xg(Y ,Z ) = g(∇XY ,Z ) + g(Y ,∇XZ ).
The covariant derivative ∇ on a (M, g) Riemannian manifoldcalled Levi-Civita covariant derivative if torsion free Riemanniancovariant derivative.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Levi-Civita covariant derivative
Theorem
For every (M, g) Riemannian manifold there exists a uniqueLevi–Civita covariant derivative ∇, which can be expressed as
Γ..mij = gkm 1
2(∂igjk + ∂jgik − ∂kgij).
in local coordinate systems.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Curvature
Curvature
Definition
For an affine manifold (M,∇) define the curvature as
R : X (M)×X (M)×X (M)→ X (M) (X ,Y ,Z ) 7→ R(X ,Y )Z
R(X ,Y )Z = ∇X∇YZ −∇Y∇XZ −∇[X ,Y ]Z .
The affine manifold (M,∇) is flat if R = 0.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Curvature
In a local coordinate system the curvature tensor can be handledby the
R(∂i , ∂j)∂k = R ...lijk ∂l ,
g(R(∂i , ∂j)∂k , ∂l) = Rijkl
quantities.
The curvature tensor has symmetries
Rijkl = −Rjikl , Rijkl = −Rijlk , Rijkl = Rklij .
One can compute the curvature tensor as
R ...lijk = ∂iΓ..ljk − ∂jΓ..lik + Γ..mjk Γ..lim − Γ..mik Γ..ljm.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Curvature
Definition
For an (M,∇) affine manifold with curvature R the function
Ric : X (M)×X (M)→ F(M) (X ,Y ) 7→ Tr(Z 7→ R(Z ,X )Y
)
is called Ricci curvature.
In local coordinate system the matrix
Ricij = Ric(∂i , ∂j)
can be computed asRicjk = R ...iijk .
Attila Andai Information Geometry
Information Geometry
Differential geometry
Length and volume
Length and volume
Assume that (M, g) is a Riemannian manifold and γ : ]a, b[→ Mis a smooth curve. The length of the curve defined as
lγ(a, b) =
∫ b
a
√g(γ(t), γ(t)) d t.
The volume of the set U ⊆ Dom(φ)
V (U) =
∫
φ(U)
√det g .
Attila Andai Information Geometry
Information Geometry
Differential geometry
Geodesic line
Geodesic line
A smooth curve γ : ]a, b[→ M is called to be a geodesic line if inlocal coordinate systems
d2 γk
d t2+
dimM∑
i ,j=1
(Γ..kij ◦ γ)d γ i
d t
d γj
d t= 0
holds.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Information geometry basics
Information geometry basics
Consider a statistical model (X , S ,Ξ).
The manifold M = Ξ, open connected subset of Rn.
The Riemannian metric g = g (F) is the Fisher information.
We can compute the Levi–Civita covariant derivative or define newones.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Information geometry basics
Information geometry basics
Consider a statistical model (X , S ,Ξ).
The manifold M = Ξ, open connected subset of Rn.
The Riemannian metric g = g (F) is the Fisher information.
We can compute the Levi–Civita covariant derivative or define newones.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Information geometry basics
Information geometry basics
Consider a statistical model (X , S ,Ξ).
The manifold M = Ξ, open connected subset of Rn.
The Riemannian metric g = g (F) is the Fisher information.
We can compute the Levi–Civita covariant derivative or define newones.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Information geometry basics
Information geometry basics
Consider a statistical model (X , S ,Ξ).
The manifold M = Ξ, open connected subset of Rn.
The Riemannian metric g = g (F) is the Fisher information.
We can compute the Levi–Civita covariant derivative or define newones.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Information geometry basics
In 1945, Rao suggested to consider the Fisher information asRiemannian metric.
In 1975, Efron studied first the curvature of statistical manifolds.
In 1979, Ruppeiner claimed that thermodynamic systems can berepresented by Riemannian geometry, and that statisticalproperties can be derived from the model. (For example he foundconnection between the behaviour of correlation functions andcurvature at second order phase transitions.)
In 1999, Brody and Ritz studied the curvature of statistical modelof Ising chains.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Information geometry basics
In 1945, Rao suggested to consider the Fisher information asRiemannian metric.
In 1975, Efron studied first the curvature of statistical manifolds.
In 1979, Ruppeiner claimed that thermodynamic systems can berepresented by Riemannian geometry, and that statisticalproperties can be derived from the model. (For example he foundconnection between the behaviour of correlation functions andcurvature at second order phase transitions.)
In 1999, Brody and Ritz studied the curvature of statistical modelof Ising chains.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Information geometry basics
In 1945, Rao suggested to consider the Fisher information asRiemannian metric.
In 1975, Efron studied first the curvature of statistical manifolds.
In 1979, Ruppeiner claimed that thermodynamic systems can berepresented by Riemannian geometry, and that statisticalproperties can be derived from the model. (For example he foundconnection between the behaviour of correlation functions andcurvature at second order phase transitions.)
In 1999, Brody and Ritz studied the curvature of statistical modelof Ising chains.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Information geometry basics
In 1945, Rao suggested to consider the Fisher information asRiemannian metric.
In 1975, Efron studied first the curvature of statistical manifolds.
In 1979, Ruppeiner claimed that thermodynamic systems can berepresented by Riemannian geometry, and that statisticalproperties can be derived from the model. (For example he foundconnection between the behaviour of correlation functions andcurvature at second order phase transitions.)
In 1999, Brody and Ritz studied the curvature of statistical modelof Ising chains.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Alpha covariant derivatives
Alpha covariant derivatives
Definition
Consider the Pn set. For every −1 ≤ α ≤ 1 define
Γ(α)ijk =
n∑
l=0
p(l , ϑ)(∂i∂j(log p(l , ϑ))
+1− α
2(∂i log p(l , ϑ))(∂j log p(l , ϑ))(∂k log p(l , ϑ))
),
which is called α-covariant derivative.
Theorem
The 0-covariant derivative is Levi-Civita covariant derivative.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Examples
Example (Geodesic line in P1)
In the space (P1,∇) γ is geodesic line iff
d2 γ(t)
d t2− (1− 2γ(t))
2γ(t)(1− γ(t))
(d γ(t)
d t
)2
= 0.
The solution (with initial values γ(0) = a and γ(0) = b) is
γ(t) = cos2
(bt
2√a√
1− a+ arccos
√a
).
Attila Andai Information Geometry
Information Geometry
Differential geometry
Examples
Example (Normal distribution)
Let us define the base set X = R, the parameter spaceΞ = R× R+ and the elements of S as
p(x , µ, σ) =1√πσ
exp
(−(x − µ)2
σ2
), (µ, σ) ∈ Ξ.
Using the coordinate system (µ, σ) the Fisher information of thestatistical model (X ,S ,Ξ) is
(g(F)ik ) =
(2σ2 00 2
σ2
).
The pair (Ξ, g (F)) is special Riemannian geometry, calledhyperbolic plane.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Examples
Example (Normal distribution cont.)
The geodesic curves are those semicircles whose centre lies on theaxis µ and the µ = constant half lines.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Examples
Example (Normal distribution cont.)
Consider the distributions given by parameters (µ1, σ1) and(µ2, σ2) (µ1 ≤ µ2), where µ1 ≤ µ2. If µ1 < µ2 then define theparameters
R =
√(µ2 − µ1
2
)2
+σ2
1 + σ22
2+
(σ2
2 − σ21
2(µ2 − µ1)
)2
,
C =µ1 + µ2
2+
σ22 − σ2
1
2(µ2 − µ1).
The geodesic curve connecting the points (µ1, σ1) and (µ2, σ2) isthe (µ− C )2 + σ2 = R2 semicircle (σ > 0).
Attila Andai Information Geometry
Information Geometry
Differential geometry
Examples
Example
Normal distribution cont. The geodesic distance between thepoints is the following.
1 If (µ1 − µ2)2 ≤∣∣σ2
2 − σ21
∣∣ then
d(
(µ1, σ1), (µ2, σ2))
=√
2
∣∣∣∣archR
σ1− arch
R
σ2
∣∣∣∣ .
2 If (µ1 − µ2)2 ≥ |σ21 − σ2
2| then
d(
(µ1, σ1), (µ2, σ2))
=√
2
(arch
R
σ1+ arch
R
σ2
).
3 If µ1 = µ2 then d(
(µ1, σ1), (µ2, σ2))
=√
2
∣∣∣∣logσ1
σ2
∣∣∣∣.
Attila Andai Information Geometry
Information Geometry
Differential geometry
Pull-back metric
Pull-back metric
Assume that ϕ : M → N is a smooth map between differentiablemanifolds.For every p ∈ M we have maps
ϕ1 : FNϕ(p) → F
Mp f 7→ f ◦ ϕ
andϕ∗ : TpM → Tϕ(p)N v 7→ v ◦ ϕ1.
Definition
If (N, g) is a Riemannian manifold then we can define thepull-back metric on M as
gMp (x , y) = gN
ϕ(p)(ϕ∗(x), ϕ∗(y)).
Attila Andai Information Geometry
Information Geometry
Differential geometry
Pull-back metric
Theorem
The pull back metric of the euclidean metric by the map
Pn → Rn+1 (p1, . . . , pn) 7→
√√√√1−
n∑
k=1
pk ,√p1, . . . ,
√pn
is the Fisher metric.
Theorem
The volume of the space Pn equals to the surface of the n + 1dimensional ball divided by 2n+1, that is
V (Pn) =π(n+1)/2
2nΓ(n+1
2
) .
Attila Andai Information Geometry
Information Geometry
Differential geometry
Uniqueness of Fisher metric
Uniqueness of Fisher metric
Theorem
Let us define Xn = {0, 1, . . . , n} (n ∈ N+). Assume than for everyn a Riemannian metric gn is given on Pn. For a κ : Xn × Xm → Rtransition probability denote by κ : Pn → Pm. If for everytransition probability κ : Xn × Xm → R for every point p ∈ Pn forevery tangent vector X ∈ TpPn
gκ(p)(κ∗(X ), κ∗(X )) ≤ gp(X ,X )
holds then there exists a unique positive number c such that for
every n ∈ N+ gn = cg(F)n .
Attila Andai Information Geometry
Information Geometry
Duality
Duality on Riemannian manifolds
Duality on Riemannian manifolds
Definition
For an (M, g) Riemannian geometry the covariant derivatives ∇and ∇∗ are called dual covariant derivatives if for every vector fieldX ,Y ,Z ∈ X (M)
Zg(X ,Y ) = g(∇ZX ,Y ) + g(X ,∇∗ZY )
holds. We call (M, g ,∇,∇∗) dual Riemannian geometry.
Attila Andai Information Geometry
Information Geometry
Duality
Duality on Riemannian manifolds
Theorem
Consider a statistical model (X , S ,Ξ) with Fisher metric g . For allα ∈ [−1, 1] the covariant derivatives ∇(α) and ∇(−α) are torsionfree and dual.
Theorem
Assume that (M, g ,∇,∇∗) torsion free dual geometry withcurvatures R and R∗. In this case R = 0 iff R∗ = 0.
In this case we call (M, g ,∇,∇∗) flat dual Riemannian geometry.
Attila Andai Information Geometry
Information Geometry
Duality
From divergence to duality
From divergence to duality
Assume that M is an n dimensional manifold, D : M ×M → R is adivergence, ϑ ∈ M, φ is a local coordinate system in aneighbourhood of p. Consider the function
D(ϑ,φ) : Rn → R y 7→ D(ϑ, φ−1(φ(ϑ) + y))
and its series expansion
D(ϑ,φ)(y) =1
2
n∑
i ,k=1
g(D)ik (ϑ)yiyk +
1
6
n∑
i ,j ,k=1
h(D)ijk (ϑ)yiyjyk + o(‖y‖3).
At every point ϑ ∈ M the matrix g (D)(ϑ) is positive definite, so(M, g (D)) is Riemannian geometry. From the third order termdefine
Γ(D)ijk = h
(D)ijk − ∂kg
(D)ij i , j , k ∈ {1, 2, . . . , n} .
Attila Andai Information Geometry
Information Geometry
Duality
From divergence to duality
Theorem (From divergence to duality)
Assume that M is an n dimensional manifold, D is a divergence on
M and we have the induced quantities g (D), Γ(D)ijk and Γ
(D∗)ijk . In this
case Γ(D)ijk and Γ
(D∗)ijk can be considered as a Christoffel symbols of
the first kind of torsion free covariant derivatives ∇(D) and ∇(D∗).Moreover (M, g ,∇(D),∇(D∗)) is a torsion free dual geometry.
Theorem
If (M, g ,∇,∇∗) is a torsion free dual geometry then there exists aD divergence which induces the same duality.
Attila Andai Information Geometry
Information Geometry
Duality
From duality to divergence
From duality to divergence
Definition
If (M,∇) is an affine manifold, x ∈ M and φ and ϑ are localcoordinate systems of a neighbourhood of x . We call φ to affinecoordinate system if for all 1 ≤ i , j ≤ dimM
∇∂i∂j = 0
holds and we call φ and ϑ dual coordinate systems if
g(x)(∂(ϑ)i , ∂
(η)j ) = δij .
Attila Andai Information Geometry
Information Geometry
Duality
From duality to divergence
Theorem (From duality to divergence)
Assume that (M, g ,∇,∇∗) is a flat dual n dimensional geometry.Then every point x ∈ M has a neighbourhood U ⊆ M with dualcoordinate systems ϑ and η. Assume that U = M.
1 In this case there exists a function ψ : M → R such that forevery 1 ≤ i ≤ n
∂(ϑ)i ψ = ηi .
2 For the function
φ : M → R x 7→ φ(x) =n∑
i=1
ϑi (x)ηi (x)− ψ(x)
we have∂
(η)i φ = ϑi 1 ≤ i ≤ n.
Attila Andai Information Geometry
Information Geometry
Duality
From duality to divergence
Theorem (From duality to divergence cont.)
3 For every indices 1 ≤ i , j ≤ n
g(∂(ϑ)i , ∂
(ϑ)j ) = ∂
(ϑ)i ∂
(ϑ)j ψ g(∂
(η)i , ∂
(η)j ) = ∂
(η)i ∂
(η)j φ.
4 The functions ψ, φ has extrema for every x ∈ M
φ(x) = maxy∈M
(n∑
i=1
ϑi (y)ηi (x)− ψ(y)
)
ψ(x) = maxy∈M
(n∑
i=1
ϑi (x)ηi (y)− φ(y)
).
Attila Andai Information Geometry
Information Geometry
Duality
From duality to divergence
Theorem (From duality to divergence cont.)
5 The functions φ and ψ are strictly convex functions of(η1, . . . , ηn) and (ϑ1, . . . , ϑn) respectively.
6 We have a canonical divergence D : M ×M → R
D(g ,∇)(p, q) = ψ(p) + φ(q)−n∑
i=1
ϑi (p)ηi (q).
Attila Andai Information Geometry
Information Geometry
Duality
Example for duality
Example (Duality for discrete distribution)
Base space is X = {0, 1, . . . , n} and the parameter space isΞ =
{(p1, . . . , pn) ∈ (R+)
n |∑n
k=1 pk < 1}
. The Fisher metric isg .The covariant derivatives ∇(−1) and ∇(1) are torsion free and(Ξ, g ,∇(1),∇(−1)) is flat dual geometry.
Let us define the following coordinate systems
η : Ξ→ Rn p 7→ η(p) = (p1, . . . , pn)
ϑ : Ξ→ Rn p 7→ ϑ(p) =
(log
p1
p0, . . . , log
pnp0
),
where p0 = 1−n∑
k=1
pk .
Attila Andai Information Geometry
Information Geometry
Duality
Example for duality
Example (Duality for discrete distribution cont.)
The coordinate systems η and ϑ are affine for (Ξ,∇(−1)) and(Ξ,∇(1)).(∇(1) called exponential covariant derivative and ∇(−1) calledmixture covariant derivative.)If we use the potential function
ψ : Ξ→ R p 7→ − log p0
then we have∂
(ϑ)i ψ(p) = ηi
Attila Andai Information Geometry
Information Geometry
Duality
Example for duality
Example (Duality for discrete distribution cont.)
The function φ is the following
φ(p) =n∑
i=0
pi log pi = −S(p).
The canonical divergence of the (Ξ, g ,∇(1),∇(−1)) flat dualgeometry is
D(g ,∇)(p, q) = ψ(p) + φ(q)−n∑
i=1
ϑi (p)ηi (q)
=n∑
i=0
qi logqipi
= DKL(q, p).
Attila Andai Information Geometry
Information Geometry
Duality
Pythagorean theorem
Pythagorean theorem
Theorem
Assume that (M, g ,∇,∇∗) is a flat dual geometry, a, b, c ∈ M, γ1
is a ∇ geodesic curve connecting a and b, γ2 is a ∇∗ geodesiccurve connecting b and c such that g(b)(γ1(b), γ2(b)) = 0. Then
D(g ,∇)(a, c) = D(g ,∇)(a, b) + D(g ,∇)(b, c).
Attila Andai Information Geometry
Information Geometry
Duality
Pythagorean theorem
Pythagorean theorem
Attila Andai Information Geometry
Information Geometry
Duality
Projection theorem
Projection theorem
Theorem
Assume that (M, g ,∇,∇∗) is a flat dual geometry, N is asubmanifold of M and x ∈ M \ N. The point y ∈ N is a criticalpoint of the function
N → R y 7→ D(g ,∇)(x , y)
iff the geodesic line between x and y is perpendicular to N.
Attila Andai Information Geometry
Information Geometry
Introduction to noncommutative information geometry
Quantum mechanical setting
Attila Andai Information Geometry
Information Geometry
Introduction to noncommutative information geometry
Quantum mechanical setting
Quantum mechanical setting
In quantum setting we use n dimensional Hilbert space.
A self-adjoint, positive semidefinite trace one operator: state.
The set of states is called to be state space.
The interior of the state space is denoted by M+n .
The extremal points of the state space: pure states.
A self-adjoint operator is called observable.
The expected value of an observable A in a state D is Tr(DA).
Attila Andai Information Geometry
Information Geometry
Introduction to noncommutative information geometry
Quantum mechanical setting
Example (2 dimensional Hilbert space (qubit))
Every state D ∈M2 can be uniquely written in the form of
D =1
2
(1 + z x + i yx − i y 1− z
). (??)
For states we havex2 + y2 + z2 ≤ 1
and for parameters (x , y , z) ∈ R3 equation (??) defines a state iffx2 + y2 + z2 ≤ 1.Therefore the state space of a two dimensional quantum systemcan be identified with the closed unit ball in R3.(x , y , z) are called to be Stokes parameters.
Attila Andai Information Geometry
Information Geometry
Introduction to noncommutative information geometry
Entropy
Entropy
The entropy of a state D can be defined as in the classical case
S(D) = −TrD logD,
called Neumann entropy.The entropy is a concave function.
Theorem
For every state D1,D2 ∈M+n and parameter λ ∈ [0, 1]
λS(D1) + (1− λ)S(D2) ≤ S(λD1 + (1− λ)D2).
Attila Andai Information Geometry
Information Geometry
Introduction to noncommutative information geometry
Riemannian metric on state space
Riemannian metric on state space
We will refer to M+n as open convex subset of Rk with its
canonical coordinate system. At a given point D0 ∈M+n we
identify the tangent space with n × n self-adjoint trace zerooperators Mn. For a given smooth function f :M+
n → R at astate D0 ∈M+
n the effect of the tangent vector X ∈Mn is
(Xf )(D0) =d f (D0 + tX )
d t
∣∣∣∣t=0
.
We denote by TDM+n the tangent space of M+
n at a pointD ∈M+
n .
Attila Andai Information Geometry
Information Geometry
Introduction to noncommutative information geometry
Riemannian metric on state space
We can define Riemannian metrics on M+n , for example
KD(X ,Y ) = TrDXY D ∈M+n X ,Y ∈ TMM+
n
is a Riemannian metric.
Problems with Fisher metric:How to generalise equations like below?
g (F)(ϑ)ik =
∫
Xp(x , ϑ)(∂i log p(x , ϑ))(∂k log p(x , ϑ)) d x
g (F)(ϑ)ik = 4
∫
X(∂i√p(x , ϑ))(∂k
√p(x , ϑ)) d x
Attila Andai Information Geometry
Information Geometry
Introduction to noncommutative information geometry
Riemannian metric on state space
We can define Riemannian metrics on M+n , for example
KD(X ,Y ) = TrDXY D ∈M+n X ,Y ∈ TMM+
n
is a Riemannian metric.
Problems with Fisher metric:How to generalise equations like below?
g (F)(ϑ)ik =
∫
Xp(x , ϑ)(∂i log p(x , ϑ))(∂k log p(x , ϑ)) d x
g (F)(ϑ)ik = 4
∫
X(∂i√p(x , ϑ))(∂k
√p(x , ϑ)) d x
Attila Andai Information Geometry
Information Geometry
Introduction to noncommutative information geometry
Riemannian metric on state space
There was the concepts of left and right logarithmic derivative
dDϑdϑ
= Dϑ × Lr ,ϑdDϑdϑ
= Ll ,ϑ × Dϑ.
The second derivative of the entropy generates a Riemannianmetric too.
The pull back of the euclidean metric by
M+n → Rk D 7→
√D
defines Riemannian metric too.
Attila Andai Information Geometry
Information Geometry
Introduction to noncommutative information geometry
Riemannian metric on state space
There was the concepts of left and right logarithmic derivative
dDϑdϑ
= Dϑ × Lr ,ϑdDϑdϑ
= Ll ,ϑ × Dϑ.
The second derivative of the entropy generates a Riemannianmetric too.
The pull back of the euclidean metric by
M+n → Rk D 7→
√D
defines Riemannian metric too.
Attila Andai Information Geometry
Information Geometry
Introduction to noncommutative information geometry
Riemannian metric on state space
There was the concepts of left and right logarithmic derivative
dDϑdϑ
= Dϑ × Lr ,ϑdDϑdϑ
= Ll ,ϑ × Dϑ.
The second derivative of the entropy generates a Riemannianmetric too.
The pull back of the euclidean metric by
M+n → Rk D 7→
√D
defines Riemannian metric too.
Attila Andai Information Geometry
Information Geometry
Preparations for Petz theorem
Extending some classical concept to quantum setting
Extending some classical concept to quantum setting
Let us denote by Mn the space of n × n matrices and by Mm(Mn)those m ×m matrices whose elements are n × n matrices.
Definition
A linear map T : Mn → Mm is called positive if maps everypositive operator to a positive operator.A linear map T : Mn → Mm is called completely positive if forevery k ∈ N the operator
T (k) : Mk(Mn)→ Mk(Mm) [Aij ] 7→ T (k)([Aij ]
)= [T (Aij)]
is positive.We call a linear map T : Mn → Mm is called to be a stochasticmap if completely positive and trace preserving.
Attila Andai Information Geometry
Information Geometry
Preparations for Petz theorem
Extending some classical concept to quantum setting
Theorem
A linear map T : Mn → Mm is completely positive iff there existoperators Vi : Mm → Mn such that
T (A) =N∑
i=1
ViAV∗i ∀A ∈ Mn.
The map T is trace preserving iffN∑
i=1
ViV∗i = I.
Attila Andai Information Geometry
Information Geometry
Preparations for Petz theorem
Extending some classical concept to quantum setting
Definition
Consider the family of Riemannian manifolds (M+n ,K
(n))n∈N. Iffor every n,m ∈ N, stochastic map T : Mn → Mm, state D ∈M+
n
and tangent vector X ∈Mn
K(m)T (D)(T (X ),T (X )) ≤ K
(n)D (X ,X )
holds then we call (M+n ,K
(n))n∈N a family of monotone metrics.
Attila Andai Information Geometry
Information Geometry
Preparations for Petz theorem
Extending some classical concept to quantum setting
Consider a function f : R→ R and a self-adjoint matrix X .How to compute f (X ):– X ∈M+
n can be diagonalized by some unitary matrix U, that isX = UDU∗.
f (X ) := Uf (D)U∗
– X can be written as X =n∑
i=1
λiEi , where (λi )i=1,...,n are the
eigenvalues and (Ei )i=1,...,n are the corresponding projections
f (X ) =n∑
i=1
f (λi )Ei .
Definition
A function f : R→ R called operator monotone if for every n ∈ Nand self-adjoint matrices A,B ∈ Mn from A ≤ B followsf (A) ≤ f (B).
Attila Andai Information Geometry
Information Geometry
Preparations for Petz theorem
Extending some classical concept to quantum setting
Consider a function f : R→ R and a self-adjoint matrix X .How to compute f (X ):– X ∈M+
n can be diagonalized by some unitary matrix U, that isX = UDU∗.
f (X ) := Uf (D)U∗
– X can be written as X =n∑
i=1
λiEi , where (λi )i=1,...,n are the
eigenvalues and (Ei )i=1,...,n are the corresponding projections
f (X ) =n∑
i=1
f (λi )Ei .
Definition
A function f : R→ R called operator monotone if for every n ∈ Nand self-adjoint matrices A,B ∈ Mn from A ≤ B followsf (A) ≤ f (B).
Attila Andai Information Geometry
Information Geometry
Preparations for Petz theorem
Extending some classical concept to quantum setting
Consider a function f : R→ R and a self-adjoint matrix X .How to compute f (X ):– X ∈M+
n can be diagonalized by some unitary matrix U, that isX = UDU∗.
f (X ) := Uf (D)U∗
– X can be written as X =n∑
i=1
λiEi , where (λi )i=1,...,n are the
eigenvalues and (Ei )i=1,...,n are the corresponding projections
f (X ) =n∑
i=1
f (λi )Ei .
Definition
A function f : R→ R called operator monotone if for every n ∈ Nand self-adjoint matrices A,B ∈ Mn from A ≤ B followsf (A) ≤ f (B).
Attila Andai Information Geometry
Information Geometry
Preparations for Petz theorem
Extending some classical concept to quantum setting
Denote by Lin(Mn) the set of linear A : Mn → Mn maps anddefine the Hilbert-Schmidt scalar product
〈·, ·〉 : Lin(Mn)× Lin(Mn)→ C (A,B) 7→ TrA∗B.
For D ∈ Mn define the left and the right multiplication operators
Ln,D(A) = DA Rn,D(A) = AD.
If D ∈M+n then Ln,D and Rn,D are self-adjoint operator.
〈Ln,DA,B〉 = 〈DA,B〉 = Tr(DA)∗B = TrA∗D∗B =
= TrA∗DB = 〈A,DB〉 = 〈A, Ln,DB〉
〈Rn,DA,B〉 = 〈AD,B〉 = Tr(AD)∗B = TrD∗A∗B =
= TrA∗BD = 〈A,BD〉 = 〈A,Rn,DB〉
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x f (1) = 1
M(x , y) = M(y , x) f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0) f increasing
M(x , y) is continuous f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x
f (1) = 1
M(x , y) = M(y , x) f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0) f increasing
M(x , y) is continuous f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x
f (1) = 1
M(x , y) = M(y , x)
f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0) f increasing
M(x , y) is continuous f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x
f (1) = 1
M(x , y) = M(y , x)
f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y
f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0) f increasing
M(x , y) is continuous f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x
f (1) = 1
M(x , y) = M(y , x)
f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y
f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0)
f increasing
M(x , y) is continuous f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x
f (1) = 1
M(x , y) = M(y , x)
f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y
f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0)
f increasing
M(x , y) is continuous
f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x
f (1) = 1
M(x , y) = M(y , x)
f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y
f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0)
f increasing
M(x , y) is continuous
f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x
f (1) = 1
M(x , y) = M(y , x)
f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y
f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0)
f increasing
M(x , y) is continuous
f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x f (1) = 1
M(x , y) = M(y , x)
f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y
f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0)
f increasing
M(x , y) is continuous
f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x f (1) = 1
M(x , y) = M(y , x) f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y
f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0)
f increasing
M(x , y) is continuous
f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x f (1) = 1
M(x , y) = M(y , x) f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0)
f increasing
M(x , y) is continuous
f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x f (1) = 1
M(x , y) = M(y , x) f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0) f increasing
M(x , y) is continuous
f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
Basic property of means
What is a mean?A function M : R+×R+ → R+ is a mean if (∀x , y , x0, y0, t ∈ R+)
M(x , x) = x f (1) = 1
M(x , y) = M(y , x) f (t) = tf (t−1)
x < y ⇒ x < M(x , y) < y f (·> 1) > 1, f (0 < ·< 1) < 1
x < x0, y < y0 ⇒ M(x , y) < M(x0, y0) f increasing
M(x , y) is continuous f continuous
M(tx , ty) = tM(x , y)
M(x , y) = xf(yx
)
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
We have
means =
f ∈ C (R+,R+)
∣∣∣f increasingf (1) = 1
∀t ∈ R+ : f (t) = tf (t−1)
M(x , y) = xf(yx
)
arithmetic mean: f (t) =1 + t
2geometric mean: f (t) =
√t
logarithmic mean: f (t) =t − 1
log t
Attila Andai Information Geometry
Information Geometry
Means
Basic property of means
We have
means =
f ∈ C (R+,R+)
∣∣∣f increasingf (1) = 1
∀t ∈ R+ : f (t) = tf (t−1)
M(x , y) = xf(yx
)
arithmetic mean: f (t) =1 + t
2geometric mean: f (t) =
√t
logarithmic mean: f (t) =t − 1
log t
Attila Andai Information Geometry
Information Geometry
Means
Means of matrices
Means of matrices
Define means on n × n, positive definite matrices M+n :
X ∈M+n ⇐⇒ X = X ∗,
{〈v ,Xv〉 > 0 ∀v ∈ Cn \ {0}every eigenvalue of X is positive
We write X ≤ Y if Y − X ∈M+n .
Attila Andai Information Geometry
Information Geometry
Means
Means of matrices
Means of matrices
Define means on n × n, positive definite matrices M+n :
X ∈M+n ⇐⇒ X = X ∗,
{〈v ,Xv〉 > 0 ∀v ∈ Cn \ {0}every eigenvalue of X is positive
We write X ≤ Y if Y − X ∈M+n .
Attila Andai Information Geometry
Information Geometry
Means
Means of matrices
M is a mean of matrices if for every X ,Y ∈M+n
– X ≤ X0, Y ≤ Y0 : M(X ,Y ) ≤ M(X0,Y0)– (Xn)n∈N and (Yn)n∈N are decreasing sequences (Xn+1 ≤ Xn,Yn+1 ≤ Yn) in M+
n with limits X and Y then M(Xn,Yn) isdecreasing and
limn→∞
M(Xn,Yn) = M(X ,Y )
– T ∗M(X ,Y )T ≤ M(T ∗XT ,T ∗YT ) for all T– M(X ,X ) = X
Theorem (Kubo-Ando)
If M is a matrix mean, then there exists an operator monotonefunction f with properties f (t) = tf (t−1) and f (1) = 1 such thatfor every X ,Y ∈M+
n
M(X ,Y ) = X 1/2f (X−1/2YX−1/2)X 1/2.
Attila Andai Information Geometry
Information Geometry
Means
Means of matrices
M is a mean of matrices if for every X ,Y ∈M+n
– X ≤ X0, Y ≤ Y0 : M(X ,Y ) ≤ M(X0,Y0)– (Xn)n∈N and (Yn)n∈N are decreasing sequences (Xn+1 ≤ Xn,Yn+1 ≤ Yn) in M+
n with limits X and Y then M(Xn,Yn) isdecreasing and
limn→∞
M(Xn,Yn) = M(X ,Y )
– T ∗M(X ,Y )T ≤ M(T ∗XT ,T ∗YT ) for all T– M(X ,X ) = X
Theorem (Kubo-Ando)
If M is a matrix mean, then there exists an operator monotonefunction f with properties f (t) = tf (t−1) and f (1) = 1 such thatfor every X ,Y ∈M+
n
M(X ,Y ) = X 1/2f (X−1/2YX−1/2)X 1/2.
Attila Andai Information Geometry
Information Geometry
Means
Means of matrices
M is a mean of matrices if for every X ,Y ∈M+n
– X ≤ X0, Y ≤ Y0 : M(X ,Y ) ≤ M(X0,Y0)– (Xn)n∈N and (Yn)n∈N are decreasing sequences (Xn+1 ≤ Xn,Yn+1 ≤ Yn) in M+
n with limits X and Y then M(Xn,Yn) isdecreasing and
limn→∞
M(Xn,Yn) = M(X ,Y )
– T ∗M(X ,Y )T ≤ M(T ∗XT ,T ∗YT ) for all T– M(X ,X ) = X
Theorem (Kubo-Ando)
If M is a matrix mean, then there exists an operator monotonefunction f with properties f (t) = tf (t−1) and f (1) = 1 such thatfor every X ,Y ∈M+
n
M(X ,Y ) = X 1/2f (X−1/2YX−1/2)X 1/2.
Attila Andai Information Geometry
Information Geometry
Means
Means of matrices
M is a mean of matrices if for every X ,Y ∈M+n
– X ≤ X0, Y ≤ Y0 : M(X ,Y ) ≤ M(X0,Y0)– (Xn)n∈N and (Yn)n∈N are decreasing sequences (Xn+1 ≤ Xn,Yn+1 ≤ Yn) in M+
n with limits X and Y then M(Xn,Yn) isdecreasing and
limn→∞
M(Xn,Yn) = M(X ,Y )
– T ∗M(X ,Y )T ≤ M(T ∗XT ,T ∗YT ) for all T– M(X ,X ) = X
Theorem (Kubo-Ando)
If M is a matrix mean, then there exists an operator monotonefunction f with properties f (t) = tf (t−1) and f (1) = 1 such thatfor every X ,Y ∈M+
n
M(X ,Y ) = X 1/2f (X−1/2YX−1/2)X 1/2.
Attila Andai Information Geometry
Information Geometry
Means
Means of matrices
M is a mean of matrices if for every X ,Y ∈M+n
– X ≤ X0, Y ≤ Y0 : M(X ,Y ) ≤ M(X0,Y0)– (Xn)n∈N and (Yn)n∈N are decreasing sequences (Xn+1 ≤ Xn,Yn+1 ≤ Yn) in M+
n with limits X and Y then M(Xn,Yn) isdecreasing and
limn→∞
M(Xn,Yn) = M(X ,Y )
– T ∗M(X ,Y )T ≤ M(T ∗XT ,T ∗YT ) for all T– M(X ,X ) = X
Theorem (Kubo-Ando)
If M is a matrix mean, then there exists an operator monotonefunction f with properties f (t) = tf (t−1) and f (1) = 1 such thatfor every X ,Y ∈M+
n
M(X ,Y ) = X 1/2f (X−1/2YX−1/2)X 1/2.
Attila Andai Information Geometry
Information Geometry
Means
Means of matrices
M is a mean of matrices if for every X ,Y ∈M+n
– X ≤ X0, Y ≤ Y0 : M(X ,Y ) ≤ M(X0,Y0)– (Xn)n∈N and (Yn)n∈N are decreasing sequences (Xn+1 ≤ Xn,Yn+1 ≤ Yn) in M+
n with limits X and Y then M(Xn,Yn) isdecreasing and
limn→∞
M(Xn,Yn) = M(X ,Y )
– T ∗M(X ,Y )T ≤ M(T ∗XT ,T ∗YT ) for all T– M(X ,X ) = X
Theorem (Kubo-Ando)
If M is a matrix mean, then there exists an operator monotonefunction f with properties f (t) = tf (t−1) and f (1) = 1 such thatfor every X ,Y ∈M+
n
M(X ,Y ) = X 1/2f (X−1/2YX−1/2)X 1/2.
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
Looking for monotone metrics:
monotonicity:
gT (D)(T (X ),T (X )) ≤ gD(X ,X ) ∀D ∈Mn,∀X ∈ TpMn ,
gD(X ,Y ) =⟨X , J−1
D (Y )⟩
= Tr(XJ−1D (Y )), where
JD : Mat(n,C)→ Mat(n,C) linear map.
gT (D)(T (X ),T (X )) =⟨T (X ), J−1
T (D)(T (X ))⟩
gT (D)(T (X ),T (X )) =⟨X ,T ∗J−1
T (D)T (X )⟩
gD(X ,X ) =⟨X , J−1
D (X )⟩
=⟨X ,T ∗J−1
D T (X )⟩
monotonicity: T ∗J−1T (D)T ≤ J−1
D
TJDT∗ ≤ JT (D)
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
Looking for monotone metrics:monotonicity:
gT (D)(T (X ),T (X )) ≤ gD(X ,X ) ∀D ∈Mn, ∀X ∈ TpMn ,
gD(X ,Y ) =⟨X , J−1
D (Y )⟩
= Tr(XJ−1D (Y )), where
JD : Mat(n,C)→ Mat(n,C) linear map.
gT (D)(T (X ),T (X )) =⟨T (X ), J−1
T (D)(T (X ))⟩
gT (D)(T (X ),T (X )) =⟨X ,T ∗J−1
T (D)T (X )⟩
gD(X ,X ) =⟨X , J−1
D (X )⟩
=⟨X ,T ∗J−1
D T (X )⟩
monotonicity: T ∗J−1T (D)T ≤ J−1
D
TJDT∗ ≤ JT (D)
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
Looking for monotone metrics:monotonicity:
gT (D)(T (X ),T (X )) ≤ gD(X ,X ) ∀D ∈Mn, ∀X ∈ TpMn ,
gD(X ,Y ) =⟨X , J−1
D (Y )⟩
= Tr(XJ−1D (Y )), where
JD : Mat(n,C)→ Mat(n,C) linear map.
gT (D)(T (X ),T (X )) =⟨T (X ), J−1
T (D)(T (X ))⟩
gT (D)(T (X ),T (X )) =⟨X ,T ∗J−1
T (D)T (X )⟩
gD(X ,X ) =⟨X , J−1
D (X )⟩
=⟨X ,T ∗J−1
D T (X )⟩
monotonicity: T ∗J−1T (D)T ≤ J−1
D
TJDT∗ ≤ JT (D)
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
Looking for monotone metrics:monotonicity:
gT (D)(T (X ),T (X )) ≤ gD(X ,X ) ∀D ∈Mn, ∀X ∈ TpMn ,
gD(X ,Y ) =⟨X , J−1
D (Y )⟩
= Tr(XJ−1D (Y )), where
JD : Mat(n,C)→ Mat(n,C) linear map.
gT (D)(T (X ),T (X )) =⟨T (X ), J−1
T (D)(T (X ))⟩
gT (D)(T (X ),T (X )) =⟨X ,T ∗J−1
T (D)T (X )⟩
gD(X ,X ) =⟨X , J−1
D (X )⟩
=⟨X ,T ∗J−1
D T (X )⟩
monotonicity: T ∗J−1T (D)T ≤ J−1
D
TJDT∗ ≤ JT (D)
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
Looking for monotone metrics:monotonicity:
gT (D)(T (X ),T (X )) ≤ gD(X ,X ) ∀D ∈Mn, ∀X ∈ TpMn ,
gD(X ,Y ) =⟨X , J−1
D (Y )⟩
= Tr(XJ−1D (Y )), where
JD : Mat(n,C)→ Mat(n,C) linear map.
gT (D)(T (X ),T (X )) =⟨T (X ), J−1
T (D)(T (X ))⟩
gT (D)(T (X ),T (X )) =⟨X ,T ∗J−1
T (D)T (X )⟩
gD(X ,X ) =⟨X , J−1
D (X )⟩
=⟨X ,T ∗J−1
D T (X )⟩
monotonicity: T ∗J−1T (D)T ≤ J−1
D
TJDT∗ ≤ JT (D)
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
Looking for monotone metrics:monotonicity:
gT (D)(T (X ),T (X )) ≤ gD(X ,X ) ∀D ∈Mn, ∀X ∈ TpMn ,
gD(X ,Y ) =⟨X , J−1
D (Y )⟩
= Tr(XJ−1D (Y )), where
JD : Mat(n,C)→ Mat(n,C) linear map.
gT (D)(T (X ),T (X )) =⟨T (X ), J−1
T (D)(T (X ))⟩
gT (D)(T (X ),T (X )) =⟨X ,T ∗J−1
T (D)T (X )⟩
gD(X ,X ) =⟨X , J−1
D (X )⟩
=⟨X ,T ∗J−1
D T (X )⟩
monotonicity: T ∗J−1T (D)T ≤ J−1
D
TJDT∗ ≤ JT (D)
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
Looking for monotone metrics:monotonicity:
gT (D)(T (X ),T (X )) ≤ gD(X ,X ) ∀D ∈Mn, ∀X ∈ TpMn ,
gD(X ,Y ) =⟨X , J−1
D (Y )⟩
= Tr(XJ−1D (Y )), where
JD : Mat(n,C)→ Mat(n,C) linear map.
gT (D)(T (X ),T (X )) =⟨T (X ), J−1
T (D)(T (X ))⟩
gT (D)(T (X ),T (X )) =⟨X ,T ∗J−1
T (D)T (X )⟩
gD(X ,X ) =⟨X , J−1
D (X )⟩
=⟨X ,T ∗J−1
D T (X )⟩
monotonicity:
T ∗J−1T (D)T ≤ J−1
D
TJDT∗ ≤ JT (D)
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
Looking for monotone metrics:monotonicity:
gT (D)(T (X ),T (X )) ≤ gD(X ,X ) ∀D ∈Mn, ∀X ∈ TpMn ,
gD(X ,Y ) =⟨X , J−1
D (Y )⟩
= Tr(XJ−1D (Y )), where
JD : Mat(n,C)→ Mat(n,C) linear map.
gT (D)(T (X ),T (X )) =⟨T (X ), J−1
T (D)(T (X ))⟩
gT (D)(T (X ),T (X )) =⟨X ,T ∗J−1
T (D)T (X )⟩
gD(X ,X ) =⟨X , J−1
D (X )⟩
=⟨X ,T ∗J−1
D T (X )⟩
monotonicity: T ∗J−1T (D)T ≤ J−1
D
TJDT∗ ≤ JT (D)
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
Looking for monotone metrics:monotonicity:
gT (D)(T (X ),T (X )) ≤ gD(X ,X ) ∀D ∈Mn, ∀X ∈ TpMn ,
gD(X ,Y ) =⟨X , J−1
D (Y )⟩
= Tr(XJ−1D (Y )), where
JD : Mat(n,C)→ Mat(n,C) linear map.
gT (D)(T (X ),T (X )) =⟨T (X ), J−1
T (D)(T (X ))⟩
gT (D)(T (X ),T (X )) =⟨X ,T ∗J−1
T (D)T (X )⟩
gD(X ,X ) =⟨X , J−1
D (X )⟩
=⟨X ,T ∗J−1
D T (X )⟩
monotonicity: T ∗J−1T (D)T ≤ J−1
D
TJDT∗ ≤ JT (D)
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
What can JD(X ) be?
”D can act on left ϕ1(D)X and on the right Xϕ1(D)”in general ϕ1(D)Xϕ2(D) gives the idea:
JD(X ) = M(LD ,RD)(X ).
Where LD(X ) = DX and RD(X ) = XD.We have M(LD ,RD) = M(RD , LD)and the monotonicity
TJDT∗ ≤ JT (D)
givesTM(LD ,RD)T ∗ ≤ M(TLDT
∗,TRDT∗).
M is a mean!
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
What can JD(X ) be?”D can act on left ϕ1(D)X and on the right Xϕ1(D)”
in general ϕ1(D)Xϕ2(D) gives the idea:
JD(X ) = M(LD ,RD)(X ).
Where LD(X ) = DX and RD(X ) = XD.We have M(LD ,RD) = M(RD , LD)and the monotonicity
TJDT∗ ≤ JT (D)
givesTM(LD ,RD)T ∗ ≤ M(TLDT
∗,TRDT∗).
M is a mean!
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
What can JD(X ) be?”D can act on left ϕ1(D)X and on the right Xϕ1(D)”in general ϕ1(D)Xϕ2(D) gives the idea:
JD(X ) = M(LD ,RD)(X ).
Where LD(X ) = DX and RD(X ) = XD.
We have M(LD ,RD) = M(RD , LD)and the monotonicity
TJDT∗ ≤ JT (D)
givesTM(LD ,RD)T ∗ ≤ M(TLDT
∗,TRDT∗).
M is a mean!
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
What can JD(X ) be?”D can act on left ϕ1(D)X and on the right Xϕ1(D)”in general ϕ1(D)Xϕ2(D) gives the idea:
JD(X ) = M(LD ,RD)(X ).
Where LD(X ) = DX and RD(X ) = XD.We have M(LD ,RD) = M(RD , LD)
and the monotonicity
TJDT∗ ≤ JT (D)
givesTM(LD ,RD)T ∗ ≤ M(TLDT
∗,TRDT∗).
M is a mean!
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
What can JD(X ) be?”D can act on left ϕ1(D)X and on the right Xϕ1(D)”in general ϕ1(D)Xϕ2(D) gives the idea:
JD(X ) = M(LD ,RD)(X ).
Where LD(X ) = DX and RD(X ) = XD.We have M(LD ,RD) = M(RD , LD)and the monotonicity
TJDT∗ ≤ JT (D)
givesTM(LD ,RD)T ∗ ≤ M(TLDT
∗,TRDT∗).
M is a mean!
Attila Andai Information Geometry
Information Geometry
Petz theorem
Looking for monotone metrics
What can JD(X ) be?”D can act on left ϕ1(D)X and on the right Xϕ1(D)”in general ϕ1(D)Xϕ2(D) gives the idea:
JD(X ) = M(LD ,RD)(X ).
Where LD(X ) = DX and RD(X ) = XD.We have M(LD ,RD) = M(RD , LD)and the monotonicity
TJDT∗ ≤ JT (D)
givesTM(LD ,RD)T ∗ ≤ M(TLDT
∗,TRDT∗).
M is a mean!
Attila Andai Information Geometry
Information Geometry
Petz theorem
A variant of Petz theorem
Theorem (Petz)
Assume that for every n ∈ N the pair (Mn, gn) is aRiemannian-manifold. If for every stochastic map T themonotonicity
gT (D)(T (X ),T (X )) ≤ gD(X ,X ) ∀D ∈Mn,∀X ∈ TpMn,
holds then there exists an operator monotone function f : R+ → Rwith the property f (x) = xf (x−1), such that
gD(X ,Y ) = Tr
(X(R
12n,D f (Ln,DR
−1n,D)R
12n,D
)−1(Y )
).
Attila Andai Information Geometry
Information Geometry
Petz theorem
A variant of Petz theorem
Classical case:
Pn =
{(p1, . . . , pn)
∣∣∣∣∣ 0 < pi < 1,n∑
i=1
pi = 1
}.
Theorem (Cencov) Assume that for every n ∈ N(Pn, gn) is a Riemannian manifold. If for everytransition probability κ : Xn × Xm → R
gκ(p)(κ∗(X ), κ∗(X )) ≤ gp(X ,X ) ∀p ∈ ∆n−1, ∀X ∈ Tp∆n−1 ,
(monotonicity) holds, then the family of metrics (gn)n∈N uniqueup to a positive factor. given by the equation
gD(X ,Y ) = Tr
(X(R
12n,D f (Ln,DR
−1n,D)R
12n,D
)−1(Y )
),
where f : R+ → R+ is an operator monotone function such thatf (x) = xf (x−1) (∀x ∈ R+).
Attila Andai Information Geometry
Information Geometry
Petz theorem
A variant of Petz theorem
Quantum case:
Mn =
{D ∈ Mat(n,C)
∣∣∣∣∣ D = D∗,D > 0,TrD = 1
}.
Theorem (Cencov) Assume that for every n ∈ N(Pn, gn) is a Riemannian manifold. If for everytransition probability κ : Xn × Xm → R
gκ(p)(κ∗(X ), κ∗(X )) ≤ gp(X ,X ) ∀p ∈ ∆n−1, ∀X ∈ Tp∆n−1 ,
(monotonicity) holds, then the family of metrics (gn)n∈N uniqueup to a positive factor. given by the equation
gD(X ,Y ) = Tr
(X(R
12n,D f (Ln,DR
−1n,D)R
12n,D
)−1(Y )
),
where f : R+ → R+ is an operator monotone function such thatf (x) = xf (x−1) (∀x ∈ R+).
Attila Andai Information Geometry
Information Geometry
Petz theorem
A variant of Petz theorem
Quantum case:
Mn =
{D ∈ Mat(n,C)
∣∣∣∣∣ D = D∗,D > 0,TrD = 1
}.
Theorem
Assume that for every n ∈ N(Pn, gn) is a Riemannian manifold. If for everytransition probability κ : Xn × Xm → R
gκ(p)(κ∗(X ), κ∗(X )) ≤ gp(X ,X ) ∀p ∈ ∆n−1, ∀X ∈ Tp∆n−1 ,
(monotonicity) holds, then the family of metrics (gn)n∈N uniqueup to a positive factor. given by the equation
gD(X ,Y ) = Tr
(X(R
12n,D f (Ln,DR
−1n,D)R
12n,D
)−1(Y )
),
where f : R+ → R+ is an operator monotone function such thatf (x) = xf (x−1) (∀x ∈ R+).
Attila Andai Information Geometry
Information Geometry
Petz theorem
A variant of Petz theorem
Quantum case:
Mn =
{D ∈ Mat(n,C)
∣∣∣∣∣ D = D∗,D > 0,TrD = 1
}.
Theorem
Assume that for every n ∈ N(Mn, gn) is a Riemannian manifold. If for everytransition probability κ : Xn × Xm → R
gκ(p)(κ∗(X ), κ∗(X )) ≤ gp(X ,X ) ∀p ∈ ∆n−1, ∀X ∈ Tp∆n−1 ,
(monotonicity) holds, then the family of metrics (gn)n∈N uniqueup to a positive factor. given by the equation
gD(X ,Y ) = Tr
(X(R
12n,D f (Ln,DR
−1n,D)R
12n,D
)−1(Y )
),
where f : R+ → R+ is an operator monotone function such thatf (x) = xf (x−1) (∀x ∈ R+).
Attila Andai Information Geometry
Information Geometry
Petz theorem
A variant of Petz theorem
Quantum case:
Mn =
{D ∈ Mat(n,C)
∣∣∣∣∣ D = D∗,D > 0,TrD = 1
}.
Theorem
Assume that for every n ∈ N(Mn, gn) is a Riemannian manifold. If for everystochastic map T :M+
n →M+n
gκ(p)(κ∗(X ), κ∗(X )) ≤ gp(X ,X ) ∀p ∈ ∆n−1, ∀X ∈ Tp∆n−1 ,
(monotonicity) holds, then the family of metrics (gn)n∈N uniqueup to a positive factor. given by the equation
gD(X ,Y ) = Tr
(X(R
12n,D f (Ln,DR
−1n,D)R
12n,D
)−1(Y )
),
where f : R+ → R+ is an operator monotone function such thatf (x) = xf (x−1) (∀x ∈ R+).
Attila Andai Information Geometry
Information Geometry
Petz theorem
A variant of Petz theorem
Quantum case:
Mn =
{D ∈ Mat(n,C)
∣∣∣∣∣ D = D∗,D > 0,TrD = 1
}.
Theorem
Assume that for every n ∈ N(Mn, gn) is a Riemannian manifold. If for everystochastic map T :M+
n →M+n
gT (D)(T (X ),T (X )) ≤ gD(X ,X ) ∀D ∈Mn, ∀X ∈ TDMn
(monotonicity) holds, then the family of metrics (gn)n∈N uniqueup to a positive factor. given by the equation
gD(X ,Y ) = Tr
(X(R
12n,D f (Ln,DR
−1n,D)R
12n,D
)−1(Y )
),
where f : R+ → R+ is an operator monotone function such thatf (x) = xf (x−1) (∀x ∈ R+).
Attila Andai Information Geometry
Information Geometry
Petz theorem
A variant of Petz theorem
Quantum case:
Mn =
{D ∈ Mat(n,C)
∣∣∣∣∣ D = D∗,D > 0,TrD = 1
}.
Theorem (Petz) Assume that for every n ∈ N(Mn, gn) is a Riemannian manifold. If for everystochastic map T :M+
n →M+n
gT (D)(T (X ),T (X )) ≤ gD(X ,X ) ∀D ∈Mn, ∀X ∈ TDMn
(monotonicity) holds, then the family of metrics (gn)n∈N given bythe equation
gD(X ,Y ) = Tr
(X(R
12n,D f (Ln,DR
−1n,D)R
12n,D
)−1(Y )
),
where f : R+ → R+ is an operator monotone function such thatf (x) = xf (x−1) (∀x ∈ R+).
Attila Andai Information Geometry
Information Geometry
Petz theorem
A variant of Petz theorem
Definition
Consider the Riemannian manifold (M+n ,K
(n)). The metric K (n) iscalled monotone metric if there exists an operator monotonefunction f : R+ → R such that for every positive number xf (x) = xf (x−1) and K (n) is generated by f .
gD(X ,Y ) = Tr
(X
(R
12n,D f (Ln,DR
−1n,D)R
12n,D
)−1
(Y )
)
Attila Andai Information Geometry
Information Geometry
Operator monotone functions
Properties of operator monotone functions
Properties of operator monotone functions
Definition
Assume that f : R+0 → R is an operator monotone function.
f \(x) = xf (x−1) is called to transpose of f ,f ⊥(x) = x
f (x) is called to dual of f .
f is symmetric if f = f \
f is normalized if f (1) = 1.
Theorem
If f : R+0 → R is symmetric operator monotone, then its dual is
symmetric and operator monotone too.
Attila Andai Information Geometry
Information Geometry
Operator monotone functions
Representation theorems for operator monotone functions
Representations of operator monotone functions
Denote by FR+0
the set of operator monotone functions defined on
R+0 and by F (S,n)
R+0
the symmetric normalized ones.
Denote by GI the set of positive Radon-measures on the intervalI ⊆ R.
A measure µ ∈ GI is said to be normalized if µ(I ) = 1.
Denote by G(n)I the set of normalized measures.
Attila Andai Information Geometry
Information Geometry
Operator monotone functions
Representation theorems for operator monotone functions
Theorem (Lowner)
There is a bijective correspondence
φ : GR+0→ FR+
0µ 7→ fµ
fµ(x) =
∫ ∞
0
x(1 + t)
x + tdµ(t).
Attila Andai Information Geometry
Information Geometry
Operator monotone functions
Representation theorems for operator monotone functions
Theorem
There is a bijective correspondence
φ : G[0,1] → FR+0
µ 7→ fµ
fµ(x) =
∫ 1
0
x
(1− t)x + tdµ(t).
The function fµ is symmetric iff µ([0, s]) = µ([1− s, 1]) holds forevery 0 ≤ s ≤ 1.
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Cencov-Morozova function
Cencov-Morozova function
Definition
The function c :(R+
0
)2 → R is called Cencov–Morozova functionif there exists an f ∈ FR+
0such that for every positive x , y
c(x , y) =1
yf(xy
) .
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Cencov-Morozova function
If f : R+ → R+ is operator monotone then it is smooth, moreoverit can be extended to a horizontal in the complex plane around thepositive real axes.So if f is operator monotone then for every ρ ∈ R we have
f (ρ) =1
2π i
∮
Γf (ξ)(ξ − ρ)−1 d ξ
by Cauchy integral formula, where Γ is a smooth closed curvearound ρ with counter-clockwise orientation.
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Riesz–Dunford operator calculus
The Riesz–Dunford operator calculus states that this can be donefor operators too. If A is a self-adjoint operator then
f (A) =1
2π i
∮
Γf (ξ)(ξ I−A)−1 d ξ,
where the interior of Γ contains all the eigenvalues of A.
Im
Re
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Petz theorem with Cencov-Morozova functions
We have seen that for a state D ∈M+n the multiplications Ln,D
and Rn,D are self-adjoint operators, so we have
f (Ln,D) =1
2π i
∮
Γf (ξ)(ξ I−Ln,D)−1 d ξ
f (Rn,D) =1
2π i
∮
Γf (ξ)(ξ I−Rn,D)−1 d ξ.
This leads us to
f (Ln,D)(X ) =1
2π i
∮
Γf (ξ)(ξ I−D)−1X d ξ
f (Rn,D)(X ) =1
2π i
∮
Γf (ξ)X (ξ I−D)−1 d ξ.
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Petz theorem with Cencov-Morozova functions
These expressions can be extended to multivariate case, such as
c(Ln,D ,Rn,D)=1
(2π i)2
∮∮c(ξ, η)(ξ I−Ln,D)−1(η I−Rn,D)−1 d ξ d η,
which effect can be computed as
c(Ln,D ,Rn,D)(X )=1
(2π i)2
∮∮c(ξ, η)(ξ I−D)−1X (η I−D)−1 d ξ d η.
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Petz theorem with Cencov-Morozova functions
Theorem
If K (n) is a monotone metric on M+n generated by an operator
monotone function f then for every state D ∈M+n and tangent
vector X ,Y ∈ TDM+n we have
K(n)D (X ,Y ) = Tr
1
(2π i)2
∮∮c(ξ, η)X (ξ I−D)−1Y (η I−D)−1 d ξ d η.
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Examples for monotone metrics
Examples for functions in F (S,n)
R+0
.
fSM(x) =1 + x
2fLA(x) =
2x
1 + xfKM(x) =
x − 1
log x
fP1(x) =2xα+1/2
1 + x2α0 ≤ α ≤ 1/2
fP2(x) =β(1− β)(x − 1)2
(xβ − 1)(x1−β − 1)β ∈ [−1, 2] \ {0, 1} ,
WYD(x) =1− α2
4
(x − 1)2
(1− x1−α
2 )(1− x1+α
2 )α ∈ [−3, 3] \ {−1, 1}
fWY(x) =1
4(√x + 1)2
fP3(x) =
(1 + x
1ν
2
)νν ∈ [1, 2]
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Explicit expression for operator monotone metrics
Consider the matrix units Eij ((Eij)ab = δiaδjb) and matricesFij = Eij + Eji and Hij = iEij − iEji . (These form a basis of thetangent space.)
Theorem
If the monotone metric K (n),f on M+n is generated by f then at a
state D ∈M+n in the form of D =
n∑
k=1
λkEkk we have
1 ≤ i < j ≤ n, 1 ≤ k < l ≤ n :
GD(Hij ,Hkl) = δikδjl2c(λi , λj)GD(Fij ,Fkl) = δikδjl2c(λi , λj)GD(Hij ,Fkl) = 0,
1 ≤ i < j ≤ n, 1 ≤ k ≤ n : GD(Hij ,Fkk) = G (Fij ,Fkk) = 0,
1 ≤ i ≤ n, 1 ≤ k ≤ n : GD(Fii ,Fkk) = δik4c(λi , λi ).
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Examples
Example (Smallest metric)
The metric K(n)SM generated by the function fSM(x) = 1+x
2 is called
smallest metric since fSM(x) is maximal among functions in F (S,n)
R+0
with respect to the pointwise order
f ≤[0,1]
g ⇐⇒ f (x) ≤ g(x) ∀x ∈ [0, 1] .
The corresponding Chencov-Morozova function is
cSM(x , y) =2
x + y.
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Examples
Example (Smallest metric cont.)
The inner product of vectors X ,Y can be written in the form of
K(n)SM,D(X ,Y ) = TrXZ ,
where Z is the solution of the equation
DZ + ZD = 2Y .
The geodesic distance between states D1 and D2 according to thismetric is
dSM(D1,D2) =
√2(
1− Tr(D
1/21 D2D
1/21
)1/2).
(1992 Uhlmann, studying Berry phase)
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Examples
Example (Largest metric)
The metric K(n)LA generated by the function fLA(x) = 2x
1+x is called
largest metric since fSM(x) is maximal among functions in F (S,n)
R+0
.
In this case the metric can be written in a simple form
K(n)LA,D(X ,Y ) = TrXD−1Y .
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Examples
Example (Kubo-Mori metric)
The metric generated by the function fKM(x) =x − 1
log x.
Its Cencov–Morozova function is
cKM(x , y) =log x − log y
x − y.
Using the integral representation
cKM(x , y) =
∫ ∞
0(t + x)−1(t + y)−1 d t
we have for the metric
K(n)KM,D(X ,Y ) = Tr
∫ ∞
0X (t + D)−1Y (t + D)−1 d t.
(Linear response theory Fick, Sailer.)
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
A simple consequences of ordering
A simple consequences of ordering
Theorem
For every f ∈ F (S,n)
R+0
we have
fSM ≥[0,1]
f ≥[0,1]
fLA.
Theorem
Assume that f ∈ F (S,n)
R+0
. For every state D ∈M+n and tangent
vector X ∈ TDM+n we have
K(n)LA,D(X ,X ) ≥ K
(n),fD (X ,X ) ≥ K
(n)SM,D(X ,X ).
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
A simple consequences of ordering
We have a continuous path in F (S,n)
R+0
from smallest to largest.
fSM = f(ν=1)P3
≥[0,1]
f(1≤ν≤2)P3
≥[0,1]
f(ν=2)P3 = fWY
fWY = f(α=0)WYD
≥[0,1]
f(0≤α≤3)WYD
≥[0,1]
f(α=3)WYD = fLA
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Monotone metric from entropy
Monotone metric from entropy
Consider the integral representation of the log function
log x =
∫ ∞
0(1 + t)−1 − (x + t)−1 d t.
We have for the entropy
S(D) = TrD
∫ ∞
0(D + t)−1 − (I+t)−1 d t.
The first derivative of the entropy is dS(D)(A) = −TrA logD.
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Monotone metric from entropy
The second derivative is
d2 S :M+n → Lin
(TMn,Lin(TMn,R)
)
d2 S(D)(A)(B) = −Tr
∫ ∞
0(D + t)−1A(D + t)−1B d t,
which is (−1) times the Kubo-Mori metric.
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Monotone metric from euclidean metric
Monotone metric from euclidean metric
For the complex state space M+n denote by Sn2−1
1 the unit ball inthe euclidean space Rn×n and consider the map
φ :M+n → Sn2−1 D 7→
√D.
Using derivative of φ
(dD φ)(A) =(L
1/2D + R
1/2D
)−1(A)
we can deduce that the pull back metric in this case is
(φ∗g)(A,B) = 〈(dDφ)(A), (dDφ)(B)〉
= TrA(L1/2D + R
1/2D )−2(B)
=1
4TrAcWY(LD ,RD)(B).
Attila Andai Information Geometry
Information Geometry
Computing monotone metrics
Monotone metric from euclidean metric
So in this case easy to compute the geodesic distance betweenstates D1 and D2
dWY(D1,D2) = 2 arccosTrD1/21 D
1/22 .
Attila Andai Information Geometry
Information Geometry
Relative entropy
Relative entropy from operator convex functions
First relative entropy
The first version of relative entropy in quantum setting was givenby Umegaki in 1962. He defined the relative entropy of statesD1,D2 ∈M+
n as
S(D1,D2) = TrD1(logD1 − logD2).
This relative entropy is called to Umegaki relative entropy.
Attila Andai Information Geometry
Information Geometry
Relative entropy
Relative entropy from operator convex functions
Relative entropy from operator convex functions
Definition
A continuous function f : R→ R is called operator convex if forevery n ∈ N and n × n self-adjoint operator A,B and parameterλ ∈ [0, 1]
f (λA + (1− λ)B) ≤ λf (A) + (1− λ)f (B)
holds.
The set of operator convex functions g with property g(1) = 0defined on the interval I ⊆ R is denoted by KI .
Attila Andai Information Geometry
Information Geometry
Relative entropy
Relative entropy from operator convex functions
Representation theorem for operator convex functions
Theorem
If g : R+ → R is an operator convex function then there existparameters a ∈ R, b, c ∈ R+
0 and a positive finite measure µg onthe interval R+
0 such that
g(x) = a(x−1)+b(x−1)2+c(x − 1)2
x+
∫ ∞
0(x−1)2 1 + t
x + tdµg (t).
For every parameter a ∈ R, b, c ∈ R+0 and finite measure µ
equation above defines an operator convex function.
Attila Andai Information Geometry
Information Geometry
Relative entropy
Relative entropy from operator convex functions
Definition (Petz)
If g ∈ KR+ then the function Hg (·, ·) :M+n ×M+
n → R
Hg (D1,D2) = Tr(D
1/21 g(LD2R
−1D1
)D1/21
)
is called to g -relative entropy.
Attila Andai Information Geometry
Information Geometry
Relative entropy
Properties of the relative entropy
Theorem (Properties of g -relative entropy)
Assume that H is a g -relative entropy.
1 Then for every state D1,D2: H(D1,D2) ≥ 0, andH(D1,D2) = 0 iff D1 = D2.
2 H is jointly convex, that is for every state D1,D2,D3,D4 andparameter λ ∈ [0, 1] we have
H(λD1 + (1− λ)D2, λD3 + (1− λ)D4)
≤ λH(D1,D3) + (1− λ)H(D2,D4).
Attila Andai Information Geometry
Information Geometry
Relative entropy
Properties of the relative entropy
Theorem (Properties of g -relative entropy cont.)
3 H is monotone: for every stochastic map T :M+n →M+
n
H(T (D1),T (D2)) ≤ H(D1,D2) ∀D1,D2 ∈M+n .
4 H is differentiable: for every state D1,D2 ∈M+n and tangent
vectors A ∈ TD1M+n , B ∈ TD2M+
n the map R× R→ R
(x , y) 7→ H(D1 + xA,D2 + yB)
is differentiable at the origin.
Attila Andai Information Geometry
Information Geometry
Relative entropy
Properties of the relative entropy
The quantity Hg (D1,D2) depends mainly on D1 − D2.
Theorem
If g ∈ KR+ then for every state D1,D2 ∈M+n
Hg (D1,D2) = Tr
((D1 − D2)R−1
D1
(g(LD2R
−1D1
)(D1 − D2)))
.
Attila Andai Information Geometry
Information Geometry
Relative entropy
Properties of the relative entropy
For an operator convex function g define its transpose asg\(x) = xg(x−1), and dual as g⊥(x) = x
g(x) .
g is said to be symmetric if g\ = g .
g is said to be normalised if g ′′(1) = 1.
The effect of transpose is changing the arguments
Hg (D1,D2) = Hg\(D2,D1).
Attila Andai Information Geometry
Information Geometry
Relative entropy
Riemannian metric from relative entropy
Theorem (Riemannian metric from relative entropy)
Assume that g ∈ KR+0
. Then
K g ,(n) :M+n → Lin(TMn × TMn,R)
Kg ,(n)D (X ,Y ) = − ∂2
∂s∂tHg (D + tX ,D + sY )
∣∣∣∣∣t=s=0
is a Riemannian metric on M+n .
Define an equivalence relation on KR+ as
f ∼ g ⇐⇒ f + f \ = g + g\.
Theorem
The functions g1, g2 ∈ KR+ generates the same metric iff g1 ∼ g2.
Attila Andai Information Geometry
Information Geometry
Relative entropy
Relative entropy and monotone metrics
Relative entropy and monotone metrics
Theorem
The map φ : KSR+ → FS
R+0
g(x) 7→ φ(g)(x) =
(x − 1)2
g(x) + xg(x−1)if x > 0, x 6= 1,
1
g ′′(1)if x = 1,
1
limx→0
g(x) + xg(x−1)if x = 0
is well-defined and
Kg ,(n)D (X ,Y ) = K
(n),φ(g)D (X ,Y ) ∀D ∈M+
n ∀X ,Y ∈ TMn.
Attila Andai Information Geometry
Information Geometry
Relative entropy
Relative entropy and monotone metrics
Relative entropy and monotone metrics
Theorem
The map ε : F (S)
R+0
→ K(S)R+
f (x) 7→ ε(f )(x) =(x − 1)2
2f (x)
is well-defined and K (n),f = K ε(f ),(n) holds.
Attila Andai Information Geometry
Information Geometry
Relative entropy
Relative entropy and monotone metrics
Relative entropy and monotone metrics
Combining these we have the following theorem.
Theorem
There is a simple bijective correspondence between
1 the set of monotone metrics,
2 F (S)
R+0
,
3 K(S)R+ .
Attila Andai Information Geometry
Information Geometry
Relative entropy
Examples
Example (Smallest metric)
The corresponding operator monotone function is f (x) = 1+x2 and
the generated operator convex function is
g(x) =(x − 1)2
1 + x.
and the relative entropy
HSM :M+n ×M+
n → R (D1,D2) 7→ HSM(D1,D2)
HSM(D1,D2) = Tr(D1 − D2)(LD2 + RD1)−1(D1 − D2).
Bures relative entropy
Attila Andai Information Geometry
Information Geometry
Relative entropy
Examples
Example (Largest metric)
The corresponding operator monotone function is f (x) = 2x1+x and
the generated operator convex function is
g(x) = (x − 1)2 1 + x
4x
and the relative entropy is
Hg1(D1,D2) =1
2Tr(D1 − D2)D−1
1 (D1 − D2).
Quadratic relative entropy
Attila Andai Information Geometry
Information Geometry
Relative entropy
Examples
Example (Kubo-Mori metric)
The corresponding operator monotone function is f (x) = x−1log x and
the generated operator convex function is
g(x) =x − 1
2log x
and the generated relative entropy is
Hg1(D1,D2) = TrD1
(logD1 − logD2
).
Umegaki relative entropy
Attila Andai Information Geometry
Information Geometry
Duality
Basic definitions
Assume that f ∈ F (n)
R+0
and h ∈ KnR+ . We use the term h is
compatible with f if for the function
g(x) =(x − 1)2
2f (x)
h ∼ g holds.For a monotone metric K (n),f and a compatible function h wedefine a covariant derivative ∇f ,h : TMn × TMn → TMn as
K(n),fD (∇f ,h
X Y ,Z ) = − ∂3
∂s∂t∂uHh(D + sX + tY ,D + uZ )
∣∣∣∣s,t,u=0
,
where X ,Y ,Z ∈ TDM+n .
(Giblisco, Isola, Uhlmann, Dabrowksi, Jadczyk, Hubner)
Attila Andai Information Geometry
Information Geometry
Duality
Main theorem of duality
Main theorem of duality
Theorem
For a function f ∈ F (S,n)
R+0
and a compatible function h ∈ K(n)R+ the
quadruplet (M+n ,K
(n),f ,∇f ,h,∇f ,h\) is torsion free dual geometry.
Attila Andai Information Geometry
Information Geometry
Duality
A characterization of the Kubo-Mori metric
A characterization of the Kubo-Mori metric
Theorem
If (M+n , g ,∇(1),∇(−1)) is a dual geometry for some Riemannian
metric then g equals to Kubo-Mori metric g (KM) up to a positivemultiplicative factor.
Attila Andai Information Geometry
Information Geometry
Duality
Pythagorean theorem
Pythagorean theorem
Theorem
Consider states D1,D2,D3 ∈M+n and ∇(1) geodesic curve γ1
connecting D1 and D2 and ∇(−1) geodesic curve γ2 connecting D2
and D3. IfK
(n)KM,D2
(γ1(D2), γ2(D2)) = 0 ,
holds then
Hlog(D1,D3) = Hlog(D1,D2) + Hlog(D2,D3).
Attila Andai Information Geometry
Information Geometry
Duality
Pythagorean theorem
Pythagorean theorem
Attila Andai Information Geometry
Information Geometry
About volume of the state space
Hilbert-Schmidt measure
Hilbert-Schmidt measure
The Hilbert-Schmidt measure on M+n is defined by the Euclidean
metric
d(D1,D2) =√
Tr(D1 − D2)2
We can consider M+n as a manifold with metric
gD(X ,Y ) = Tr(XY ) D ∈M+n X ,Y ∈ TDM+
n .
Induces the flat, Euclidean geometry on the set of states.
Attila Andai Information Geometry
Information Geometry
About volume of the state space
Hilbert-Schmidt measure
The invariant volume measure is
ρ(D) =√
det gD = 1 .
(Which is the most simple prior on M+n .) The volume of the state
space is
Volume =
∫
M+n
1 dD ,
wheredD = d a11 d a12 . . . d a22 d a23 . . . d an−1,n .
Attila Andai Information Geometry
Information Geometry
About volume of the state space
Hilbert-Schmidt measure
The invariant volume measure is
ρ(D) =√
det gD = 1 .
(Which is the most simple prior on M+n .) The volume of the state
space is
Volume =
∫
M+n
1 dD ,
wheredD = d a11 d a12 . . . d a22 d a23 . . . d an−1,n .
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Some notations:
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
Tn := det(An)× (An)−1 detTn = (detAn)n−1
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
x1, x2, x3
Lemma: detAn = ann(detAn−1)−⟨xn−1,Tn−1xn−1
⟩.
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Some notations:
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
A1
Tn := det(An)× (An)−1 detTn = (detAn)n−1
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
x1, x2, x3
Lemma: detAn = ann(detAn−1)−⟨xn−1,Tn−1xn−1
⟩.
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Some notations:
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
A2
Tn := det(An)× (An)−1 detTn = (detAn)n−1
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
x1, x2, x3
Lemma: detAn = ann(detAn−1)−⟨xn−1,Tn−1xn−1
⟩.
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Some notations:
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
A3
Tn := det(An)× (An)−1 detTn = (detAn)n−1
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
x1, x2, x3
Lemma: detAn = ann(detAn−1)−⟨xn−1,Tn−1xn−1
⟩.
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Some notations:
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
A3
Tn := det(An)× (An)−1 detTn = (detAn)n−1
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
x1, x2, x3
Lemma: detAn = ann(detAn−1)−⟨xn−1,Tn−1xn−1
⟩.
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Some notations:
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
A3
Tn := det(An)× (An)−1 detTn = (detAn)n−1
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
x1, x2, x3
Lemma: detAn = ann(detAn−1)−⟨xn−1,Tn−1xn−1
⟩.
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Some notations:
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
A3
Tn := det(An)× (An)−1 detTn = (detAn)n−1
A4 =
a11 a12 a13 a14
a∗12 a22 a23 a24
a∗13 a∗23 a33 a34
a∗14 a∗24 a∗34 a44
x1, x2, x3
Lemma: detAn = ann(detAn−1)−⟨xn−1,Tn−1xn−1
⟩.
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Decomposition of the state space: 3× 3 real case:
x
y
z
diagonal elem
ents
1
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Decomposition of the state space: 3× 3 real case:
x
y
z
diagonal elem
ents
b
0.1250.25
0.625
1
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Decomposition of the state space: 3× 3 real case:
x
y
z
diagonal elem
ents
b
0.125a12
a120.25
0.625
1
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Decomposition of the state space: 3× 3 real case:
x
y
z
diagonal elem
ents
b
0.25 a12
a12 0.50.25
1
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Decomposition of the state space: 3× 3 real case:
x
y
z
diagonal elem
ents
0.25 0.1 a13
0.1 0.5 a23
a13 a23 0.25
b
b
1
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Decomposition of the state space: 3× 3 real case:
x
y
z
diagonal elem
ents
0.25 0.05a13
0.05 0.05a23
a13a23 0.25
b
b
1
Attila Andai Information Geometry
Information Geometry
About volume of the state space
A decomposition of the state space
Decomposition of the state space: 4× 4 real case:
x
y
z
diagonal elem
ents
0.24 0.05 0.02a14
0.05 0.04 0.03a24
0.02 0.03 0.24a34
a14a24
a34 0.03
b
b b
1
Attila Andai Information Geometry
Information Geometry
About volume of the state space
The volume
Theorem
For every n ∈ N the volume of the state space M+n is
V (M+n ) =
πdn(n−1)/4
Γ(d n(n−1)
2 + n)
n−1∏
i=1
Γ
(id
2+ 1
)
and the integral of the function detα with respect to thenormalized Hilbert–Schmidt measure is
∫
M+n
det α =Γ(dn(n−1)
2 + n)
Γ(dn(n−1)
2 + n + nα)
n∏
i=1
Γ(d i−1
2 + 1 + α)
Γ(d i−1
2 + 1) .
Attila Andai Information Geometry
Information Geometry
About volume of the state space
The qubit case
In the space of qubits we use the Stokes parametrization
D =1
2
(1 + x y + i zy + i z 1− x
).
M2 can be identified with the unit ball in R3 and R2.The Riemannian metric g (f ) in this coordinate system is
gf (x , y , z) =1
2
12λ1λ2
0 0
0 1
λ1f(λ2λ1
) 0
0 0 1
λ1f(λ2λ1
)
gf (x , y) =1
2
( 12λ1λ2
0
0 1
λ1f(λ2λ1
)).
Attila Andai Information Geometry
Information Geometry
About volume of the state space
The qubit case
The volume is an integral on the unit ball, which can be expressedas
V(M(C)
2
)= 2π
1∫
0
(1− t
1 + t
)2 1√tf (t)
d t
V(M(R)
2
)=√
2π
1∫
0
1− t
1 + t
1√t + t2
√f (t)
d t.
The volume of the state space with monotone metric is unknown.
Attila Andai Information Geometry
Information Geometry
About volume of the state space
The qubit case
Some operator monotone functions and the corresponding volumes.
f (x) : V(M(C)
2
): V
(M(R)
2
):
1 + x
2π2 2π
2x
1 + x∞ ∞
x − 1
log x2π2 ∼ 8.298
√x ∞ 4π
(√x + 1)2/4 4π(π − 2) 4π(2−
√2)
2√x(x − 1)
(1 + x) log x∞ ∼ 19.986
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Brief history of uncertainty relations
Brief history of uncertainty relations
1927, Heisenberg: not possible to measure the position andmoment at a same time. (Idea, not a theorem.)
Heisenberg studied Gauss distributions (f (q)), where ”uncertainty”was the width of Df .
q
f(q)
f(0)e ︸ ︷︷ ︸
Df
1
If F(f ) denotes the Fourier transform of f then the first equationfor uncertainty was
DfDF(f ) = constant.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Brief history of uncertainty relations
1927, Kennard: For observables A,B if [A,B] = − i then
VarD(A)VarD(B) ≥ 1
4,
where VarD(A) = Tr(DA2)− (Tr(DA))2.
1929, Robertson: For all observables A,B
VarD(A)VarD(B) ≥ 1
4|Tr(D [A,B])|2 .
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Brief history of uncertainty relations
1927, Kennard: For observables A,B if [A,B] = − i then
VarD(A)VarD(B) ≥ 1
4,
where VarD(A) = Tr(DA2)− (Tr(DA))2.
1929, Robertson: For all observables A,B
VarD(A)VarD(B) ≥ 1
4|Tr(D [A,B])|2 .
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Brief history of uncertainty relations
1930, Schrodinger: For all observables A,B
VarD(A)VarD(B)− CovD(A,B)2 ≥ 1
4|Tr(D [A,B])|2 ,
where
CovD(A,B) =1
2
(Tr(DAB) + Tr(DBA)
)− Tr(DA)Tr(DB).
Or in a bit different form:
det
(CovD(A,A) CovD(A,B)CovD(B,A) CovD(B,B)
)≥
≥ det
[− i
2
(Tr(D [A,A]) Tr(D [A,B])Tr(D [B,A]) Tr(D [B,B])
)].
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Brief history of uncertainty relations
1930, Schrodinger: For all observables A,B
VarD(A)VarD(B)− CovD(A,B)2 ≥ 1
4|Tr(D [A,B])|2 ,
where
CovD(A,B) =1
2
(Tr(DAB) + Tr(DBA)
)− Tr(DA)Tr(DB).
Or in a bit different form:
det
(CovD(A,A) CovD(A,B)CovD(B,A) CovD(B,B)
)≥
≥ det
[− i
2
(Tr(D [A,A]) Tr(D [A,B])Tr(D [B,A]) Tr(D [B,B])
)].
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Brief history of uncertainty relations
1934, Robertson: For finite set of observables (Ai )i∈I
det
([CovD(Ah,Aj)]h,j∈I
)≥det
([− i
2Tr(D [Ah,Aj ])
]
h,j∈I
).
∼2000–, Furuichi, Gibilisco, Hansen, Imparato, Isola, Kosaki,Kuriyama, Luo, Petz, Yanagi, Q. Zhang, Z. Zhang
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Brief history of uncertainty relations
1934, Robertson: For finite set of observables (Ai )i∈I
det
([CovD(Ah,Aj)]h,j∈I
)≥det
([− i
2Tr(D [Ah,Aj ])
]
h,j∈I
).
∼2000–, Furuichi, Gibilisco, Hansen, Imparato, Isola, Kosaki,Kuriyama, Luo, Petz, Yanagi, Q. Zhang, Z. Zhang
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Covariances
New concepts
For observables A,B, state D ∈M+n and operator monotone
function f :
CovD(A,B) =1
2(Tr(DAB) + Tr(DBA))− Tr(DA)Tr(DB)
CovfD(A,B) = 〈A,B〉D,f (2002, Petz)
qCovasD,f (A,B) =f (0)
2〈i [D,A] , i [D,B]〉D,f
qCovsD,f (A,B) =f (0)
2〈{D,A} , {D,B}〉D,f ,
where [., .] is the commutator and {., .} is the anticommutator.
For an observable A and state D define A0 = A− Tr(DA)I , thenTrDA0 = 0.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Covariances
New concepts
For observables A,B, state D ∈M+n and operator monotone
function f :
CovD(A,B) =1
2(Tr(DAB) + Tr(DBA))− Tr(DA)Tr(DB)
CovfD(A,B) = 〈A,B〉D,f (2002, Petz)
qCovasD,f (A,B) =f (0)
2〈i [D,A] , i [D,B]〉D,f
qCovsD,f (A,B) =f (0)
2〈{D,A} , {D,B}〉D,f ,
where [., .] is the commutator and {., .} is the anticommutator.
For an observable A and state D define A0 = A− Tr(DA)I , thenTrDA0 = 0.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Covariances
For observables (A(k))k=1,...,N with zero mean at a state D define
[CovD ]ij = CovD(A(i),A(j))[CovfD
]ij
= CovfD(A(i),A(j))
[qCovasD,f
]ij
= qCovasD,f (A(i),A(j))[qCovsD,f
]ij
= qCovsD,f (A(i),A(j)).
2006, Gibilisco: Conjecture: det(CovD) ≥ det(qCovasD,f ).
2008, Andai: The conjecture is true.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Covariances
For observables (A(k))k=1,...,N with zero mean at a state D define
[CovD ]ij = CovD(A(i),A(j))[CovfD
]ij
= CovfD(A(i),A(j))
[qCovasD,f
]ij
= qCovasD,f (A(i),A(j))[qCovsD,f
]ij
= qCovsD,f (A(i),A(j)).
2006, Gibilisco: Conjecture: det(CovD) ≥ det(qCovasD,f ).
2008, Andai: The conjecture is true.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Covariances
For observables (A(k))k=1,...,N with zero mean at a state D define
[CovD ]ij = CovD(A(i),A(j))[CovfD
]ij
= CovfD(A(i),A(j))
[qCovasD,f
]ij
= qCovasD,f (A(i),A(j))[qCovsD,f
]ij
= qCovsD,f (A(i),A(j)).
2006, Gibilisco: Conjecture: det(CovD) ≥ det(qCovasD,f ).
2008, Andai: The conjecture is true.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
Up to date results
Theorem (2016, Lovas, Andai)
det(CovD) ≥ det(qCovsD,f ) ≥ det(qCovasD,f ).
2f (0)CovfRLDD (A0,B0)
≤ qCovsD,f (A0,B0)− qCovasD,f (A0,B0)
≤ CovfRLDD (A0,B0)
det(qCovsD,f )− det(qCovasD,f ) ≥ (2f (0))N det(CovfRLDD )
2017, Lovas, Andai: Further extensions of symmetric andantisymmetric covariant derivatives and simplified proof for theoriginal Robertson inequality2018: ???
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
Up to date results
Theorem (2016, Lovas, Andai)
det(CovD) ≥ det(qCovsD,f ) ≥ det(qCovasD,f ).
2f (0)CovfRLDD (A0,B0)
≤ qCovsD,f (A0,B0)− qCovasD,f (A0,B0)
≤ CovfRLDD (A0,B0)
det(qCovsD,f )− det(qCovasD,f ) ≥ (2f (0))N det(CovfRLDD )
2017, Lovas, Andai: Further extensions of symmetric andantisymmetric covariant derivatives and simplified proof for theoriginal Robertson inequality2018: ???
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
Up to date results
Theorem (2016, Lovas, Andai)
det(CovD) ≥ det(qCovsD,f ) ≥ det(qCovasD,f ).
2f (0)CovfRLDD (A0,B0)
≤ qCovsD,f (A0,B0)− qCovasD,f (A0,B0)
≤ CovfRLDD (A0,B0)
det(qCovsD,f )− det(qCovasD,f ) ≥ (2f (0))N det(CovfRLDD )
2017, Lovas, Andai: Further extensions of symmetric andantisymmetric covariant derivatives and simplified proof for theoriginal Robertson inequality2018: ???
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References I
[1] P. M. Alberti es A. Uhlmann.Stochasticity and partial order, volume 18 of Mathematische Monographien [Mathematical Monographs].VEB Deutscher Verlag der Wissenschaften, Berlin, 1981.Doubly stochastic maps and unitary mixing.
[2] S. Amari.Differential-geometrical methods in statistics, volume 28 of Lecture Notes in Statistics.Springer-Verlag, New York, 1985.
[3] S. Amari es H. Nagaoka.Methods of information geometry, volume 191 of Translations of Mathematical Monographs.American Mathematical Society, Providence, RI, 2000.Translated from the 1993 Japanese original by Daishi Harada.
[4] S.-I. Amari, O. E. Barndorff-Nielsen, R. E. Kass, S. L. Lauritzen, es C. R. Rao.Differential geometry in statistical inference.Institute of Mathematical Statistics Lecture Notes—Monograph Series, 10. Institute of Mathematical Statistics, Hayward, CA,1987.
[5] T. Ando.Concavity of certain maps on positive definite matrices and applications to Hadamard products.Linear Algebra Appl., 26:203–241, 1979.
[6] W. B. Arveson.
Subalgebras of C∗-algebras.Acta Math., 123:141–224, 1969.
[7] R. Balian, Y. Alhassid, es H. Reinhardt.Dissipation in many-body systems: a geometric approach based on information theory.Phys. Rep., 131(1-2):1–146, 1986.
[8] M. B. Bassat.f -entropies, probability of error and feature selection.Inform. Control, 39:227–242, 1978.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References II
[9] V. P. Belavkin es P. Staszewski.
C∗-algebraic generalization of relative entropy and entropy.Ann. Inst. H. Poincare Sect. A (N.S.), 37(1):51–58, 1982.
[10] R. Bhatia.Matrix analysis, volume 169 of Graduate Texts in Mathematics.Springer-Verlag, New York, 1997.
[11] A. Bhattacharyya.On a measure of divergence between two statistical populations defined by their probability distributions.Bull. Calcutta Math. Soc., 35:99–109, 1943.
[12] F. J. Bloore.Geometrical description of the convex sets of states for systems with spin- 1/2 and spin-1.J. Phys. A, 9(12):2059–2067, 1976.
[13] J. Bognar, F. Gondocs, L. Kaszonyi, A. Kovats, Gy. Michaletzky, T. Mori, A. Somogyi, L. Szeidl, es J. G. Szekely.Matematikai statisztika.Nemzeti Tankonyvkiado, Budapest, 1995.Szerkesztette J. Mogyorodi es Gy. Michaletzky.
[14] M. Bolla, I. Csiszar, I. Gaudi, O. Gulyas, B. Hajtman, A. Kun, T. Lengyel, G. Michaletzky, L. Rejto, T. Rudas, G. Szekely,L. Telegdi, es G. Tusnady.Tobbvaltozos statisztikai analızis.Muszaki Konyvkiado, Budapest, 1986.Szerkesztette T. Mori es Szekely, G.
[15] D. C. Brody es A. Ritz.Geometric phase transitions.cond-mat/9903168, 1999.
[16] N. N. Cencov.Statistical decision rules and optimal inference, volume 53 of Translations of Mathematical Monographs.American Mathematical Society, Providence, R.I., 1982.Translation from the Russian edited by Lev J. Leifman.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References III
[17] Man Duen Choi.Completely positive linear maps on complex matrices.Linear Algebra and Appl., 10:285–290, 1975.
[18] J. Conway.A Course in Functional Analysis.Graduate Texts in Mathematics. Springer Verlag, 1990.Second Edition.
[19] I. Csiszar.Information-type measures of difference of probability distributions and indirect observations.Studia Sci. Math. Hungar., 2:299–318, 1967.
[20] I. Csiszar.On topology properties of f -divergences.Studia Sci. Math. Hungar., 2:329–339, 1967.
[21] I. Csiszar.I -divergence geometry of probability distributions and minimization problems.Ann. Probability, 3:146–158, 1975.
[22] I. Csiszar es G. Tusnady.Information geometry and alternating minimization procedures.Statist. Decisions, (suppl. 1):205–237, 1984.Recent results in estimation theory and related topics.
[23] L. Dabrowski es A. Jadczyk.Quantum statistical holonomy.J. Phys. A, 22(15):3167–3170, 1989.
[24] C. Davis.Notions generalizing convexity for functions defined on spaces of matrices.In Proc. Sympos. Pure Math., Vol. VII, pages 187–201. Amer. Math. Soc., Providence, R.I., 1963.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References IV
[25] J. Dittmann.On the curvature of monotone metrics and a conjecture concerning the Kubo-Mori metric.Linear Algebra Appl., 315(1-3):83–112, 2000.
[26] J. Dittmann es A. Uhlmann.Connections and metrics respecting purification of quantum states.J. Math. Phys., 40(7):3246–3267, 1999.
[27] B. Efron.Defining the curvature of a statistical problem (with applications to second order efficiency).Ann. Statist., 3(6):1189–1242, 1975.With a discussion by C. R. Rao, Don A. Pierce, D. R. Cox, D. V. Lindley, Lucien LeCam, J. K. Ghosh, J. Pfanzagl, Neils Keiding,A. P. Dawid, Jim Reeds and with a reply by the author.
[28] S. Eguchi.Second order efficiency of minimum contrast estimators in a curved exponential family.Ann. Statist., 11(3):793–803, 1983.
[29] S. Eguchi.A characterization of second order efficiency in a curved exponential family.Ann. Inst. Statist. Math., 36(2):199–206, 1984.
[30] S. Eguchi.A geometric look at nuisance parameter effect of local powers in testing hypothesis.Ann. Inst. Statist. Math., 43(2):245–260, 1991.
[31] E. Fick es G. Sauermann.The quantum statistics of dynamic processes, volume 86 of Springer Series in Solid-State Sciences.Springer-Verlag, Berlin, 1990.Translated from the German by W. D. Brewer.
[32] R. A. Fisher.On the mathematical foundations of theoretical statistics.Phil. Trans. R. Soc., A, 222:309–368, 1922.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References V
[33] R. A. Fisher.The statistical utilization of multiple measurements.Annals of Eugenics, 8:376–386, 1938.
[34] R. A. Fisher.Statistical methods and scientific inference.2nd ed., revised. Hafner Publishing Company, New York, 1959.
[35] B. R. Frieden.Fisher information, disorder, and the equilibrium distributions of physics.Phys. Rev. A (3), 41(8):4265–4276, 1990.
[36] A. Fujiwara es H. Nagaoka.Quantum Fisher metric and estimation for pure state models.Phys. Lett. A, 201(2-3):119–124, 1995.
[37] A. Fujiwara es H. Nagaoka.An estimation theoretical characterization of coherent states.J. Math. Phys., 40(9):4227–4239, 1999.
[38] P. Gibilisco es T. Isola.
Connections on statistical manifolds of density operators by geometry of noncommutative Lp -spaces.Infin. Dimens. Anal. Quantum Probab. Relat. Top., 2(1):169–178, 1999.
[39] P. Gibilisco es T. Isola.A characterisation of Wigner-Yanase skew information among statistically monotone metrics.Infin. Dimens. Anal. Quantum Probab. Relat. Top., 4(4):553–557, 2001.
[40] P. Gibilisco es T. Isola.On characterization of dual statistically monotone metrics.math.PR/0303059, 2003.Probability Theory.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References VI
[41] P. Gibilisco es T. Isola.Wigner-Yanase information on quantum state space: the geometric approach.math.PR/0304170, 2003.Probability Theory.
[42] P. Gibilisco es G. Pistone.Connections on non-parametric statistical manifolds by Orlicz space geometry.Infin. Dimens. Anal. Quantum Probab. Relat. Top., 1(2):325–347, 1998.
[43] P. Giblisco es T. Isola.
Monotone metrics on statistical manifolds of denity matrices by geometry of non-commutative l2-spaces.???, pages 129–140, 2001.
[44] M. R. Grasselli es R. F. Streater.On the uniqueness of the Chentsov metric in quantum information geometry.Infin. Dimens. Anal. Quantum Probab. Relat. Top., 4(2):173–182, 2001.
[45] F. Hansen es G. K. Pedersen.Jensen’s inequality for operators and Lowner’s theorem.Math. Ann., 258(3):229–241, 1981/82.
[46] H. Hasegawa.Dual geometry of the Wigner–Yanase–Dyson information content.
[47] H. Hasegawa.α-divergence of the noncommutative information geometry.In Proceedings of the XXV Symposium on Mathematical Physics (Torun, 1992), volume 33, pages 87–93, 1993.
[48] H. Hasegawa.Non-commutative extension of the information geometry.In Quantum communications and measurement (Nottingham, 1994), pages 327–337. Plenum, New York, 1995.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References VII
[49] M. Hayashi.Asymptotic estimation theory for a finite-dimensional pure state model.J. Phys. A, 31(20):4633–4655, 1998.
[50] M. Hayashi.Corrigenda: “Asymptotic estimation theory for a finite-dimensional pure state model”.J. Phys. A, 31(41):8405, 1998.
[51] E. Hellinger.Neue Bergrundung der Theorie quadratischer Formen von unendlich vielen Veranderlichen.J. fur reine and Angew. Math., 36:210–271, 1909.
[52] F. Hiai es D. Petz.The semicircle law, free random variables and entropy, volume 77 of Mathematical Surveys and Monographs.American Mathematical Society, Providence, RI, 2000.
[53] M. Hubner.Computation of Uhlmann’s parallel transport for density matrices and the Bures metric on three-dimensional Hilbert space.Phys. Lett. A, 179(4-5):226–230, 1993.
[54] R. S. Ingarden, H. Janyszek, A. Kossakowski, es T. Kawaguchi.Information geometry of quantum statistical systems.Tensor (N.S.), 37(1):105–111, 1982.
[55] H. Jeffreys.An invariant form for the prior probability in estimation problems.Proc. Roy. Soc. London. Ser. A., 186:453–461, 1946.
[56] A. Jencova.Dualistic properties of the manifold of quantum states.In Disordered and complex systems (London, 2000), volume 553 of AIP Conf. Proc., pages 147–152. Amer. Inst. Phys., Melville,NY, 2001.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References VIII
[57] A. Jencova.Geometry of quantum states: dual connections and divergence functions.Rep. Math. Phys., 47(1):121–138, 2001.
[58] A. Jencova.Quantum information geometry and standard purification.J. Math. Phys., 43(5):2187–2201, 2002.
[59] A. Jencova.Flat connections and Wigner-Yanase-Dyson metrics.math-ph/0307057, 2003.Mathematical Physics.
[60] K. Kraus.States, effects, and operations, volume 190 of Lecture Notes in Physics.Springer-Verlag, Berlin, 1983.Fundamental notions of quantum theory, Lecture notes edited by A. Bohm, J. D. Dollard and W. H. Wootters.
[61] F. Kubo es T. Ando.Means of positive linear operators.Math. Ann., 246(3):205–224, 1979/80.
[62] Leibler R. A. Kullback, S.On information and sufficiency.Ann. Math. Statistics, 22:79–86, 1951.
[63] S. Kullback.Information theory and statistics.Dover Publications Inc., Mineola, NY, 1997.Reprint of the second (1968) edition.
[64] A. Lesniewski es M. B. Ruskai.Monotone Riemannian metrics and relative entropy on noncommutative probability spaces.J. Math. Phys., 40(11):5702–5724, 1999.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References IX
[65] J. Lin es S. K. M. Wong.A new directed divergence measure and its characterization.Int. J. General Systems, 17:73–81, 1990.
[66] G. Lindblad.Entropy, information and quantum measurements.Comm. Math. Phys., 33:305–322, 1973.
[67] G. Lindblad.Expectations and entropy inequalities for finite quantum systems.Comm. Math. Phys., 39:111–119, 1974.
[68] G. Lindblad.Completely positive maps and entropy inequalities.Comm. Math. Phys., 40:147–151, 1975.
[69] K. Lowner.
Uber monotone Matrixfunctionen.Math. Z., 38:177–216, 1934.
[70] T. Matumoto.
Any statistical manifold has a contrast function—on the C3-functions taking the minimum at the diagonal of the productmanifold.Hiroshima Math. J., 23(2):327–332, 1993.
[71] M. Mei.The theory of genetic distance and evaluation of human races.Japan J. Human Genetics, 23:341–369, 1978.
[72] E. A. Morozova es N. N. Chentsov.Markov invariant geometry on state manifolds.In Current problems in mathematics. Newest results, Vol. 36 (Russian), Itogi Nauki i Tekhniki, pages 69–102, 187. Akad. NaukSSSR Vsesoyuz. Inst. Nauchn. i Tekhn. Inform., Moscow, 1989.Translated in J. Soviet Math. 56 (1991), no. 5, 2648–2669.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References X
[73] H. Nagaoka.On the parameter estimation problem for quantum statistical models.In Proceedings of 12th Symposium on Information Theory ans Its Applications, pages 577–582. Society of Information Theory ansIts Applications in Japan, 1989.
[74] H. Nagaoka es S. Amari.Differential geometry of smooth families of probability distributions.Technical Report 82-7, Dept. of Math. Eng. and Instr. Phys., Univ. of Tokyo, 1982.
[75] J. von Neumann.Thermodynamik quantenmechanischer gesamtheiten.Gott. Nachr., pages 273–291, 1927.
[76] M. Ohya es D. Petz.Quantum entropy and its use.Texts and Monographs in Physics. Springer-Verlag, Berlin, 1993.
[77] M. Ohya es D. Petz.Notes on quantum entropy.Studia Sci. Math. Hungar., 31(4):423–430, 1996.
[78] D. Petz.Quasi-entropies for finite quantum systems.Rep. Math. Phys., 23(1):57–65, 1986.
[79] D. Petz.On certain properties of the relative entropy of states of operator algebras.Math. Z., 206(3):351–361, 1991.
[80] D. Petz.Geometry of canonical correlation on the state space of a quantum system.J. Math. Phys., 35(2):780–795, 1994.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References XI
[81] D. Petz.Monotone metrics on matrix spaces.Linear Algebra Appl., 244:81–96, 1996.
[82] D. Petz.Information-geometry of quantum states.In Quantum probability communications, QP-PQ, X, pages 135–157. World Sci. Publishing, River Edge, NJ, 1998.
[83] D. Petz.Covariance and Fisher information in quantum mechanics.J. Phys. A, 35(4):929–939, 2002.
[84] D. Petz.Monotonicity of quantum relative entropy revisited.Rev. Math. Phys., 15(1):79–91, 2003.
[85] D. Petz es H. Hasegawa.On the Riemannian metric of α-entropies of density matrices.Lett. Math. Phys., 38(2):221–225, 1996.
[86] D. Petz es M. B. Ruskai.Contraction of generalized relative entropy under stochastic mappings on matrices.Infin. Dimens. Anal. Quantum Probab. Relat. Top., 1(1):83–89, 1998.
[87] D. Petz es G. Toth.The Bogoliubov inner product in quantum statistics.Lett. Math. Phys., 27(3):205–216, 1993.
[88] E. C. Pielou.Ecological diversity.Wiley, New York, 1975.
[89] C. R. Rao.Information and accuracy atainable in the estimation of statistical parameters.Bulletin of the Calcutta Mathematical Society, 37:81–91, 1945.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References XII
[90] C. R. Rao.Diversity and dissimilarity coefficients: a unified approach.Theoretic Population Biology, 21:24–43, 1982.
[91] G. Ruppeiner.Riemannian geometry in thermodynamic fluctuation theory.Rev. Modern Phys., 67(3):605–659, 1995.
[92] M. B. Ruskai.Beyond strong subadditivity? Improved bounds on the contraction of generalized relative entropy.Rev. Math. Phys., 6(5A):1147–1161, 1994.With an appendix on applications to logarithmic Sobolev inequalities, Special issue dedicated to Elliott H. Lieb.
[93] K. Sailer.Nemegyensulyi statisztikus fizika.Kossuth Lajos Tudomanyegyetem, Debrecen, 1994.Specialis eloadasok.
[94] W. F. Stinespring.
Positive functions on C∗-algebras.Proc. Amer. Math. Soc., 6:211–216, 1955.
[95] R. F. Streater.Statistical dynamics and information geometry.In Geometry and nature (Madeira, 1995), volume 203 of Contemp. Math., pages 117–131. Amer. Math. Soc., Providence, RI, 1997.
[96] T. Tanaka.Information geometry of mean-field approximation.In Advanced mean field methods (Birmingham, 1999), Neural Inf. Process. Ser., pages 259–273. MIT Press, Cambridge, MA, 2001.
[97] I. J. Taneja.Generalised information measures and their applications.preprint.http://www.mtm.ufsc.br/∼taneja/bhtml/bhtml.html.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
References XIII
[98] F. Topsoe.Some inequalities for information divergence and related measures of discrimination.Res. Rep. Coll., RGMIA, 2(1):85–98, 1999.
[99] A. Uhlmann.Relative entropy and the Wigner-Yanase-Dyson-Lieb concavity in an interpolation theory.Comm. Math. Phys., 54(1):21–32, 1977.
[100] A. Uhlmann.Parallel transport and “quantum holonomy” along density operators.Rep. Math. Phys., 24(2):229–240, 1986.
[101] A. Uhlmann.Density operators as an arena for differential geometry.In Proceedings of the XXV Symposium on Mathematical Physics (Torun, 1992), volume 33, pages 253–263, 1993.
[102] A. Uhlmann.Geometric phases and related structures.In Proceedings of the XXVII Symposium on Mathematical Physics (Torun, 1994), volume 36, pages 461–481, 1995.
[103] A. Uhlmann.Spheres and hemispheres as quantum state spaces.J. Geom. Phys., 18(1):76–92, 1996.
[104] H. Umegaki.Conditional expectation in an operator algebra. IV. Entropy and information.Kodai Math. Sem. Rep., 14:59–85, 1962.
[105] E. P. Wigner es Mutsuo M. Yanase.Information contents of distributions.Proc. Nat. Acad. Sci. U.S.A., 49:910–918, 1963.
[106] W. K. Wooters.Statistical distance and Hilbert space.Physical Review D, 23:357–362, 1981.
Attila Andai Information Geometry
Information Geometry
Uncertainty relations
Up to date results
Thank you for your attention!
Attila Andai Information Geometry