(Co)-Algebraic and Analytical Aspects of Weighted Automata ... · Minimisation vs Approximation •...

(Co)-Algebraic and Analytical Aspects of Weighted Automata Minimisation and Equivalence

(Part 2)

Borja Balle

[CMCS Tutorial, Apr 2 2016]

Minimisation vs Approximation

16 novática Special English Edition - 2013/2014 Annual Selection of Articles

monograph Process Mining

monograph

Figure 6. Process Model Discovered for a Group of 627 Gynecological Oncology Patients.

Figure 5. Conformance Analysis Showing Deviations between Eventlog and Process.

monograph

minimisation

monograph

minimisation

q3 q4approximation

• Minimisation is naturally related to (co-)algebraic properties

• Approximation questions need analytical tools (eg. metrics)

monograph

minimisation

q3 q4approximation

Motivations for Approximation

1. Efficient (approximate) computation with WA

• Compute faster and pay a controlled price in terms of accuracy

2. Machine learning for WA

• Understand learning bias, design better algorithms

3. Interesting mathematical questions

WA and Rational Functions

´1´2

a, 1b, 0

a,´1b,´2

a, 3b, 5

a,´2b, 0 A = ��, �, {Ta}a��

�, � � Rn, Ta � Rn�n

�Tb =

�0 �20 5

�Ta =

�1 �1

�2 3

�� =

�1 �2

WA and Rational Functions

´1´2

a, 1b, 0

a,´1b,´2

a, 3b, 5

a,´2b, 0 A = ��, �, {Ta}a��

�, � � Rn, Ta � Rn�n

fA : ⌃? ! Rsum of weights of all paths labelled by the

input string

fA(x) = �Tx1 · · · Txt�

= �Tx�

(Pseudo-)Metrics for ApproximationHow do we measure the quality of WA approximations?

(Pseudo-)Metrics for Approximation

�f�p =

��

x��

|f(x)|p�1/p

distp(A,B) = ∥fA − fB∥p

How do we measure the quality of WA approximations?

�f�p =

��

x��

|f(x)|p�1/p

pseudo-metric

�f�p =

��

x��

|f(x)|p�1/p

ℓp = {f : Σ⋆ → R | ∥f∥p < ∞}

Rat = {f : Σ⋆ → R | rational}

ℓpRat = Rat ∩ ℓp = {f : Σ⋆ → R | rational, ∥f∥p < ∞}

Important linearsubspaces of R��

�f�p =

��

x��

|f(x)|p�1/p

ℓp = {f : Σ⋆ → R | ∥f∥p < ∞}

Rat = {f : Σ⋆ → R | rational}

ℓpRat = Rat ∩ ℓp = {f : Σ⋆ → R | rational, ∥f∥p < ∞}

Important linearsubspaces of R��

Banach/Hilbert

not complete!

An Approximation Result

There is an algorithm that given a minimal WA A with nstates and k < n, runs in time O(|�|n6) and returns aWA A with k states such that

dist2(A, A) ��

�2k+1 + · · · + �2

[Balle, Panangaden, Precup 2015, Kulesza, Jiang, Singh 2015]

An Approximation Result

There is an algorithm that given a minimal WA A with nstates and k < n, runs in time O(|�|n6) and returns aWA A with k states such that

dist2(A, A) ��

�2k+1 + · · · + �2

singular values of theHankel matrix of fA

[Balle, Panangaden, Precup 2015, Kulesza, Jiang, Singh 2015]

The Plan

1. Hankel Matrices and Singular Value Automata (SVA)

2. Algorithms for Computing SVA

3. WA Approximation via SVA Truncation

The Hankel Matrix

Hf 2 R�?⇥�?

f : �? ! R

Hf(p, s) = f(ps)

——————————–

" a b ¨¨¨ s ¨¨¨

" fp"q fpaq fpbq ...

a fpaq fpaaq fpabq ...

b fpbq fpbaq fpbbq ......p ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ fppsq...

��fl

The Hankel Matrix

Hf 2 R�?⇥�?

f : �? ! R

Hf(p, s) = f(ps)

——————————–

" a b ¨¨¨ s ¨¨¨

" fp"q fpaq fpbq ...

a fpaq fpaaq fpabq ...

b fpbq fpbaq fpbbq ......p ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨ fppsq...

��fl

The Hankel Matrix

[Kronecker 1881, Carlyle & Paz 1971, Fliess 1974]

Theorem: rank(Hf) = n iff f computed by minimal WA with n states

Hf 2 R�?⇥�?

f : �? ! R

Hf(p, s) = f(ps)

Hf(p, s) = f(ps) = αTp1 · · · TptTs1 · · · Tst′β = αp ·βs

Proof (upper bound rank(HfA) � |A|)

Structure of Low-rank Hankel Matrices

HfA R� � P R� n S Rn �

fA p1 pT s1 sT �0 Ap1 ApT

↵A p

As1 AsT �

�A s

�A p P p, ⇥A s S , s

Structure of Low-rank Hankel Matrices

HfA R� � P R� n S Rn �

fA p1 pT s1 sT �0 Ap1 ApT

↵A p

As1 AsT �

�A s

�A p P p, ⇥A s S , s

Hf = P · S

Proof (equality rank(Hf) = minfA=f |A|)1. Suppose Hf = P · S is rank factorisation

2. For a � � define Haf (p, s) = f(pas) and note

Haf = RaHf where (Raf)(x) = f(xa)

3. Observe that because S has full row rank colspan(RaP) =colspan(Ha

f ) � colspan(Hf) = colspan(P)

4. Hence there exists Ta such that RaP = PTa

5. Let A = �� = P(�, �), � = S(�, �), {Ta}�

6. Note fA = f since P(�, �)TxS(�, �) = RxP(�, �)S(�, �) =P(x, �)S(�, �) = Hf(x, �)

A Useful Correspondence

AQ = ��Q, Q�1�, {Q�1TaQ}a��

A = ��, �, {Ta}a�� Q � Rn�n invertible

(�Q)(Q�1Tx1Q) · · · (Q�1TxtQ)(Q�1�) = �Tx1 · · · Txt�

AQ = ��Q, Q�1�, {Q�1TaQ}a��

(�Q)(Q�1Tx1Q) · · · (Q�1TxtQ)(Q�1�) = �Tx1 · · · Txt�

AQ = ��Q, Q�1�, {Q�1TaQ}a��

Theorem:

bijection

For any rational f : �� R

Minimal WA Acomputing f

Rank factorizationsof the form Hf = P · S

[Balle, Carreras, Luque, Quattoni 2014]

SVD of Hankel Matrices

Hf = UDV⊤ =

⎢⎢⎣

......

u1 · · · un

......

⎥⎥⎦

⎢⎣σ1

. . .σn

⎥⎦

⎢⎣· · · v⊤1 · · ·

...· · · v⊤n · · ·

⎥⎦

Hf = UDV⊤ =

⎢⎢⎣

......

u1 · · · un

......

⎥⎥⎦

⎢⎣σ1

. . .σn

⎥⎦

⎢⎣· · · v⊤1 · · ·

...· · · v⊤n · · ·

⎥⎦

Theorem:

Hf with f rational admitsan SVD iff �f�2 < �

[Balle, Panangaden, Precup 2015, 201?]

Hf = UDV⊤ =

⎢⎢⎣

......

u1 · · · un

......

⎥⎥⎦

⎢⎣σ1

. . .σn

⎥⎦

⎢⎣· · · v⊤1 · · ·

...· · · v⊤n · · ·

⎥⎦

Hf : �2 � �2

g �� Hfg

(Hfg)(x) =�

y��

f(xy)g(y)

Key Idea: see Hankel matrix as an operator in a Hilbert spaceTheorem:

Hf with f rational admitsan SVD iff �f�2 < �

The Singular Value AutomatonFrom Correspondence Theorem:

If Hf admits an SVD Hf = UDV�, then there exists a WAA for f inducing P = UD1/2 and S = D1/2V�

[Balle, Panangaden, Precup 2015]

singular value automaton

WA computing sing. vect. can be obtained from SVA:

Ui = ⟨α, ei/√σi, {Ta}⟩

Vi = ⟨ei/√σi,β, {Ta}⟩

fUi= ui

fVi= vi

A = ��, �, {Ta}a��

singular value automaton

Consequences of SVA

1. For every f � �2Rat the SVA provides a “unique”canonical WA representation

2. The vector subspace �2Rat � R��is closed under

taking left/right singular vectors of Hankel matrices

Consequences of SVA

1. For every f � �2Rat the SVA provides a “unique”canonical WA representation

2. The vector subspace �2Rat � R��is closed under

taking left/right singular vectors of Hankel matrices

The Plan

Two Important Matrices

Reachability Gramian: GP = P> · P

Observability Gramian: GS = S · S>

Theorem:Assuming A is minimal, GP and GS are well-defined iff �fA�2 < �

P = UD1/2 ) GP = D

S = D1/2V> ) GS = D

For SVA

GQP = Q>GPQ

GQS = Q-1GSQ

P = UD1/2 ) GP = D

S = D1/2V> ) GS = D

For SVA

Computing the SVAGiven minimal WA A computing f � �2Rat do:

1. Compute Gramians GP and GS

2. Diagonalize M = GSGP

(i.e. find Q s.t. Q�1MQ = D2)

3. Obtain SVA as AQ

Computing the SVA

note:Step 2 is a finite eigenvalue problem :-)

Step 1 involves products of infinite matrices :-o

Given minimal WA A computing f � �2Rat do:

1. Compute Gramians GP and GS

2. Diagonalize M = GSGP

(i.e. find Q s.t. Q�1MQ = D2)

3. Obtain SVA as AQ

Computing the Gramians

Fixed-point Equations for Gramians

GS = β · β⊤ +∑

a∈Σ

Ta ·GS · T⊤a

GP = α⊤ · α+∑

a∈Σ

T⊤a ·GP · Ta

Closed-form Algorithm

Total time O(n6 + |�|n4)vec(GS) =

I ��

a��

Ta � Ta

��1

(� � �)

GS = β · β⊤ +∑

a∈Σ

Ta ·GS · T⊤a

GP = α⊤ · α+∑

a∈Σ

T⊤a ·GP · Ta

Closed-form Algorithm

Total time O(n6 + |�|n4)vec(GS) =

I ��

a��

Ta � Ta

��1

(� � �)

GS = β · β⊤ +∑

a∈Σ

Ta ·GS · T⊤a

GP = α⊤ · α+∑

a∈Σ

T⊤a ·GP · Ta

note: there is also an iterative algorithm with O(|�|n2.4) flops per iteration

The Plan

Approximation (Attempt 1)⎡

⎢⎢⎣

......

u1 · · · un

......

⎥⎥⎦

⎢⎣σ1

. . .σn

⎥⎦

⎢⎣· · · v⊤1 · · ·

...· · · v⊤n · · ·

⎥⎦ ≈

⎢⎢⎣

......

u1 · · · uk

......

⎥⎥⎦

⎢⎣σ1

. . .σk

⎥⎦

⎢⎣· · · v⊤1 · · ·

...· · · vk

⊤ · · ·

⎥⎦

⎢⎢⎣

......

u1 · · · un

......

⎥⎥⎦

⎢⎣σ1

. . .σn

⎥⎦

⎢⎣· · · v⊤1 · · ·

...· · · v⊤n · · ·

⎥⎦ ≈

⎢⎢⎣

......

u1 · · · uk

......

⎥⎥⎦

⎢⎣σ1

. . .σk

⎥⎦

⎢⎣· · · v⊤1 · · ·

...· · · vk

⊤ · · ·

⎥⎦

Hf H best rank k approximation

Claim:

If f : �� R is such that Hf = H, then rank(f) = k

and �f � f�2 � �Hf � H�op = �k+1

⎢⎢⎣

......

u1 · · · un

......

⎥⎥⎦

⎢⎣σ1

. . .σn

⎥⎦

⎢⎣· · · v⊤1 · · ·

...· · · v⊤n · · ·

⎥⎦ ≈

⎢⎢⎣

......

u1 · · · uk

......

⎥⎥⎦

⎢⎣σ1

. . .σk

⎥⎦

⎢⎣· · · v⊤1 · · ·

...· · · vk

⊤ · · ·

⎥⎦

Claim:

If f : �� R is such that Hf = H, then rank(f) = k

and �f � f�2 � �Hf � H�op = �k+1

⎢⎢⎣

......

u1 · · · un

......

⎥⎥⎦

⎢⎣σ1

. . .σn

⎥⎦

⎢⎣· · · v⊤1 · · ·

...· · · v⊤n · · ·

⎥⎦ ≈

⎢⎢⎣

......

u1 · · · uk

......

⎥⎥⎦

⎢⎣σ1

. . .σk

⎥⎦

⎢⎣· · · v⊤1 · · ·

...· · · vk

⊤ · · ·

⎥⎦

not Hankel!

Approximation (Attempt 2)

⎢⎢⎣

...f...

⎥⎥⎦ =

⎢⎢⎣

......√

σ1u1 · · · √σnun

......

⎥⎥⎦

⎢⎣β1...

⎥⎦

UD1/2 D1/2V⊤

f =n∑

βi√σiuiHf =

⎢⎢⎣

...f...

⎥⎥⎦ =

⎢⎢⎣

......√

σ1u1 · · · √σnun

......

⎥⎥⎦

⎢⎣β1...

⎥⎦

UD1/2 D1/2V⊤

Claim:

Taking f =�k

i=1 �i�

�iui we have a rational functionsum of k orthogonal rational functions and �f � f�2

2 =�ni=k+1 �2

i�i � �ni=k+1 �2

f =n∑

βi√σiuiHf =

⎢⎢⎣

...f...

⎥⎥⎦ =

⎢⎢⎣

......√

σ1u1 · · · √σnun

......

⎥⎥⎦

⎢⎣β1...

⎥⎦

UD1/2 D1/2V⊤

Claim:

Taking f =�k

i=1 �i�

�iui we have a rational functionsum of k orthogonal rational functions and �f � f�2

2 =�ni=k+1 �2

i�i � �ni=k+1 �2

f =n∑

βi√σiui

has rank n, but want k!

⎢⎢⎣

...f...

⎥⎥⎦ =

⎢⎢⎣

......√

σ1u1 · · · √σnun

......

⎥⎥⎦

⎢⎣β1...

⎥⎦

UD1/2 D1/2V⊤

SVA Truncation

Remove

SVA Truncation

Remove

corresponding to k largest singular values

SVA Truncation

Remove

corresponding to k largest singular values

A� =

• • • •• • • •• • • •• • • •

keep to keep

remove to removeremove to keep

keep to remove

Error in SVA TruncationOriginal Truncated

A� =

• • • •• • • •• • • •• • • •

775 A� =

�• •• •

�Tσ Tσ

A, f, n states A, f, k states

Error in SVA TruncationOriginal Truncated

A� =

• • • •• • • •• • • •• • • •

775 A� =

�• •• •

�Tσ Tσ

A, f, n states A, f, k states

Error Bounds:

�f � f�2 = �

��

Cf(�k+1 + · · · + �n)1/4

(�2k+1 + · · · + �2

[Balle, Panangaden, Precup 2015, 201?, Kulesza, Jiang, Singh 2015]

Conclusion

• Tools from functional analysis must play a role in a theory of automata approximation

Conclusion

• Approximation theories can lead to useful algorithms for large-scale problems

Conclusion

• Approximation theories can lead to useful algorithms for large-scale problems

• Change of basis in WA doesn’t change the function but yields automata with different properties (eg. for truncation)

Open Problems

Open Problems• How can this be extended to other types of automata, state

machines, co-algebras, etc?

• How important is it that WA are closed under reversal for the error analysis? [see Rabusseau, Balle, Cohen 2016]

• Can the AAK theorems in control theory be generalised to WA?

• Can we obtain efficient approximation algorithms with extra constraints (eg. sparsity, positivity)?

(Co)-Algebraic and Analytical Aspects of Weighted Automata Minimisation and Equivalence

(Part 2)

Borja Balle

[CMCS Tutorial, Apr 2 2016]

(Co)-Algebraic and Analytical Aspects of Weighted Automata ... · Minimisation vs Approximation •...

Documents

Transcript of (Co)-Algebraic and Analytical Aspects of Weighted Automata ... · Minimisation vs Approximation •...

WASTE MINIMISATION TECHNIQUES

Approximation to real numbers by algebraic …algant.eu/documents/theses/frank.pdfApproximation to real numbers by algebraic numbers of bounded degree (Review of existing results)Vladislav

IEQ-12 Formaldehyde Minimisation

Risk analysis & minimisation

Waste minimisation, Resource maximisation

Waste minimisation guide_for_the_textile_industry_a_step_towards

C:\fakepath\harm minimisation

Trefitz type approximation and the generalized flnite ... · Trefitz type approximation and the generalized flnite element method 307 The above algebraic equations can always

On approximation of functions by algebraic polynomials in H ......strong converse inequalities for some methods of approximation of functions. As an example, we consider approximation

TEXTILE WASTE MINIMISATION

Conservation and waste minimisation

Effective Chorin-Temam Algebraic Splitting Schemes for the ... · the Schur complement approximation. KEYWORDS algebraic splitting, finite element methods, grad-div stabiliza- ...

algebraic goals a guide to third grade curriculumschools.camas.wednet.edu/test/files/2011/08/3rd-grade.pdf · (attributes & dimensions, approximation & precision, systems & tools)

Waste minimisation across the supply chain: areas … › sites › files › wrap › Waste Minimisation...Waste minimisation across the supply chain: areas to focus on Waste minimisation

L3 Waste Minimisation

Dynamic programming – Minimisation problem

ALGEBRAIC REAL ANALYSIS - Mount Allison University · ALGEBRAIC REAL ANALYSIS 217 denote a “ﬁrst approximation” operator, to wit, an arbitrary operator from C(I) to I that satisﬁes

Limerick Clare Kerry Business Waste Minimisation Initiatives · 10th May 2011 Limerick Clare Kerry Business Waste Minimisation Initiatives Margaret Murphy Waste Minimisation Adviser.

crosstalk minimisation using vlsi

IIF Annual Awards Logistics Loss Improvement, Management Practices, Rework Minimisation, Downtime Minimisation, Changeover Time Minimisation, Efficiency Improvement, Design Improvement,