AN67 - Linear Technology Magazine Circuit Collection, Volume III
LINEAR PROGRAMMING AND CIRCUIT IMBALANCES
88
LINEAR PROGRAMMING AND CIRCUIT IMBALANCES László Végh IPCO Summer School Georgia Tech, May 2021 Slides available at https://personal.lse.ac.uk/veghl/ipco
Transcript of LINEAR PROGRAMMING AND CIRCUIT IMBALANCES
László Végh
Slides available at https://personal.lse.ac.uk/veghl/ipco
LP Operations Research
§ L: total bit-complexity of the rational input (, , )
§ Simplex method: Dantzig, 1947
§ Interior point method: Karmarkar, 1984
min ! = ≥ 0
Weakly vs strongly polynomial algorithms for LP § variables, constraints, total encoding L.
§ Strongly polynomial algorithm:
§ PSPACE: The bit-length of numbers during the algorithm remain polynomially bounded in the size of the input.
§ Can also be defined in the real model of computation
min ! = ≥ 0
Is there a strongly polynomial algorithm for Linear
Programming?
Strongly polynomial algorithms for some classes of Linear Programs
§ Systems of linear inequalities with at most two nonzero variables per inequality: Megiddo ’83
§ Network flow problems
§ Min-cost flow: Tardos ’85, Fujishige ’86, Goldberg-Tarjan ’89, Orlin ’93, …
§ Generalized flow: V ’17, Olver-V ’20
§ Discounted Markov Decision Processes: Ye ’05, Ye ’11, …
Dependence on the constraint matrix only
§ Algorithms with running time dependent only on , but not on and .
§ Combinatorial LP’s: integer matrix ∈ !×#. Δ$ = max{| det |: submatrix of }
Tardos ’86: poly ,, log Δ$ black box LP algorithm
§ Layered-least-squares (LLS) Interior Point Method Vavasis-Ye ’96: poly ,, log $ LP algorithm in the real model of computation $: condition number
§ Dadush-Huiberts-Natura-V ’20: poly ,, log $∗ $∗ : optimized version of $
min !, = ≥ 0
Outline of the lectures
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
§ Dadush-Huiberts-Natura-V ’20: A scaling-invariant algorithm for linear programming whose running time depends only on the constraint matrix
§ Dadush-Natura-V ’20: Revisiting Tardos’s framework for linear programming: Faster exact solutions using approximate solvers
Part 1 Tardos’s algorithm for min-cost flows circuits, proximity, and variable fixing
§ Directed graph = (, ), node demands : → with = 0, costs : → .
min !
§ Form with arc capacities can be reduced to this form.
§ Constraint matrix is totally unimodular (TU)
The minimum-cost flow problem
arcs
§ Directed graph = (, ), node demands : → with = 0, costs : → .
min !
§ Optimality: #" > 0 "−# = #"
The minimum-cost flow problem: optimality
!
max ! s. t. " − # ≤ #" ∀ ∈
§ Residual cost: #"$ = #" + # − " ≥ 0
§ Residual graph: % = ∪ , : #" > 0 "# = −#"
Dual solutions: potentials
1
-1
LEMMA: The primal feasible f is optimal ∃: +,- ≥ 0 for all , ∈ and +,- = 0 if +, > 0 ∃: +,- ≥ 0 for all , ∈ .
#
Variable fixing by proximity § If for some (, ) ∈ we can show that +,∗ = 0 in every
optimal solution, then we can remove (, ) from the graph.
§ Overall goal: in strongly polynomial number of steps, guarantee that we can infer this for at least one arc.
PROXIMITY THEOREM: Let Q be the optimal dual potential for costs , and ∗ an optimal primal solution for the original costs . Then,
+, /- > ⋅ − 0 ⇒ +,∗ = 0
§ For the node-arc incidence matrix , ker ⊆ ) is the set of circulations:
in-flow=out-flow
=7 #
Circulations and cycle decompositions
§ LEMMA: Let and + be two feasible flows for the same demand vector . Then, we can write
′ = +7 #
for sign-consistent directed cycles # in :
§ If #"+ > #" then cycles may only contain but not .
§ If #" > #"+ then cycles may only contain but not .
§ If #" = #"+ then no cycle contains or .
Every cycle is moving from towards ′.
Circulations and cycle decompositions
PROOF:
PROXIMITY THEOREM: Let R be the optimal dual potential for costs , and ∗ an optimal primal solution for the original costs . Then,
#"-. > ⋅ − / ⇒ #"∗ = 0
Rounding the costs § Rescale such that 0 = ||
§ Round costs as +, = ⌊+,⌋
§ For we can find optimal primal and dual solutions in strongly polynomial time, e.g. the Out-of-Kilter method by Ford and Fulkerson 1962.
§ For the optimal dual Q, fix all arcs to 0 that have
+, /- > > ⋅ − 0
§ QUESTION: Why would such an arc exist?
Minimum-norm projections § Residual cost:
§ The cost vectors = {.: ∈ 0} ⊂ )
form an affine subspace.
§ For any feasible flow and any residual cost ., . ! = ! + !
§ Solving the problem for and . is equivalent.
§ If 0 ∈ , i.e. ∃: c. ≡ 0, then every feasible flow is optimal
§ IDEA: Replace the input by the min norm projection to the affine subspace :
. = arg min .∈$
. 2 #
Rounding the costs § Assume is chosen as a min norm projection:
≥ ∀ ∈
§ Rescale such that / = ||
§ Round costs as #" = ⌊#"⌋
§ For the optimal dual R, fix all arcs to 0 that have #" -. > > ⋅ − /
§ LEMMA: There exist at least one such arc. PROOF:
-. / ≥ -. 2
-. ≥ 0
Summary of Tardos’s algorithm § Variable fixing based on proximity that can be shown by
cycle decomposition.
§ Round to small integer costs
§ Find optimal dual R for with simple classical method
§ Identify a variable #"∗ = 0 as one where #"-. is large and remove all such arcs.
§ Iterate
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
Part 2 The circuit imbalance measure ! and the condition measure !
The circuit imbalance measure § The matrix ∈ !×# defines a linear matroid on
= {1,2, … , }: a set ⊆ [] is independent if the columns {$: ∈ } are linearly independent.
§ ⊆ [] is a circuit if $: ∈ is a linearly dependent set minimal for containment.
§ For a circuit , there exists a vector % ∈ % unique up to a scalar multiplier such that
5 $∈%
§ The circuit imbalance measure is defined as
' = max (%
5 2.4 -3 -1
"*
# * : ∈ 6, , ∈
§ This measure depends only on the linear subspace = ker : if ker = ker() then 6 = 7
§ We will use 8 = 6 for = ker
Connection to subdeterminants:
§ For an integer matrix ∈ 9×;, Δ6 = max{| det |: submatrix of }
§ For a circuit ∈ 6, with = let = <,* ∈ (=>?)×= be a submatrix with linearly independent rows.
(#) ∈ (=>?)×(=>?) remove the -th column from . By Cramer’s rule * = det ? , det 2 , … , det =
Properties of ! § LEMMA: For an integer matrix ∈ 9×;,
6 ≤ Δ6 For a totally unimodular matrix , 6 = 1
§ EXERCISE: i. If is the node-edge incidence matrix of
an undirected graph, then 6 ∈ 1,2 ii. For the incidence matrix of a complete
undirected graph on nodes,
Δ6 ≥ 2 % &
PROOF (Ekbatani & Natura):
THEOREM (Cederbaum, 1958): If ∈ !×# is a TU- matrix, then $ = 1. Conversely, if 5 = 1 for a linear subspace ⊂ # then there exists a TU-matrix such that = ker .
Duality of circuit imbalances
5 = 5)
Circuits in optimization § Appear in various LP algorithms directly or indirectly
§ IPCO summer school 2020: Laura Sanità’s lectures discussed circuit augmentation algorithms and circuit diameter
§ …
The condition number ! 6 = sup ! ! >? : is positive diagonal matrix
§ Measures the norm of oblique projections
§ Introduced by Dikin 1967, Stewart 1989, Todd 1990
§ THEOREM (Vavasis-Ye 1996): There exists a poly ,, log 6 LP algorithm for min ! , = , ≥ 0, ∈ 9×;
§ LEMMA i. If is an integer matrix with bit encoding length , then
6 ≤ 2@(A)
ii. 6 = max >? : nonsingular × submatrix of iii. 6 only depends on the subspace = ker() iv. 8 = 8'
The lifting operator § For a linear subspace ⊂ # and index set ⊆ , we let
*: # → *
denote the coordinate projection, and * = {*: ∈ }
§ The lifting operator *+: * → # is defined as *+ = argmin ,: ∈ , * =
§ This is a linear operator; we can efficiently compute a projection matrix ∈ #×* such that *+ = .
§ LEMMA:
' = max *⊆[#]
: ⊆ , ∈ * {0}
The lifting operator
The lifting operator LEMMA:
: ⊆ , ∈ B 0
PROOF:
The condition numbers ! and !
Approximability of and e:
LEMMA (Tunçel 1999): It is NP-hard to approximate e by a factor better than 29:;<(>?@A($))
THEOREM: For every matrix ∈ !×#, ≥ 2 1 + $7 ≤ $ ≤ $
Outline of the lectures
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
Part 3 Solving LPs: from approximate to exact
Fast approximate LP algorithms
§ Approximately optimal: ! ≤ OPT +
§ Finding an approximate solution with log ? D
running time dependence implies a weakly polynomial exact algorithm.
min ! = ≥ 0
§ variables, equality constraints, Randomized vs. Deterministic § Significant recent progress:
§ R nnz + ( log) * log + ,
Lee–Sidford ’13–’19
Cohen, Lee, Song ’19
van den Brand ’20
van den Brand, Lee, Liu, Saranurak, Sidford, Song, Wang ’21
Some important techniques: § weighted and stochastic central paths § fast approximate linear algebra § efficient data structures
min ! = ≥ 0 Fast approximate LP algorithms
Fast exact LP algorithms with ! dependence
§ variables, equality constraints
THEOREM (Dadush, Natura, V. ‘20) There exists a poly(,, log 6) algorithm for solving LP exactly. § Feasibility: m calls to an approximate solver § Optimization: mn calls to an approximate solver with = 1/(poly , 6 ). Using van den Brand ’20, this gives a
deterministic exact EF? log2 log(6+) time LP optimization algorithm
§ Generalization of Tardos ’86 for real constraint matrices and with directly working with approximate solvers.
§ Main difference: arguments in Tardos ’86 heavily rely on integrality assumptions
min ! = ≥ 0
Hoffman’s proximity theorem Polyhedron = ∈ ;: ≤ , point G ∉ , norms . H, . I
THEOREM (Hoffman, 1952): There exists a constant H,I() such that
∃ ∈ : − G H ≤ H,I() G − F I
$
§ Subspace form: W = ker , ∈ ; s.t. =
min D = ≥ 0
max D D + =
≥ 0
max D( − ) ∈ E + ≥ 0
Proximity theorem with ! THEOREM: For ∈ 9×;, ∈ ;, consider the system
x ∈ + , ≥ 0.
There exists a feasible solution such that − / ≤ 8 > ?
PROOF:
Linear feasibility algorithm Linear feasibility problem
∈ + , ≥ 0. § Recursive algorithm using a stronger problem formulation:
∈ + , ≥ 0. − / ≤ ′82 > ?
§ Black box oracle for = 1/(poly , 6 )
∈ + − % ≤ & ! '
! % ≤ ! '
The linear feasibility algorithm
1. Call the black box solver to find a solution for = 1/ 8 J
2. Set = { ∈ : # < 8 > ?}; assume ≠ [].
3. Recursively obtain R ∈ F < from
(<(), <)
∈ + − % ≤ & ! '
! % ≤ ! '
≥ 0
& ! '
1. Call the black box solver to find a solution for = 1/ 8 J
2. Set = { ∈ : # < 8 > ?}; assume ≠ [].
3. Recursively obtain R ∈ F < from
(<(), <)
∈ + − % ≤ & ! '
! % ≤ ! '
≥ 0
= { ∈ : + < 5 S T};
§ If = [], then we replace by its projection to E
§ Bound on the number of recursive calls; can be decreased to
§ (UVW T log(5 + )) feasibility algorithm using van den Brand '20.
The linear feasibility algorithm
§ In case of infeasibility we return an exact Farkas certificate
§ 8 is hard to approximate within 2@ ; Tunçel 1999
§ We use an estimate in the algorithm
§ The algorithm may fail if <8( R − <) / > R − < ?
§ In this case, we restart with
max 2, <8( R − <) / R − < ?
§ Our estimate never overshoots 8 by much, but can be significantly better.
Proximity for optimization
THEOREM: Let ∈ D + , ≥ 0 be a feasible dual solution, and assume the primal is also feasible. Then there exists a primal optimal ∗ ∈ + , ∗ ≥ 0 such that
∗ − 0 ≤ 5 S T + XY99 Z T .
min D ∈ + ≥ 0
max D( − ) ∈ E +
≥ 0
Optimization algorithm
§ ≤ Outer Loops, each comprising ≤ Inner Loops
§ Each Outer Loop finds with − "small", and (, ) primal and dual optimal solutions to
min ! . . ∈ + , ≥ 0
§ Using proximity, we can use this to conclude B > 0 for a certain variable set ⊆ and recurse.
min D ∈ + ≥ 0
max D( − ) ∈ E + ≥ 0
Outline of the lectures
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
Part 4 Optimizing circuit imbalances
Diagonal rescaling of LP
min D = ≥ 0
max D D + =
≥ 0
max D′ ()D′ + ′ =
′ ≥ 0
min ()D′ ′ = ′ ≥ 0
max D′ ()D′ + ′ =
′ ≥ 0
§ Natural symmetry of LPs and many LP algorithms.
§ The Central Path is invariant under diagonal scaling.
§ Most “standard” interior point methods are invariant.
Dependence on the constraint matrix only
§ Algorithms with running time dependent only on , but not on and .
§ Combinatorial LP’s: integer matrix ∈ !×#. Δ$ = max{| det |: submatrix of }
Tardos ’86: poly ,, log Δ$ LP algorithm
§ Layered-least-squares (LLS) Interior Point Method Vavasis-Ye ’96: poly ,, log $ LP algorithm in the real model of computation $: condition number
§ Dadush-Huiberts-Natura-V ’20: poly ,, log $∗ $∗ : optimized version of $
min !, = ≥ 0
Optimizing ! and ! by rescaling = set of × positive diagonal matrices
6∗ = inf{6K: ∈} 6∗ = inf{6K: ∈}
§ A scaling invariant algorithm with 6 dependence automatically yields 6∗ dependence.
§ Recall 1 + 62 ≤ 6 ≤ 6.
THEOREM (Tunçel 1999): It is NP-hard to approximate $ (and thus $) by a factor better than 29:;<(>?@A($))
THEOREM (Dadush-Huiberts-Natura-V ’20): Given ∈ !×#, in (77 + \) time, one can • approximate the value $ within a factor $∗ 7, and • compute a rescaling D ∈ satisfying $] ≤ $∗ \.
Approximating !∗
= set of × positive diagonal matrices $∗ = inf{$]: ∈}
§ EXAMPLE: Let ker = span( 0,1,1,M , (1,0,M, 1) )
Pairwise circuit imbalances § For a circuit , there exists a vector ^ ∈ ^ unique
up to a scalar multiplier such that
x +∈^
+, = max ,^
§ The circuit imbalance measure is $ = max
+,,∈[#] +,
LEMMA For any directed cycle on {1,2, … , }
6∗ |M| ≥ #," ∈M
Circuit imbalance min-max formula
§ BUT: Computing the #" values is NP-complete…
§ LEMMA: For any circuit ∈ 6 s.t. , ∈ , "*
#* ≥
Outline of the lectures
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
Part 5 Interior point methods: basic concepts
Primal and dual LP § Matrix form: ∈ 9×;, ∈ 9, c ∈ ;
§ Subspace form: W = ker , ∈ ; s.t. =
§ Complementary slackness: Primal and dual solutions (, ) are optimal if ! = 0: for each ∈ [], either # = 0 or # = 0.
§ Optimality gap: ! − ! − = !.
min D = ≥ 0
max D D + =
s≥ 0
max D( − ) ∈ D +
≥ 0
The central path § For each > 0, there exists a unique
solution () = ((), (), ()) such that
# # = ∀ ∈ the central path element for .
§ The central path is the algebraic curve formed by {(): > 0}
§ For → 0, the central path converges to an optimal solution ∗ = (∗, ∗, ∗).
§ The optimality gap is !() = .
§ Interior point algorithms: walk down along the central path with decreasing geometrically.
The Mizuno-Todd-Ye Predictor-Corrector Algorithm
§ Start from point G = (G, G, G) 'near' the central path at some G > 0.
§ Alternate between § Predictor steps: 'shoot down' the
central path, decreasing by a factor at least 1 − /. May move slightly 'farther' from the central path.
§ Corrector steps: do not change parameter , but move back 'closer' to the central path.
Within () iterations, decreases by a factor 2.
The predictor step § Step direction Δ = (Δ, Δ, Δ)
§ Pick the largest ∈ [0,1] such that ′ is still “close enough” to the central path [ = + Δ = + Δ, + Δ, + Δ
§ Long step: |Δ+Δ+| small for every ∈ [] § New optimality gap is 1 − .
Δ = 0 DΔ + Δ = 0
+Δ+ + +Δ+ = −++ ∀ ∈ []
The predictor step – subspace view
§ Assume the current point = (, , ) is on the central path. The steps can be found as minimum norm projections in the ( ⁄& ') and ( ⁄& () rescaled norms
Δ = argminF #)&
Δ = argminF #)&
+Δ+ + +Δ+ = −++ ∀ ∈ []
Some recent progress on interior point methods § Tremendous recent progress on fast approximate
variants LS’14–’19, CLS’19,vdB’20,vdBLSS’20,vdBLLSSSW’21
§ Fast approximate algorithms for combinatorial problems flows, matching and MDPs: DS’08, M’13, M’16, CMSV’17, AMV’20, vdBLNPTSSW’20, vdBLLSSSW’21
Outline of the lectures
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
Part 6 Layered-least-squares interior point methods
Layered-least-squares (LLS) Interior Point Methods: Dependence on the constraint matrix only
$∗ = inf{$]: ∈}
§ Monteiro-Tsuchiya ’03 O( )
\.b log($∗ + ) + 7 log log 1/ iterations
§ Lan-Monteiro-Tsuchiya ‘09 O \.b log($∗ + ) iterations, but the running time of the iterations depends on b and c
§ Dadush-Huiberts-Natura-V ’20: scaling invariant LLS method with O 7.b log() log($∗ + ) iterations
Near monotonicity of the central path
LEMMA For = (, , ) on the central path, and for any solution + = (+, +, +) s.t. + !+ ≤ !, we have
7 #O?
IPM learns gradually improved upper bounds on the optimal solution.
Variable fixing…—or not? LEMMA After every iteration, there exists variables # and " such that
1
≤ # #∗ , " "∗ ≤
For the optimal ∗, ∗, ∗ . Thus, # and " have “converged” to their final values.
§ PROOF: Can be shown using the form of the predictor step:
Δ = argmin7 #O?
; # + Δ# #
Δ = argmin7 #O?
; # + Δ# #
and bounds on the stepsize.
Variable fixing…—or not? LEMMA After every iteration, there exists variables # and " such that
1
Thus, # and " have “converged” to their final values.
We cannot identify these indices, just show their existence
Layered least squares methods § Instead of the standard predictor step, split the
variables into layers.
§ Force new primal and dual variables that must have converged.
Recap: the lifting operator and ! § For a linear subspace ⊂ ; and index set ⊆ , we let
B: ; → B
§ The lifting operator B8: B → ; is defined as
B8 = argmin 2: ∈ , B =
§ LEMMA: 6 = max A1 2 Q 3 Q 4
: ∈ B
B = , and / ≤ 6 ?
Motivating the layering idea: final rounding step in standard IPM
min D = ≥ 0
max D D + =
s≥ 0 § Limit optimal solution ∗, ∗, ∗ , and optimal
partition = ∪ s.t. = supp ∗ , = supp ∗ .
§ Given (, , ) near central path with ‘small enough’ = D/ such that for every ∈ [], either + or + very small.
§ Assume that we can correctly guess = : + > , = {: + > }
§ Assume we have a partition ,, we have ∈ : # > , # < / ∈ : # < /, # >
§ Goal: move to = + Δ, = + Δ, = + Δ s.t. supp ⊆ , supp ⊆ . Then, ! = 0: optimal solution.
§ Choice: Δ = −R8 R , Δ = −78 7
Layered-least-squares step Assume we have a partition ,, with ∈ : # > , # < / ∈ : # < /, # >
Standard primal predictor step:
Δ = argmin 7 #O?
ΔR = argmin7 #∈R
# + Δ# #
Layered-least-squares step Vavasis-Ye ‘96
§ Order variables decreasingly as ? ≥ 2 ≥ ≥ ; § Arrange variables into layers (?, 2, … , =); start a new layer when
# > S 6#F? § Primal step direction by least squares problems from backwards, layer-
by-layer
Not scaling invariant!
Progress measure: crossover events Vavasis-Ye’96
§ DEFINITION: The variables # and " cross over between and +, > +, if § S 6 ; " ≥ # § S 6 ; " ′′ < # ′′ for any ++ ≤ ′
§ LEMMA: In the Vavasis-Ye algorithm, a crossover event happens every (?.U log(6 + )) iterations, totalling to (V.U log(6 + )).
Scaling invariant layering DNHV’20 § Instead of the ratios #/", we consider the rescaled circuit
imbalance measures #"#/" § Layers: strongly connected components of the arcs
, : #"# "
> 1
Scaling invariant crossover events Vavasis-Ye’96
§ DEFINITION: The variables # and " cross over between and +, > +, if § S 6 ; " ≥ #"# § S 6 ; " ′′ < #"# ′′ for any ++ ≤ ′
§ Amortized analysis, resulting in improved (2.U log() log(6 + )) iteration bound.
Limitation of IPMs
§ THEOREM (Allamigeon–Benchimol–Gaubert–Joswig ‘18): No standard path following method can be strongly polynomial.
§ Proof using tropical geometry: studies the tropical limit of a family of parametrized linear programs.
Future directions § Circuit imbalance measure: key parameter for strongly
polynomial solvability.
§ LP classes with existence of strongly polynomial algorithms open: § LPs with 2 nonzeros per column in the constraint matrix,
equivalently: min cost generalized flows § Undiscounted Markov Decision Processes
§ Extend the theory of circuit imbalances more generally, to convex programming and integer programming.
Thank you!
Slides available at https://personal.lse.ac.uk/veghl/ipco
LP Operations Research
§ L: total bit-complexity of the rational input (, , )
§ Simplex method: Dantzig, 1947
§ Interior point method: Karmarkar, 1984
min ! = ≥ 0
Weakly vs strongly polynomial algorithms for LP § variables, constraints, total encoding L.
§ Strongly polynomial algorithm:
§ PSPACE: The bit-length of numbers during the algorithm remain polynomially bounded in the size of the input.
§ Can also be defined in the real model of computation
min ! = ≥ 0
Is there a strongly polynomial algorithm for Linear
Programming?
Strongly polynomial algorithms for some classes of Linear Programs
§ Systems of linear inequalities with at most two nonzero variables per inequality: Megiddo ’83
§ Network flow problems
§ Min-cost flow: Tardos ’85, Fujishige ’86, Goldberg-Tarjan ’89, Orlin ’93, …
§ Generalized flow: V ’17, Olver-V ’20
§ Discounted Markov Decision Processes: Ye ’05, Ye ’11, …
Dependence on the constraint matrix only
§ Algorithms with running time dependent only on , but not on and .
§ Combinatorial LP’s: integer matrix ∈ !×#. Δ$ = max{| det |: submatrix of }
Tardos ’86: poly ,, log Δ$ black box LP algorithm
§ Layered-least-squares (LLS) Interior Point Method Vavasis-Ye ’96: poly ,, log $ LP algorithm in the real model of computation $: condition number
§ Dadush-Huiberts-Natura-V ’20: poly ,, log $∗ $∗ : optimized version of $
min !, = ≥ 0
Outline of the lectures
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
§ Dadush-Huiberts-Natura-V ’20: A scaling-invariant algorithm for linear programming whose running time depends only on the constraint matrix
§ Dadush-Natura-V ’20: Revisiting Tardos’s framework for linear programming: Faster exact solutions using approximate solvers
Part 1 Tardos’s algorithm for min-cost flows circuits, proximity, and variable fixing
§ Directed graph = (, ), node demands : → with = 0, costs : → .
min !
§ Form with arc capacities can be reduced to this form.
§ Constraint matrix is totally unimodular (TU)
The minimum-cost flow problem
arcs
§ Directed graph = (, ), node demands : → with = 0, costs : → .
min !
§ Optimality: #" > 0 "−# = #"
The minimum-cost flow problem: optimality
!
max ! s. t. " − # ≤ #" ∀ ∈
§ Residual cost: #"$ = #" + # − " ≥ 0
§ Residual graph: % = ∪ , : #" > 0 "# = −#"
Dual solutions: potentials
1
-1
LEMMA: The primal feasible f is optimal ∃: +,- ≥ 0 for all , ∈ and +,- = 0 if +, > 0 ∃: +,- ≥ 0 for all , ∈ .
#
Variable fixing by proximity § If for some (, ) ∈ we can show that +,∗ = 0 in every
optimal solution, then we can remove (, ) from the graph.
§ Overall goal: in strongly polynomial number of steps, guarantee that we can infer this for at least one arc.
PROXIMITY THEOREM: Let Q be the optimal dual potential for costs , and ∗ an optimal primal solution for the original costs . Then,
+, /- > ⋅ − 0 ⇒ +,∗ = 0
§ For the node-arc incidence matrix , ker ⊆ ) is the set of circulations:
in-flow=out-flow
=7 #
Circulations and cycle decompositions
§ LEMMA: Let and + be two feasible flows for the same demand vector . Then, we can write
′ = +7 #
for sign-consistent directed cycles # in :
§ If #"+ > #" then cycles may only contain but not .
§ If #" > #"+ then cycles may only contain but not .
§ If #" = #"+ then no cycle contains or .
Every cycle is moving from towards ′.
Circulations and cycle decompositions
PROOF:
PROXIMITY THEOREM: Let R be the optimal dual potential for costs , and ∗ an optimal primal solution for the original costs . Then,
#"-. > ⋅ − / ⇒ #"∗ = 0
Rounding the costs § Rescale such that 0 = ||
§ Round costs as +, = ⌊+,⌋
§ For we can find optimal primal and dual solutions in strongly polynomial time, e.g. the Out-of-Kilter method by Ford and Fulkerson 1962.
§ For the optimal dual Q, fix all arcs to 0 that have
+, /- > > ⋅ − 0
§ QUESTION: Why would such an arc exist?
Minimum-norm projections § Residual cost:
§ The cost vectors = {.: ∈ 0} ⊂ )
form an affine subspace.
§ For any feasible flow and any residual cost ., . ! = ! + !
§ Solving the problem for and . is equivalent.
§ If 0 ∈ , i.e. ∃: c. ≡ 0, then every feasible flow is optimal
§ IDEA: Replace the input by the min norm projection to the affine subspace :
. = arg min .∈$
. 2 #
Rounding the costs § Assume is chosen as a min norm projection:
≥ ∀ ∈
§ Rescale such that / = ||
§ Round costs as #" = ⌊#"⌋
§ For the optimal dual R, fix all arcs to 0 that have #" -. > > ⋅ − /
§ LEMMA: There exist at least one such arc. PROOF:
-. / ≥ -. 2
-. ≥ 0
Summary of Tardos’s algorithm § Variable fixing based on proximity that can be shown by
cycle decomposition.
§ Round to small integer costs
§ Find optimal dual R for with simple classical method
§ Identify a variable #"∗ = 0 as one where #"-. is large and remove all such arcs.
§ Iterate
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
Part 2 The circuit imbalance measure ! and the condition measure !
The circuit imbalance measure § The matrix ∈ !×# defines a linear matroid on
= {1,2, … , }: a set ⊆ [] is independent if the columns {$: ∈ } are linearly independent.
§ ⊆ [] is a circuit if $: ∈ is a linearly dependent set minimal for containment.
§ For a circuit , there exists a vector % ∈ % unique up to a scalar multiplier such that
5 $∈%
§ The circuit imbalance measure is defined as
' = max (%
5 2.4 -3 -1
"*
# * : ∈ 6, , ∈
§ This measure depends only on the linear subspace = ker : if ker = ker() then 6 = 7
§ We will use 8 = 6 for = ker
Connection to subdeterminants:
§ For an integer matrix ∈ 9×;, Δ6 = max{| det |: submatrix of }
§ For a circuit ∈ 6, with = let = <,* ∈ (=>?)×= be a submatrix with linearly independent rows.
(#) ∈ (=>?)×(=>?) remove the -th column from . By Cramer’s rule * = det ? , det 2 , … , det =
Properties of ! § LEMMA: For an integer matrix ∈ 9×;,
6 ≤ Δ6 For a totally unimodular matrix , 6 = 1
§ EXERCISE: i. If is the node-edge incidence matrix of
an undirected graph, then 6 ∈ 1,2 ii. For the incidence matrix of a complete
undirected graph on nodes,
Δ6 ≥ 2 % &
PROOF (Ekbatani & Natura):
THEOREM (Cederbaum, 1958): If ∈ !×# is a TU- matrix, then $ = 1. Conversely, if 5 = 1 for a linear subspace ⊂ # then there exists a TU-matrix such that = ker .
Duality of circuit imbalances
5 = 5)
Circuits in optimization § Appear in various LP algorithms directly or indirectly
§ IPCO summer school 2020: Laura Sanità’s lectures discussed circuit augmentation algorithms and circuit diameter
§ …
The condition number ! 6 = sup ! ! >? : is positive diagonal matrix
§ Measures the norm of oblique projections
§ Introduced by Dikin 1967, Stewart 1989, Todd 1990
§ THEOREM (Vavasis-Ye 1996): There exists a poly ,, log 6 LP algorithm for min ! , = , ≥ 0, ∈ 9×;
§ LEMMA i. If is an integer matrix with bit encoding length , then
6 ≤ 2@(A)
ii. 6 = max >? : nonsingular × submatrix of iii. 6 only depends on the subspace = ker() iv. 8 = 8'
The lifting operator § For a linear subspace ⊂ # and index set ⊆ , we let
*: # → *
denote the coordinate projection, and * = {*: ∈ }
§ The lifting operator *+: * → # is defined as *+ = argmin ,: ∈ , * =
§ This is a linear operator; we can efficiently compute a projection matrix ∈ #×* such that *+ = .
§ LEMMA:
' = max *⊆[#]
: ⊆ , ∈ * {0}
The lifting operator
The lifting operator LEMMA:
: ⊆ , ∈ B 0
PROOF:
The condition numbers ! and !
Approximability of and e:
LEMMA (Tunçel 1999): It is NP-hard to approximate e by a factor better than 29:;<(>?@A($))
THEOREM: For every matrix ∈ !×#, ≥ 2 1 + $7 ≤ $ ≤ $
Outline of the lectures
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
Part 3 Solving LPs: from approximate to exact
Fast approximate LP algorithms
§ Approximately optimal: ! ≤ OPT +
§ Finding an approximate solution with log ? D
running time dependence implies a weakly polynomial exact algorithm.
min ! = ≥ 0
§ variables, equality constraints, Randomized vs. Deterministic § Significant recent progress:
§ R nnz + ( log) * log + ,
Lee–Sidford ’13–’19
Cohen, Lee, Song ’19
van den Brand ’20
van den Brand, Lee, Liu, Saranurak, Sidford, Song, Wang ’21
Some important techniques: § weighted and stochastic central paths § fast approximate linear algebra § efficient data structures
min ! = ≥ 0 Fast approximate LP algorithms
Fast exact LP algorithms with ! dependence
§ variables, equality constraints
THEOREM (Dadush, Natura, V. ‘20) There exists a poly(,, log 6) algorithm for solving LP exactly. § Feasibility: m calls to an approximate solver § Optimization: mn calls to an approximate solver with = 1/(poly , 6 ). Using van den Brand ’20, this gives a
deterministic exact EF? log2 log(6+) time LP optimization algorithm
§ Generalization of Tardos ’86 for real constraint matrices and with directly working with approximate solvers.
§ Main difference: arguments in Tardos ’86 heavily rely on integrality assumptions
min ! = ≥ 0
Hoffman’s proximity theorem Polyhedron = ∈ ;: ≤ , point G ∉ , norms . H, . I
THEOREM (Hoffman, 1952): There exists a constant H,I() such that
∃ ∈ : − G H ≤ H,I() G − F I
$
§ Subspace form: W = ker , ∈ ; s.t. =
min D = ≥ 0
max D D + =
≥ 0
max D( − ) ∈ E + ≥ 0
Proximity theorem with ! THEOREM: For ∈ 9×;, ∈ ;, consider the system
x ∈ + , ≥ 0.
There exists a feasible solution such that − / ≤ 8 > ?
PROOF:
Linear feasibility algorithm Linear feasibility problem
∈ + , ≥ 0. § Recursive algorithm using a stronger problem formulation:
∈ + , ≥ 0. − / ≤ ′82 > ?
§ Black box oracle for = 1/(poly , 6 )
∈ + − % ≤ & ! '
! % ≤ ! '
The linear feasibility algorithm
1. Call the black box solver to find a solution for = 1/ 8 J
2. Set = { ∈ : # < 8 > ?}; assume ≠ [].
3. Recursively obtain R ∈ F < from
(<(), <)
∈ + − % ≤ & ! '
! % ≤ ! '
≥ 0
& ! '
1. Call the black box solver to find a solution for = 1/ 8 J
2. Set = { ∈ : # < 8 > ?}; assume ≠ [].
3. Recursively obtain R ∈ F < from
(<(), <)
∈ + − % ≤ & ! '
! % ≤ ! '
≥ 0
= { ∈ : + < 5 S T};
§ If = [], then we replace by its projection to E
§ Bound on the number of recursive calls; can be decreased to
§ (UVW T log(5 + )) feasibility algorithm using van den Brand '20.
The linear feasibility algorithm
§ In case of infeasibility we return an exact Farkas certificate
§ 8 is hard to approximate within 2@ ; Tunçel 1999
§ We use an estimate in the algorithm
§ The algorithm may fail if <8( R − <) / > R − < ?
§ In this case, we restart with
max 2, <8( R − <) / R − < ?
§ Our estimate never overshoots 8 by much, but can be significantly better.
Proximity for optimization
THEOREM: Let ∈ D + , ≥ 0 be a feasible dual solution, and assume the primal is also feasible. Then there exists a primal optimal ∗ ∈ + , ∗ ≥ 0 such that
∗ − 0 ≤ 5 S T + XY99 Z T .
min D ∈ + ≥ 0
max D( − ) ∈ E +
≥ 0
Optimization algorithm
§ ≤ Outer Loops, each comprising ≤ Inner Loops
§ Each Outer Loop finds with − "small", and (, ) primal and dual optimal solutions to
min ! . . ∈ + , ≥ 0
§ Using proximity, we can use this to conclude B > 0 for a certain variable set ⊆ and recurse.
min D ∈ + ≥ 0
max D( − ) ∈ E + ≥ 0
Outline of the lectures
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
Part 4 Optimizing circuit imbalances
Diagonal rescaling of LP
min D = ≥ 0
max D D + =
≥ 0
max D′ ()D′ + ′ =
′ ≥ 0
min ()D′ ′ = ′ ≥ 0
max D′ ()D′ + ′ =
′ ≥ 0
§ Natural symmetry of LPs and many LP algorithms.
§ The Central Path is invariant under diagonal scaling.
§ Most “standard” interior point methods are invariant.
Dependence on the constraint matrix only
§ Algorithms with running time dependent only on , but not on and .
§ Combinatorial LP’s: integer matrix ∈ !×#. Δ$ = max{| det |: submatrix of }
Tardos ’86: poly ,, log Δ$ LP algorithm
§ Layered-least-squares (LLS) Interior Point Method Vavasis-Ye ’96: poly ,, log $ LP algorithm in the real model of computation $: condition number
§ Dadush-Huiberts-Natura-V ’20: poly ,, log $∗ $∗ : optimized version of $
min !, = ≥ 0
Optimizing ! and ! by rescaling = set of × positive diagonal matrices
6∗ = inf{6K: ∈} 6∗ = inf{6K: ∈}
§ A scaling invariant algorithm with 6 dependence automatically yields 6∗ dependence.
§ Recall 1 + 62 ≤ 6 ≤ 6.
THEOREM (Tunçel 1999): It is NP-hard to approximate $ (and thus $) by a factor better than 29:;<(>?@A($))
THEOREM (Dadush-Huiberts-Natura-V ’20): Given ∈ !×#, in (77 + \) time, one can • approximate the value $ within a factor $∗ 7, and • compute a rescaling D ∈ satisfying $] ≤ $∗ \.
Approximating !∗
= set of × positive diagonal matrices $∗ = inf{$]: ∈}
§ EXAMPLE: Let ker = span( 0,1,1,M , (1,0,M, 1) )
Pairwise circuit imbalances § For a circuit , there exists a vector ^ ∈ ^ unique
up to a scalar multiplier such that
x +∈^
+, = max ,^
§ The circuit imbalance measure is $ = max
+,,∈[#] +,
LEMMA For any directed cycle on {1,2, … , }
6∗ |M| ≥ #," ∈M
Circuit imbalance min-max formula
§ BUT: Computing the #" values is NP-complete…
§ LEMMA: For any circuit ∈ 6 s.t. , ∈ , "*
#* ≥
Outline of the lectures
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
Part 5 Interior point methods: basic concepts
Primal and dual LP § Matrix form: ∈ 9×;, ∈ 9, c ∈ ;
§ Subspace form: W = ker , ∈ ; s.t. =
§ Complementary slackness: Primal and dual solutions (, ) are optimal if ! = 0: for each ∈ [], either # = 0 or # = 0.
§ Optimality gap: ! − ! − = !.
min D = ≥ 0
max D D + =
s≥ 0
max D( − ) ∈ D +
≥ 0
The central path § For each > 0, there exists a unique
solution () = ((), (), ()) such that
# # = ∀ ∈ the central path element for .
§ The central path is the algebraic curve formed by {(): > 0}
§ For → 0, the central path converges to an optimal solution ∗ = (∗, ∗, ∗).
§ The optimality gap is !() = .
§ Interior point algorithms: walk down along the central path with decreasing geometrically.
The Mizuno-Todd-Ye Predictor-Corrector Algorithm
§ Start from point G = (G, G, G) 'near' the central path at some G > 0.
§ Alternate between § Predictor steps: 'shoot down' the
central path, decreasing by a factor at least 1 − /. May move slightly 'farther' from the central path.
§ Corrector steps: do not change parameter , but move back 'closer' to the central path.
Within () iterations, decreases by a factor 2.
The predictor step § Step direction Δ = (Δ, Δ, Δ)
§ Pick the largest ∈ [0,1] such that ′ is still “close enough” to the central path [ = + Δ = + Δ, + Δ, + Δ
§ Long step: |Δ+Δ+| small for every ∈ [] § New optimality gap is 1 − .
Δ = 0 DΔ + Δ = 0
+Δ+ + +Δ+ = −++ ∀ ∈ []
The predictor step – subspace view
§ Assume the current point = (, , ) is on the central path. The steps can be found as minimum norm projections in the ( ⁄& ') and ( ⁄& () rescaled norms
Δ = argminF #)&
Δ = argminF #)&
+Δ+ + +Δ+ = −++ ∀ ∈ []
Some recent progress on interior point methods § Tremendous recent progress on fast approximate
variants LS’14–’19, CLS’19,vdB’20,vdBLSS’20,vdBLLSSSW’21
§ Fast approximate algorithms for combinatorial problems flows, matching and MDPs: DS’08, M’13, M’16, CMSV’17, AMV’20, vdBLNPTSSW’20, vdBLLSSSW’21
Outline of the lectures
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
Part 6 Layered-least-squares interior point methods
Layered-least-squares (LLS) Interior Point Methods: Dependence on the constraint matrix only
$∗ = inf{$]: ∈}
§ Monteiro-Tsuchiya ’03 O( )
\.b log($∗ + ) + 7 log log 1/ iterations
§ Lan-Monteiro-Tsuchiya ‘09 O \.b log($∗ + ) iterations, but the running time of the iterations depends on b and c
§ Dadush-Huiberts-Natura-V ’20: scaling invariant LLS method with O 7.b log() log($∗ + ) iterations
Near monotonicity of the central path
LEMMA For = (, , ) on the central path, and for any solution + = (+, +, +) s.t. + !+ ≤ !, we have
7 #O?
IPM learns gradually improved upper bounds on the optimal solution.
Variable fixing…—or not? LEMMA After every iteration, there exists variables # and " such that
1
≤ # #∗ , " "∗ ≤
For the optimal ∗, ∗, ∗ . Thus, # and " have “converged” to their final values.
§ PROOF: Can be shown using the form of the predictor step:
Δ = argmin7 #O?
; # + Δ# #
Δ = argmin7 #O?
; # + Δ# #
and bounds on the stepsize.
Variable fixing…—or not? LEMMA After every iteration, there exists variables # and " such that
1
Thus, # and " have “converged” to their final values.
We cannot identify these indices, just show their existence
Layered least squares methods § Instead of the standard predictor step, split the
variables into layers.
§ Force new primal and dual variables that must have converged.
Recap: the lifting operator and ! § For a linear subspace ⊂ ; and index set ⊆ , we let
B: ; → B
§ The lifting operator B8: B → ; is defined as
B8 = argmin 2: ∈ , B =
§ LEMMA: 6 = max A1 2 Q 3 Q 4
: ∈ B
B = , and / ≤ 6 ?
Motivating the layering idea: final rounding step in standard IPM
min D = ≥ 0
max D D + =
s≥ 0 § Limit optimal solution ∗, ∗, ∗ , and optimal
partition = ∪ s.t. = supp ∗ , = supp ∗ .
§ Given (, , ) near central path with ‘small enough’ = D/ such that for every ∈ [], either + or + very small.
§ Assume that we can correctly guess = : + > , = {: + > }
§ Assume we have a partition ,, we have ∈ : # > , # < / ∈ : # < /, # >
§ Goal: move to = + Δ, = + Δ, = + Δ s.t. supp ⊆ , supp ⊆ . Then, ! = 0: optimal solution.
§ Choice: Δ = −R8 R , Δ = −78 7
Layered-least-squares step Assume we have a partition ,, with ∈ : # > , # < / ∈ : # < /, # >
Standard primal predictor step:
Δ = argmin 7 #O?
ΔR = argmin7 #∈R
# + Δ# #
Layered-least-squares step Vavasis-Ye ‘96
§ Order variables decreasingly as ? ≥ 2 ≥ ≥ ; § Arrange variables into layers (?, 2, … , =); start a new layer when
# > S 6#F? § Primal step direction by least squares problems from backwards, layer-
by-layer
Not scaling invariant!
Progress measure: crossover events Vavasis-Ye’96
§ DEFINITION: The variables # and " cross over between and +, > +, if § S 6 ; " ≥ # § S 6 ; " ′′ < # ′′ for any ++ ≤ ′
§ LEMMA: In the Vavasis-Ye algorithm, a crossover event happens every (?.U log(6 + )) iterations, totalling to (V.U log(6 + )).
Scaling invariant layering DNHV’20 § Instead of the ratios #/", we consider the rescaled circuit
imbalance measures #"#/" § Layers: strongly connected components of the arcs
, : #"# "
> 1
Scaling invariant crossover events Vavasis-Ye’96
§ DEFINITION: The variables # and " cross over between and +, > +, if § S 6 ; " ≥ #"# § S 6 ; " ′′ < #"# ′′ for any ++ ≤ ′
§ Amortized analysis, resulting in improved (2.U log() log(6 + )) iteration bound.
Limitation of IPMs
§ THEOREM (Allamigeon–Benchimol–Gaubert–Joswig ‘18): No standard path following method can be strongly polynomial.
§ Proof using tropical geometry: studies the tropical limit of a family of parametrized linear programs.
Future directions § Circuit imbalance measure: key parameter for strongly
polynomial solvability.
§ LP classes with existence of strongly polynomial algorithms open: § LPs with 2 nonzeros per column in the constraint matrix,
equivalently: min cost generalized flows § Undiscounted Markov Decision Processes
§ Extend the theory of circuit imbalances more generally, to convex programming and integer programming.
Thank you!