Mixed Integer Programming: Analyzing 12 Years of Progress

Mixed Integer Programming:

Analyzing 12 Years of Progress

© 2013 IBM Corporation

Background

2001: Manfred Padberg’s 60th birthday– Bixby et al., “Mixed-Integer Programming: A Progress Report”, in: Grötschel (ed.) The

Sharpest Cut: The Impact of Manfred Padberg and His Work, MPS-SIAM Series on Optimization, pp.309-325, SIAM, Philadelphia (2004)

• Analysis of the relative contributions of the key ingredients of Branch-and-Cut Algorithms for solving MIPs

2013: Martin Grötschel’s 65th birthday– T. Achterberg and R.W., “Mixed Integer Programming: Analyzing 12 Years of

Progress”, in: Jünger and Reinelt (eds.) Facets of Combinatorial Optimization, Festschrift for Martin Grötschel, pp.449-481, Springer, Berlin-Heidelberg (2013)

• What stayed?• What changed?• Why?

2


Agenda

Methodology– How to Benchmark– How to Measure Importance of Features

Analysis– Presolving– Cuts– Heuristics– Parallelism

Summary

3


Benchmarking

Run competing algorithms on set of problem instances– Measure and compare runtime, timeouts– Use geometric mean for aggregation

Performance Variability– Seemingly performance neutral changes (random seed, platform, permutation of

variables, …) have drastic impact on solution time– Has been observed for a long time, e.g.

• Emilie Danna: Performance variability in mixed integer programming,Presentation at Workshop on Mixed Integer Programming 2008

– Friend or foe for Solvers?• Tuesday Oct 08, 08:00 - 09:30, Andrea Lodi:

Performance Variability in Mixed-integer Programming• Wednesday Oct 09, 15:30 - 17:00, Andrea Tramontani:

Concurrent root cut loops to exploit random performance variability– Definitely foe for Benchmarking

4


A benchmarking myth

Compare Solver S against reference Solver R– Solver S claimed to be faster than R on “hard problems”

Model Set M– Solution times tR(m), m M, for S and R are in (0sec, 100sec]

Classify Models in to hard models H and easy models E– m H, if tR(m) > 80sec– m E, otherwise

Computational confirmation of speedup:– 1.8x faster on hard models– 0.8x slower on easy models

5

1.80

0.80

0.00

1.00S

peedup


A benchmarking myth





Times for S and R uniform random numbers:– <tR(E)> = 40 <tR(H)> = 90– <tS(E)> = 50 <tS(H)> = 50– Speedup: 4/5 9/5

6

1.80

0.80

0.00

1.00S

peedup


A benchmarking myth





Times for S and R uniform random numbers:– <tR(E)> = 40 <tR(H)> = 90– <tS(E)> = 50 <tS(H)> = 50– Speedup: 4/5 9/5

7

1.80

0.80

0.00

1.00S

peedup


Avoiding the bias

The problem is real:2 different random seedsfor CPLEX 12.5

Problem comes from using timesfrom one solver to define subsetsof problems

8

biased subsets # of problems geomean

All 3159 1.00

[0,10k] 3082 1.00

[1,10k] 1848 0.99

[10,10k] 1074 0.95

[100,10k] 552 0.87

[1k,10k] 207 0.76


Avoiding the bias



Solution–Use (max) times from all solvers

to define subsets of problems

9


All 3159 1.00

[0,10k] 3082 1.00

[1,10k] 1848 0.99

[10,10k] 1074 0.95

[100,10k] 552 0.87

[1k,10k] 207 0.76

unbiased subsets # of problems geomean

All 3159 1.00

[0,10k] 3082 1.00

[1,10k] 1879 1.00

[10,10k] 1121 1.01

[100,10k] 604 1.01

[1k,10k] 238 1.08


Avoiding the bias



Solution–Use (max) times from all solvers

to define subsets of problems

Note–250 models can not measure

performance difference of lessthan 10%

–Will use [10,10k] bracket

10


All 3159 1.00

[0,10k] 3082 1.00

[1,10k] 1848 0.99

[10,10k] 1074 0.95

[100,10k] 552 0.87

[1k,10k] 207 0.76

unbiased subsets # of problems geomean

All 3159 1.00

[0,10k] 3082 1.00

[1,10k] 1879 1.00

[10,10k] 1121 1.01

[100,10k] 604 1.01

[1k,10k] 238 1.08


Measuring Impact

MIP is a bag of tricks–Presolving–Cutting planes–Branching–Heuristics– ...

How important is each trick?Compare runs with feature turned on and off

–Solution time degradation(geometric mean)

–# of solved models• Essential or just speedup?

–Number of affected models• General or problem specific?

11

Bixby et al. 2001

Feature Degradation

No cuts 53.7x

No presolve 10.8x

Trivial branching 2.9x

No heuristics 1.4x


Component Impact CPLEX 12.5 Summary

Benchmarking setup

• 1769 models• 12 core Intel Xenon 2.66 GHz• Unbiased: At least one of all thetest runs took at least 10sec

99% 82% 91% 26%93%91% 46%83% 65%% affected

12



Fundamental Features

• Lots of models unsolvablewithout

• Apply to most models

99% 82% 91% 26%93%91% 46%83% 65%% affected

13



Important Features

• Many models unsolvablewithout

• Apply to most models

99% 82% 91% 26%93%91% 46%83% 65%% affected

14



Parallelism is not important

• turning off == 12x fewer cyclesi.e. just a tighter time limit

• Hardware cannot defeatcombinatorial explosion

99% 82% 91% 26%93%91% 46%83% 65%% affected

15



Special Features

• Few models unsolvablewithout

• Apply to few models

99% 82% 91% 26%93%91% 46%83% 65%% affected

16



• Only 106 models• Biased towards Cuts due toselection of “hard” modelsfor CPLEX 5.0

• Impact of other componentsmatches

99% 82% 91% 26%93%91% 46%83% 65%% affected

17



99% 82% 91% 26%93%91% 46%83% 65%% affected

18


Component Impact CPLEX 12.5 – Presolve

19



99% 82% 91% 26%93%91% 46%83% 65%% affected

20


Component Impact CPLEX 12.5 – Cutting Planes

21


Component Impact CPLEX 12.5 – Cutting Planes

2.52 1.40 1.19 1.22 1.041.021.83 1.02

Bixby et al. 2001

22



99% 82% 91% 26%93%91% 46%83% 65%% affected

23


Component Impact CPLEX 12.5 – Primal Heuristics

24



99% 82% 91% 26%93%91% 46%83% 65%% affected

25


Parallelism in CPLEX

Types of Parallelism–Opportunistic Parallelism–Deterministic Parallelism

• Identical runs produce same solution path and results• Use deterministic locks• Based on deterministic time• Implemented via counting memory accesses

Parallel Tasks–Root node parallelism

• Concurrent LP solve• Heuristics concurrent to cutting phase• Parallel Cut loop

–Parallel Processing of B&C Tree

26


The Cost of Determinism

27


0

200

400

600

800

1000

1200

1400

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

nu

mb

er o

f ti

meo

uts

0

50

100

150

200

250

tota

l sp

eed

up

10 sec

100 sec

1000 sec

Date: 28 September 2013Testset: 3147 models (1792 in 10sec, 1554 in 100sec, 1384 in 1000sec)Machine: Intel X5650 @ 2.67GHz, 24 GB RAM, 12 threads (deterministic since CPLEX 11.0)Timelimit: 10,000 sec


Related presentations

Recent Developments in CPLEX Monday, 08:00 - 09:30 Tobias Achterberg

Non-convex Quadratic Programming in CPLEX Tuesday, 11:00 - 12:30 Christian Bliek and Pierre Bonami

Tutorial: Performance Variability in Mixed-integer Programming Tuesday, 8:00 – 9:30 Andrea Lodi and Andrea Tramontani

Software Tutorial: Expert Tips and Tricks for Using CPLEX Optimization Studio Tuesday Oct 08, 08:00 - 09:30

Lift-and-Project Cuts in CPLEX 12.5.1 Wednesday, 13:30 - 15:00 Andrea Tramontani

Concurrent root cut loops to exploit random performance variability Wednesday, 15:30 - 17:00 A. Tramontani, M. Fischetti, A. Lodi, D. Salvignin, M. Monaci

Interesting Use Cases for the CPLEX Remote Object Wednesday, 15:30 - 17:00 Lazlo Ladanyi and Daniel Junglas

29



31


Component Impact CPLEX 12.5 – Branching

32

Mixed Integer Programming: Analyzing 12 Years of Progress

Software

Transcript of Mixed Integer Programming: Analyzing 12 Years of Progress