Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. ·...

49
Jan Hubička and Martin Liška SUSElabs [email protected], [email protected] Building openSUSE with link-time optimizations

Transcript of Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. ·...

Page 1: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Jan Hubička and Martin LiškaSUSElabs

[email protected], [email protected]

Building openSUSE with link-time optimizations

Page 2: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Outlilne

● What is link-time optimization? ● Link-time optimization and GCC● Benchmarks● Can we build openSUSE with link-time optimization by

default?

Page 3: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

What is link-time optimization?

Page 4: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Per-file compilation

File1.c

File2.c

File3.c

File4.c

File1.o

File2.o

File3.o

File4.o

GCC

GCC

GCC

GCC

ld / gold a.out

Page 5: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Link-time optimization

File1.c

File2.c

File3.c

File4.c

File1.oIL

File2.oIL

File3.oIL

File4.oIL

GCC

GCC

GCC

GCC

ld / gold a.out

LTO plug-inlink-timecompiler

Page 6: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Benefits of LTO● Symbol promotion

(from linker’s resolution data most symbols become “static”)● Cross-module inlining, constant propagation● Aggressive unreachable code removal● Profile propagation● EH optimization (propagation of “nothrow”)● Identical code folding● Optimized code layout● and more :)

Page 7: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Problems of LTO

● Whole toolchain has to be restructured● Slow compile-edit cycle● Harder bugreports

(often whole program needed to reproduce issue)● Not 100% transparent to user, but in most cases all one needs

to do is to add -flto

Page 8: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Link-time optimization and GCC

Page 9: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Modernizing GCC (LTO perspective)● 1999 (GCC 2.95): Function-at-a-time● 2001 (GCC 3.0): New inliner (first high-level opt . in gcc)● 2004 (GCC 3.4): Unit-at-a-time; intermodule compilation for C; Inter-procedural

optimization framework● 2005 (GCC 4.0): New SSA optimization framework● 2006 (GCC 4.1): Inter-procedural optimizations: profile guided inlining, pure/const

discovery, mod/ref, inter-procedural constant propagation, --fwhole-program –combine

● 2008 (GCC 4.4): Inter-procedural optimization on SSA; early optimization and inlining.● 2010 (GCC 4.5): Basic LTO framework (5 years in development)● 2011 (GCC 4.6): WHOPR (parallel link-time optimization); Firefox builds

Page 10: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Link-time optimization

File1.c

File2.c

File3.c

File4.c

File1.oIL

File2.oIL

File3.oIL

File4.oIL

GCC

GCC

GCC

GCC

ld / gold a.out

LTO plug-inlink-timecompiler

Page 11: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Parallelized Link-time optimization (WHOPR)

File1.c

File2.c

File3.c

File4.c

File1.oIL

File2.oIL

File3.oIL

File4.oIL

GCC

GCC

GCC

GCC

ld / gold a.out

LTO plguin

Whole ProgramAnalysis

Local opt.

Local opt

Local opt.

Page 12: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Modernizing GCC (LTO perspective)● 2012 (GCC 4.7.0): Memory use optimizations, new inliner heuristics, new inter-procedural

constant propagation with clonning● 2013 (GCC 4.8.0): symbol table; propagation of values passed through aggregates● 2014 (GCC 4.9.0): slim LTO objects by default; on demand loading of functions; devirtualization

pass; feedback directed code layout● 2015 (GCC 5): Identical Code Folding; COMDAT optimization; One Definition Rule for C++;

alignment propagation; correct command line options handling with LTO● 2016 (GCC 6): Linker-plugin now detects type of output binary. C&Fortran type merging. Better

alias anaysis● 2017 (GCC 7): Inter-procedural value range propagaion; bitwise propagation● 2018 (GCC 8): Early debug info. Profile representation rewrite. Function splitting now by default.

Reworked runtime estimation; Malloc attribute propagation

Page 13: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

GCC optimization pipeline

Parser

IL generation

Early opts:

Early InlinerConstant prop.Forwward prop.Jump threading

Scalar repl. of aggr.Alias analysis

Redundancy ellim.Dead store ellim.Dead code ellim.

Tail recursionSwitch conversionpure/const/nothrow

EH optimizationProfile guessing

Compile time Link-time serial

IP analysisstreaming out

Symbol & typestreaming in+ merging

Inter-procedural(whole program)

Opts:

Dead symbol ellim.Symbol promotion

profile analysisIdentical code folding

devirtualizationConstant propagationconst/destr merging

Inliningpure/const/nothrow

mod/refcomdat

Partitioningstreaming out

Streaming in symbols, types and declarations

& link

Stream inand apply

transformations

High level opts:

Constant prop.Complette unrollForward prop.Alias analysis

Return slot opt.Redudancy ellim.Jump threading

Dead code ellim.Conditional store ellim.

Copy prop.If combine

Tail recursionCopy loop headersScalar repl. of aggr.

Dead store ellim.Dead code ellim.

ReassociationSincos, bswap opt.

Loop invariant motionPartial redundancy ellim.

Loop splittingUnroll and jam

Loop dsitributionLoop interchange ...

Low level opts:

Common subexpression ellim.Forward propagation

Copy propagationPartial Redundancy Ellim.

Code hoistingCopy propagation

Store motionIf conversion

Loop invariant motionLoop unrolling

Doloop optimizationWeb constructionCopy propagation

Common subexpression ellim.Dead store ellim.

Instruction combineFunction partitioningInstruction splitting

Live range shrinkingScheduling

Register allocationGlobal common subexpr. Ellim.

Shrink wrappingStack adjustment opt.

Register renamingConstant prop.

Code reorderingScheduling

X87 register stackCode/data alignment

Machine dependent reorg.Code output

Link-time parallel

Page 14: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Benchmarks

Page 15: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

0

0.5

1

1.5

2

2.5

generic

native

Page 16: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

0

2

4

6

8

10

12

14

generic

native

Page 17: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

0

2

4

6

8

10

12

14

generic

native

429.

mcf

458

.sjen

g

445

.gob

mk

403

.gcc

462

.libqu

antu

m

473

.ast

ar

464

.h26

4ref

471

.om

netp

p

401

.bzip

2

483

.xala

ncbm

k

400

.per

lbenc

h

456

.hm

mer

Geo

mea

n

-20

0

20

40

60

80

100

120

GCC -Ofast relative to GCC 6 -O2

GCC 6

GCC 7

GCC 8

Page 18: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

0

2

4

6

8

10

12

14

generic

native

Page 19: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

clang

/flan

g 6

-Ofa

st

ICC 1

8 -O

fast

0

2

4

6

8

10

12

14

generic

native

Page 20: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

clang

/flan

g 6

-Ofa

st

ICC 1

8 -O

fast

0

2

4

6

8

10

12

14

generic

native

400.

perlb

ench

401

.bzip

2

403

.gcc

429

.mcf

445

.gob

mk

456

.hm

mer

458

.sjen

g

462

.libqu

antu

m

464

.h26

4ref

471

.om

netp

p

473

.ast

ar

483

.xala

ncbm

k

Geo

mea

n

-50

-40

-30

-20

-10

0

10

Clang & ICC -Ofast relative to GCC 8

clang/flang 6

ICC 18

Page 21: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

clang

/flan

g 6

-Ofa

st

ICC 1

8 -O

fast

0

2

4

6

8

10

12

14

generic

native

Page 22: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

clang

/flan

g 6

-Ofa

st

ICC 1

8 -O

fast

GCC 8 -O

2 -fl

to

GCC 8 -O

fast

lto

0

2

4

6

8

10

12

14

generic

native

Page 23: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

clang

/flan

g 6

-Ofa

st

ICC 1

8 -O

fast

GCC 8 -O

2 -fl

to

GCC 8 -O

fast

lto

0

2

4

6

8

10

12

14

generic

native

400.

perlb

ench

401

.bzip

2

403

.gcc

429

.mcf

445

.gob

mk

456

.hm

mer

458

.sjen

g

462

.libqu

antu

m

464

.h26

4ref

471

.om

netp

p

473

.ast

ar

483

.xala

ncbm

k

Geo

mea

n

-4

-2

0

2

4

6

8

GCC -Ofast -flto relative to -Ofast

Page 24: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

clang

/flan

g 6

-Ofa

st

ICC 1

8 -O

fast

GCC 8 -O

2 -fl

to

GCC 8 -O

fast

lto

0

2

4

6

8

10

12

14

generic

native

Page 25: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

clang

/flan

g 6

-Ofa

st

ICC 1

8 -O

fast

GCC 8 -O

2 -fl

to

GCC 8 -O

fast

lto

clang

/fllan

g 6

-Ofa

st -f

lto

ICC 1

8 -O

fast

-flto

0

5

10

15

20

25

generic

native

Page 26: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

clang

/flan

g 6

-Ofa

st

ICC 1

8 -O

fast

GCC 8 -O

2 -fl

to

GCC 8 -O

fast

lto

clang

/fllan

g 6

-Ofa

st -f

lto

ICC 1

8 -O

fast

-flto

0

5

10

15

20

25

generic

native

400.

perlb

ench

401

.bzip

2

403

.gcc

429

.mcf

445

.gob

mk

456

.hm

mer

458

.sjen

g

462

.libqu

antu

m

464

.h26

4ref

471

.om

netp

p

473

.ast

ar

483

.xala

ncbm

k

Geo

mea

n

-40

-20

0

20

40

60

80

100

120

Clang/flang 6 and ICC 18 relative to GCC 8 (-O2 -flto)

clang/flang 6

ICC 18

Page 27: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

clang

/flan

g 6

-Ofa

st

ICC 1

8 -O

fast

GCC 8 -O

2 -fl

to

GCC 8 -O

fast

lto

clang

/fllan

g 6

-Ofa

st -f

lto

ICC 1

8 -O

fast

-flto

0

5

10

15

20

25

generic

native

Page 28: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Profile feedback & LTO works well together

Hmmer benchmark has problems; was excluded.

To build with profile feedback often you can use: ./configure ; CFLAGS=”-O2 -fprofile-generate” make ; make check ; make clean ; CFLAGS=”-O2 -fprofile-use” make

400.

perlb

ench

401

.bzip

2

403

.gcc

429

.mcf

445

.gob

mk

458

.sjen

g

462

.libqu

antu

m

464

.h26

4ref

471

.om

netp

p

473

.ast

ar

483

.xala

ncbm

k

Geo

mea

n

-10

-5

0

5

10

15

20

25

Performance relative to GCC 8 -Ofast

LTO

FDO

FDO+LTO

Page 29: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECint 2006 code size

GCC 7 -O2 GCC 8 -O2 GCC 7 -Ofast GCC 8 -OfastGCC 8 -Ofast + FDO Clang 6 -O2 Clang 6 -Ofast Icc 18 -Ofast0

2000000

4000000

6000000

8000000

10000000

12000000

Non-LTO

LTO

Page 30: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

SPECfp 2006 performance (relative to GCC 6)

GCC 7 -O

2

GCC 8 -O

2

GCC 6 -O

fast

GCC 7 -O

fast

GCC 8 -O

fast

ICC 1

8 -O

fast

GCC 8 -O

2 -fl

to

GCC 8 -O

fast

lto

ICC 1

8 -O

fast

-flto

-10

0

10

20

30

40

50

60

generic

native

Page 31: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Firefox performance summary

resp

onsiv

enes

s tp

5o

drom

aeo

dom

Display

list m

utat

e

tap

paint

spee

dom

eter

a11y

r

svgr

_opa

city

tp5o

ARES6

style

benc

hta

rt

drom

aeo

css

tsvg

x

star

tup

time

-5

0

5

10

15

20

25

30

35

Firefox performance relative to non-LTO build

static

Profile feedback

Page 32: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Firefox performance – dromaeo DOM

Page 33: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Firefox performance – tp5o page responsiveness

Page 34: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Firefox binary size

clang6 -Oz -fltoclang6 -Oz -flto=thin

clang6 -O2 -fltoclang6 -O3 -flto=thin + FDO

clang6 O3 -flto + FDOclang6 -O3 -flto

clang6 -O3 -flto=thin

clang6 -Ozclang6 -Osclang6 -O2

clang6 -O3 + FDOclang6 -O3

gcc8 -Os -fltogcc8 -O3 -flto + FDO

gcc8 -O2 -fltogcc8 -O3 -flto

gcc8 -Osgcc8 -O3 + FDO

gcc8 -O2gcc8 -O3

gcc8 -O3 -flto + FDOgcc7 -O3 -flto + FDOgcc6 -O3 -flto + FDO

0 20000000 40000000 60000000 80000000 100000000 120000000 140000000

EH

data

relocations

text

Page 35: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Firefox binary size

clang6 -Oz -fltoclang6 -Oz -flto=thin

clang6 -O2 -fltoclang6 -O3 -flto=thin + FDO

clang6 O3 -flto + FDOclang6 -O3 -flto

clang6 -O3 -flto=thin

clang6 -Ozclang6 -Osclang6 -O2

clang6 -O3 + FDOclang6 -O3

gcc8 -Os -fltogcc8 -O3 -flto + FDO

gcc8 -O2 -fltogcc8 -O3 -flto

gcc8 -Osgcc8 -O3 + FDO

gcc8 -O2gcc8 -O3

gcc8 -O3 -flto + FDOgcc7 -O3 -flto + FDOgcc6 -O3 -flto + FDO

0 20000000 40000000 60000000 80000000 100000000 120000000 140000000

EH

data

relocations

text

Page 36: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Firefox binary size

clang6 -Oz -fltoclang6 -Oz -flto=thin

clang6 -O2 -fltoclang6 -O3 -flto=thin + FDO

clang6 O3 -flto + FDOclang6 -O3 -flto

clang6 -O3 -flto=thin

clang6 -Ozclang6 -Osclang6 -O2

clang6 -O3 + FDOclang6 -O3

gcc8 -Os -fltogcc8 -O3 -flto + FDO

gcc8 -O2 -fltogcc8 -O3 -flto

gcc8 -Osgcc8 -O3 + FDO

gcc8 -O2gcc8 -O3

gcc8 -O3 -flto + FDOgcc7 -O3 -flto + FDOgcc6 -O3 -flto + FDO

0 20000000 40000000 60000000 80000000 100000000 120000000 140000000

EH

data

relocations

text

Page 37: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Firefox binary size

clang6 -Oz -fltoclang6 -Oz -flto=thin

clang6 -O2 -fltoclang6 -O3 -flto=thin + FDO

clang6 O3 -flto + FDOclang6 -O3 -flto

clang6 -O3 -flto=thin

clang6 -Ozclang6 -Osclang6 -O2

clang6 -O3 + FDOclang6 -O3

gcc8 -Os -fltogcc8 -O3 -flto + FDO

gcc8 -O2 -fltogcc8 -O3 -flto

gcc8 -Osgcc8 -O3 + FDO

gcc8 -O2gcc8 -O3

gcc8 -O3 -flto + FDOgcc7 -O3 -flto + FDOgcc6 -O3 -flto + FDO

0 20000000 40000000 60000000 80000000 100000000 120000000 140000000

EH

data

relocations

text

Page 38: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Firefox binary size

clang6 -Oz -fltoclang6 -Oz -flto=thin

clang6 -O2 -fltoclang6 -O3 -flto=thin + FDO

clang6 O3 -flto + FDOclang6 -O3 -flto

clang6 -O3 -flto=thin

clang6 -Ozclang6 -Osclang6 -O2

clang6 -O3 + FDOclang6 -O3

gcc8 -Os -fltogcc8 -O3 -flto + FDO

gcc8 -O2 -fltogcc8 -O3 -flto

gcc8 -Osgcc8 -O3 + FDO

gcc8 -O2gcc8 -O3

gcc8 -O3 -flto + FDOgcc7 -O3 -flto + FDOgcc6 -O3 -flto + FDO

0 20000000 40000000 60000000 80000000 100000000 120000000 140000000

EH

data

relocations

text

Page 39: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Summary

● LTO now works and scale for large applications (Firefox, Libreoffice,…)

● Important size optimization especially for large C++ programs● Often important performance optimization especially when

combined with profile feedback● Space for future improvements

(both on GCC side and re-optimization of programs to LTO model)

Page 40: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

LTO in openSUSE Factory

Page 41: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

LTO in openSUSE Factory

● A new staging project (openSUSE:Factory:Staging:N)● Setting: Optflags: * -flto● ~80 packages failed (out of all ~2300)● Branch all failing packages and disable LTO● LTO Factory (w/ some disabled packages) can run openQA test-

suite (with a small fallout)

Page 42: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Package statistics

● Total ELF files in distribution: ~6700● Reduction: 1839 MB to 1736MB (-5.6%)● libmergedlo.so: reduction 10MB (-16%) ● Biggest reduction:

– mysql_upgrade (-92%)

– Innochecksum (-94%)

– mbstream (-94%)

Page 43: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Issues

● Argyllcms: lto1: fatal error: multiple prevailing defs for 'xcal_read_icc': LD PR: https://sourceware.org/PR23079

● GCC LTO miscompilation: http://gcc.gnu.org/PR85248● .symver in shared libraries: https://gcc.gnu.org/PR48200:

– __asm__(".symver old_foo,foo@VERS_1.1")

– New symver function attribute must be added (GCC 9.1.0)

Page 44: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Issues (cont.)

● static libraries (*.la):– e2fsprogs, btrfsprogs, …– Fat LTO objects must be used (-ffat-lto-objects)– LTO elf sections should be stripped– OBS sanitizer should be extended– LTO mode can combine both LTO objects and assembly objects

Page 45: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Issues (cont. 2)

● LTO warnings (-Wodr, ...):– ltrace: error: type of 'filter_matches_symbol' does not match

original declaration [-Werror=lto-type-mismatch]– gdb: error: type 'struct ipa_sym_addresses' violates the C++ One

Definition Rule [-Werror=odr]● mpir: weak configure script that scans *.o file as a blob● weak support for LTO debug info in dwz tool● Maybe higher memory constaints for selected packages

Page 46: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Issues (cont. 3)

● Usage of top-level assembler (https://gcc.gnu.org/PR57703):– Example of syscall.cc in Chromium project:

asm volatile(".text\n"

".align 16, 0x90\n"

".type SyscallAsm, @function\n"

"SyscallAsm:.cfi_startproc\n"

– Error:nacl_helper.ltrans1.ltrans.o: In function `playground2::SandboxSyscall(int, long, long, long, long, long, long)':

nacl_helper.ltrans1.o:(.text+0x4503): undefined reference to `SyscallAsm'

Page 47: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Histogram of text segment sizes

0 5,000,000 10,000,000 15,000,000 20,000,000 25,000,000 30,000,000 35,000,000 40,000,0000

20

40

60

80

100

120

140

Page 48: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

Conclusion

● LTO looks mature enough to be used by default● Do we want it for openSUSE Factory?● For the future, can we offer it also for SLE?

Page 49: Building openSUSE with link-time optimizationshubicka/slides/opensuse2018-e.pdf · 2018. 5. 27. · Building openSUSE with link-time optimizations . Outlilne ... 2018 (GCC 8): Early

LicenseThis slide deck is licensed under the Creative Commons Attribution-ShareAlike 4.0 International license. It can be shared and adapted for any purpose (even commercially) as long as Attribution is given and any derivative work is distributed under the same license.

Details can be found at https://creativecommons.org/licenses/by-sa/4.0/

General DisclaimerThis document is not to be construed as a promise by any participating organisation to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. openSUSE makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for openSUSE products remains at the sole discretion of openSUSE. Further, openSUSE reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All openSUSE marks referenced in this presentation are trademarks or registered trademarks of SUSE LLC, in the United States and other countries. All third-party trademarks are the property of their respective owners.

Credits

TemplateRichard Brown

[email protected]

Design & InspirationopenSUSE Design Team

http://opensuse.github.io/branding-guidelines/