Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for...

29
Verifying the LLVM Steve Zdancewic DeepSpec Summer School 2017

Transcript of Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for...

Page 1: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

VerifyingtheLLVM

SteveZdancewicDeepSpecSummerSchool2017

Page 2: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

ThanksTo•  DmitriGarbuzov•  NicolasKoh•  OlekGierczak

And…collaboratorsonVellvm•  JianzhouZhao

–  developedthe"legacy"VellvmCoqframework•  SantoshNagarakaMe•  MiloMarNn•  WilliamMansky•  ChrisNneRizkallah

Page 3: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

Low-LevelVirtualMachine(LLVM)•  Open-SourceCompilerInfrastructure

–  seellvm.orgforfulldocumentaNon•  CreatedbyChrisLaMner(advisedbyVikramAdve)atUIUC

–  LLVM:AninfrastructureforMulN-stageOpNmizaNon,2002–  LLVM:ACompilaNonFrameworkforLifelongProgramAnalysisand

TransformaNon,2004•  2005:AdoptedbyAppleforXCode3.1•  Frontends:

–  llvm-gcc(drop-inreplacementforgcc)–  Clang:C,objecNveC,C++compilersupportedbyApple–  variouslanguages:ADA,Scala,Haskell,…

•  Backends:–  x86/Arm/Power/etc.

•  Usedinmanyacademic/researchprojects–  HereatPenn:SohBound,Vellvm

ZdancewicCIS341:Compilers 3

Page 4: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

LLVMCompilerInfrastructure

LLVM

FrontEnds

CodeGen/Jit

OpNmizaNons/TransformaNons

TypedSSAIR

Analysis

[LaMneretal.]

Page 5: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

MoNvaNon:SohBound/CETS

•  BufferoverflowvulnerabiliNes.•  DetectspaNal/temporalmemorysafetyviolaNonsinlegacyCcode.

•  ImplementedasanLLVMpass.•  Whataboutcorrectness?

[NagarakaMe,etal.PLDI’09,ISMM‘10]

hMp://www.cis.upenn.edu/acg/sohbound/

Context:Penn'sPOPLMarkchallenge:usingCoqwasbecomingcoolXavierLeroy'sCompCert:providedinspiraNon!

Page 6: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

LLVMCompilerInfrastructure

LLVM

FrontEnds

CodeGen/Jit

OpNmizaNons/TransformaNons

TypedSSAIR

Analysis

[LaMneretal.]

Page 7: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

TheVellvmProject

OpNmizaNons/TransformaNons

TypedSSAIR

Analysis

•  FormalsemanNcs•  FaciliNesforcreaNngsimulaNonproofs

•  ImplementedinCoq•  ExtractpassesforusewithLLVMcompiler

•  Example:verifiedmemorysafetyinstrumentaNon

[Zhaoetal.POPL2012,CPP2012,PLDI2013]

Page 8: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

VellvmFramework

Transform CSourceCode

OtherOpNmizaNons

LLVMIR

LLVMIR Target

LLVMOCamlBindings

PrinterParser

Coq

Syntax

OperaNonalSemanNcs

MemoryModel

TypeSystemandSSA

ProofTechniques&Metatheory

Extract

Page 9: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

VellvmFramework

CSourceCode

OtherOpNmizaNons

LLVMIR

LLVMIR Target

LLVMOCamlBindings

PrinterParser

Coq

Syntax

OperaNonalSemanNcs

MemoryModel

TypeSystemandSSA

ProofTechniques&Metatheory

ExtractVerified

Transform

Page 10: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

Plan

•  IntroducNontoLLVM–  staNcsingle-assignment

•  Vminus:simplifiedSSAIR–  OperaNonalSemanNcs–  SSAProperNes–  StaNcProperNes

•  VerifiedCompilaNon:ImptoVminus–  Parallel'sXavier'sImptostack-

machinecompiler–  CasestudyforQuickChick–  Monotonicstate(freshness!)

•  Scalingup:Vellvm–  TasteofthefullLLVMIR–  OperaNonalSemanNcs–  Metatheory+Proof

Techniques

•  Casestudies:–  SohBoundmemorysafety–  mem2reg

•  Conclusion:–  challenges&research

direcNons

Page 11: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

ExampleLLVMCode

•  LLVMoffersatextualrepresentaNonofitsIR–  filesendingin.ll

ZdancewicCIS341:Compilers 11

define @factorial(%n) { %1 = alloca %acc = alloca store %n, %1 store 1, %acc br label %start

start: %3 = load %1 %4 = icmp sgt %3, 0 br %4, label %then, label %else

then: %6 = load %acc %7 = load %1 %8 = mul %6, %7 store %8, %acc %9 = load %1 %10 = sub %9, 1 store %10, %1 br label %start

else: %12 = load %acc ret %12}

#include <stdio.h>#include <stdint.h>

int64_t factorial(int64_t n) { int64_t acc = 1; while (n > 0) { acc = acc * n; n = n - 1; } return acc;}

factorial64.c

factorial-pretty.ll

Page 12: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

RealLLVM•  Decoratesvalues

withtypeinformaNoni64i64*i1

•  PermitsnumericidenNfiers

•  HasalignmentannotaNons

•  Keepstrackofentryedgesforeachblock:preds = %5, %0

ZdancewicCIS341:Compilers 12

; Function Attrs: nounwind sspdefine i64 @factorial(i64 %n) #0 { %1 = alloca i64, align 8 %acc = alloca i64, align 8 store i64 %n, i64* %1, align 8 store i64 1, i64* %acc, align 8 br label %2

; <label>:2 ; preds = %5, %0 %3 = load i64* %1, align 8 %4 = icmp sgt i64 %3, 0 br i1 %4, label %5, label %11

; <label>:5 ; preds = %2 %6 = load i64* %acc, align 8 %7 = load i64* %1, align 8 %8 = mul nsw i64 %6, %7 store i64 %8, i64* %acc, align 8 %9 = load i64* %1, align 8 %10 = sub nsw i64 %9, 1 store i64 %10, i64* %1, align 8 br label %2

; <label>:11 ; preds = %2 %12 = load i64* %acc, align 8 ret i64 %12}

factorial64-pretty.ll

Page 13: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

BasicBlocks

•  AsequenceofinstrucNonsthatisalwaysexecutedstarNngatthefirstinstrucNonandalwaysexitsatthelastinstrucNon.–  Startswithalabelthatnamestheentrypointofthebasicblock.–  Endswithacontrol-flowinstrucNon(e.g.branchorreturn)the“link”–  Containsnoothercontrol-flowinstrucNons–  Containsnointeriorlabelusedasajumptarget

•  Basicblockscanbearrangedintoacontrol-flowgraph–  ThereisadirectededgefromnodeAtonodeBifthecontrolflow

instrucNonattheendofbasicblockAmightjumptothelabelofbasicblockB.

CIS341:Compilers 13

Page 14: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

ExampleControl-flowGraph

ZdancewicCIS341:Compilers 14

%1 = alloca %acc = alloca store %n, %1store 1, %accbr label %start

%3 = load %1%4 = icmp sgt %3, 0br %4, label %then, label %else

loop:

entry:

%6 = load %acc%7 = load %1%8 = mul %6, %7store %8, %acc%9 = load %1%10 = sub %9, 1store %10, %1br label %start

%12 = load %accret %12

body: post:

define @factorial(%n) {

}

Page 15: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

OPTIMIZEDLLVMCODE

Seefactorial64.ll

Page 16: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

StaNcSingleAssignment(SSA)•  CompilerintermediaterepresentaNondevelopedinthelate

1980’searly1990’s:–  DetecNngEqualityofValuesinPrograms

[Alpern,Wegman,Zadeck1988]–  GlobalValueNumbersandRedundantComputaNons

[Rosen,Wegman,Zadeck1988]–  AnEfficientMethodofCompuNngStaNcSingleAssignmentForm

[Cytron,Ferrante,+RWZ,1989]–  EfficientlyCompuNngStaNcSingleAssignmentFormandtheControl

DependenceGraph[Cytron,et.al,TOPLAS1991]

•  MakesopNmizingimperaNveprogramminglanguagescleanandefficient…bymakingitmorepurelyfuncNonal–  Usedingcc,clang,intel,Jikes,HotSpot,Open64,…

Page 17: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

INTUITIONABOUTSEMANTICS

Seefactorial.ml

Page 18: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

SSAIR’sinPracNce

•  SSAyieldsanefficientrepresentaNon–  SimplifiesDef-UseinformaNonneededindataflowanalysis–  ImperaNvedatastructuretomapadefiniNontoitsuses

•  SSAenablesgoodregisterallocaNon:

–  GoodregisterallocaNonis(arguably)themostimportantopNmizaNonforperformanceonmodernprocessors

–  Theleh-handsidesofSSA"assignments"canbethoughtofas“registers”

–  RegisterpromoNon–movestack-allocateddataintoregisters

Page 19: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

LLVMIR⇒Vminus

•  VastlySimplify!(Fornow…)

•  Throwout:–  types,complex&structureddata–  localstorageallocaNon,complexpointers–  funcNons–  undefinedvalues&nondeterminism

•  What’sleh?–  basicarithmeNc–  controlflow–  global,preallocatedstate(asinImp)

Page 20: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

VminusbyExampleentry: r0 = ... r1 = ... r2 = ...

Control-flowGraphs:+Labeledblocks

exit: r7 = ... r8 = r1 x r2 r9 = r7 + r8

loop: r3 = ... r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100

Page 21: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

VminusbyExampleentry: r0 = ... r1 = ... r2 = ...

Control-flowGraphs:+Labeledblocks+BinaryOperaNons

exit: r7 = ... r8 = r1 * r2 r9 = r7 + r8

loop: r3 = ... r4 = r1 * r2 r5 = r3 + r4 r6 = r5 ≥ 100

Page 22: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

VminusbyExampleentry: r0 = ... r1 = ... r2 = ... br r0 loop exit

Control-flowGraphs:+Labeledblocks+BinaryOperaNons+Branches/Return

exit: r7 = ... r8 = r1 * r2 r9 = r7 + r8 ret r9

loop: r3 = ... r4 = r1 * r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit

Page 23: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

VminusbyExampleentry: r0 = ... r1 = ... r2 = ... br r0 loop exit

Control-flowGraphs:+Labeledblocks+BinaryOperaNons+Branches/Return+StaNcSingleAssignment(eachlocaliden?fierassignedonlyonce,staNcally)localidenNfiera.k.a.uidorSSAvariable

exit: r7 = ... r8 = r1 * r2 r9 = r7 + r8 ret r9

loop: r3 = ... r4 = r1 * r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit

Page 24: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

VminusbyExampleentry: r0 = ... r1 = ... r2 = ... br r0 loop exit

Control-flowGraphs:+Labeledblocks+BinaryOperaNons+Branches/Return+StaNcSingleAssignment+φnodes

exit: r7 = φ[0;entry][r5;loop] r8 = r1 * r2 r9 = r7 + r8 ret r9

loop: r3 = φ[0;entry][r5;loop] r4 = r1 * r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit

Page 25: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

VminusbyExampleentry: r0 = ... r1 = ... r2 = ... br r0 loop exit

Control-flowGraphs:+Labeledblocks+BinaryOperaNons+Branches/Return+StaNcSingleAssignment+φnodes(choosevaluesbasedonpredecessorblocks)

exit: r7 = φ[0;entry][r5;loop] r8 = r1 * r2 r9 = r7 + r8 ret r9

loop: r3 = φ[0;entry][r5;loop] r4 = r1 * r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit

Page 26: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

VMINUSSYNTAX

Vminus.vCFG.vListCFG.v

Page 27: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

VminusOperaNonalSemanNcs

•  Only5kindsofinstrucNons:–  BinaryarithmeNc–  MemoryLoad–  MemoryStore–  Terminators–  Phinodes

•  WhatisthestateofaVminusprogram?

Page 28: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

SubtletyofPhiNodes

•  Phi-Nodesadmit“cyclic”dependencies:

pred: ... br loop

loop: %x = φ[0;pred][y;loop] %y = φ[1;pred][x;loop] %b = %x ≤ %y br %b loop exit

Page 29: Verifying the LLVM · 2017-11-05 · • Open-Source Compiler Infrastructure – see llvm.org for full documentaon • Created by Chris Laner (advised by Vikram Adve) at UIUC –

SemanNcsofPhiNodes

•  ThevalueoftheRHSofaphi-defineduidisrelaNvetothestateattheentrytotheblock.

•  OpNon1:–  Requireallphinodestobeatthebeginningoftheblock–  Executethem“atomically,inparallel”–  (OriginalVellvmfollowedthismodel)

•  OpNon2:–  Keeptrackofthestateuponentrytotheblock–  CalculatetheRHSofphinodesrelaNvetotheentrystate–  (Vminusfollowsthismodel)