CS 671 Compilers

22
CS 671 Compilers Prof. Kim Hazelwood Spring 2008

description

CS 671 Compilers. Prof. Kim Hazelwood Spring 2008. High-Level Programming Languages. Compiler. Machine Code. Error Messages. Then What?. Machine Code. Program Inputs. Program Outputs. What is a Compiler?. Source Code. Interpreter. Program Outputs. Program Inputs. - PowerPoint PPT Presentation

Transcript of CS 671 Compilers

Page 1: CS 671 Compilers

CS 671Compilers

Prof. Kim HazelwoodSpring 2008

Page 2: CS 671 Compilers

2 CS 671 – Spring 2008

What is a Compiler?

CompilerHigh-Level

Programming Languages

Machine Code

Error Messages

Then What?

MachineCode

Program Inputs

Program Outputs

Page 3: CS 671 Compilers

3 CS 671 – Spring 2008

What is an Interpreter?

Interpreter

Source CodeProgram OutputsProgram

Inputs

Page 4: CS 671 Compilers

4 CS 671 – Spring 2008

What is a Just-in-Time Compiler?

IR Generator

Source Code

Program Outputs

Program Inputs

VirtualMachine

Intermediate Program

Page 5: CS 671 Compilers

5 CS 671 – Spring 2008

Why Study Compilers?

Fundamental tool in computer science since 1952

Remains a vibrant research topic• Machines keep changing

• Languages keep changing

• Applications keep changing

• When to compile keeps changing

Challenging!• Must correctly handle an infinite set of legal programs

• Is itself a large program → “Where theory meets practice”

Page 6: CS 671 Compilers

6 CS 671 – Spring 2008

Goals of a Compiler

A compiler’s job is to • Lower the abstraction level• Eliminate overhead from language abstractions• Map source program onto hardware efficiently

– Hide hardware weaknesses, utilize hardware strengths• Equal the efficiency of a good assembly programmer

Optimizing compilers should improve the code• Performance*• Code size• Security• Reliability• Power consumption

Page 7: CS 671 Compilers

7 CS 671 – Spring 2008

An Interface to High-Level Languages

Programmers write in high-level languages• Increases productivity• Easier to maintain/debug• More portable

HLLs also protect the programmer from low-level details• Registers and caches – the register keyword• Instruction selection• Instruction-level parallelism

The catch: HLLs are less efficient

CompilerHigh-Level

Programming

Languages

Machine Code

Page 8: CS 671 Compilers

8 CS 671 – Spring 2008

High-Level Languages and Features

C (80’s) … C++ (Early 90’s) … Java (Late 90’s)

Each language had features that spawned new research

C/Fortran/COBOL• User-defined aggregate data types (arrays, structures)• Control-flow and procedures• Prompted data-flow optimizations

C++/Simula/Modula II/Smalltalk• Object orientation (more, smaller procedures)• Prompted inlining

Java• Type safety, bounds checking, garbage collection• Prompted bounds removal, dynamic optimization

Page 9: CS 671 Compilers

9 CS 671 – Spring 2008

An Interface to Computer Architectures

Parallelism• Instruction level

– multiple operations at once– want to minimize dependences• Processor level

– multiple threads at once– want to minimize synchronization

Memory Hierarchies• Register allocation (only portion explicitly managed in SW)• Code and data layout (helps the hardware cache manager)

Designs driven by how well compilers can leverage new features!

CompilerHigh-Level

Programming Languages

Machine Code

Page 10: CS 671 Compilers

10 CS 671 – Spring 2008

How Can We Translate Effectively?

?

High-Level Source Code

Low-Level Machine Code

Page 11: CS 671 Compilers

11 CS 671 – Spring 2008

Idea: Translate in Steps

• Series of program representations

• Intermediate representations optimized for various manipulations (checking, optimization)

• More machine specific, less language specific as translation proceeds

Page 12: CS 671 Compilers

12 CS 671 – Spring 2008

Simplified Compiler Structure

Lexical Analysis

Parsing

Intermediate Code Generation

Optimization

Source code(character stream)if (b==0) a = b;

Assembly code(character stream)CMP CX, 0CMOVZ CX, DX

Token stream

Abstract syntax tree

Intermediate code

Front End

Back EndMachine dependent

Machine independent

Register Allocation

LIR

Page 13: CS 671 Compilers

13 CS 671 – Spring 2008

Why Separate the Front and Back Ends?Recall: An interface between HLLs and architectures

Option: X*Y compilers or X-front ends + Y-back ends

Compiler Hello alpha

Hello x86

Hello sparc

hello.c

hello.f

hello.cc

hello.ada

X F

Es Y

BEs

Front End Hello alpha

Hello x86

Hello sparc

hello.c

hello.f

hello.cc

hello.ada

Back EndIR

Page 14: CS 671 Compilers

14 CS 671 – Spring 2008

Internal Compiler Structure –Front End

Series of filter passes

•Source program – Written in a HLL

•Lexical analysis – Convert keywords into “tokens”

•Parser – Forms a syntax “tree” (statements, expressions, etc.)

•Semantic analysis – Type checking, etc.

•Intermediate code generator – Three-address code, interface for back end

Lexical Analyzer

ParserSemantic Analyzer

Source Program

Token Stream

Syntax Tree

Syntax Tree

Intermediate Code Gen

IR

Page 15: CS 671 Compilers

15 CS 671 – Spring 2008

Internal Compiler Structure –Back End

Code optimization – “improves” the intermediate code (most time is spent here)• Consists of machine independent & dependent

optimizations

Code generation• Register allocation, instruction selection

Code Optimizer

Code Generator

IR IR Target program

Page 16: CS 671 Compilers

16 CS 671 – Spring 2008

GCC Demo

Page 17: CS 671 Compilers

17 CS 671 – Spring 2008

Traditional Compiler Infrastructures

GNU GCC• Targets: everything (pretty much)• Strength: Targets everything (pretty much)• Weakness: Not as extensible as research infrastructures, poor

optimization

Stanford SUIF Compiler with Harvard MachSUIF• Targets: Alpha, x86, IPF, C• Strength: high level analysis, parallelization on scientific codes

Intel Open Research Compiler (ORC)• Targets: IPF• Strength: robust with OK code quality• Weakness: Many IR levels

Page 18: CS 671 Compilers

18 CS 671 – Spring 2008

Modern Compiler Infrastructures

IBM Jikes RVM• Targets Linux, x86 or AIX• Strengths: Open-source, wide user base• Weaknesses: In maintenance mode

Microsoft Phoenix• Targets Windows• Strengths: Actively developed• Weaknesses: Closed source, extensive API

Page 19: CS 671 Compilers

19 CS 671 – Spring 2008

What will we learn in this course?

• Structure of Compilers • The Front End• The Back End• Advanced Topics

– Just-in-time compilation – Dynamic optimization– Power and size optimizations

Page 20: CS 671 Compilers

20 CS 671 – Spring 2008

Required Work

Two Non-Cumulative Exams (15% each)– February 21 and April 10

Homework (30%)– About 4 assignments– Some are pencil/paper; some are implementation-based

Semester Project (40%)– Groups of 2– Staged submission

– proposal 30%– report 55%– presentation 15%

Late Policy2 ^ (days late) points off (where days late > 0)

Page 21: CS 671 Compilers

21 CS 671 – Spring 2008

Course Materials

We will use the Dragon book

Course Website– www.cs.virginia.edu/kim/courses/cs671/

Lecture Slides– On the course website– I will try to post them before class– Attending class is in your best interest

Other Helpful Books

Page 22: CS 671 Compilers

22 CS 671 – Spring 2008

Next Time…

Read Dragon Chapter 1

We will begin discussing lexical analysis

Look out for HW1