COMP 348 Principles of Programming...

COMP 348 Principles of Programming Languages

Lecture Notes

Peter Grogono

These notes may be copied for students who are taking eitherCOMP 348 Principles of Programming Languages or COMP 6411Comparative Study of Programming Languages.

c©Peter Grogono, 2002, 2003, 2004

Department of Computer Science and Software EngineeringConcordia University

Montreal, Quebec

CONTENTS ii

Contents

1 Introduction 1

1.1 How important are programming languages? . . . . . . . . . . . . . . . . . . . . . 1

1.2 Features of the Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Anomalies 3

3 The Procedural Paradigm 7

3.1 Early Days . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.2 FORTRAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.3 Algol 60 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.4 COBOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.5 PL/I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.6 Algol 68 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.7 Pascal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.8 Modula–2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.9 C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.10 Ada . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4 The Functional Paradigm 20

4.1 LISP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.1.1 Basic LISP Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.1.2 A LISP Interpreter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.1.3 Dynamic Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2 Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.3 SASL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.4 SML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.5 Other Functional Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5 Haskell 34

5.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5.1.1 Hugs Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.1.2 Literate Haskell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.1.3 Defining Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.1.4 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.1.5 Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.2 Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.2.1 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.2.2 Lazy Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

CONTENTS iii

5.2.3 Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.2.4 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.2.5 Building Haskell Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.2.6 Guards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.3 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.3.1 Using Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.3.2 Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.3.3 Reporting Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.3.4 Standard Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.3.5 Standard Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.3.6 Defining New Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.3.7 Defining Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.3.8 Types with Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5.4 Input and Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.4.1 String Formatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.4.2 Interactive Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.5 Reasoning about programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6 The Object Oriented Paradigm 77

6.1 Simula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.2 Smalltalk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.3 CLU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.4 C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.5 Eiffel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.5.1 Programming by Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.5.2 Repeated Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6.5.3 Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.6 Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.6.1 Portability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6.6.2 Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.6.3 Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.6.4 Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6.7 Kevo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.8 Other OOPLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

6.9 Issues in OOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

6.9.1 Algorithms + Data Structures = Programs . . . . . . . . . . . . . . . . . . 93

6.9.2 Values and Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.9.3 Classes versus Prototypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

CONTENTS iv

6.9.4 Types versus Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

6.9.5 Pure versus Hybrid Languages . . . . . . . . . . . . . . . . . . . . . . . . . 96

6.9.6 Closures versus Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.9.7 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.10 A Critique of C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.11 Evaluation of OOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

7 Backtracking Languages 106

7.1 Prolog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

7.2 Alma-0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

7.3 Other Backtracking Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

8 Structure 118

8.1 Block Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

8.2 Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

8.2.1 Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

8.3 Control Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

8.3.1 Loop Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

8.3.2 Procedures and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

8.3.3 Exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

9 Names and Binding 125

9.1 Free and Bound Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

9.2 Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

9.3 Early and Late Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

9.4 What Can Be Named? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

9.5 What is a Variable Name? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

9.6 Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

9.6.1 Ad Hoc Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

9.6.2 Parametric Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

9.6.3 Object Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

9.7 Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

9.8 Scope and Extent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

9.8.1 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

9.8.2 Are nested scopes useful? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

9.8.3 Extent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

LIST OF FIGURES v

10 Abstraction 138

10.1 Abstraction as a Technical Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

10.1.1 Procedures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

10.1.2 Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

10.1.3 Data Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

10.1.4 Classes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

10.2 Computational Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

11 Implementation 144

11.1 Compiling and Interpreting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

11.1.1 Compiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

11.1.2 Interpreting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

11.1.3 Mixed Implementation Strategies . . . . . . . . . . . . . . . . . . . . . . . . 146

11.1.4 Consequences for Language Design . . . . . . . . . . . . . . . . . . . . . . . 147

11.2 Garbage Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

12 Conclusion 150

References 151

A Abbreviations and Glossary 156

B Theoretical Issues 159

B.1 Syntactic and Lexical Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

B.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

B.3 Type Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

B.4 Regular Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

B.4.1 Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

B.4.2 Context Free Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

B.4.3 Control Structures and Data Structures . . . . . . . . . . . . . . . . . . . . 162

B.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

C Haskell Implementation 164

List of Figures

1 The list structure (A (B C)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2 A short session with Hugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3 Hugs commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4 Parameters for :set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5 A Literate Haskell program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

LIST OF FIGURES vi

6 Pattern matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

7 Parsing precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

8 Operator precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

9 Standard Class Inheritance Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

10 The Class Frac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

11 Trying out type Frac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

12 Class BTree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

13 The fringe of a tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

14 Key-value tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

15 Proof of ancestor(pam, jim) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

16 Effect of cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

17 REs and Control Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

18 REs and Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

19 The graph for succ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

COMP 348 Principles of Programming Languages

1 Introduction

In order to understand why programming languages (PLs) are as they are today, and to predicthow they might develop in the future, we need to know something about how they evolved. Weconsider early languages, but the main focus of the course is on contemporary and evolving PLs.

The treatment is fairly abstract. We not discuss details of syntax, and we try to avoid unproductive“language wars”. Here is a useful epigram for the course:

C and Pascal are the same language. Dennis Ritchie.

We can summarize our goals as follows: if the course is successful, it should help us to find usefulanswers to questions such as these.

. What is a good programming language?

. How should we choose an appropriate language for a particular task?

. How should we approach the problem of designing a programming language?

1.1 How important are programming languages?

As the focus of practitioners and researchers has moved from programming to large-scale softwaredevelopment, PLs no longer seem so central to Computer Science (CS). Brooks (1987) and othershave argued that, because coding occupies only a small fraction of the total time required forproject development, developments in programming languages cannot lead to dramatic increasesin software productivity.

The assumption behind this reasoning is that the choice of PL affects only the coding phase of theproject. For example, if coding requires 15% of total project time, we could deduce that reducingcoding time to zero would reduce the total project time by only 15%. Maintenance, however, iswidely acknowledged to be a major component of software development. Estimates of maintenancetime range from “at last 50%” (Lientz and Swanson 1980) to “between 65% and 75%” (McKee1984) of the total time. Since the choice of PL can influence the ease of maintenance, these figuressuggest that the PL might have a significant effect on total development time.

Another factor that we should consider in studying PLs is their ubiquity: PLs are everywhere.

. The “obvious” PLs are the major implementation languages: Ada, C++, COBOL, FOR-TRAN, . . . .

. PLs for the World Wide Web: Java, Javascript, . . . .

. Scripting PLs: Javascript, Perl, Tcl/Tk, Python, . . . .

. “Hidden” languages: spreadsheets, macro languages, input for complex applications, . . . .

The following scenario has occurred often in the history of programming. Developers realize thatan application requires a format for expressing input data. The format increases in complexityuntil it becomes a miniature programming language. By the time that the developers realize this,it is too late for them to redesign the format, even if they had principles that they could use tohelp them. Consequently, the notation develops into a programming language with many of thebad features of old, long-since rejected programming languages.

1

1 INTRODUCTION 2

There is an unfortunate tendency in Computer Science to re-invent language features withoutcarefully studying previous work. Brinch Hansen (1999) points out that, although safe and provablycorrect methods of programming concurrent processes were developed by himself and Hoare in themid-seventies, Java does not make use of these methods and is insecure.

We conclude that PLs are important. However, their importance is often under-estimated, some-times with unfortunate consequences.

1.2 Features of the Course

We will review a number of PLs during the course, but the coverage is very uneven. The timedevoted to a PL depends on its relative importance in the evolution of PLs rather than the extentto which it was used or is used. For example, C and C++ are both widely-used languages but,since they did not contribute profound ideas, they are discussed briefly. To provide insight intofunctional programming, we consider Haskell in some depth.

With hindsight, we can recognize the mistakes and omissions of language designers and criticizetheir PLs accordingly. These criticisms do not imply that the designers were stupid or incompetent.On the contrary, we have chosen to study PLs whose designers had deep insights and made profoundcontributions. Given the state of the art at the time when they designed the PLs, however, thedesigners could hardly be expected to foresee the understanding that would come only with anothertwenty or thirty years of experience.

We can echo Perlis (1978, pages 88–91) in saying:

[Algol 60] was more racehorse than camel . . . . a rounded work of art . . . . Rarelyhas a construction so useful and elegant emerged as the output of a committee of 13meeting for about 8 days . . . . Algol deserves our affection and appreciation.

and yet

[Algol 60] was a noble begin but never intended to be a satisfactory end.

The approach of the course is conceptual rather than technical. We do not dwell on fine lexicalor syntactic distinctions, semantic details, or legalistic issues. Instead, we attempt to focus on thedeep and influential ideas that have developed with PLs.

Exercises are provided mainly to provoke thought. In many cases, the solution to an exercise canbe found (implicitly!) elsewhere in these notes. If we spend time to think about a problem, we arebetter prepared to understand and appreciate its solution later. Some of the exercises may appearin revised form in the examination at the end of the course.

Many references are provided. The intention is not that you read them all but rather that, if youare interested in a particular aspect of the course, you should be able to find out more about it byreading the indicated material.

2 Anomalies

There are many ways of motivating the study of PLs. One way is to look at some anomalies:programs that look as if they ought to work but don’t, or programs that look as if they shouldn’twork but do.

Example 1: Assignments. Most PLs provide an assignment statement of the form v := E inwhich v is a variable and E is an expression. The expression E yields a value. The variable v is notnecessarily a simple name: it may be an expression such as a[i] or r.f. In any case, it yields anaddress into which the value of E is stored. In some PLs, we can use a function call in place of Ebut not in place of v. For example, we cannot write a statement such as

top(stack) = 6;

to replace the top element of a stack. 2

Exercise 1. Explain why the asymmetry mentioned in Example 1 exists. Can you see anyproblems that might be encountered in implementing a function like top? 2

Example 2: Parameters and Results. PLs are fussy about the kinds of objects that can bepassed as parameters to a function or returned as results by a function. In C, an array can bepassed to a function, but only by reference, and a function cannot return an array. In Pascal,arrays can be passed by value or by reference to a function, but a function cannot return an array.Most languages do not allow types to be passed as parameters.

It would be useful, for example, to write a procedure such as

void sort (type t, t a[]) {. . . .}and then call sort(float, scores). 2

Exercise 2. Suppose you added types as parameters to a language. What other changes wouldyou have to make to the language? 2

Exercise 3. C++ provides templates . Is this equivalent to having types as parameters? 2

Example 3: Data Structures and Files. Conceptually, there is not a great deal of differencebetween a data structure and a file. The important difference is the more or less accidental featurethat data structures usually live in memory and files usually live in some kind of external storagemedium, such as a disk. Yet PLs treat data structures and files in completely different ways. 2

Exercise 4. Explain why most PLs treat data structures and files differently. 2

Example 4: Arrays. In PLs such as C and Pascal, the size of an array is fixed at compile-time.In other PLs, even ones that are older than C and Pascal, such as Algol 60, the size of an arrayis not fixed until run-time. What difference does this make to the implementor of the PL and theprogrammers who use it?

Suppose that we have a two-dimensional array a. We access a component of a by writing a[i,j](except in C, where we have to write a[i][j]). We can access a row of the array by writing a[i].Why can’t we access a column by writing a[,j]?

3

2 ANOMALIES 4

It is easy to declare rectangular arrays in most PLs. Why is there usually no provision for triangular,symmetric, banded, or sparse arrays? 2

Exercise 5. Suggest ways of declaring arrays of the kinds mentioned in Example 4. 2

Example 5: Data Structures and Functions. Pascal and C programs contain declarationsfor functions and data. Both languages also provide a data aggregation — record in Pascal andstruct in C. A record looks rather like a small program: it contains data declarations, but itcannot contain function declarations. 2

Exercise 6. Suggest some applications for records with both data and function components. 2

Example 6: Memory Management. Most C/C++ programmers learn quite quickly that itis not a good idea to write functions like this one.

char *money (int amount){

char buffer[16];sprintf(buffer, "%d.%d", amount / 100, amount % 100);return buffer;

}Writing C functions that return strings is awkward and, although experienced C programmers haveways of avoiding the problem, it remains a defect of both C and C++. 2

Exercise 7. What is wrong with the function money in Example 6? Why is it difficult for a Cfunction to return a string (i.e., char *)? 2

Example 7: Factoring. In algebra, we are accustomed to factoring expressions. For example,when we see the expression Ax + Bx we are strongly tempted to write it in the form (A + B)x.Sometimes this is possible in PLs: we can certainly rewrite

z = A * x + B * x;

as

z = (A + B) * x;

Most PLs allow us to write something like this:

if (x > PI) y = sin(x) else y = cos(x)

Only a few PLs allow the shorter form:

y = if (x > PI) then sin(x) else cos(x);

Very few PLs recognize the even shorter form:

y = (if (x > PI) then sin else cos)(x);

2

Example 8: Returning Functions. Some PLs, such as Pascal, allow nested functions, whereasothers, such as C, do not. Suppose that we were allowed to nest functions in C, and we definedthe function shown in Listing 1.

2 ANOMALIES 5

Listing 1: Returning a function

typedef int intint(int);intint addk (int k){

int f (int n){

return n + k;}return f;

}

The idea is that addk returns a function that “adds k to its argument”. We would use it as in thefollowing program, which should print “10”:

int add6 (int) = addk(6);printf("%d", add6(4));

2

Exercise 8. “It would be fairly easy to change C so that a function could return a function.” Trueor false? (Note that you can represent the function itself by a pointer to its first instruction.) 2

Example 9: Functions as Values. Example 8 is just one of several oddities about the wayprograms handle functions. Some PLs do not permit this sequence:

if (x > PI) f = sin else f = cos;f(x);

2

Exercise 9. Why are statements like this permitted in C (with appropriate syntax) but not inPascal? 2

Example 10: Functions that work in more than one way. The function Add(x,y,z)attempts to satisfy the constraint x + y = z. For instance:

. Add(2,3,n) assigns 5 to n.

. Add(i,3,5) assigns 2 to i.

2

Example 11: Logic computation. It would be convenient if we could encode logical assertionssuch as

∀ i . 0 < i < N ⇒ a[i− 1] < a[i]

in the form

all (int i = 1; i < N; i++)a[i-a] < a[i]

and∃ i • 0 ≤ i < N ∧ a[i] = k

2 ANOMALIES 6

Listing 2: The Factorial Function

typedef TYPE . . . .TYPE fac (int n, TYPE f){

if (n <= 1)return 1;

elsereturn n * f(n-1, f);

}printf("%d", fac(3, fac));

in the form

exists (int i = 0; i < N; i++)a[i] == k

2

Example 12: Implicit Looping. It is a well-known fact that we can rewrite programs thatuse loops as programs that use recursive functions. It is less well-known that we can eliminate therecursion as well: see Listing 2.

This program computes factorials using neither loops nor recursion:

fac(3, fac) = 3× fac(2, fac)= 3× 2× fac(1, fac)= 3× 2× 1= 6

In early versions of Pascal, functions like this actually worked. When type-checking was improved,however, they would no longer compile. 2

Exercise 10. Can you complete the definition of TYPE in Example 2? 2

The point of these examples is to show that PLs affect the way we think. It would not occur tomost programmers to write programs in the style of the preceding sections because most PLs donot permit these constructions. Working with low-level PLs leads to the belief that “this is all thatcomputers can do”.

Garbage collection provides an example. Many programmers have struggled for years writingprograms that require explicit deallocation of memory. Memory “leaks” are one of the mostcommon forms of error in large programs and they can do serious damage. (One of the Apollomissions almost failed because of a memory leak that was corrected shortly before memory wasexhausted, when the rocket was already in lunar orbit.) These programmers may be unaware ofthe existence of PLs that provide garbage collection, or they may believe (because they have beentold) that garbage collection has an unacceptably high overhead.

3 The Procedural Paradigm

The introduction of the von Neumann architecture was a crucial step in the development of elec-tronic computers. The basic idea is that instructions can be encoded as data and stored in thememory of the computer. The first consequence of this idea is that changing and modifying thestored program is simple and efficient. In fact, changes can take place at electronic speeds, a verydifferent situation from earlier computers that were programmed by plugging wires into panels.The second, and ultimately more far-reaching, consequence is that computers can process programsthemselves, under program control. In particular, a computer can translate a program from onenotation to another. Thus the stored-program concept led to the development of programming“languages”.

The history of PLs, like the history of any technology, is complex. There are advances and set-backs; ideas that enter the mainstream and ideas that end up in a backwater; even ideas that aresubmerged for a while and later surface in an unexpected place.

With the benefit of hindsight, we can identify several strands in the evolution of PLs. Thesestrands are commonly called “paradigms” and, in this course, we survey the paradigms separatelyalthough their development was interleaved.

Sources for this section include (Wexelblat 1981; Williams and Campbell-Kelly 1989; Bergin andGibson 1996).

3.1 Early Days

The first PLs evolved from machine code. The first programs used numbers to refer to machineaddresses. One of the first additions to programming notation was the use of symbolic namesrather than numbers to represent addresses.

Briefly, it enables the programmer to refer to any word in a programme by means of alabel or tag attached to it arbitrarily by the programmer, instead of by its address inthe store. Thus, for example, a number appearing in the calculation might be labelled‘a3’. The programmer could then write simply ‘A a3’ to denote the operation of addingthis number into the accumulator, without having to specify just where the number islocated in the machine. (Mutch and Gill 1954)

The key point in this quotation is the phrase “instead of by its address in the store”. Instead ofwriting

Location Order100 A 104101 A 2102 T 104103 H 24104 C 50105 T 104

the programmer would write

7

3 THE PROCEDURAL PARADIGM 8

A a3A 2T a3H 24

a3) C 50T a3

systematically replacing the address 104 by the symbol a3 and omitting the explicit addresses.This establishes the principle that a variable name stands for a memory location, a principle thatinfluenced the subsequent development of PLs and is now known — perhaps inappropriately — asvalue semantics .

The importance and subroutines and subroutine libraries was recognized before high-level pro-gramming languages had been developed, as the following quotation shows.

The following advantages arise from the use of such a library:

1. It simplifies the task of preparing problems for the machine;

2. It enables routines to be more readily understood by other users, as conventionsare standardized and the units of a routine are much larger, being subroutinesinstead of individual orders;

3. Library subroutines may be used directly, without detailed coding and punching;

4. Library subroutines are known to be correct, thus greatly reducing the overallchance of error in a complete routine, and making it much easier to locate errors.

. . . . Another difficulty arises from the fact that, although it is desirable to havesubroutines available to cover all possible requirements, it is also undesirable to allowthe size of the resulting library to increase unduly. However, a subroutine can be mademore versatile by the use of parameters associated with it, thus reducing the total sizeof the library.

We may divide the parameters associated with subroutines into two classes.

EXTERNAL parameters, i.e. parameters which are fixed throughout the solution of aproblem and arise solely from the use of the library;

INTERNAL parameters, i.e. parameters which vary during the solution of the problem.

. . . . Subroutines may be divided into two types, which we have called OPEN andCLOSED. An open subroutine is one which is included in the routine as it standswhereas a closed subroutine is placed in an arbitrary position in the store and can becalled into use by any part of the main routine. (Wheeler 1951)

Exercise 11. This quotation introduces a theme that has continued with variations to the presentday. Find in it the origins of concern for:

. correctness;

. maintenance;

. encapsulation;

. parameterization;

. genericity;

. reuse;

. space/time trade-offs.


2

Machine code is a sequence of “orders” or “instructions” that the computer is expected to execute.The style of programming that this viewpoint developed became known as the “imperative” or“procedural” programming paradigm. In these notes, we use the term “procedural” rather than“imperative” because programs resemble “procedures” (in the English, non-technical sense) orrecipes rather than “commands”. Confusingly, the individual steps of procedural PLs, such asPascal and C, are often called “statements”, although in logic a “statement” is a sentence that iseither true or false.

By default, the commands of a procedural program are executed sequentially. Procedural PLsprovide various ways of escaping from the sequence. The earliest mechanisms were the “jump”command, which transferred control to another part of the program, and the “jump and store link”command, which transferred control but also stored a “link” to which control would be returnedafter executing a subroutine.

The data structures of these early languages were usually rather simple: typically primitive values(integers and floats) were provided, along with single- and multi-dimensioned arrays of primitivevalues.

3.2 FORTRAN

FORTRAN was introduced in 1957 at IBM by a team led by John Backus. The “PreliminaryReport” describes the goal of the FORTRAN project:

The IBM Mathematical Formula Translation System or briefly, FORTRAN, will com-prise a large set of programs to enable the IBM 704 to accept a concise formulation of aproblem in terms of a mathematical notation and to produce automatically a high-speed704 program for the solution of the problem. (Quoted in (Sammet 1969).)

This suggests that the IBM team’s goal was to eliminate programming! The following quotationseems to confirm this:

If it were possible for the 704 to code problems for itself and produce as good programsas human coders (but without the errors), it was clear that large benefits could beachieved. (Backus 1957)

It is interesting to note that, 20 years later, Backus (1978) criticized FORTRAN and similarlanguages as “lacking useful mathematical properties”. He saw the assignment statement as asource of inefficiency: “the von Neumann bottleneck”. The solution, however, was very similarto the solution he advocated in 1957 — programming must become more like mathematics: “weshould be focusing on the form and content of the overall result”.

Although FORTRAN did not eliminate programming, it was a major step towards the eliminationof assembly language coding. The designers focused on efficient implementation rather than elegantlanguage design, knowing that acceptance depended on the high performance of compiled programs.

FORTRAN has value semantics. Variable names stand for memory addresses that are determinedwhen the program is loaded.

The major achievements of FORTRAN are:

. efficient compilation;


. separate compilation (programs can be presented to the compiler as separate subroutines, butthe compiler does not check for consistency between components);

. demonstration that high-level programming, with automatic translation to machine code, isfeasible.

The principal limitations of FORTRAN are:

Flat, uniform structure. There is no concept of nesting in FORTRAN. A program consistsof a sequence of subroutines and a main program. Variables are either global or local tosubroutines. In other words, FORTRAN programs are rather similar to assembly languageprograms: the main difference is that a typical line of FORTRAN describes evaluating anexpression and storing its value in memory whereas a typical line of assembly languagespecifies a machine instruction (or a small group of instructions in the case of a macro).

Limited control structures. The control structures of FORTRAN are IF, DO, and GOTO. Sincethere are no compound statements, labels provide the only indication that a sequence ofstatements form a group.

Unsafe memory allocation. FORTRAN borrows the concept of COMMON storage from assemblylanguage program. This enables different parts of a program to share regions of memory, butthe compiler does not check for consistent usage of these regions. One program componentmight use a region of memory to store an array of integers, and another might assume thatthe same region contains reals. To conserve precious memory, FORTRAN also provides theEQUIVALENCE statement, which allows variables with different names and types to share aregion of memory.

No recursion. FORTRAN allocates all data, including the parameters and local variables ofsubroutines, statically. Recursion is forbidden because only one instance of a subroutine canbe active at one time.

Exercise 12. The FORTRAN 1966 Standard stated that a FORTRAN implementation mayallow recursion but is not required to do so. How would you interpret this statement if you were:

. writing a FORTRAN program?

. writing a FORTRAN compiler?

2

3.3 Algol 60

During the late fifties, most of the development of PLs was coming from industry. IBM domi-nated, with COBOL, FORTRAN, and FLPL (FORTRAN List Processing Language), all designedfor the IBM 704. Algol 60 (Naur et al. 1960; Naur 1978; Perlis 1978) was designed by an interna-tional committee, partly to provide a PL that was independent of any particular company and itscomputers. The committee included both John Backus (chief designer of FORTRAN) and JohnMcCarthy (designer of LISP).

The goal was a “universal programming language”. In one sense, Algol was a failure: few complete,high-quality compilers were written and the language was not widely used (although it was usedmore in Europe than in North America). In another sense, Algol was a huge success: it became the


Listing 3: An Algol Block

begininteger x;begin

function f(x) begin ... end;integer x;real y;x := 2;y := 3.14159;

end;x := 1;

end

standard language for describing algorithms. For the better part of 30 years, the ACM requiredsubmissions to the algorithm collection to be written in Algol.

The major innovations of Algol are discussed below.

Block Structure. Algol programs are recursively structured. A program is a block . A blockconsists of declarations and statements. There are various kinds of statement; in particular,one kind of statement is a block. A variable or function name declared in a block canbe accessed only within the block: thus Algol introduced nested scopes . The recursivestructure of programs means that large programs can be constructed from small programs.In the Algol block shown in Listing 3, the two assignments to x refer to two different variables.

The run-time entity corresponding to a block is called an activation record (AR). The ARis created on entry to the block and destroyed after the statements of the block have beenexecuted. The syntax of Algol ensures that blocks are fully nested; this in turn means thatARs can be allocated on a stack . Block structure and stacked ARs have been incorporatedinto almost every language since Algol.

Dynamic Arrays. The designers of Algol realized that it was relatively simple to allow the size ofan array to be determined at run-time. The compiler statically allocates space for a pointerand an integer (collectively called a “dope vector”) on the stack. At run-time, when the sizeof the array is known, the appropriate amount of space is allocated on the stack and thecomponents of the “dope vector” are initialized. The following code works fine in Algol 60.

procedure average (n); integer n;begin

real array a[1:n];. . . .

end;

Despite the simplicity of the implementation, successor PLs such as C and Pascal droppedthis useful feature.

Call By Name. The default method of passing parameters in Algol was “call by name” and itwas described by a rather complicated “copy rule”. The essence of the copy rule is that theprogram behaves as if the text of the formal parameter in the function is replaced by thetext of the actual parameter. The complications arise because it may be necessary to renamesome of the variables during the textual substitution. The usual implementation strategy wasto translate the actual parameter into a procedure with no arguments (called a “thunk”);


Listing 4: Call by name

procedure count (n); integer n;begin

n := n + 1end

Listing 5: A General Sum Function

integer procedure sum (max, i, val); integer max, i, val;begin

integer s;s := 0;for i := 1 until n do

s := s + val;sum := s

end

each occurrence of the formal parameter inside the function was translated into a call to thisfunction.

The mechanism seems strange today because few modern languages use it. However, theAlgol committee had several valid reasons for introducing it.

. Call by name enables procedures to alter their actual parameters. If the procedure countis defined as in Listing 4, the statement

count(widgets)

has the same effect as the statementbegin

widgets := widgets + 1end

The other parameter passing mechanism provided by Algol, call by value, does not allowa procedure to alter the value of its actual parameters in the calling environment: theparameter behaves like an initialized local variable.

. Call by name provides control structure abstraction. The procedure in Listing 5 providesa form of abstraction of a for loop. The first parameter specifies the number of iterations,the second is the loop index, and the third is the loop body. The statement

sum(3, i, a[i])

computes a[1]+a[2]+a[3].. Call by name evaluates the actual parameter exactly as often as it is accessed. (This is

in contrast with call by value, where the parameter is usually evaluated exactly once, onentry to the procedure.) For example, if we declare the procedure try as in Listing 6,it is safe to call try(x > 0, 1.0/x), because, if x ≤ 0, the expression 1.0/x will not beevaluated.

Own Variables. A variable in an Algol procedure can be declared own. The effect is that thevariable has local scope (it can be accessed only by the statements within the procedure)but global extent (its lifetime is the execution of the entire program).

Exercise 13. Why does i appear in the parameter list of Sum? 2


Listing 6: Using call by name

real procedure try (b, x); boolean b; real x;begin

try := if b then x else 0.0end

Exercise 14. Discuss the initialization of own variables. 2

Algol 60 and most of its successors, like FORTRAN, has a value semantics. A variable name standsfor a memory addresses that is determined when the block containing the variable declaration isentered at run-time.

With hindsight, we can see that Algol made important contributions but also missed some veryinteresting opportunities.

. An Algol block without statements is, in effect, a record. Yet Algol 60 does not providerecords.

. The local data of an AR is destroyed after the statements of the AR have been executed. Ifthe data was retained rather than destroyed, Algol would be a language with modules.

. An Algol block consists of declarations followed by statements. Suppose that declarations andstatements could be interleaved in a block. In the following block, D denotes a sequence ofdeclarations and S denotes a sequence of statements.

beginD1

S1

D2

S2

end

A natural interpretation would be that S1 and S2 are executed concurrently .. Own variables were in fact rather problematic in Algol, for various reasons including the diffi-

culty of reliably initializing them (see Example 14). But the concept was important: it is theseparation of scope and extent that ultimately leads to objects.

. The call by name mechanism was a first step towards the important idea that functions canbe treated as values. The actual parameter in an Algol call, assuming the default callingmechanism, is actually a parameterless procedure, as mentioned above. Applying this ideaconsistently throughout the language would have led to high order functions and paved theway to functional programming.

The Algol committee knew what they were doing, however. They knew that incorporating the“missed opportunities” described above would have led to significant implementation problems. Inparticular, since they believed that the stack discipline obtained with nested blocks was crucial forefficiency, anything that jeopardized it was not acceptable.

Algol 60 was simple and powerful, but not quite powerful enough. The dominant trend after Algolwas towards languages of increasing complexity, such as PL/I and Algol 68. Before discussingthese, we take a brief look at COBOL.


3.4 COBOL

COBOL (Sammett 1978) introduced structured data and implicit type conversion. When COBOLwas introduced, “programming” was more or less synonymous with “numerical computation”.COBOL introduced “data processing”, where data meant large numbers of characters. The datadivision of a COBOL program contained descriptions of the data to be processed.

Another important innovation of COBOL was a new approach to data types. The problem of typeconversion had not arisen previously because only a small number of types were provided by thePL. COBOL introduced many new types, in the sense that data could have various degrees ofprecision, and different representations as text. The choice made by the designers of COBOL wasradical: type conversion should be automatic.

The assignment statement in COBOL has several forms, including

MOVE X to Y.

If X and Y have different types, the COBOL compiler will attempt to find a conversion from onetype to the other. In most PLs of the time, a single statement translated into a small number ofmachine instructions. In COBOL, a single statement could generate a large amount of machinecode.

Example 13: Automatic conversion in COBOL. The Data Division of a COBOL programmight contain these declarations:

77 SALARY PICTURE 99999, USAGE IS COMPUTATIONAL.77 SALREP PICTURE $$$,$$9.99

The first indicates that SALARY is to be stored in a form suitable for computation (probably, butnot necessarily, binary form) and the second provides a format for reporting salaries as amountsin dollars. (Only one dollar symbol will be printed, immediately before the first significant digit).The Procedure Division would probably contain a statement like

MOVE SALARY TO SALREP.

which implicitly requires the conversion from binary to character form, with appropriate formatting.2

Exercise 15. Despite significant advances in the design and implementation of PLs, it remainstrue that FORTRAN is widely used for “number crunching” and COBOL is widely used for dataprocessing. Can you explain why? 2

3.5 PL/I

During the early 60s, the dominant languages were Algol, COBOL, FORTRAN. The continuingdesire for a “universal language” that would be applicable to a wide variety of problem domains ledIBM to propose a new programming language (originally called NPL but changed, after objectionsfrom the UK’s National Physical Laboratory, to PL/I) that would combine the best features ofthese three languages. Insiders at the time referred to the new language as “CobAlgoltran”.

The design principles of PL/I (Radin 1978) included:

. the language should contain the features necessary for all kinds of programming;

. a programmer could learn a subset of the language, suitable for a particular application,without having to learn the entire language.


An important lesson of PL/I is that these design goals are doomed to failure. A programmer whohas learned a “subset” of PL/I is likely, like all programmers, to make a mistake. With luck, thecompiler will detect the error and provide a diagnostic message that is incomprehensible to theprogrammer because it refers to a part of the language outside the learned subset. More probably,the compiler will not detect the error and the program will behave in a way that is inexplicable tothe programmer, again because it is outside the learned subset.

PL/I extends the automatic type conversion facilities of COBOL to an extreme degree. For exam-ple, the expression (Gelernter and Jagannathan 1990)

(’57’ || 8) + 17

is evaluated as follows:

1. Convert the integer 8 to the string ’8’.

2. Concatenate the strings ’57’ and ’8’, obtaining ’578’.

3. Convert the string ’578’ to the integer 578.

4. Add 17 to 578, obtaining 595.

5. Convert the integer 595 to the string ’595’.

The compiler’s policy, on encountering an assignment x = E, might be paraphrased as: “Doeverything possible to compile this statement; as far as possible, avoid issuing any diagnosticmessage that would tell the programmer what is happening”.

PL/I did introduce some important new features into PLs. They were not all well-designed, buttheir existence encouraged others to produce better designs.

. Every variable has a storage class : static, automatic, based, or controlled. Some ofthese were later incorporated into C.

. An object associated with a based variable x requires explicit allocation and is placed onthe heap rather than the stack. Since we can execute the statement allocatex as often asnecessary, based variables provide a form of template.

. PL/I provides a wide range of programmer-defined types. Types, however, could not be named.

. PL/I provided a simple, and not very safe, form of exception handling. Statements of thefollowing form are allowed anywhere in the program:

ON conditionBEGIN;. . . .END;

If the condition (which might be OVERFLOW, PRINTER OUT OF PAPER, etc.) becomes TRUE,control is transferred to whichever ON statement for that condition was most recently executed.After the statements between BEGIN and END (the handler) have been executed, control returnsto the statement that raised the exception or, if the handler contains a GOTO statement, to thetarget of that statement.

Exercise 16. Discuss potential problems of the PL/I exception handling mechanism. 2


3.6 Algol 68

Whereas Algol 60 is a simple and expressive language, its successor Algol 68 (van Wijngaardenet al. 1975; Lindsey 1996) is much more complex. The main design principle of Algol 68 wasorthogonality : the language was to be defined using a number of basic concepts that could becombined in arbitrary ways. Although it is true that lack of orthogonality can be a nuisance inPLs, it does not necessarily follow that orthogonality is always a good thing.

The important features introduced by Algol 68 include the following.

. The language was described in a formal notation that specified the complete syntax andsemantics of the language (van Wijngaarden et al. 1975). The fact that the Report was veryhard to understand may have contributed to the slow acceptance of the language.

. Operator overloading: programmers can provide new definitions for standard operators suchas “+”. Even the priority of these operators can be altered.

. Algol 68 has a very uniform notation for declarations and other entities. For example, Algol68 uses the same syntax (mode name = expression) for types, constants, variables, and func-tions. This implies that, for all these entities, there must be forms of expression that yieldappropriate values.

. In a collateral clause of the form (x, y, z), the expressions x, y, and z can be evaluated inany order, or concurrently. In a function call f(x, y, z), the argument list is a collateral clause.Collateral clauses provide a good, and early, example of the idea that a PL specification shouldintentionally leave some implementation details undefined. In this example, the Algol 68 reportdoes not specify the order of evaluation of the expressions in a collateral clause. This givesthe implementor freedom to use any order of evaluation and hence, perhaps, to optimize.

. The operator ref stands for “reference” and means, roughly, “use the address rather than thevalue”. This single keyword introduces call by reference, pointers, dynamic data structures,and other features to the language. It appears in C in the form of the operators “*” and “&”.

. A large vocabulary of PL terms, some of which have become part of the culture (cast, coercion,narrowing, . . . .) and some of which have not (mode, weak context, voiding,. . . .).

Like Algol 60, Algol 68 was not widely used, although it was popular for a while in various partsof Europe. The ideas that Algol 68 introduced, however, have been widely imitated.

Exercise 17. Algol 68 has a rule that requires, for an assignment x := E, the lifetime of thevariable x must be less than or equal to the lifetime of the object obtained by evaluating E.Explain the motivation for this rule. To what extent can it be checked by the compiler? 2

3.7 Pascal

Pascal was designed by Wirth (1996) as a reaction to the complexity of Algol 68, PL/I, and otherlanguages that were becoming popular in the late 60s. Wirth made extensive use of the ideas ofDijkstra and Hoare (later published as (Dahl, Dijkstra, and Hoare 1972)), especially Hoare’s ideasof data structuring. The important contributions of Pascal included the following.

. Pascal demonstrated that a PL could be simple yet powerful.

. The type system of Pascal was based on primitives (integer, real, bool, . . . .) and mech-anisms for building structured types (array, record, file, set, . . . .). Thus data types inPascal form a recursive hierarchy just as blocks do in Algol 60.


. Pascal provides no implicit type conversions other than subrange to integer and integerto real. All other type conversions are explicit (even when no action is required) and thecompiler checks type correctness.

. Pascal was designed to match Wirth’s (1971) ideas of program development by stepwise re-finement. Pascal is a kind of “fill in the blanks” language in which all programs have a similarstructure, determined by the relatively strict syntax. Programmers are expected to start witha complete but skeletal “program” and flesh it out in a series of refinement steps, each ofwhich makes certain decisions and adds new details. The monolithic structure that this ideaimposes on programs is a drawback of Pascal because it prevents independent compilation ofcomponents.

Pascal was a failure because it was too simple. Because of the perceived missing features, supersetswere developed and, inevitably, these became incompatible. The first version of “Standard Pascal”was almost useless as a practical programming language and the Revised Standard described ausable language but appeared only after most people had lost interest in Pascal.

Like Algol 60, Pascal missed important opportunities. The record type was a useful innovation(although very similar to the Algol 68 struct) but allowed data only. Allowing functions in arecord declaration would have paved the way to modular and even object oriented programming.

Nevertheless, Pascal had a strong influence on many later languages. Its most important innova-tions were probably the combination of simplicity, data type declarations, and static type checking.

Exercise 18. List some of the “missing features” of Pascal. 2

Exercise 19. It is well-known that the biggest loop-hole in Pascal’s type structure was the variantrecord. How serious do you think this problem was? 2

3.8 Modula–2

Wirth (1982) followed Pascal with Modula–2, which inherits Pascal’s strengths and, to some ex-tent, removes Pascal’s weaknesses. The important contribution of Modula–2 was, of course, theintroduction of modules. (Wirth’s first design, Modula, was never completed. Modula–2 was theproduct of a sabbatical year in California, where Wirth worked with the designers of Mesa, anotherearly modular language.)

A module in Modula–2 has an interface and an implementation . The interface provides infor-mation about the use of the module to both the programmer and the compiler. The implementationcontains the “secret” information about the module. This design has the unfortunate consequencethat some information that should be secret must be put into the interface. For example, thecompiler must know the size of the object in order to declare an instance of it. This implies thatthe size must be deducible from the interface which implies, in turn, that the interface must containthe representation of the object. (The same problem appears again in C++.)

Modula–2 provides a limited escape from this dilemma: a programmer can define an “opaque” typewith a hidden representation. In this case, the interface contains only a pointer to the instanceand the representation can be placed in the implementation module.

The important features of Modula–2 are:

. Modules with separated interface and implementation descriptions (based on Mesa).

. Coroutines.


3.9 C

C is a very pragmatic PL. Ritchie (Ritchie 1996) designed it for a particular task — systemsprogramming — for which it has been widely used. The enormous success of C is partly accidental.UNIX, after Bell released it to universities, became popular, with good reason. Since UNIXdepended heavily on C, the spread of UNIX inevitably led to the spread of C.

C is based on a small number of primitive concepts. For example, arrays are defined in termsof pointers and pointer arithmetic. This is both the strength and weakness of C. The number ofconcepts is small, but C does not provide real support for arrays, strings, or boolean operations.

C is a low-level language by comparison with the other PLs discussed in this section. It is designedto be easy to compile and to produce efficient object code. The compiler is assumed to be ratherunsophisticated (a reasonable assumption for a compiler running on a PDP/11 in the late sixties)and in need of hints such as register. C is notable for its concise syntax. Some syntactic featuresare inherited from Algol 68 (for example, += and other assignment operators) and others are uniqueto C and C++ (for example, postfix and prefix ++ and --).

3.10 Ada

Ada (Whitaker 1996) represents the last major effort in procedural language design. It is a largeand complex language that combines then-known programming features with little attempt atconsolidation. It was the first widely-used language to provide full support for concurrency, withinteractions checked by the compiler, but this aspect of the language proved hard to implement.

Ada provides templates for procedures, record types, generic packages, and task types. The corre-sponding objects are: blocks and records (representable in the language); and packages and tasks(not representable in the language). It is not clear why four distinct mechanisms are required(Gelernter and Jagannathan 1990). The syntactic differences suggest that the designers did notlook for similarities between these constructs. A procedure definition looks like this:

procedure procname ( parameters ) isbody

A record type looks like this:

type recordtype ( parameters ) isbody

The parameters of a record type are optional. If present, they have a different form than theparameters of procedures.

A generic package looks like this:

generic ( parameters ) package packagename ispackage description

The parameters can be types or values. For example, the template

genericmax: integer;type element is private;

package Stack is. . . .

might be instantiated by a declaration such as

package intStack is new Stack(20, integer)


Finally, a task template looks like this (no parameters are allowed):

task type templatename istask description

Of course, programmers hardly notice syntactic differences of this kind: they learn the correctincantation and recite it without thinking. But it is disturbing that the language designersapparently did not consider passible relationships between these four kinds of declaration. Changingthe syntax would be a minor improvement, but uncovering deep semantic similarities might have asignificant impact on the language as a whole, just as the identity declaration of Algol 68 suggestednew and interesting possibilities.

Exercise 20. Propose a uniform style for Ada declarations. 2

4 The Functional Paradigm

Procedural programming is based on instructions (“do something”) but, inevitably, procedural PLsalso provide expressions (“calculate something”). The key insight of functional programming (FP)is that everything can be done with expressions: the commands are unnecessary.

This point of view has a solid foundation in theory. Turing (1936) introduced an abstract model of“programming”, now known as the Turing machine. Kleene (1936) and Church (1941) introducedthe theory of recursive functions. The two theories were later shown (by Kleene) to be equivalent:each had the same computational power. Other theories, such as Post production systems, wereshown to have the same power. This important theoretical result shows that FP is not a completewaste of time but it does not tell us whether FP is useful or practical. To decide that, we mustlook at the functional programming languages (FPLs) that have actually been implemented.

Most functional language support high order functions . Roughly, a high order function is afunction that takes another function as a parameter or returns a function. More precisely:

. A zeroth order expression contains only variables and constants.

. A first order expression may also contain function invocations, but the results and parametersof functions are variables and constants (that is, zeroth order expressions).

. In general, in an n-th order expression, the results and parameters of functions are (n−1)-thorder expressions.

. A high order expression is an n-th order expression with n ≥ 2.

The same conventions apply in logic with “function” replaced by “function or predicate”. Infirst-order logic, quantifiers can bind variables only; in a high order logic, quantifiers can bindpredicates.

4.1 LISP

Anyone could learn LISP in one day, except that if they already knew FORTRAN, it

would take three days. Marvin Minsky

Functional programming was introduced in 1958 in the form of LISP by John McCarthy. Thefollowing account of the development of LISP is based on McCarthy’s (1978) history.

The important early decisions in the design of LISP were:

. to provide list processing (which already existed in languages such as Information ProcessingLanguage (IPL) and FORTRAN List Processing Language (FLPL));

. to use a prefix notation (emphasizing the operator rather than the operands of an expression);

. to use the concept of “function” as widely as possible (cons for list construction; car and cdrfor extracting list components; cond for conditional, etc.);

. to provide higher order functions and hence a notation for functions (based on Church’s (1941)λ-notation);

. to avoid the need for explicit erasure of unused list structures.

McCarthy (1960) wanted a language with a solid mathematical foundation and decided that recur-sive function theory was more appropriate for this purpose than the then-popular Turing machinemodel. He considered it important that LISP expressions should obey the usual mathematical lawsallowing replacement of expressions and:

20

4 THE FUNCTIONAL PARADIGM 21

s s s s

s s s s

- - - NIL

- - NIL

?A

?

?B

?C

Figure 1: The list structure (A (B C))

Another way to show that LISP was neater than Turing machines was to write auniversal LISP function and show that it is briefer and more comprehensible thanthe description of a universal Turing machine. This was the LISP function eval [e, a],which computes the value of a LISP expression e, the second argument a being a listof assignments of values to variables. . . . Writing eval required inventing a notationfor representing LISP functions as LISP data, and such a notation was devised for thepurpose of the paper with no thought that it would be used to express LISP programsin practice. (McCarthy 1978)

After the paper was written, McCarthy’s graduate student S. R. Russel noticed that eval could beused as an interpreter for LISP and hand-coded it, thereby producing the first LISP interpreter.Soon afterwards, Timothy Hart and Michael Levin wrote a LISP compiler in LISP; this is probablythe first instance of a compiler written in the language that it compiled.

The function application f(x, y) is written in LISP as (f x y). The function name always comesfirst: a + b is written in LISP as (+ a b). All expressions are enclosed in parentheses and can benested to arbitrary depth.

There is a simple relationship between the text of an expression and its representation in memory.An atom is a simple object such as a name or a number. A list is a data structure composed ofcons-cells (so called because they are constructed by the function cons); each cons-cell has twopointers and each pointer points either to another cons-cell or to an atom. Figure 1 shows the liststructure corresponding to the expression (A (B C)). Each box represents a cons-cell. There aretwo lists, each with two elements, and each terminated with NIL. The diagram is simplified in thatthe atoms A, B, C, and NIL would themselves be list structures in an actual LISP system.

The function cons constructs a list from its head and tail: (cons head tail). The value of(car list) is the head of the list and the value of (cdr list) is the tail of the list. Thus:

(car (cons head tail)) → head

(cdr (cons head tail)) → tail

The names car and cdr originated in IBM 704 hardware; they are abbreviations for “contentsof address register” (the top 18 bits of a 36-bit word) and “contents of decrement register” (thebottom 18 bits).

It is easy to translate between list expressions and the corresponding data structures. Thereis a function eval (mentioned in the quotation above) that evaluates a stored list expression.Consequently, it is straightforward to build languages and systems “on top of” LISP and LISP isoften used in this way.


It is interesting to note that the close relationship between code and data in LISP mimics the vonNeumann architecture at a higher level of abstraction.

LISP was the first in a long line of functional programming (FP) languages. Its principal contri-butions are listed below.

Names. In procedural PLs, a name denotes a storage location (value semantics). In LISP, a nameis a reference to an object, not a location (reference semantics). In the Algol sequence

int n;n := 2;n := 3;

the declaration int n; assigns a name to a location, or “box”, that can contain an integer.The next two statements put different values, first 2 then 3, into that box. In the LISPsequence

(progn(setq x (car structure))(setq x (cdr structure)))

x becomes a reference first to (car structure) and then to (cdr structure). The twoobjects have different memory addresses. A consequence of the use of names as references toobjects is that eventually there will be objects for which there are no references: these objectsare “garbage” and must be automatically reclaimed if the interpreter is not to run out ofmemory. The alternative — requiring the programmer to explicitly deallocate old cells —would add considerable complexity to the task of writing LISP programs. Nevertheless, thedecision to include automatic garbage collection (in 1958!) was courageous and influential.

A PL in which variable names are references to objects in memory is said to have referencesemantics . All FPLs and most OOPLs have reference semantics.

Note that reference semantics is not the same as “pointers” in languages such as Pascal andC. A pointer variable stands for a location in memory and therefore has value semantics; itjust so happens that the location is used to store the address of another object.

Lambda. LISP uses “lambda expressions”, based on Church’s λ-calculus, to denote functions.For example, the function that squares its argument is written

(lambda (x) (* x x))

by analogy to Church’s f = λx .x2. We can apply a lambda expression to an argument toobtain the value of a function application. For example, the expression

((lambda (x) (* x x)) 4)

yields the value 16.

However, the lambda expression itself cannot be evaluated. Consequently, LISP had to resortto programming tricks to make higher order functions work. For example, if we want to passthe squaring function as an argument to another function, we must wrap it up in a “specialform” called function:

(f (function (lambda (x) (* x x))) . . . .)

Similar complexities arise when a function returns another function as a result.

Dynamic Scoping. Dynamic scoping was an “accidental” feature of LISP: it arose as a side-effectof the implementation of the look-up table for variable values used by the interpreter. TheC-like program in Listing 7 illustrates the difference between static and dynamic scoping. In


Listing 7: Static and Dynamic Binding

int x = 4; // 1void f(){

printf("%d", x);}void main (){

int x = 7; // 2f();

}

C, the variable x in the body of the function f is a use of the global variable x defined in thefirst line of the program. Since the value of this variable is 4, the program prints 4. (Do notconfuse dynamic scoping with dynamic binding!)

A LISP interpreter constructs its environment as it interprets. The environment behaves likea stack (last in, first out). The initial environment is empty, which we denote by 〈〉. Afterinterpreting the LISP equivalent of the line commented with “1”, the environment containsthe global binding for x: 〈x = 4〉. When the interpreter evaluates the function main, it insertsthe local x into the environment, obtaining 〈x = 7, x = 4〉. The interpreter then evaluates thecall f(); when it encounters x in the body of f , it uses the first value of x in the environmentand prints 7.

Although dynamic scoping is natural for an interpreter, it is inefficient for a compiler. In-terpreters are slow anyway, and the overhead of searching a linear list for a variable valuejust makes them slightly slower still. A compiler, however, has more efficient ways of ac-cessing variables, and forcing it to maintain a linear list would be unacceptably inefficient.Consequently, early LISP systems had an unfortunate discrepancy: the interpreters used dy-namic scoping and the compilers used static scoping. Some programs gave one answer wheninterpreted and another answer when compiled!

Exercise 21. Describe a situation in which dynamic scoping is useful. 2

Interpretation. LISP was the first major language to be interpreted. Originally, the LISP inter-preter behaved as a calculator: it evaluated expressions entered by the user, but its internalstate did not change. It was not long before a form for defining functions was introduced(originally called define, later changed to defun) to enable users to add their own functionsto the list of built-in functions.

A LISP program has no real structure. On paper, a program is a list of function definitions;the functions may invoke one another with either direct or indirect recursion. At run-time, aprogram is the same list of functions, translated into internal form, added to the interpreter.

The current dialect of LISP is called Common LISP (Steele et al. 1990). It is a much larger andmore complex language than the original LISP and includes many features of Scheme (describedbelow). Common LISP provides static scoping with dynamic scoping as an option.

Exercise 22. The LISP interpreter, written in LISP, contains expressions such as the one shownin Listing 8. We might paraphrase this as: “if the car of the expression that we are currentlyevaluating is car, the value of the expression is obtained by taking the car of the cadr (that is,the second term) of the expression”. How much can you learn about a language by reading aninterpreter written in the language? What can you not learn? 2


Listing 8: Defining car

(cond((eq (car expr) ’car)

(car (cadr expr)) ). . . .

4.1.1 Basic LISP Functions

The principal — and only — data structure in classical LISP is the list. Lists are written (a b c).Lists also represent programs: the list (f x y) stands for the function f applied to the argumentsx and y.

The following functions are provided for lists.

� cons builds a list, given its head and tail.

� first (originally called car) returns the head (first component) of a list.

� rest (originally called cdr) returns the tail of a list.

� second (originally called cadr, short for car cdr) returns the second element of a list.

� null is a predicate that returns true if its argument is the empty list.

� atom is a predicate that return true if its argument is not a list — and must therefore be an“atom”, that is, a variable name or other primitive object.

A form looks like a function but does not evaluate its argument. The important forms are:

� quote takes a single argument and does not evaluate it.

� cond is the conditional construct has the form

(cond (p1 e1) (p2 e2) ... (t e))

end works like this: if p1 is true, return e1; if p2 is true, return e2; . . . ; else return e.

� def is used to define functions:

(def name (lambda parameters body ))

4.1.2 A LISP Interpreter

The following interpreter is based on McCarthy’s LISP 1.5 Evaluator from Lisp 1.5 Programmer’sManual by McCarthy et al., MIT Press 1962.

An environment is a list of name/value pairs. The function pairs builds an environment froma list of names and a list of expressions.

(def pairs (lambda (names exps env)(cond

((null names) env)(t (cons

(cons (first name) (first exps))(pairs (rest names) (rest exps) env) )) ) ))

The function lookup finds the value of a name in a table. The table is represented as a list ofpairs.


(def lookup (lambda (name table)(cond

((eq name (first (first table))) (rest (first table)))(t (lookup name (rest table))) ) ))

The heart of the interpreter is the function eval which evaluates an expression exp in an environ-ment env.

(def eval (lambda (exp env)(cond

((null exp) nil)((atom exp) (lookup exp env))((eq (first exp) (quote quote)) (second exp))((eq (first exp) (quote cond)) (evcon (second exp) env) )(t (apply (first exp) (evlist (rest exp) env) env)) ) ))

The function evcon is used to evaluate cond forms; it takes a list of pairs of the form (cnd exp)and returns the value of the expression corresponding to the first true cnd.

(def evcon (lambda (tests env)(cond

((null tests) nil)(t(cond

((eval (first (first tests)) env) (eval (second (first tests)) env))(t (evcon (rest tests) env)) ) ) ) ))

The function evlist evaluates a list of expressions in an environment. It is a straightforwardrecursion.

(def evlist (lambda (exps env)(cond

((null exps) nil)(t (cons (eval (first exps) env)

(evlist (rest exps) env) )) ) ))

The function apply applies a function fun to arguments args in the environment env. The casesit considers are:

� Built-in functions: first, second, cons, and other built-in functions not considered here.

� Any other function with an atomic name: this is assumed to be a user-defined function, andlookup is used to find the lambda form corresponding to the name.

� A function which is a lambda form.

(def apply (lambda (fun args env)(cond

((eq fun (quote first)) (first (first args)))((eq fun (quote second)) (second (first args)))((eq fun (quote cons)) (cons (first args) (second args)))((atom fun) (apply (eval fun env) args env))((eq (first fun) (quote lambda))(eval (third fun) (pairs (second fun) args env)) ) ) ))


4.1.3 Dynamic Binding

The classical LISP interpreter was implemented while McCarthy was still designing the language.It contains an important defect that was not discovered for some time. James Slagle defined afunction like this

(def testr (lambda (x p f u)(cond

((p x) (f x))((atom x) (u))(t (testr (rest x) p f (lambda () (testr (first x) p f u)))) ) ))

and it did not give the correct result.

We use a simpler example to explain the problem. Suppose we define:

(def show (lambda ()x ))

Calling this function generates an error because x is not defined. (The interpreter above does notincorporate error detection and would simply fail with this function.) However, we can wrap showinside another function:

(def try (lambda (x)(show) ))

We then find that (try t) evaluates to t. When show is called, the binding (x.t) is in theenvironment, and so that is the value that show returns.

In other words, the value of a variable in LISP depends on the dynamic behaviour of the program(which functions have been called) rather than on the static text of the program.

We say that LISP uses dynamic binding whereas most other languages, including Scheme,Haskell, and C++, use static binding .

Correcting the LISP interpreter to provide static binding is not difficult: it requires a slightlymore complicated data structure for environments (instead of having a simple list of name/valuepairs, we effectively build a list of activation records). However, the dynamic binding problemwas not discovered until LISP was well-entrenched: it would be almost 20 years before Guy Steeleintroduced Scheme, with static binding, in 1978.

People wrote LISP compilers, but it is hard to write a compiler with dynamic binding. Conse-quently, there were many LISP systems that provided dynamic binding during interpretation andstatic binding for compiled programs!

4.2 Scheme

Scheme was designed by Guy L. Steele Jr. and Gerald Jay Sussman (1975). It is very similar toLISP in both syntax and semantics, but it corrects some of the errors of LISP and is both simplerand more consistent.

The starting point of Scheme was an attempt by Steele and Sussman to understand Carl Hewitt’stheory of actors as a model of computation. The model was object oriented and influenced bySmalltalk (see Section 6.2). Steele and Sussman implemented the actor model using a small LISP


Listing 9: Factorial with functions

(define factorial(lambda (n) (if (= n 0)

1(* n (factorial (- n 1))))))

Listing 10: Factorial with actors

(define actorial(alpha (n c) (if (= n 0)

(c 1)(actorial (- n 1) (alpha (f) (c (* f n)))))))

interpreter. The interpreter provided lexical scoping, a lambda operation for creating functions, andan alpha operation for creating actors. For example, the factorial function could be representedeither as a function, as in Listing 9, or as an actor, as in Listing 10. Implementing the interpreterbrought an odd fact to light: the interpreter’s code for handling alpha was identical to the codefor handling lambda! This indicated that closures — the objects created by evaluating lambda —were useful for both high order functional programming and object oriented programming (Steele1996).

LISP ducks the question “what is a function?” It provides lambda notation for functions, but alambda expression can only be applied to arguments, not evaluated itself. Scheme provides ananswer to this question: the value of a function is a closure . Thus in Scheme we can write both

(define num 6)

which binds the value 6 to the name num and

(define square (lambda (x) (* x x)))

which binds the squaring function to the name square. (Scheme actually provides an abbreviatedform of this definition, to spare programmers the trouble of writing lambda all the time, but theform shown is accepted by the Scheme compiler and we use it in these notes.)

Since the Scheme interpreter accepts a series of definitions, as in LISP, it is important to understandthe effect of the following sequence:

(define n 4)(define f (lambda () n))(define n 7)(f)

The final expression calls the function f that has just been defined. The value returned is thevalue of n, but which value, 4 or 7? If this sequence could be written in LISP, the result wouldbe 7, which is the value of n in the environment when (f) is evaluated. Scheme, however, usesstatic scoping . The closure created by evaluating the definition of f includes all name bindingsin effect at the time of definition . Consequently, a Scheme interpreter yields 4 as the value of(f). The answer to the question “what is a function?” is “a function is an expression (the bodyof the function) and an environment containing the values of all variables accessible at the pointof definition”.

Closures in Scheme are ordinary values. They can be passed as arguments and returned by func-tions. (Both are possible in LISP, but awkward because they require special forms.)


Listing 11: Differentiating in Scheme

(define derive (lambda (f dx)(lambda (x)

(/ (- (f (+ x dx)) (f x)) dx))))(define square (lambda (x) (* x x)))(define Dsq (derive sq 0.001))

Listing 12: Banking in Scheme

(define (make-account balance)(define (withdraw amount)

(if (>= balance amount)(sequence (set! balance (- balance amount))

balance)("Insufficient funds"))

(define (deposit amount)(set! balance (+ balance amount))

balance)(define (dispatch m)

(cond((eq? m ’withdraw) withdraw)((eq? m ’deposit) deposit)(else (error "Unrecognized transaction" m))))

dispatch)

Example 14: Differentiating. Differentiation is a function that maps functions to functions.Approximately (Abelson and Sussman 1985):

D f(x) =f(x + dx)− f(x)

dx

We define the Scheme functions shown in Listing 11. After these definitions have been evaluated,Dsq is, effectively, the function

f(x) =(x + 0.001)2− x2

0.001= 2x + · · ·

We can apply Dsq like this: (->” is the Scheme prompt):

-> (Dsq 3)6.001

2

Scheme avoids the problem of incompatibility between interpretation and compilation by beingstatically scoped, whether it is interpreted or compiled. The interpreter uses a more elaborate datastructure for storing values of local variables to obtain the effect of static scoping. In addition tofull support for high order functions, Scheme introduced continuations .

Although Scheme is primarily a functional language, side-effects are allowed. In particular, set!changes the value of a variable. (The “!” is a reminder that set! has side-effects.)


Listing 13: Using the account

-> ((acc ’withdraw) 50)50-> ((acc ’withdraw) 100)Insufficient funds-> ((acc ’deposit) 100)150

Example 15: Banking in Scheme. The function shown in Listing 12 shows how side-effectscan be used to define an object with changing state (Abelson and Sussman 1985, page 173). Thefollowing dialog shows how banking works in Scheme. The first step is to create an account.

-> (define acc (make-account 100))

The value of acc is a closure consisting of the function dispatch together with an environment inwhich balance = 100. The function dispatch takes a single argument which must be withdrawor deposit; it returns one of the functions withdraw or deposit which, in turn, takes an amountas argument. The value returned is the new balance. Listing 13 shows some simple applications ofacc. The quote sign (’) is required to prevent the evaluation of withdraw or deposit. 2

This example demonstrates that with higher order functions and control of state (by side-effects)we can obtain a form of OOP. The limitation of this approach, when compared to OOPLs such asSimula and Smalltalk (described in Section 6) is that we can define only one function at a time.This function must be used to dispatch messages to other, local, functions.

4.3 SASL

SASL (St. Andrew’s Symbolic Language) was introduced by David Turner (1976). It has an Algol-like syntax and is of interest because the compiler translates the source code into a combinatorexpression which is then processed by graph reduction (Turner 1979). Turner subsequently designedKRC (Kent Recursive Calculator) (1981) and Miranda (1985), all of which are implemented withcombinator reduction.

Combinator reduction implements call by name (the default method for passing parameters inAlgol 60) but with an optimization. If the parameter is not needed in the function, it is notevaluated, as in Algol 60. If it is needed one or more times, it is evaluated exactly once. SinceSASL expressions do not have side-effects, evaluating an expression more than once will alwaysgive the same result. Thus combinator reduction is (in this sense) the most efficient way to passparameters to functions. Evaluating an expression only when it is needed, and never more thanonce, is called call by need or lazy evaluation .

The following examples use SASL notation. The expression x::xs denotes a list with first element(head) x and remaining elements (tail) xs. The definition

nums(n) = n::nums(n+1)

apparently defines an infinite list:

nums(0) = 0::nums(1) = 0::1::nums(2) = . . . .

The function second returns the second element of a list. In SASL, we can define it like this:

second (x::y::xs) = y

Although nums(0) is an “infinite” list, we can find its second element in SASL:


second(nums(0)) = second(0::nums(1)) = second(0::1::nums(2)) = 1

This works because SASL evaluates a parameter only when its value is needed for the calculationto proceed. In this example, as soon as the argument of second is in the form 0::1::. . . ., therequired result is known.

Call by need is the only method of passing arguments in SASL but it occurs as a special case inother languages. If we consider if as a function, so that the expression

if P then X else Y

is a fancy way of writing if(P,X,Y), then we see that if must use call by need for its second andthird arguments. If it did not, we would not be able to write expressions such as

if x = 0 then 1 else 1/x

In C, the functions && (AND) and || (OR) are defined as follows:

X && Y ≡ if X then Y else false

X || Y ≡ if X then true else Y

These definitions provide the effect of lazy evaluation and allow us to write expressions such as

if (p != NULL && p->f > 0) . . . .

4.4 SML

SML (Milner, Tofte, and Harper 1990; Milner and Tofte 1991) was designed as a “metalanguage”(ML) for reasoning about programs as part of the Edinburgh Logic for Computable Functions(LCF) project. The language survived after the rest of the project was abandoned and became“standard” ML, or SML. The distinguishing feature of SML is that it is statically typed in thesense of Section B.3 and that most types can be inferred by the compiler.

In the following example, the programmer defines the factorial function and SML responds with itstype. The programmer then tests the factorial function with argument 6; SML assigns the resultto the variable it, which can be used in the next interaction if desired. SML is run interactively,and prompts with “-”.

- fun fac n = if n = 0 then 1 else n * fac(n-1);val fac = fn : int -> int- fac 6;val it = 720 : int

SML also allows function declaration by cases , as in the following alternative declaration of thefactorial function:

- fun fac 0 = 1= | fac n = n * fac(n-1);val fac = fn : int -> int- fac 6;val it = 720 : int

Since SML recognizes that the first line of this declaration is incomplete, it changes the prompt to“=” on the second line. The vertical bar “|” indicates that we are declaring another “case” of thedeclaration.

Each case of a declaration by cases includes a pattern. In the declaration of fac, there aretwo patterns. The first, 0, is a constant pattern, and matches only itself. The second, \tt n,


Listing 14: Function composition

- infix o;- fun (f o g) x = g (f x);val o = fn : (’a -> ’b) * (’b -> ’c) -> ’a -> ’c- val quad = sq o sq;val quad = fn : real -> real- quad 3.0;val it = 81.0 : real

Listing 15: Finding factors

- fun hasfactor f n = n mod f = 0;val hasfactor fn : int -> int -> boolhasfactor 3 9;val it = true : bool

is a variable pattern, and matches any value of the appropriate type. Note that the definitionfun sq x = x * x; would fail because SML cannot decide whether the type of x is int or real.

- fun sq x:real = x * x;val sq = fn : real -> real- sq 17.0;val it = 289.0 : real

We can pass functions as arguments to other functions. The function o (intended to resemble thesmall circle that mathematicians use to denote functional composition) is built-in, but even if itwasn’t, we could easily declare it and use it to build the fourth power function, as in Listing 14. Thesymbols ’a, ’b, and ’c are type names; they indicate that SML has recognized o as a polymorphicfunction.

The function hasfactor defined in Listing 15 returns true if its first argument is a factor ofits second argument. All functions in SML have exactly one argument. It might appear thathasfactor has two arguments, but this is not the case. The declaration of hasfactor introducestwo functions, as shown in Listing 16. Functions like hasfactor take their arguments one at atime. Applying the first argument, as in hasfactor 2, yields a new function. The trick of applyingone argument at a time is called “currying”, after the American logician Haskell Curry. It may behelpful to consider the types involved:

hasfactor : int -> int -> boolhasfactor 2 : int -> boolhasfactor 2 6 : bool

The following brief discussion, adapted from (Ake Wikstrom 1987), shows how functions can beused to build a programmer’s toolkit. The functions here are for list manipulation, which is awidely used example but not the only way in which a FPL can be used. We start with a listgenerator, defined as an infix operator.

Listing 16: A function with one argument

- val even = hasfactor 2;val even = fn : int -> bool;- even 6;val it = true : bool


Listing 17: Sums and products

- fun sum [] = 0= | sum (x::xs) = x + sum xs;val sum = fn : int list -> int- fun prod [] = 1= | prod (x::xs) = x * prod xs;val prod = fn : int list -> int- sum (1 -- 5);val it = 15 : int- prod (1 -- 5);val it = 120 : int

Listing 18: Using reduce

- fun sum xs = reduce add 0 xs;val sum = fn : int list -> list- fun prod xs = reduce mul 1 xs;val prod = fn : int list -> list

- infix --;- fun (m -- n) = if m < n then m :: (m+1 -- n) else [];val -- = fun : int * int -> int list- 1 -- 5;val it = [1,2,3,4,5] : int list

The functions sum and prod in Listing 17 compute the sum and product, respectively, of a list ofintegers. We note that sum and prod have a similar form. This suggests that we can abstract thecommon features into a function reduce that takes a binary function, a value for the empty list,and a list. We can use reduce to obtain one-line definitions of sum and prod, as in Listing 18. Theidea of processing a list by recursion has been captured in the definition of reduce.

- fun reduce f u [] = u= | reduce f u (x::xs) = f x (reduce f u xs);val reduce = fn : (’a -> ’b -> ’b) -> ’b -> ’a list -> ’b

We can also define a sorting function as

- val sort = reduce insert nil;val sort = fn : int list -> int list

where insert is the function that inserts a number into an ordered list, preserving the ordering:

- fun insert x:int [] = x::[]| insert x (y::ys) = if x <= y then x::y::ys else y::insert x ys;val insert = fn : int -> int list -> int list

4.5 Other Functional Languages

We have not discussed FP, the notation introduced by Backus (1978). Although Backus’s speech(and subsequent publication) turned many people in the direction of functional programming, hisideas for the design of FPLs were not widely adopted.

Since 1978, FPLs have split into two streams: those that provide mainly “eager evaluation” (thatis, call by value), following LISP and Scheme, and those that provide mainly “lazy evaluation”,


following SASL. The most important language in the first category is still SML, while Haskellappears to be on the way to becoming the dominant FPL for teaching and other applications.

5 Haskell

COMP 348 is about the principles of programming languages. In order to understand the prin-ciples, you need to know more than one programming language. Even this is not enough: youreally need to know more than one style of programming. Accordingly, the course begins with anintroduction to a language that is quite different to languages that you may already be familiarwith such as C++, Java, and Basic.

Haskell is a functional programming language in the sense of Section 4. Although it has manyof the features that we associate with other languages, such as variables, expressions, and datastructures, its foundation is the mathematical concept of a function .

You can find information about Haskell at www.haskell.org. Haskell gets its name from theAmerican logician Haskell Curry (1900–82 — see http://www.haskell.org/bio.html).

5.1 Getting Started

There are several Haskell compilers and all of them are free. One of the first Haskell implementa-tions was called “Gofer”; we will be using a system derived from Gofer and called HUGS (HaskellUsers’ Gofer System). At Concordia, Hugs has been installed on both Solaris8 machines andlinux hosts in pkg/hugs98. Executables are in /site/bin and can be run without any specialpreparation. Documentation is in pkg/hugs98/docs.

If you want to run Hugs on your own PC, download it from http://www.haskell.org/hugs;installation is straightforward.

Hugs provides an interactive programming environment. It is based on a “read/eval/print” loop:the system reads an expression, evaluates it, and prints the value. Figure 2 shows Hugs working out2+2. The prompt Prelude> indicates that the only module that has been loaded is the “prelude”.As well as expressions, the environment recognizes various commands with the prefix ”:”. Forinstance, if you enter :browse (or :b for short), Hugs will display the functions and structuresdefined in the current environment.

Haskell computes with integers and floating-point numbers. Integers have unlimited size andconversion are performed as required.

Prelude> 1234567890987654321 / 12345671.0e+012 :: DoublePrelude> 1234567890987654321 ‘div‘ 12345671000000721700 :: Integer

Strings are built into Haskell. A string literal is enclosed in double quotes, as in C++. Haskell usesthe same escape sequences C++(e.g., ’\n’, etc.). However, Haskell strings are not terminated by’\0’. The infix operator ++ concatenates strings.

Prelude> "Hallo" ++ " " ++ "everyone!""Hallo everyone!" :: [Char]

The basic data structure of almost all functional programming languages, including Haskell, isthe list . In Haskell, lists are enclosed in square brackets with items separated by commas. Theconcatenation operator for lists is ++. This is the same as the concatenation operator for strings— Haskell considers a string to be simply a list of characters.

Prelude> [1,2,3] ++ [4,5,6][1,2,3,4,5,6] :: [Integer]

The prelude provides a number of basic operations on lists.

34

5 HASKELL 35

__ __ __ __ ____ ___ _____________________________________________|| || || || || || ||__ Hugs 98: Based on the Haskell 98 standard||___|| ||__|| ||__|| __|| Copyright (c) 1994-2001||---|| ___|| World Wide Web: http://haskell.org/hugs|| || Report bugs to: [email protected]|| || Version: December 2001 _____________________________________________

Haskell 98 mode: Restart with command line option -98 to enable extensions

Reading file "C:\Program Files\Hugs98\lib\Prelude.hs":

Hugs session for:C:\Program Files\Hugs98\lib\Prelude.hsType :? for helpPrelude> 2+24 :: IntegerPrelude>

Figure 2: A short session with Hugs

Prelude> [1..20][1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20] :: [Integer]Prelude> sum [1..20]210 :: IntegerPrelude> product [1..20]2432902008176640000 :: IntegerPrelude> take 5 [1..20][1,2,3,4,5] :: [Integer]Prelude> drop 10 [1..20][11,12,13,14,15,16,17,18,19,20] :: [Integer]Prelude> filter even [1..20][2,4,6,8,10,12,14,16,18,20] :: [Integer]Prelude> take 5 "Peter Piper pickled a pepper""Peter" :: [Char]

Evaluating expressions is fun for a few minutes, but it quickly gets boring. To do anything inter-esting, you have to write your own programs — which, in Haskell, means mainly writing functions(although later we will also introduce new data types).

The environment does not allow you to introduce any new definitions:

Prelude> n = 99ERROR - Syntax error in input (unexpected ‘=’)

Fortunately, it does allow you read definitions from a file. The command is :load and it is followedby a path name. Here is how the definitions for this lecture are loaded. Note that the promptchanges to reflect the newly loaded module:

Prelude> :load "e:/courses/comp348/src/lec1.hs"Reading file "e:/courses/comp348/src/lec1.hs":

5 HASKELL 36

Command Brief description Note:? List all Hugs commands:! 〈command〉 Execute a shell command:quit Quit the current Hugs session:load 〈filename〉 . . . Load Haskell source from a file (1):also 〈filename〉 . . . Load additional files (1):reload Repeat last load command (1):browse 〈module〉 . . . Display functions exported by module(s):set 〈options〉 Set flags and environment parameters (see Figure 4):cd 〈directory〉 Change working directory:edit 〈file〉 Start the editor (2):find 〈name〉 Start the editor with the cursor on the object named (2):gc Force garbage collection:info 〈name〉 . . . Display information about files, classes, etc.:module 〈module〉 Change module:names 〈pattern〉 Find names that match the pattern (2):project 〈project file〉 Load a project (obsolete) (3):type 〈expression〉 Print the type of 〈expression〉 without evaluating it:version Display Hugs version

Figure 3: Hugs commands

Hugs session for:C:\Program Files\Hugs98\lib\Prelude.hse:/courses/comp348/src/lec1.hs

The file lec1.hs looks like this (the “....” indicates material that is discussed below):

module Lec1 wherefc1 n = if n == 0 then 1 else n * fc1(n - 1)....

This code introduces a Haskell module called Lec1 that contains a number of definitions. Thefirst definition introduces a function called fc1.

In the examples that follow, we will show Haskell code as it appears in the source file. When weneed to show the effect of evaluating an expression, we will include the prompt, as in the examplesabove.

5.1.1 Hugs Commands

Before describing the Haskell language, we digress briefly to explain the commands provided by theHugs environment. Figure 3 provides a brief summary of each command. Note that any commandcan be abbreviated: for example, you can enter :b instead of :browse. Figure 4 shows optionsfor the :set command; these options can also be specified on the command line. The WinHugsenvironment has icons for most of these commands.

Notes for Figure 3

5 HASKELL 37

±s Print # of reductions and cells after evaluation±t Print type after evaluation±f Terminate evaluation after first error±g Print cells recovered by garbage collector±l Use literate modules as default±e Warn about errors in literate modules±. Print dots to show progress (not recommended!)±q Print nothing to show progress±w Always show which modules are loaded±k Show “kind” errors in full±u Use show to display results±i Chase imports when loading modulesp〈string〉 Set prompt to 〈string〉r〈string〉 Set repeat last expression string to 〈string〉P〈string〉 Set search path for modules to 〈string〉E〈string〉 Use editor setting given by 〈string〉F〈string〉 Set preprocessor filter to 〈string〉h〈num〉 Set heap size to 〈num〉 bytes (has no effect on current session)c〈num〉 Set constraint cutoff limit to 〈num〉

Figure 4: Parameters for :set. The symbol ± indicates a toggle: +x turns x on and -x turns itoff. The other commands take a string or a number as argument, as shown.

1. :load removes all definitions (except those belonging to the Prelude) and loads new defini-tions from the named file(s). :also is similar but it does not remove previous definitions.:reload repeats the previous :load command.

One easy way to develop a Hugs program is to:

� Write the first version in a file fun.hs

� Leaving the editor running, load this file into Hugs and try the functions: :l fun.hs

� Make corrections in the editor window� Reload the file in the Hugs window: :r

2. The editor commands assume that Hugs has been linked to an editor. For example, you maylike to use emacs with Hugs. :edit suspends Hugs and starts the editor; when you closethe editor, Hugs resumes. :find is similar, but it positions the editor cursor at the chosenobject.

The default editor for Hugs in Windows is notepad. You may find it more convenient to ignorethese commands and simply run Hugs and your favourite editor as independent processes.In this case, all you have to do is save the corrected file in the editor window and enter :r inHugs to reload the modified source code.

3. The project command reads a project file containing information about the files that Hugsshould load. Recent versions of Hugs use a smarter method of loading projects, called “importchasing”, that make the :project command unnecessary in most cases.

You can obtain the information in Figure 4 by entering :set without any arguments. The outputalso includes the current settings which, for a Windows session, will look something like this:

5 HASKELL 38

Current settings: +tfewuiAR -sgl.qQkI -h250000 -p"%s> " -r$$ -c40Search path : -P{Hugs}\lib;{Hugs}\lib\hugs;{Hugs}\lib\exts;{Hugs}\lib\win32Project Path :Editor setting : -EC:\WINNT\notepad.exePreprocessor : -FCompatibility : Haskell 98 (+98)

5.1.2 Literate Haskell

Donald Knuth proposed the idea that programs should be readable documents with detailed ex-planations and commentary built in, rather than just code listings with occasional — and oftenuseless — comments. A program written in this way is called a literate program and the art ofwriting such programs is called literate programming .

The first step towards literate programming is to make “comments” very easy to write and read.A consequence of this step is that the code itself becomes slightly harder to write and read. Theoverhead for code in literate Haskell (at least the Hugs version) is relatively light. Here are therules:

• Literate Haskell source files use the extension .lhs instead of .hs.

• A line beginning with > is interpreted as Haskell code.

• The lines before and after lines containing code must be either blank or lines containing code(but see the e switch below).

• Any other lines are ignored by the compiler.

The check for blank lines can be suppressed by turning off the e switch in Hugs. Here’s an exampleto illustrate how this works. Suppose that the source file vector.lhs contains this text:

Vector is an instance of the type class Show.This enables easy conversion of vectors to strings.> instance Show Vector where> show (V x y z) = "(" ++ show x ++ ", " ++ show y ++ ", " ++ show z ++ ")"

By default, the e switch in the compiler is on (as if we had entered :set +e). When Hugs loadsthis file, it will report:

Prelude> :l e:/courses/comp348/src/vector.lhsReading file "e:/courses/comp348/src/vector.lhs":ParsingERROR "e:/courses/comp348/src/vector.lhs":14 - Program line next to comment

There are two ways to fix this problem. One is to turn off the e switch (its default setting is +e):

Prelude> :set -e

and then reload the source file — it will load without errors. The other way is to put blank linesin the source file:

5 HASKELL 39

Vector is an instance of the type class Show.This enables easy conversion of vectors to strings.

> instance Show Vector where> show (V x y z) = "(" ++ show x ++ ", " ++ show y ++ ", " ++ show z ++ ")"

In general, it is a good idea to put blank lines in the source file anyway, because they make thecode stand out more clearly.

Figure 5 shows a small but complete example of a literate Haskell program.

5.1.3 Defining Functions

Since Haskell is a functional language, defining functions is fundamental and there are, in fact,several ways to define functions. We will start by considering the favourite function of functionalprogrammers, the factorial function:

0 ! = 1n ! = 1× 2× 3× . . .× n for n ≥ 1

The first point to note is that, since everything is a value in Haskell, there is no need for a returnstatement: you write just the value that you want returned. Our first definition of the factorialfunction uses recursion:

fc1 n = if n == 0 then 1 else n * fc1(n - 1)

This definition works fine, but it is not the preferred style for Haskell. Many functions, includingthis one, are defined by cases . The cases for factorial are n = 0 and n > 0. Haskell has a caseexpression (analogous to switch in C++ and Java) and we use it here for the second definition offactorial:

fc2 n = case n of0 -> 1n -> n * fc2(n - 1)

This is the first definition that is written on more than one line. It can be written on one line but, tomake the syntax legal, there must be a semicolon between the two case clauses (see Section 5.1.4):

fc2 n = case n of 0 -> 1 ; n -> n * fc2(n - 1)

Even this, however, is not idiomatic Haskell. The preferred form of definition is a set of equa-tions :

fc3 0 = 1fc3 n = n * fc3(n - 1)

Although we have written these equations together, they are in fact independent assertions forHaskell. If we wrote the first but not the second, fc3 0 would evaluate correctly to 1, but fc3 4would cause an error.

There are also “closed form” expressions for some functions. We can compute n ! by constructingthe list [1,2,3,...,n] and then using the Prelude function product to multiply all of the listitems:

fc4 n = product [1..n]

This works correctly when n = 0 because the “product” of the empty list is defined to be 1:

Lec1> product []

5 HASKELL 40

Author: Peter GrogonoDate: 5 October 2003Description: Example of a Literate Haskell source file

> module Vector where

This is a module for vectors in 3D Euclidean space.We define V to be the constructor for vectors.

> data Vector = V Double Double Double

Vector is an instance of the type class Show.This enables easy conversion of vectors to strings.

> instance Show Vector where> show (V x y z) = "(" ++ show x ++ ", " ++ show y ++ ", " ++ show z ++ ")"

Vector is an equality type. Vectors are equal if all of their componentsare equal. It might be an improvement to consider vectors to be equal iftheir components are approximately equal.

> instance Eq Vector where> V x y z == V x’ y’ z’ = x == x’ && y == y’ && z == z’

Vector is a numeric class. The following definitions provide sums,differences, and outer (cross) products of vectors.

> instance Num Vector where> V x y z + V x’ y’ z’ = V (x + x’) (y + y’) (z + z’)> V x y z - V x’ y’ z’ = V (x - x’) (y - y’) (z - z’)> V x y z * V x’ y’ z’ => V (y * z’ - y’ * z) (z * x’ - z’ * x) (x * y’ - x’ * y)> negate (V x y z) = V (negate x) (negate y) (negate z)

A vector can be scaled by a real number. Haskell does not allow us tooverload ’*’ and ’/’ for vector/scalar operations.

> scale (V x y z) d = V (x * d) (y * d) (z * d)

These auxiliary functions define the dot product and length of a vector.The expression ’norm v’ yields a unit vector with the same direction as v.

> dot (V x y z) (V x’ y’ z’) = x * x’ + y * y’ + z * z’> len v = sqrt(dot v v)> norm v = if l == 0> then error "Norm of zero vector"> else scale v (1/l)> where l = len v

Figure 5: A Literate Haskell program

5 HASKELL 41

1 :: Integer

To illustrate some other styles of function definition, we will use an even simpler function: incadds 1 to its argument. The “obvious” definition is a single equation:

inc1 n = n + 1

Purists have a problem with this definition: they would prefer definitions to have to form x = ein which x is the name of the thing being defined and e is the defining expression. We can writedefinition in this way in Haskell:

inc2 = \n -> n + 1

We can read this definition as “inc2 is the function that maps n to n + 1”. The backslashindicates that n is a parameter and the “arrow” -> links the parameter to the body of the function.Backslash looks a bit like λ and the Haskell notation suggests the way in which function is writtenin λ-calculus: λ n . n + 1.

The expression \n -> n + 1 can be used alone: it is an anonymous function .

Lec1> (\n -> n + 1) 78 :: Integer

There is another way of writing this anonymous function. Its form is ? = 1 in which ? stands forthe unknown parameter. Putting parentheses around +1 makes this expression into a function:

Lec1> (+1) 67 :: Integer

In general, given a binary operator, we can create functions from it by giving it either a left or aright operand. For example, (/3) is a function that divides its argument by 3 and (3/) is a functionthat divides its argument into 3:

Lec1> (/3) 93.0 :: Double

Lec1> (12/) 52.4 :: Double

This provides us with another definition of inc:

inc3 = (+1)

Operators with one argument missing are called sections . The general rules are:

� (op expr ) is equivalent to \x -> (x op expr)

� (expr op ) is equivalent to \x -> (expr op x)

There is an important exception to this rule: (-1) is not a section: it is just −1. However,(subtract expr) is a section equivalent to \x -> x - expr .

So far, all of our functions have had exactly one argument. In fact, all Haskell functions takeexactly one argument. Nevertheless, we can write definitions such as

add x y = x + y

and use the functions defined:

Lec1> add 347589 23874371463 :: Integer

How do we account for this? The function add is actually a second order function that expectsone argument and returns a function that expects another argument. To see this, define

5 HASKELL 42

inc4 = add 1

and try it:

Lec1> inc4 56 :: Integer

Thus add 1 is “the function that adds 1 to its argument” and add 99 is “the function that adds99 to its argument”, and so on. Haskell interprets the expression add 2 3 as “apply the functionadd 2 to the argument 3”.

5.1.4 Syntax

Syntax conventions for functional programming are different from “traditional” programmingstyles. You will have noticed, for example, that we write inc n rather than inc(n). Here aresome of the rules.

� The “blank” operator denotes function application. Thus f x means “the result of applying thefunction f to the argument x”. The technical name for the blank operator is juxtaposition .Juxtaposition binds higher than any other operation.

� Because application binds strongly, we must put parentheses around arguments that are notsimple. For example, the Haskell expression f n+1 denotes f(n) + 1; if we want the effect off(n + 1), we must write f(n+1).

� Haskell interprets f x y as ((f x) y). First, the function f is applied to the argument x;then the result, which must be a function, is applied to the argument y.

� The application f(g(x)) applies g to x and then applies f to the result. We can write this inHaskell either as f(g(x)) or as f (g x).

Compared to programs in most other languages, Haskell programs are remarkably free from “syn-tactic clutter” — braces, semicolons, and such. The trade-off is that Haskell programmers mustrespect layout rules . Unlike C++, for example, in Haskell indentation and line breaks make adifference. The rules are simple and intuitive and, if you use existing source code as a model, youmay never have to learn them explicitly.

It is important to know that the rules exist, however, because otherwise Haskell’s error messagescan be quite confusing. For example, inserting this definition into lec1.hs

fc6 n = ifn == 0 then 1 else n * fc6(n - 1)

gives this error:

Reading file "e:/courses/comp348/src/lec1.hs":ParsingERROR "e:/courses/comp348/src/lec1.hs":27 -

Syntax error in expression (unexpected ‘;’, possibly due to bad layout)

Haskell allows semicolons but does not require them. However, the parser inserts a semicolon atthe end of every line. Consequently, if you break a line in the wrong place, you get the mysteriousdiagnostic “unexpected ‘;’ ”.

There’s nothing wrong with line breaks, provided that they occur in sensible places. The followingdefinition is fine:

fc6 n =

5 HASKELL 43

if n == 0then 1else n * fc6(n - 1)

We have already seen the Haskell case expression. Cases must appear on separate lines:

case e ofp1 -> e1p2 -> e2p3 -> e3

or separated by semicolons:

case e of p1 -> e1 ; p2 -> e2 ; p3 -> e3

There are a few other Haskell constructs for which layout is important, and we will state the ruleswhen we introduce the constructs.

5.1.5 Tracing

Debugging functional programs is difficult. One problem is that we cannot interleave print state-ments with other statements because functions programs don’t have statements. Why? Becausestatements have side-effects and functional programs are not supposed to have side-effects.

The Hugs Trace module provides a way around the problem of no side-effects. It defines a functiontrace with two arguments: a string and an expression. When trace is called, it prints the messageas a side-effect and returns the expression.

To use the Trace module, write

import Trace

at the beginning of your program. To use it, replace any expression e by

(trace (s) (e))

in which s is a string. The parentheses are not always necessary (depending on the context) butthey do no harm.

Since trace returns the value of e, the meaning of your program should not be affected. However,you will be able to monitor its progress by looking at what trace prints.

Here is a demonstration of Trace in use. We import the Trace module and define our old friend,the factorial function. We keep the original version for later use and define another version with acall to trace.

module Demo where

import Trace

{-fac 0 = 1fac n = n * fac (n - 1)-}

fac 0 = 1fac n = trace ("fac " ++ show n ++ "\n") (n * fac (n - 1))

Calling the traced version of fac gives:

5 HASKELL 44

Demo> fac 6fac 6fac 5fac 4fac 3fac 2fac 1720 :: Integer

5.2 Details

5.2.1 Lists

We have already seen (in Section 5) that Haskell supports lists. Here are the basic built-in opera-tions for lists:

� Lists are enclosed in square brackets: [...].

� The empty list is [].

� Items within lists are separated by commas: [1,2,3].

� A list may contain items of any type, but all of the items in a given list must have the sametype (note the use of -- to separate comments from code):

Lec1> [1,2,3] -- integer list - OK[1,2,3] :: [Integer]

Lec1> ["abc","def"] -- string list - OK["abc","def"] :: [[Char]]

Lec1> [’a’,’b’,’c’] -- a character list is a string"abc" :: [Char]

Lec1> [1, 1.5] -- integers are promoted if necessary[1.0,1.5] :: [Double]

Lec1> [1,’a’] -- mixed types - BAD!ERROR - Illegal Haskell 98 class constraint in inferred type*** Expression : [1,’a’]*** Type : Num Char => [Char]

� The infix operator : builds lists: it takes an item on the left and a list on the right. Theoperator : is right-associative, so parentheses are not required in the second example below:

Lec1> 1:[][1] :: [Integer]

Lec1> 1:2:[][1,2] :: [Integer]

A very common idiom in Haskell progamming is x:xs (read “x followed by xes”). This is alist consisting of an item x and a list xs. The initial letter may vary: for example, a:as,y:ys, and so on.

5 HASKELL 45

� The expression [x..y] constructs a list, provided that x and y are members of the sameenumerated type:

Lec1> [2..4][2,3,4] :: [Integer]

Lec1> [3,6,..20][3,6,9,12,15,18] :: [Integer]

Lec1> [’a’..’z’]"abcdefghijklmnopqrstuvwxyz" :: [Char]

Lec1> [5..2][] :: [Integer]

� The infix operator ++ concatenates lists.

There are functions for extracting the first item of a list (head) and the remaining items of the list(tail) but we don’t often need them. The reason is that we can define list functions by using theconstructors as patterns. Most function definitions need only two cases: what is the result for anempty list, []? and what is the result for a non-empty list x:xs?

For example, to find the length of a list, we define:

len [] = 0len (x:xs) = 1 + len xs

The parentheses are required: since juxtaposition has higher precedence than colon, Haskell wouldparse len x:xs as (len x):xs and report a type error.

If the append operator ++ did not exist, we could append lists using our own function app definedby

app [] ys = ysapp (x:xs) ys = x:app xs ys

Any Haskell function with two arguments1 can be used as an infix operator by putting back quotes(grave accents) around it. The second example below demonstrates this. The third example showswhat happens if we try to build a list with mixed types.

Lec1> app [1,2,3] [4,5,6][1,2,3,4,5,6] :: [Integer]

Lec1> [1,2,3] ‘app‘ [4,5,6][1,2,3,4,5,6] :: [Integer]

Lec1> app [’a’,’b’] [1,2]ERROR - Illegal Haskell 98 class constraint in inferred type*** Expression : app [’a’,’b’] [1,2]*** Type : Num Char => [Char]

A common operation on lists is to select items that satisfy a predicate. For example, to obtain theeven items in a list of integers, we could use the Prelude function even:

evens [] = []evens (x:xs) = if even x then x:evens xs else evens xs

1Of course, it hasn’t really got two arguments — see Section 5.1.3.

5 HASKELL 46

In functional programming, however, we usually want to generalize, using functions to do so.The function filt takes a predicate p and a list, and returns the items in the list that satisfythe predicate. (A better name would be filter but a function with this name is defined in thePrelude: it does the same as filt.)

filt f [] = []filt p (x:xs) = if p x then x:filt p xs else filt p xs

Now we can write:

Lec1> evens [1..20][2,4,6,8,10,12,14,16,18,20] :: [Integer]Lec1> filt even [1..20][2,4,6,8,10,12,14,16,18,20] :: [Integer]

Using filt, we can write a better definition for evens:

evens = filt even

5.2.2 Lazy Evaluation

Haskell evaluates expressions lazily : this means that it never evaluates an expression until thevalue is actually needed. It is perfectly alright to introduce a non-terminating definition into aHaskell program, but things go wrong when we try to use it. Here is a non-terminating definition

loop = loop

and here is an attempt to use it (the message {Interrupted!} appears when the user interruptsthe program, for example, by entering ctrl-C):

Lec1> loop + 3{Interrupted!}Lec1>

There are other things that fail:

Lec1> 1/0Program error: primDivDouble 1.0 0.0

But now define a constant function that doesn’t use its argument:

one n = 1

Since Haskell never evaluates the argument of this function, we can give it any argument at all:

Lec1> one 771 :: IntegerLec1> one (1/0)1 :: IntegerLec1> one loop1 :: Integer

Lazy evaluation allows us to work with infinite lists. The lists do not have an infinite number ofitems: what “infinite” means in this context is “unbounded” — the list appears to have as manyitems as we need. For example, if we define

from n = n:from (n+1)nats = from 1

5 HASKELL 47

then from n yields an unbounded list starting at n and nats yields as many natural numbers aswe might need: [1,2,3,...]. We can use the built-in function take to bite off finite chunks fromthe beginning of an infinite list:

Lec1> take 6 nats[1,2,3,4,5,6] :: [Integer]

Lec1> take 6 (evens nats)[2,4,6,8,10,12] :: [Integer]

Here is an alternative definition of nats:

nats = [1..]

As an application of infinite lists, the function nomul returns the items in list l that are notmultiples of n. The function sieve takes a list, and returns its first item n followed by the listobtained by applying itself to a list with multiples of n.

nomul n l = filter (\m -> mod m n /= 0) lsieve (n:ns) = n:sieve (nomul n ns)

This is what happens when we apply sieve to the list [2,3,4,...].

Lec1> sieve (from 2)[2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89Interrupted!

5.2.3 Pattern Matching

It is time to look more closely at some of the simple definitions we have already seen:

fac 0 = 1fac n = n * fac(n - 1)

len [] = 0len (x:xs) = 1 + len xs

The object following the function name on the left of = is a pattern . When the function is called,the argument of the function is matched against the available patterns. If a match succeeds,the expression on the right side of = is evaluated. Pattern matching may include binding , whichassigns values to variables in the expression.

Examples:

� In the call fac 0, the argument 0 matches the pattern 0. The expression 1 is evaluated,yielding 1.

� In the call fac 3, the argument 3 matches the pattern n, yielding the binding {n = 3 }. Theexpression n * fac(n - 1) is evaluated with this binding, yielding 3 * fac(2).

� In the call len [], the argument [] matches the pattern []. The expression 0 is evaluated,yielding 0.

� In the call len [27..30], the argument is equivalent to 27:28:29:30. This matches thepattern x:xs, yielding the bindings {x = 27, xs = [28..30] }. The expression 1 + len xs isevaluated with these bindings, yielding 1 + len [28..30].

Pattern matching occurs in a number of contexts in Haskell. All of these contexts can be expressedin terms of the case expression:

5 HASKELL 48

Kind Pattern Expression Condition for match BindingWild card e none noneConstant k e k==e noneVariable v e none v = e

Constructor C p1 p1 C ′ e1 e2 C = C ′, p1 ∼ e1, p2 ∼ e2

Figure 6: Pattern matching

case e of p1 -> e1 ; p2 -> e2 ; ...; pn -> en

Here is how other kinds of expression map to case expressions:

\ p1 p2 . . . pn → e = \x1x2 . . . xn → case (x1, x2, . . . , xn) of (p1, p2, . . . , pn)→ e

where the xi are new variablesif e1 then e2 else e3 = case e1 of True→ e2 ; False→ e3

let p1 = e1; p2 = e2; . . . pn = en; in e = case (e1, e2, . . . , en) of (p1, p2, . . . , pn)→ e2

The case expression

case e of p1 -> e1 ; p2 -> e2 ; ...; pn -> en

is evaluated as follows:

� The expression e is matched against p1, p2, . . . , pn in turn.

� If e matches pi, then the value of the expression is obtained by evaluating ei with any bindingsobtained by matching.

� If none of the patterns match, the expression fails.

Note that the order of patterns is significant. For example, the following definition of the factorialfunction does not work, because the first pattern always succeeds and evaluation never terminates:

fac n = n * fc3(n - 1)fac 0 = 1

Figure 6 summarizes the important kinds of pattern and the way in which they match. Theexpression p ∼ e stands for “pattern p matches expression e”.

� The underscore character, , is called a “wild card” because it matches anything but producesno binding. Use the wild card when you don’t need the value of the argument in the function.For example, the head and tail selectors for lists can be defined as

head (x: ) = xtail ( :xs) = xs

� A constant matches a constant with the same value but produces no binding. For example,[] matches the empty list but nothing else.

� A pattern consisting of a variable name matches any expression and binds that expression tothe variable.

� If the pattern is a constructor with patterns as arguments, the expression must be an invocationof the same constructor to expressions. The corresponding expressions are matched and thebindings from the matches are combined. Since “:” is the list constructor, the pattern (3:xs)matches any list whose first element is 3.

5 HASKELL 49

Haskell patterns are linear . This means that each variable in a pattern must be distinct. Supposethat we are defining a function merge that has two lists and arguments and that we need to considerthe case in which the first item in both lists is the same. We are not allowed to write

merge (x:xs) (x:ys) = ... -- BAD!

Instead, we must write

merge (x:xs) (y:ys) = if x == y then ... else ...

Here are some examples of the pattern matching rules. First, the most common two cases for alist function are the empty list pattern, [], and the non-empty list pattern, (x:xs). This patternmatches the list with one item, in which case xs is bound to [].

Sometimes a third pattern is necessary. For example, if we define

swap [] = []swap (x:y:ys) = y:x:swap ys

the expression “swap [3]” gives an error because the list with one element does not match anypattern. The solution is to include this case in the definition:

swap [] = []swap [x] = [x]swap (x:y:ys) = y:x:swap ys

The tuple is another kind of constructor in Haskell. (A tuple corresponds roughly to a C++

struct.) Tuples are enclosed in parentheses and the items inside them are separated by commas.The items need not be of the same type. Unlike lists, tuples of different lengths have differenttypes.

A tuple with two items is usually called a pair . We can define selectors for pairs like this:

first (x, ) = xsecond ( ,y) = y

However, these definitions are not really necessary, because the Prelude defines functions fst andsnd that do the same thing.

We can also define

plus (x,y) = x + y

This is not the same function as the Prelude function

add x y = x + y

The argument of plus must be a pair; there is no equivalent to “add 1”.

Functions can return tuples. The following function breaks a list into two roughly equal parts andreturns the two parts as a pair:

brk xs = (take (len xs ‘div‘ 2) xs, drop (len xs ‘div‘ 2) xs)

Obviously, we would like to avoid the repeated evaluation of “len xs ‘div‘ 2”. In a procedurallanguage, we would introduce a local variable. The Haskell equivalent is:

brk xs = (take m xs, drop m xs)where m = len xs ‘div‘ 2

This may not look “functional” but in fact it is. It is “syntactic sugar” for

brk xs = (\m -> (take m xs, drop m xs)) (len xs ‘div‘ 2)

5 HASKELL 50

Description Syntax Associativitysimple term tparenthesized term (t)as-patterns a@p rightapplication f x leftdo, if, left, \, case (left) rightcase (leftwards) right

infix operators — see Figure 8function types -> rightcontexts =>type constraints ::do, if, left, \, case (rightwards) rightsequences ..generators <-grouping , n-aryguards |cases ->definitions =separation ;

Figure 7: Parsing precedence, from highest (at the top) to lowest (at the bottom).

AssociativityPrecedence Left None Right

9 !! .8 ^ ^^ **7 * / ‘div‘ ‘mod‘

‘rem‘ ‘quot‘6 + -5 : ++4 == /= < <= > >=

‘elem’ ‘notElem’3 &&2 ||1 >> >>=0 $ $! ‘seq’

Figure 8: Operator precedence, from highest (at the top) to lowest (at the bottom).

5.2.4 Operators

Figures 7 and 8 show the precedence rules that Haskell uses for parsing expressions and operators.We have not encountered all of them yet.

� Some constructs (do, if, let, \, and case) apparently have two precedences: one leftwardsand one rightwards. This is because of a metarule which says that each of these constructs

5 HASKELL 51

extends as far to the right as possible. In the first line in the table below, let seems to havelower precedence than +, because the whole expression “a + a” is part of it. In the secondline, let seems to have higher precedence than +, because its components group together. Thesecond line is an application of the metarule: the let expression extends as far as possible.Note parentheses override the metarule, as shown in the third lines of the table. Note alsothat all three parsings are what we might intuitively expect.

The expression parses aslet a = 3 in a + a let a = 3 in (a + a)k + let a = 3 in a + a k + (let a = 3 in (a + a))k + (let a = 3 in a) + a (k + (let a = 3 in a)) + a

5.2.5 Building Haskell Programs

In procedural programming, we have a collection of control structures, or abstraction mechanisms(assignment, sequence, conditional, loop, procedure call, etc.), and we use them to build programs.In Functional Programming (FP), we have only one abstraction mechanism: function definition.This makes the task of programming both simpler and harder: simpler because there are fewerthings to worry about, harder because we have a more limited toolkit.

The tendency in FP is to define a lot of small functions and then to build programs using thesefunctions. This can be confusing for the beginner, because the number of small functions is large.But, with experience, this approach to programming becomes quite manageable.

The Prelude is a collection of functions that many programmers have found useful. It is often easierto define a new function using Prelude functions than it is to write the function “from scratch”.Here are some of the Prelude functions and their definitions.

Since Haskell does not have an explicit loop construct, recursion is used frequently. The key stepin writing a function is often to choose the appropriate parameter for recursion.

Recall that “take n xs” returns the first n items of the list x. If n is zero, the list is unchanged.Otherwise, we use recursion on the value of n. To be safe, we define the appropriate response whenthe list is empty.

take 0 xs = []take [] = []take n (x:xs) = x:take (n-1) xs

Recall also that “drop n xs” throws away the first n items of the list xs and returns the rest. Thedefinition is similar.

drop 0 xs = xsdrop [] = []drop n ( :xs) = drop (n-1) xs

The functions takeWhile and dropWhile are only a little more complicated.

takeWhile p [] = []takeWhile p (x:xs) = if p x then x : takeWhile p xs else []

dropWhile p [] = []dropWhile p (x:xs) = if p x then dropWhile p xs else x:xs

Haskell provides another way of writing the last line. The infix operator @, read “as”, allows us towrite a pattern in two equivalent ways. In this case, we want xs and x:xs’ to stand for the same

5 HASKELL 52

list. Note that one or more primes may be added to a variable name to give a new name, as inmathematics.

dropWhile p xs@(x:xs’) = if p x then dropWhile p xs’ else xs

The functions break and span are complementary. They split a list into two parts depending onthe value of a predicate.

Prelude> span (<5) [1..20]([1,2,3,4],[5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])

Prelude> break (>=5) [1..20]([1,2,3,4],[5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])

In particular, note:

Prelude> break isSpace "Here is a sentence."("Here"," is a sentence.") :: ([Char],[Char])

The definitions of span and break are as follows:

span p [] = ([],[])span p xs@(x:xs’) = if p x

then (x:ys, zs)else ([],xs)

where (ys,zs) = span p xs’

break p = span (not . p)

With these functions working, we can write words, which splits a string into words separated byblanks, and unwords, which puts the words together again.

words s = case dropWhile isSpace s of"" -> []s’ -> w : words s’’

where (w,s’’) = break isSpace s’

unwords [] = ""unwords [w] = wunwords (w:ws) = w ++ ’ ’ : unwords ws

Now we are ready for something more elaborate. Suppose we have a very long string of words,perhaps an entire file, and we want to format so that each line is no longer than 80 characters.The first step is to break into up into a list of words using the function words.

Here is the first version of a function that does the rest. The idea is to put words into a line untilthe next word won’t fit — that is, it will make the line more than 80 characters long. When thishappens, we put the line into the output and carry on.

The problem is that we have no local variables and no state. Where do we store the information inthis “line”? The answer is that we need an accumulating parameter and recursion. We definebrk1 as

brk1 [] line = [line]brk1 (w:ws) line = if length w + 1 + length line <= 80

then brk1 ws (line ++ " " ++ w)else line : brk1 (w:ws) []

5 HASKELL 53

If there is something in our word string w:ws, this function calculates how long the line will becomeif we append a blank and the word w to it. If the length is less than 80 characters, the word ismoved into the line. Otherwise, the result is a list consisting of the line and the result of processingthe rest of the word list. We start it off with an empty line:

brk1 sentence []

This function has one error: the first character of every line that it outputs is a blank. The nextversion corrects that error.


thenif line == ""

then brk2 ws welse brk2 ws (line ++ " " ++ w)

else line : brk2 (w:ws) []

Finally, to make our function presentable to users, we wrap its parts up into a single definition:

format sen = brk (words sen) [] wherebrk [] line = [line]brk ws@(w:ws’) line = if length w + 1 + length line <= 80

thenif line == ""

then brk ws’ welse brk ws’ (line ++ " " ++ w)

else line : brk ws []

5.2.6 Guards

The pattern-matching mechanism is handy but rather inflexible because it only allows for exactmatching. For example, if we want a factorial function that does not blow up for negative numbers,we have to write

fac n = if n <= 0 then 1 else n * fac(n-1)

We can do better with guards. A guard is a condition that qualifies a pattern. As we know, thepattern

n

matches any number. The pattern

n | n < 0

matches any negative number. Once we have provided a pattern, we can specify several guards,each with their appropriate definitions. This version of factorial works for all integers:

fac n| n <= 0 = 1| n > 0 = n * fc(n-1)

The guard otherwise includes any conditions not covered by previous guards. For example, theactual definitions of take and takeWhile in the Prelude are:

take n | n <= 0 = []take [] = []take n (x:xs) = x : take (n-1) xs

5 HASKELL 54

takeWhile p [] = []takeWhile p (x:xs)

| p x = x : takeWhile p xs| otherwise = []

5.3 Types

Haskell is a strongly typed language. This means that, if a program compiles without error, itwill either give an answer of the expected type or it will fail. Semantically, every type includes avalue ⊥ (pronounced “bottom” because it is the minimal element of a lattice) that denotes failure.For example, if the type of a function is Int, the function will return either an integer or ⊥. Failureis discussed in Section 5.3.2. For the rest of this section, we will discuss programs that do not fail.

Haskell uses the Hindley-Milner type system . An expression is written without type declara-tions but, if it is type correct, the compiler can find a principle type or most general type forit. There are a small number of situations in which the type checker needs a type declaration tohelp it.

The following examples illustrate principle types. First, declare i to be the identity function. Inall of these examples, we follow the convention of declaring the type before the function. The typedeclaration is accepted in this form by the Haskell compiler. It can also be obtained using the:type command in the Hugs interpreter.

i :: a -> ai x = x

The Haskell type is a -> a which we can read as “the type of a function that takes an argumentof any type and returns a result of the same type”. The letter a is called a type variable . In theformal type system, this type is written ∀ a .a→ a and is read “for all types a, the type a to a”.

The next example is a constant function: it takes two arguments and throws away the second one.

k :: a -> b -> ak x y = x

There are two points to note. First, the type is a → b → a although you might have expecteda×b→ a. This is because Haskell functions take their arguments “one at a time”. The right arrowassociates to the right and the type a → b → a is parsed as a → (b → a). In other words, k is afunction that takes one argument of type a and returns a function with type b→ a. The followingtests verifies this:

Types> k ’x’k ’x’ :: a -> Char

Types> k ’x’ 5’x’ :: Char

The second point is simply that the type contains two type variables, a and b. Since there isnothing in the definition that connects the types, the compiler assumes that they are different(although they don’t have to be).

In the next example, the two arguments are used to build a list. Consequently, they must have thesame type. The type [a] is the type of a list whose elements have type a.

two :: a -> a -> [a]two x y = [x,y]

5 HASKELL 55

When we apply a function to an actual argument, the types become specialized. Providing a Charas the first argument of two forces the second argument to be Char as well.

Types> two ’a’two ’a’ :: Char -> [Char]

The function two ’a’ accepts an argument of type Char but an argument of another kind gives atype error.

Types> two ’a’ ’b’"ab" :: [Char]

Types> two ’a’ 3ERROR - Illegal Haskell 98 class constraint in inferred type*** Expression : two ’a’ 3*** Type : Num Char => [Char]

The next function introduces a new feature: it uses the equality predicate ==. Haskell allows thiseven without further type information, but adds a requirement to the type: it must be an equalitytype . The expression Eq => is called a context and it says that the type a must be an instanceof the class Eq.

cmp :: Eq a => a -> a -> [a]cmp x y = if x == y then [x,y] else []

If we change == to >, we get a different requirement: the type a must be an instance of the classOrd, also called an ordinal type . The class Ord is a subclass of the class Eq: this means that anytype that provides > must also provide ==.

gt :: Ord a => a -> a -> [a]gt x y = if x > y then [x,y] else []

Something else happens if we change one argument of > to the digit “1”. The context becomes(Ord a, Num a), asserting that a must be both an ordinal type and a numeric type .

n1 :: (Ord a, Num a) => a -> a -> an1 x y = if x > 1 then x + y else 0

If we write 1.5 instead of 1, we obtain yet another context: a must be an ordinal type and afractional type .

n2 :: (Ord a, Fractional a) => a -> a -> an2 x y = if x > 1.5 then x + y else 0

As stated above, Haskell finds the most general type. We can specialize types in two ways. First,we can add a type annotation to any variable or expression in the definition. Note that we are notallowed to annotate a pattern — only an expression. If we specify that x is a Double, the compilerinfers that y must be a Double too, because x and y both occur in the expression x + y.

n3 :: Double -> Double -> Doublen3 x y = if (x::Double) > 1.5 then x + y else 0

We can also specify the type of the entire function. Here is the same function again, but with thetype Int specified by the programmer. Asking Haskell for the type of this function shows that ithas used the type we provided rather than the more general type.

n4 :: Int -> Int -> Intn4 x y = if x > 1 then x + y else 0

Types> :t n4

5 HASKELL 56

n4 :: Int -> Int -> Int

Of course, if we do provide a type, we’d better get it right. In the next example, we provide anincorrect type for the same function:

n5 :: Int -> Char -> Intn5 x y = if x > 1 then x + y else 0

This attempt does not fool the compiler:

Reading file "e:/courses/comp348/src/types.hs":Type checkingERROR "e:/courses/comp348/src/types.hs":31 - Type error in application*** Expression : x + y*** Term : x*** Type : Int*** Does not match : Char

It is often easier to write functions than to figure out their types, which can get quite complicated.In this example, we apply a function to a list. Haskell infers: that the most general type of thefunction is a→ b; the second argument of apply must be a list of as; and the result is a list of bs.

apply :: (a -> b) -> [a] -> [b]apply f [] = []apply f (x:xs) = f x : apply f xs

Types can get even longer, as the following example shows. The individual types are f :: a → b,g :: c→ a, h :: d→ c; the type of x is d, and the type of the result is b.

compose :: (a -> b) -> (c -> a) -> (d -> c) -> d -> bcompose f g h x = f (g (h x))

The type of a tuple is written in the same way as values of the type. In this example, the type(a, b) is “the type of pairs whose first element has type a and whose second element has type b”.Similar notation is used for longer tuples.

cp :: a -> b -> (a,b)cp x y = (x, y)

Summarizing what we have seen so far:

� All valid Haskell functions and expressions have a type .

� The type inferred by the compiler is the most general type .

� The most general type may be specialized by type annotations provided by the user.

� A context indicates that a type is an instance of a class .

5.3.1 Using Types

Since Haskell infers types, it is possible to virtually ignore types while programming. This isunwise, for two reasons:

� It is often useful to have a clear idea of the types of your variables when writing functions.

� Haskell errors are very often caused by type mismatches. If you don’t understand the typesystem, you will find these errors very hard to understand. If you do understand the typesystem, you will find them quite helpful.

5 HASKELL 57

For example, one version of the function brk function in Section 5.2.5 looked like this:


then brk1 ws (line ++ ’ ’ ++ w)else line : brk1 (w:ws) []

Haskell did not accept this version, reporting:

Reading file "e:/courses/comp348/src/lec1.hs":Type checkingERROR "e:/courses/comp348/src/lec1.hs":9 - Type error in application*** Expression : ’ ’ ++ w*** Term : ’ ’*** Type : Char*** Does not match : [a]

To make sense of this error:

� In the line beginning ERROR, the :9 after the path name is the line number. It identifies thenbrk1 ws (line ++ ’ ’ ++ w) as the line with the error.

� The expression which cannot be typed is ’ ’ ++ w.

� The term (that is, part of the expression) which is causing the problem is ’ ’. It has typeChar, but the type checker is expecting something of type [a] — that is, a list.

� The correction is to replace ’ ’ by something which is a list; in this case, [’ ’] or simply "".

When you are reading someone else’s function definitions, knowing the types of the functions oftenmakes it easier to understand them. For this reason, it is a common convention in Haskell (and otherfunctional languages) to write the type declaration, as inferred by the compiler, above the functiondeclaration. Haskell accepts these type declarations, provided they are correctly formatted, andreports an error if there is an inconsistency between the declaration type and the inferred type.However, you are allowed to declare a more specialized type than the type that Haskell infers. Forexample, if your module includes the declaration

app :: [Int] -> [Int] -> [Int]app [] ys = ysapp (x:xs) ys = x:app xs ys

Haskell will accept the declaration, record the type you have specified, and give an error if you tryto use it incorrectly:

Lec1> :info appapp :: [Int] -> [Int] -> [Int]

Lec1> app "abc" "def"ERROR - Type error in application*** Expression : app "abc" "def"*** Term : "def"*** Type : String*** Does not match : [Int]

There are some occasions when the type system doesn’t work as you would expect. You may havealready discovered this problem when representing sets as lists:

Lec1> union [] []

5 HASKELL 58

ERROR - Unresolved overloading*** Type : Eq a => [a]*** Expression : union [] []

In fact, even [] gives a type error:

Prelude> []ERROR - Cannot find "show" function for:*** Expression : []*** Of type : [a]

This is a deficiency of the type system, but it is not an important one. If the expression union [] []arises in a context, the empty lists will be the result of some computation and their types will beknown. Here is a contrived example: we define

comb n = union [1,3..n] [1,5..n]

and evaluate comb n with n = 0. This calls union [] [], but Haskell does not complain:

Lec1> comb 0[] :: [Integer]

5.3.2 Failure

There are only two main causes of failure:

� A function is given an argument that is of the correct type but outside the domain of thefunction. E.g., head [] or 1/0.

� A function fails to terminate.

Although the semantics says that the value of a failed program is always ⊥, the symptoms offailure can vary quite a lot.

� Domain errors are usually trapped by the runtime system:

Prelude> 1/0Program error: primDivDouble 1.0 0.0

Prelude> head []Program error: head []

� A non-terminating program may run until it is interrupted, whether or not it is producingoutput:

Prelude> [1..][1,2,3,4,5,6,Interrupted!

Prelude> f where f = fInterrupted!

� Some implementations of Hugs, including WinHugs, do not detect stack overflow, with theresult that programs may crash:

Prelude> g 1 where g n = g(n+1)

5.3.3 Reporting Errors

There is a built-in function

5 HASKELL 59

error :: String -> a

The result returned by this function is the only value that is shared by all types: ⊥ (pronounced“bottom”). As a side-effect, it displays the string.

Here is an example of its use. The definition of head in the Prelude is:

head (x:xs) = xhead [] = error "headPreuldeList: head[]"

5.3.4 Standard Types

There are a number of types that are either built-in, defined in the Prelude, or defined in libraries.The first letter of a type is always upper-case. This is not a convention: the compiler uses the caseof the first letter to distinguish type names from type variables, ordinary variables, and functionnames.

� The type Bool has values True and False.

� The type Char has the ASCII characters as its values. The type String is equivalent to[Char].

� There are a number of numeric types. These include Int (fixed size integers); Integer (integersof any size); Float (single-precision floats); Double (double precision floats); and Rational(ratios of Integers).

� The type () has only two values: (), which is the empty tuple, and ⊥ (bottom). This typecorresponds roughly to void in C++.

5.3.5 Standard Classes

The Standard Classes of Haskell are defined in the Prelude. Here are a couple of examples. Theequality class specifies that a type instance must provide the functions == and /=. However, ifthe user supplies one of these functions, the compiler will provide the other one automatically, asshown by the mutually recursive definitions.

class Eq a where(==), (/=) :: a -> a -> Bool-- Minimal complete definition: (==) or (/=)x == y = not (x/=y)x /= y = not (x==y)

The ordinal class inherits from the equality class and provides additional functions for comparison.An instance type must provide at least <=; all of the other functions can be inferred by the compiler.

class (Eq a) => Ord a wherecompare :: a -> a -> Ordering(<), (<=), (>=), (>) :: a -> a -> Boolmax, min :: a -> a -> a-- Minimal complete definition: (<=) or compare-- using compare can be more efficient for complex typescompare x y | x==y = EQ

| x<=y = LT| otherwise = GT

x <= y = compare x y /= GT

5 HASKELL 60

�� Eq

��

��

@@

@@R

�� Show

��

�� Ord

@@

@@R

�� Num

��

��

@@

@@R�� Enum

@@

@@R

�� Real

��

��

@@

@@R

�� Fractional

��

��

@@

@@R�� Integral

�� RealFrac

@@

@@R

�� Floating

��

�� RealFloat

Figure 9: Standard Class Inheritance Graph

x < y = compare x y == LTx >= y = compare x y /= LTx > y = compare x y == GTmax x y | x <= y = y

| otherwise = xmin x y | x <= y = x

| otherwise = y

Figure 9 shows the class inheritance graph for the standard classes. There is no need to memorizethis graph: you will need it only if you are designing your own numeric type and, even then,Haskell’s error messages will guide you. What is more important is to know the functions that aclass is supposed to implement; you can get this information from the Prelude, as shown above.The class Show must provide a function that translates a value of the type into a string.

5.3.6 Defining New Types

As an illustration of how programmers can define new Haskell types, we discuss the type Frac,shown in Figure 10. This type is actually unnecessary, because there is already a type Rational,but it is not defined in the Prelude.

� The first step is to declare the type and constructors for it. The name of the type is Frac andthe name of the only constructor is F. These names do not have to be different: we could haveused Frac for both. To construct a fraction, we must provide two integers:

Frac> F 9 129/12 :: Frac

5 HASKELL 61

module Frac where

data Frac = F Integer Integer

instance Show Frac whereshow (F n d) = if d == 1

then show nelse show n ++ "/" ++ show d

instance Eq Frac whereF n d == F n’ d’ = n * d’ == n’ * d

instance Ord Frac whereF n d <= F n’ d’ = n * d’ <= n’ * d

instance Num Frac whereF n d + F n’ d’ = normalize( F (n * d’ + n’ * d) (d * d’) )F n d - F n’ d’ = normalize( F (n * d’ - n’ * d) (d * d’) )F n d * F n’ d’ = normalize( F (n * n’) (d * d’) )negate (F n d) = normalize( F (negate n) d )fromInteger n = F n 1

instance Fractional Frac whererecip (F n d) = normalize( F d n )

normalize :: Frac -> Fracnormalize (F n d) =

if d == 0then error ("Bad fraction: " ++ show (F n d))

else if n == 0then F 0 1

else if d < 0then normalize (F (negate n) (negate d))

else F (n ‘div‘ g) (d ‘div‘ g)where g = gcd n d

-- Harmonic series: 1/1 + 1/2 + 1/3 + ...harm :: Integer -> [Frac]harm n

| n <= 0 = []| otherwise = harm (n - 1) ++ [F 1 n]

Figure 10: The Class Frac

5 HASKELL 62

We could introduce new function names for operations on fractions. For example, addFracmight add two fractions. There are many advantages, however, to making the type Frac aninstance of numeric classes. If we do this, we can overload function names and, for example,add fractions using +.

� From Figure 9, we can see that any numeric class must be an instance of class Show. Weprovide a version of the required function show that writes a fraction in the form "2/3". Asa nice touch, it shows fractions with denominator equal to 1 as integers.

� The next step is to make Frac an equality type. We define ==, and Haskell automaticallyprovides /=.

� We can provide an ordering for fractions. The function that we choose is not arbitrary: ifwe provide <=, as here, Haskell will provide a number of other functions, including <, >, >=,compare, min, and max.

Caution: providing a different comparison, such as >=, may lead to looping!

� The next step is to make Frac an instance of Num, which requires overloaded definitions of +,-, and *. It is useful also to define negate, because Haskell treats -x as (negate x), and afunction that converts integers to Fracs. Note that the Num class does not contain divisionoperations. This enables us to define types like Vector, which can be added, subtracted, andmultiplied, but not divided.

� Since we do intend Fracs to be divided, we make Frac an instance of Fractional. Allthat is needed is a function recip defined so that recip x = 1/x. Haskell will then definex/y = x× recip y.

� We want to work with fractions in “normal form”. For example, 4/6 should be converted to 2/3and 5/(−16) should be converted to −5/16. We also want to check for zero denominators andput the fraction zero into the normal form 0/1. All of these tasks are performed by normalize,which uses the standard function gcd to cancel common factors. We use normalize for everyoperation that constructs a new fraction.

� Finally and for fun, we introduce the function harm to construct a list of the fractions1/1, 1/2, 1/3, . . . , 1/n.

Figure 11 shows a few simple examples of the new type. The functions work as expected. Thelast example is of interest, because we did not define sum. However, sum is defined for all instancesof Num, so it works correctly for Frac. The result is exact because the components of a Frac areInteger (“large integers”) rather than Int (32-bit integers).

5.3.7 Defining Data Types

A type may have more than one constructor. Enumerations have several constructions but nodata, as in these declarations:

data Bool = False | Truedata Colour = Red | Green | Blue

Haskell provides special syntax (if/then/else) for conditional expressions because programmersare familiar with the English words. The special syntax is actually unnecessary, because we coulddefine

if True x y = xif False x y = y

relying on lazy evaluation to ensure that only one of x or y is actually evaluated.

5 HASKELL 63

Frac> F 3 4 > F 5 6False :: Bool

Frac> F 3 4 + F 7 124/3 :: Frac

Frac> $$ > F 6 5True :: Bool

Frac> F 7 4 / F 2 535/8 :: Frac

Frac> harm 5[1,1/2,1/3,1/4,1/5] :: [Frac]

Frac> sum (harm 50)13943237577224054960759/3099044504245996706400 :: Frac

Figure 11: Trying out type Frac

Binary Search Trees Constructors may also define data components. In Figure 12, we definea type for binary trees with integers at the nodes. A BTree is either Empty, or consists of an Intand two subtrees of type BTree. Here are some things to note about this definition:

� Function show uses a “helper” function with an additional argument to display trees. Usingthe tree t defined at the bottom of Figure 12:

BTree> t

12

34

56

8 :: BTree

� The functions size, height, and treeToList are trivial to define by cases on the constructors.

� The function insert assumes that the trees are binary search trees. A new entry is insertedin the left or right subtree unless it is equal to an entry already in the tree, in which case it isignored (an alternative would be to report an error).

� Comparing trees for equality, using == is straightforward. The important thing to note is thatthe apparently unnecessary case Empty == Empty = True must be included — otherwisethe function loops!

Constructors may define data components. As an example, we will develop a type for binary treeswith two constructors: Empty constructs an empty tree, and Tree n l r constructs a tree withinteger n at the root, a left subtree l, and a right subtree r. Note that this is a recursive typedefinition because BTree occurs on both sides of =. Just as with recursion in functions, we need abase case (in this example the base case is Empty) and an inductive case.

data BTree = Empty | Tree Int BTree BTree

5 HASKELL 64

module BTree whereimport Trace

data BTree = Empty | Tree Int BTree BTree

instance Show BTree whereshow t = sh t ""

wheresh Empty sp = ""sh (Tree n l r) sp = let sps = sp ++ " "

in sh r sps ++ sp ++ show n ++ "\n" ++ sh l sps

instance Eq BTree whereEmpty == Empty = True(Tree n l r) == (Tree n’ l’ r’) = n == n’ && l == l’ && r == r’

size Empty = 0size (Tree n l r) = 1 + size l + size r

height Empty = 0height (Tree n l r) = 1 + max (height l) (height r)

treeToList Empty = []treeToList (Tree n l r) = treeToList l ++ treeToList r ++ [n]

listToTree [] = EmptylistToTree (n:ns) = insert n (listToTree ns)

insert m Empty = Tree m Empty Emptyinsert m t@(Tree n l r) =

if m < n then Tree n (insert m l) relse if m > n then Tree n l (insert m r)else t

find Empty m = Falsefind (Tree n l r) m =

if m == n then Trueelse if m < n then find l melse find r m

merge x y = listToTree(treeToList x ++ treeToList y)

Figure 12: Class BTree

5 HASKELL 65

We can build trees using the constructors. This is a bit tedious, and we will discuss easier wayslater.

t = Tree 1( Tree 2

( Tree 3Empty

( Tree 4Empty( Tree 5

EmptyEmpty ) ) )

Empty )( Tree 6 Empty Empty )

The simple functions below illustrate how we use the constructors as patterns in functions thatoperate on trees. The size of a tree is the number of nodes that it contains and the height of a treeis the length of the longest path from the root to a leaf node.

size Empty = 0size (Tree n l r) = 1 + size l + size r

height Empty = 0height (Tree n l r) = 1 + max (height l) (height r)

The function treeToList converts a tree to a list. The ordering of the list is significant: it would benatural to use inorder (left/root/right), because this gives an ordered list of integers. However, wewant to be able to use the list to recreate the tree, and for this purpose postorder (left/right/root)is better.

Here are these functions in use:

BTree> size t6 :: Integer

BTree> height t5 :: Integer

BTree> treeToList t[5,4,3,2,6,1] :: [Int]

As with previous types, we make the type Btree an instance of class Show so that we can converttrees to strings. The implementation of show below uses a “helper function” sh with a secondargument giving the amount of indentation to use when printing a node. In recursive calls, theindentation is controlled by the depth of the node in the tree. For example:

BTree> t6

12

54

3:: BTree

5 HASKELL 66

We can easily define an equality relation on trees: trees are equal if the integers at their root nodesare equal and their subtrees are (recursively) equal. Surprisingly, we must declare the Empty isequal to Empty — without this component of the definition, the function loops.

instance Eq BTree whereEmpty == Empty = True(Tree n l r) == (Tree n’ l’ r’) = n == n’ && l == l’ && r == r’

Let us assume that our binary trees are going to be used as binary search trees (BST). Thatis, we must ensure that all entries in every left subtree are less than the root and that all entriesin every right subtree are greater than the root. The following function inserts an integer into aBST, maintaining the BST property. We have to decide what to do with duplicate entries, whichare not allowed in a binary search tree. This version of function insert ignores them.

It is important to “rebuild” the parts of the tree that we have taken apart. The following code forthe case m < n returns only part of the tree:

if m < nthen insert m l

The function insert provides another way of building trees. Note that the last node in theexpression is the first node to be inserted in the tree — and it becomes the root.

t1 = insert 1(insert 8

(insert 3(insert 2

(insert 5(insert 2

(insert 6(insert 4 Empty)))))))

BTree> t18

65

43

21

Although this is a slight improvement over calling constructors, it is still rather clumsy. Here is afunction that takes integers from a list and builds a BST from them.

listToTree [] = EmptylistToTree (n:ns) = insert n (listToTree ns)

t2 = listToTree [5, 2, 7, 0, 8, 3, 6, 1, 4]

BTree> t28

76

54

5 HASKELL 67

32

10

We have written treeToList and listToTree so that they are inverses.

BTree> t2 == listToTree (treeToList t2)True :: Bool

It is easy to merge trees using treeToList and listToTree.

merge x y = listToTree (treeToList x ++ treeToList y)

Then, with

t3 = listToTree [14, 9, 13, 16, 11, 17, 19, 18, 10, 20, 12]

we have

BTree> merge t2 t320

1918

1716

1413

1211

109

87

65

43

21

0:: BTree

The function find uses the familiar algorithm to decide whether a given integer is stored in thetree. To make trees useful, we should store a key/value pair at each node. Then find would matchthe key and return the corresponding value.

The fringe of a tree is a list of the leaf nodes as they would be visited in a left to right scan.Assuming that we can recognize a leaf, it is straightforward to define fringe, as in Figure 13.

We can test these functions with the trees that we already have.

BTree> fringe t1[1,3,5,8] :: [Int]

BTree> fringe t2[0,2,5,7] :: [Int]

5 HASKELL 68

isLeaf (Tree _ Empty Empty) = TrueisLeaf _ = False

fringe Empty = []fringe t@(Tree n l r) =

if isLeaf tthen trace ("Leaf " ++ show n ++ "\n") (n : (fringe l ++ fringe r))else fringe l ++ fringe r

sameFringe x y = fringe x == fringe y

Figure 13: The fringe of a tree

There is a famous problem in Computer Science called the “same fringe problem”. Given two trees,how do we decide whether they have the same fringe? Since list comparison is provided for anyequality type, and Integer is an equality type, we have immediately:

sameFringe x y = fringe x == fringe y

BTree> sameFringe t1 t2False :: Bool

Let us investigate this more closely. Here is a modified version of fringe that traces each discoveryof a leaf node:

fringe Empty = []fringe t@(Tree n l r) =

if isLeaf tthen trace ("Leaf " ++ show n ++ "\n") (n : (fringe l ++ fringe r))else fringe l ++ fringe r

This is what happens when we call sameFringe again with the traced version of fringe:

BTree> sameFringe t1 t2Leaf 1Leaf 0False :: Bool

5.3.8 Types with Parameters

In this section, we revisit trees, but with a new twist. The trees in Section 5.3.7 were defined withintegers at the nodes. In the declaration below, we define trees with an arbitrary type, indicatedby the type variable a, at the nodes. On the left side of the declaration, Tree a indicates that thetype variable a is a parameter of the type. Whenever we refer to the type on the right side, wemust specify it as Tree a again.

module Tree where

data Tree a = Empty | T a (Tree a) (Tree a)

In practice, trees with integer nodes are not very useful. We need to store key/value pairs atthe nodes so that we can store useful information in the tree, as shown in Figure 14. The type

5 HASKELL 69

module Tree where

data (Ord k, Show k, Show v) => Tree k v =Empty

| T k v (Tree k v) (Tree k v)

instance (Ord k, Show k, Show v) => Show (Tree k v) whereshow Empty = ""show (T key value left right) =show left ++leftjust (show key) 12 ++ show value ++ "\n" ++show right

leftjust :: [Char] -> Int -> [Char]leftjust s n

| length s >= n = s| otherwise = leftjust (s ++ " ") n

out t = putStr (show t)

insert :: (Ord a, Show a, Show b) => a -> b -> Tree a b -> Tree a binsert newKey newValue Empty = T newKey newValue Empty Emptyinsert newKey newValue t@(T key value left right) =

if newKey == key then T key newValue left rightelse if newKey < key then T key value (insert newKey newValue left) rightelse T key value left (insert newKey newValue right)

build :: (Ord a, Show a, Show b) => [(a,b)] -> Tree a bbuild [] = Emptybuild ((key,value):ks) = insert key value (build ks)

t = build [("Charles", 4163548695),("Anne", 5144831733),("George", 8147586977),("Edward", 4504657464),("Diana", 2124756843),("Feridoon", 6063647586),("Bill", 4508982212) ]

Figure 14: Key-value tree

5 HASKELL 70

variables k and v stand for “key” and “value” respectively. We require that k is an ordered typeso that we can compare keys.

When we construct a non-empty tree, we must now provide four arguments: the key, the value,the left subtree, and the right subtree. The insertion function insert can use the operators ==, <,and > with keys because the type of the keys is an instance of class Ord. When we supply a newpair with a key that already exists in the tree, the new value replaces the old value.

As before, we can reduce the amount of writing required by providing a function, build, thatbuilds a tree from a list of pairs.

Here is a test:

t = build [ (6,"Anne"), (3,"Bill"), (8,"Charles"), (1,"Diana") ]

There is a catch with parameterized types: when we use them as instances of a class, we mustrepeat all of the constraints. For example, to make Tree an instance of Show, we must write:

instance (Ord k, Show k, Show v) => Show (Tree k v) where....

In the source file:

t = build [("Charles", 4163548695),("Anne", 5144831733),("George", 8147586977),("Edward", 4504657464),("Diana", 2124756843),("Feridoon", 6063647586),("Bill", 4508982212) ]

A Haskell session:

Tree> t"Anne" 5144831733"Bill" 4508982212"Charles" 4163548695"Diana" 2124756843"Edward" 4504657464"Feridoon" 6063647586"George" 8147586977:: Tree [Char] Integer

Formulas It is often useful to define types with many constructors. The following declarationintroduces a type suitable for formulas in propositional calculus, with constants true and false,propositional symbols (or names), conjunctions, disjunctions, negations, and implications.

module Logic wheredata Formula =

TT| FF| Var String| Con [Formula]| Dis [Formula]| Neg Formula

5 HASKELL 71

| Imp Formula Formula

When we define a function for a type like this one, we must usually include cases for all of theconstructors. The following definition of show is incomplete because it does not indicate operatorprecedence.

instance Show Formula whereshow TT = "true"show FF = "false"show (Var name) = nameshow (Con []) = ""show (Con [f]) = show fshow (Con (f:fs)) = show f ++ " and " ++ show (Con fs)show (Dis []) = ""show (Dis [f]) = show fshow (Dis (f:fs)) = show f ++ " or " ++ show (Dis fs)show (Neg f) = "not " ++ show fshow (Imp p q) = show p ++ " => " ++ show q

As before, we can build formulas using the constructors, although this is rather tedious.

f = (Imp p (Imp p (Imp p p))) where p = Var "p"g = Imp (Var "p") (Con [Var "q", Neg (Var "r")])

Logic> fp => p => p => p :: Formula

Logic> gp => q and not r :: Formula

5.4 Input and Output

Haskell handles input and output using monads . Moands are pure functions that provide theeffect of managing state changes. In this course, we will not discuss monads in detail; instead wewill just provide some simple examples of their use.

The first program reads a character and echoes it. The keyword do is followed by a sequence ofactions . Each action must have the same indentation. The function getChar performs an actionand returns a value, in this case a Char. The function putChar takes a Char as its argument andperforms an action — writing the character.

test1 = doc <- getCharputChar c

The next example shows how to write functions that perform actions. The function ready performsthe following sequence of actions: output a prompt; read a character; return a Bool value thatdepends on the character read.

ready = doputStr "Ready? "char <- getCharreturn (char == ’y’)

5 HASKELL 72

test2 = dot <- readyif t

then putStr "\nReady"else putStr "\nNot ready"

The function getline in the next example is recursive. The else part requires an action sequencethat is introduced by a second do. The program test3 reads a line, applies a function (switch)to it, and then writes the line.

getline = dochar <- getCharif char == ’\n’

then return ""else do

line <- getLinereturn (char:line)

switch s = map sw s wheresw char =

if isLower charthen toUpper char

else if isUpper charthen toLower char

else char

test3 = doline <- getlineputStr (switch line)

The next example uses files instead of console functions.

process s = filter (not . isSpace) s

test4 = dos <- readFile "e:/temp/test.dat"writeFile "e:/temp/test.res" (process s)

Since it is not good practice to build file names into programs, the next example reads the inputand output file names from the console.

main = doputStr "Input file: "iFile <- getLineputStr "Output file: "oFile <- getLines <- readFile iFilewriteFile oFile sputStr (iFile ++ " copied to " ++ oFile ++ "\n")

5 HASKELL 73

5.4.1 String Formatting

By default, Hugs does not interpret formatting characters from strings. The following exchangewith the interpreter illustrates this (the standard function unlines converts a list of strings into asingle string with the list items separated by newline characters):

Prelude> unlines ["first line", "second line", "third line"]"first line\nsecond line\nthird line\n"

The IO function putStr interprets the formatting characters correctly:

Prelude> putStr(unlines ["first line", "second line", "third line"])first linesecond linethird line

Since putStr :: String -> IO () is an IO function, you can use it only at the “top level” —that is, as a command that you enter or as part of a do sequence. This does not matter in practice,because there is no need to format strings until you actually want to see them at the end of acalculation.

5.4.2 Interactive Programs

The standard IO functions include interact, a function that enables you to write interactiveprograms easily. The argument for interact is a function of type String -> String. A simpleexample of such a function is map toUpper. If you write

interact (map toUpper)

the interpreter will echo each character as you type it, converting lower case letters to upper caseletters.

The effect of lazy evaluation is important here. The effect is not to read a string of characters andthen output the same string with lower case letters converted. Instead, it reads each characterand then immediately echoes the converted character.

The following example provides a more elaborate version of the same idea. The function machine:: String -> String simulates a vending machine. The argument string is a sequence of “com-mands” interpreted as follows:

Command Effect’n’ Insert a nickel (5 cents)’d’ Insert a dime (10 cents)’q’ Insert a quarter (25 cents)’r’ Demand a refund

The output from the machine is a string defined as follows:

String Condition"" the amount entered is less than 25 cents"candy" the amount entered was exactly 25 cents"candy and n cents change" the amount entered was 25 + n cents"Refund: n" response to ’r’ with n cents stored

5 HASKELL 74

Here is the definition of the function machine:

machine commands = mac commands 0wheremac "" stored = "Amount left: " ++ show storedmac (’q’:cs) stored = respond (stored+25) csmac (’d’:cs) stored = respond (stored+10) csmac (’n’:cs) stored = respond (stored+5) csmac (’r’:cs) stored = "Refund: " ++ show stored ++ "\n" ++ mac cs 0mac (_:cs) stored = mac cs storedrespond stored cs =

if stored == 25then "candy\n" ++ mac cs 0

else if stored > 25then "candy and " ++ show (stored-25) ++ " cents change\n" ++ mac cs 0

else mac cs stored

The following dialog with Hugs shows the result when the user enters the string "qdddnnr". Toend the input string, and hence the dialogue, the user enters ctrl-z. The second example showsthe response to "dddd" followed by ctrl-z, showing that the machine reports the amount it hasleft before terminating.

VM> interact machinecandycandy and 5 cents changeRefund: 10Amount left: 0

VM> interact machinecandy and 5 cents changeAmount left: 10

VM> ....

5.5 Reasoning about programs

Haskell function definitions not only look like equations but, in an important sense, they areequations. We will illustrate this with a simple example of program manipulation. The followingdefinitions of append (append two lists) and length (length of a list) should be familiar by now.The only difference is that we have numbered the equations for future reference.

append [] ys = ys (1)append (x:xs) ys = x : append xs ys (2)

length [] = 0 (3)length (x:xs) = 1 + length xs (4)

Suppose that we want to define a new function len2 such that len2 xs ys returns the sum of thelengths of the lists xs and ys. Here is a simple definition:

len2 xs ys = length (append xs ys) (5)

5 HASKELL 75

The obvious problem with this definition is that it is inefficient: it constructs a new list just for thepurpose of finding its length. How can we avoid this inefficiency? As a first step, we instantiate(5) with xs = [] and then use (1) to simplify the resulting expression:

len2 [] ys = length (append [] ys) (6)= length ys (7)

Next, we instantiate (5) with xs = (x:xs) and then use (2), (4), and (5) to simplify the resultingexpression:

len2 (x:xs) = length (append (x:xs) ys (8)= length (x : append xs ys) (9)= 1 + length (append xs ys) (10)= 1 + len2 xs ys (11)

We can now combine (7) and (11) to obtain a definition of len2 that does not use append:

len2 [] ys = length ys

len2 (x:xs) ys = 1 + len2 xs ys

We could, of course, have obtained this definition of len2 with a little thought and no formaldeduction. Here is a somewhat more complicated example in which the final answer is less obvious.Suppose we would like a function isSubList such that isSubList xs ys returns True if and only ifxs is a sublist of ys. The following definition is clearly correct but not, as it stands, implementable(think of bs as the “before list” and as as the “after list”):

isSubList xs ys = ∃ bs,as • bs ++ xs ++ as == ys

To make progress, we assume that ys is not empty (and so can be written y:ys’) and that thereare two cases: either bs = [] or bs = y:bs’ (clearly, if bs is not empty, its first element must bey). Then we have:

isSubList xs (y:ys’) = ∃ as • [] ++ xs ++ as == ys’∨ ∃ bs’,as • y:bs’ ++ xs ++ as == y:ys’

= ∃ as • xs ++ as == ys’∨ ∃ bs’,as • bs’ ++ xs ++ as == ys’

= ∃ as • xs ++ as == ys’∨ isSubList xs ys’

The second disjunct is a straightforward recursion. The first requires another function which, onthe basis of its required behaviour, we will call isPrefix. The definition of isPrefix is

isPrefix xs ys = ∃ as • xs ++ as == ys

Assuming both lists are not empty, we can write this as

isPrefix (x:xs’) (y:ys’) = ∃ as • (x:xs’) ++ as == (y:ys’)

= x == y ∧ ∃ as’ • xs’ ++ as’ == ys’

= x == y ∧ isPrefix xs’ ys’

It is straightforward to add the base cases to obtain the following definition:

5 HASKELL 76

isSubList [] _ = TrueisSubList _ [] = FalseisSubList xs (y:ys) = isPrefix xs (y:ys) || isSubList xs ys

whereisPrefix [] _ = TrueisPrefix _ [] = FalseisPrefix (x:xs) (y:ys) = x == y && isPrefix xs ys

6 The Object Oriented Paradigm

A fundamental feature of our understanding of the world is that we organize our ex-perience as a number of distinct object (tables and chairs, bank loans and algebraicexpressions, . . .) . . . . When we wish to solve a problem on a computer, we oftenneed to construct within the computer a model of that aspect of the real or conceptualworld to which the solution of the problem will be applied. (Hoare 1968)

Object Oriented Programming (OOP) is the currently dominant programming paradigm. Formany people today, “programming” means “object oriented programming”. Nevertheless, thereis a substantial portion of the software industry that has not adopted OOP, and there remainswidespread misunderstanding about what OOP actually is and what, if any, benefits it has to offer.

6.1 Simula

Many programs are computer simulations of the real world or a conceptual world. Writing suchprograms is easier if there is a correspondence between objects in the world and components of theprogram.

Simula originated in the Norwegian Computing Centre in 1962. Kristen Nygaard proposed alanguage for simulation to be developed by himself and Ole-Johan Dahl (1978). Key insights weredeveloped in 1965 following experience with Simula I:

. the purpose of the language was to model systems ;

. a system is a collection of interacting processes ;

. a process can be represented during program execution by multiple procedures each with itsown Algol-style stacks .

The main lessons of Simula I were:

. the distinction between a program text and its execution;

. the fact that data and operations belong together and that most useful programming constructscontain both.

Simula 67 was a general purpose programming language that incorporated the ideas of Simula Ibut put them into a more general context. The basic concept of Simula 67 was to be “classes ofobjects”. The major innovation was “block prefixing” (Nygaard and Dahl 1978).

Prefixing emerged from the study of queues, which are central to discrete-event simulations. Thequeue mechanism (a list linked by pointers) can be split off from the elements of the queue (ob-jects such as trucks, buses, people, and so on, depending on the system being simulated). Oncerecognized, the prefix concept could be used in any context where common features of a collectionof classes could be abstracted in a prefixed block or, as we would say today, a superclass .

Dahl (1978, page 489) has provided his own analysis of the role of blocks in Simula.

. Deleting the procedure definitions and final statement from an Algol block gives a pure datarecord.

. Deleting the final statement from an Algol block, leaving procedure definitions and data dec-larations, gives an abstract data object .

. Adding coroutine constructs to Algol blocks provides quasi-parallel programming capabilities.

77

6 THE OBJECT ORIENTED PARADIGM 78

Listing 19: A Simula Class

class Account (real balance);begin

procedure Deposit (real amount)balance := balance + amount;

procedure Withdraw (real amount)balance := balance - amount;

end;

. Adding a prefix mechanism to Algol blocks provides an abstraction mechanism (the classhierarchy).

All of these concepts are subsumed in the Simula class .

Listing 19 shows a simple Simula-style class, with modernized syntax. Before we use this class, wemust declare a reference to it and then instantiate it on the heap:

ref (Account) myAccount;MyAccount := new Account(1000);

Note that the class is syntactically more like a procedure than a data object: it has an (optional)formal parameter and it must be called with an actual parameter. It differs from an ordinary Algolprocedure in that its AR remains on the heap after the invocation.

To inherit from Account we write something like:

Account class ChequeAccount (real amount);. . . .

This explains the term “prefixing” — the declaration of the child class is prefixed by the name ofits parent. Inheritance and subclasses introduced a new programming methodology, although thisdid not become evident for several years.

Simula provides:

. coroutines that permit the simulation of concurrent processes;

. multiple stacks , needed to support coroutines;

. classes that combine data and a collection of functions that operate on the data;

. prefixing (now known as inheritance) that allows a specialized class to be derived from ageneral class without unnecessary code duplication;

. a garbage collector that frees the programmer from the responsibility of deallocating storage.

Simula itself had an uneven history. It was used more in Europe than in North America, but itnever achieved the recognition that it deserved. This was partly because there were few Simulacompilers and the good compilers were expensive.

On the other hand, the legacy that Simula left is considerable: a new paradigm of programming.

Exercise 23. Coroutines are typically implemented by adding two new statements to the language:suspend and resume. A suspend statement, executed within a function, switches control to ascheduler which stores the value of the program counter somewhere and continues the execution ofanother function. A resume statement has the effect of restarting a previously suspended function.At any one time, exactly one function is executing. What features might a concurrent languageprovide that are not provided by coroutines? What additional complications are introduced byconcurrency? 2


6.2 Smalltalk

Smalltalk originated with Alan Kay’s reflections on the future of computers and programming inthe late 60s. Kay (1996) was influenced by LISP, especially by its one-page metacircular interpreter,and by Simula. It is interesting to note that Kay gave a talk about his ideas at MIT in November1971; the talk inspired Carl Hewitt’s work on his Actor model, an early attempt at formalizingobjects. In turn, Sussman and Steele wrote an interpreter for Actors that eventually becameScheme (see Section 4.2). The cross-fertilization between OOPL and FPL that occurred duringthese early days has, sadly, not continued.

The first version of Smalltalk was implemented by Dan Ingalls in 1972, using BASIC (!) as theimplementation language (Kay 1996, page 533). Smalltalk was inspired by Simula and LISP; itwas based on six principles (Kay 1996, page 534).

1. Everything is an object .

2. Objects communicate by sending and receiving messages (in terms of objects).

3. Objects have their own memory (in terms of objects).

4. Every object is an instance of a class (which must be an object).

5. The class holds the shared behaviour for its instances (in the form of objects in a programlist).

6. To [evaluate] a program list, control is passed to the first object and the remainder is treatedas its message.

Principles 1–3 provide an “external” view and remained stable as Smalltalk evolved. Principles4–6 provide an “internal” view and were revised following implementation experience. Principle6 reveals Kay’s use of LISP as a model for Smalltalk — McCarthy had described LISP with aone-page meta-circular interpreter; one of Kay’s goals was to do the same for Smalltalk.

Smalltalk was also strongly influenced by Simula. However, it differs from Simula in several ways:

. Simula distinguishes primitive types, such as integer and real, from class types. In Smalltalk,“everything is an object”.

. In particular, classes are objects in Smalltalk. To create a new instance of a class, you send amessage to it. Since a class object must belong to a class, Smalltalk requires metaclasses.

. Smalltalk effectively eliminates passive data. Since objects are “active” in the sense that theyhave methods, and everything is an object, there are no data primitives.

. Smalltalk is a complete environment, not just a compiler. You can edit, compile, execute, anddebug Smalltalk programs without ever leaving the Smalltalk environment.

The “block” is an interesting innovation of Smalltalk. A block is a sequence of statements thatcan be passed as a parameter to various control structures and behaves rather like an object. Inthe Smalltalk statement

10 timesRepeat: [Transcript nextPutAll: ’ Hi!’]

the receiver is 10, the message is timesRepeat:, and [Transcript nextPutAll: ’ Hi!’] isthe parameter of the message.

Blocks may have parameters. One technique for iteration in Smalltalk is to use a do: messagewhich executes a block (its parameter) for each component of a collection. The component is


passed to the block as a parameter. The following loop determines whether "Mika" occurs in thecurrent object (assumed to be a collection):

self do: [:item | item = "Mika" ifTrue: [^true]].^false

The first occurrence of item introduces it as the name of the formal parameter for the block.The second occurrence is its use. Blocks are closely related to closures (Section 4.2). The block[:x | E] corresponds to λx .E.

The first practical version of Smalltalk was developed in 1976 at Xerox Palo Alto Research Center(PARC). The important features of Smalltalk are:

. everything is an object;

. an object has private data and public functions;

. objects collaborate by exchanging “messages”;

. every object is a member of a class;

. there is an inheritance hierarchy (actually a tree) with the class Object as its root;

. all classes inherit directly or indirectly from the class Object;

. blocks;

. coroutines;

. garbage collection.

Exercise 24. Discuss the following statements (Gelernter and Jagannathan 1990, page 245).

1. [Data objects are replaced by program structures.] Under this view of things, particularlyunder Smalltalk’s approach, there are no passive objects to be acted upon . Every entity is anactive agent, a little program in itself. This can lead to counter-intuitive interpretations of simpleoperations, and can be destructive to a natural sense of hierarchy.

2. Algol 60 gives us a representation for procedures, but no direct representation for a procedureactivation . Simula and Smalltalk inherit this same limitation: they give us representations forclasses , but no direct representations for the objects that are instantiated from classes. Ourability to create single objects and to specify (constant) object “values” is compromised. 2

Exercise 25. In Smalltalk, an object is an instance of a class. The class is itself an object, andtherefore is a member of a class. A class whose instances are classes is called a metaclass . Dothese definitions imply an infinite regression? If so, how could the infinite regression by avoided?Is there a class that is a member of itself? 2

6.3 CLU

CLU is not usually considered an OOPL because it does not provide any mechanisms for incre-mental modification. However:

The resulting programming methodology is object-oriented: programs are developedby thinking about the objects they manipulate and then inventing a modular structurebased on these objects. Each type of object is implemented by its own program module.(Liskov 1996, page 475)

Parnas (1972) introduced the idea of “information hiding”: a program module should have a publicinterface and a private implementation. Liskov and Zilles (1974) introduced CLU specifically tosupport this style of programming.


An abstract data type (ADT) specifies a type, a set of operations that may be performed oninstances of the type, and the effects of these operations. The implementation of an ADT providesthe actual operations that have the specified effects and also prevents programmers from doinganything else. The word “abstract” in CLU means “user defined”, or not built in to the language:

I referred to the types as “abstract” because they are not provided directly by a pro-gramming languages but instead must be implemented by the user. An abstract typeis abstract in the same way that a procedure is an abstract operation. (Liskov 1996,page 473)

This is not a universally accepted definition. Many people use the term ADT in the followingsense:

An abstract data type is a system consisting of three constituents:

1. some sets of objects;2. a set of syntactic descriptions of the primitive functions;3. a semantic description — that is, a sufficiently complete set of relationships that

specify how the functions interact with each other.

(Martin 1986)

The difference is that an ADT in CLU provides a particular implementation of the type (and, infact, the implementation must be unique) whereas Martin’s definition requires only a specification.A stack ADT in CLU would have functions called push and pop to modify the stack, but astack ADT respecting Martin’s definition would define the key property of a stack (the sequencepush(); pop() has no effect) but would not provide an implementation of these functions.

Although Simula influenced the design of CLU, it was seen as deficient in various ways. Simula:

. does not provide encapsulation: clients could directly access the variables, as well as thefunctions, of a class;

. does not provide “type generators” or, as we would say now, “generic types” (“templates” inC++);

. associates operations with objects rather than types;

. handles built-in and user-defined types differently — for example, instances of user-definedclasses are always heap-allocated.

CLU’s computational model is similar to that of LISP: names are references to heap-allocatedobjects. The stack-only model of Algol 60 was seen as too restrictive for a language supportingdata abstractions. The reasons given for using a heap include:

. Declarations are easy to implement because, for all types, space is allocated for a pointer.Stack allocation breaks encapsulation because the size of the object (which should be animplementation “secret”) must be known at the point of declaration.

. Variable declaration is separated from object creation. This simplifies safe initialization.

. Variable and object lifetimes are not tied. When the variable goes out of scope, the objectcontinues to exist (there may be other references to it).

. The meaning of assignment is independent of type. The statement x := E means simplythat x becomes a reference (implemented as a pointer, of course) to E. There is no need toworry about the semantics of copying an object. Copying, if needed, should be provided by amember function of the class.


Listing 20: A CLU Cluster

intset = cluster is create, insert, delete, member, size, chooserep = array[int]create = proc () returns (cvt)

return (rep$new())end create

insert = proc (s: intset, x: int)if ∼member(s, x) then rep$addh(down(s), x) endend insert

member = proc (s: cvt, x: int) returns (bool)return (getind(s, x) ≤ rep$high(s))

end member. . . .end intset

. Reference semantics requires garbage collection. Although garbage collection has a run-timeoverhead, efficiency was not a primary goal of the CLU project. Furthermore, garbage collec-tion eliminates memory leaks and dangling pointers, notorious sources of obscure errors.

CLU is designed for programming with ADTs. Associated with the interface is an implementationthat defines a representation for the type and provides bodies for the functions.

Listing 20 shows an extract from a cluster that implements a set of integers (Liskov and Guttag1986, page 63). The components of a cluster, such as intset, are:

1. A header that gives the name of the type being implemented (intset) and the names ofthe operations provided (create, insert, . . . .).

2. A definition of the representation type, introduced by the keyword rep. An intset is repre-sented by an array of integers.

3. Implementations of the functions mentioned in the header and, if necessary, auxiliary func-tions that are not accessible to clients.

The abstract type intset and the representation type array[int], referred to as rep, are distinct.CLU provides special functions up, which converts the representation type to the abstract type,and down, which converts the abstract type to the representation type. For example, the functioninsert has a parameter s of type intset which is converted to the representation type so that itcan be treated as an array.

It often happens that the client provides an abstract argument that must immediately be convertedto the representation type; conversely, many functions create a value of the representation typebut must return a value of the abstract type. For both of these cases, CLU provides the keywordcvt as an abbreviation. In the function create above, cvt converts the new instance of rep tointset before returning, and in the function member, the formal parameter s:cvt indicates thatthe client will provide an intset which will be treated in the function body as a rep.

All operations on abstract data in CLU contain the type of the operator explicitly, using the syntaxtype$operation.

CLU draws a distinction between mutable and immutable objects. An object is mutable if its valuecan change during execution, otherwise it is immutable . For example, integers are immutable butarrays are mutable (Liskov and Guttag 1986, pages 19–20).


The mutability of an object depends on its type and, in particular, on the operations provided bythe type. Clearly, an object is mutable if its type has operations that change the object. Afterexecuting

x: T = intset$create()y: Ty := x

the variables x and y refer to the same object. If the object is immutable, the fact that it is sharedis undetectable by the program. If it is mutable, changes made via the reference x will be visiblevia the reference y. Whether this is desirable depends on your point of view.

The important contributions of CLU include:

. modules (“clusters”);

. safe encapsulation;

. distinction between the abstract type and the representation type;

. names are references, not values;

. the distinction between mutable and immutable objects;

. analysis of copying and comparison of ADTs;

. generic clusters (a cluster may have parameters, notably type parameters);

. exception handling.

Whereas Simula provides tools for the programmer and supports a methodology, CLU providesthe same tools and enforces the methodology. The cost is an increase in complexity. Severalkeywords (rep, cvt, down, and up) are needed simply to maintain the distinction between abstractand representation types. In Simula, three concepts — procedures, classes, and records — areeffectively made equivalent, resulting in significant simplification. In CLU, procedures, records,and clusters are three different things.

The argument in favour of CLU is that the compiler will detect encapsulation and other errors.But is it the task of a compiler to prevent the programmer doing things that might be completelysafe? Or should this role be delegated to a style checker, just as C programmers use both cc (theC compiler) and lint (the C style checker)?

The designers of CLU advocate defensive programming :

Of course, checking whether inputs [i.e., parameters of functions] are in the permit-ted subset of the domain takes time, and it is tempting not to bother with the checks,or to use them only while debugging, and suppress them during production. This isgenerally an unwise practice. It is better to develop the habit of defensive programming,that it writing each procedure to defend itself against errors. . . .

Defensive programming makes it easier to debug programs. . . . (Liskov and Guttag1986, page 100)

Exercise 26. Do you agree that defensive programming is a better habit than the alternativesuggested by Liskov and Guttag? Can you propose an alternative strategy? 2

6.4 C++

C++ was developed at Bell Labs by Bjarne Stroustrup (1994). The task assigned to Stroustrup wasto develop a new systems PL that would replace C. C++ is the result of two major design decisions:


first, Stroustrup decided that the new language would be adopted only if it was compatible withC; second, Stroustrup’s experience of completing a Ph.D. at Cambridge University using Simulaconvinced him that object orientation was the correct approach but that efficiency was essential foracceptance. C++ is widely used but has been sharply criticized (Sakkinen 1988; Sakkinen 1992;Joyner 1992).

C++:

. is almost a superset of C (that is, there are only a few C constructs that are not accepted bya C++ compiler);

. is a hybrid language (i.e., a language that supports both imperative and OO programming),not a “pure” OO language;

. emphasizes the stack rather than the heap, although both stack and heap allocation is provided;

. provides multiple inheritance, genericity (in the form of “templates”), and exception handling;

. does not provide garbage collection.

Exercise 27. Explain why C++ is the most widely-used OOPL despite its technical flaws. 2

6.5 Eiffel

Software Engineering was a key objective in the design of Eiffel.

Eiffel embodies a “certain idea” of software construction: the belief that it is possible totreat this task as a serious engineering enterprise, whose goal is to yield quality softwarethrough the careful production and continuous development of parameterizable, scien-tifically specified, reusable components, communicating on the basis of clearly definedcontracts and organized in systematic multi-criteria classifications. (Meyer 1992)

Eiffel borrows from Simula 67, Algol W, Alphard, CLU, Ada, and the specification language Z.Like some other OOPLs, it provides multiple inheritance. It is notable for its strong typing, anunusual exception handling mechanism, and assertions. By default, object names are references;however, Eiffel also provides “expanded types” whose instances are named directly.

Eiffel does not have: global variables; enumerations; subranges; goto, break, or continue state-ments; procedure variables; casts; or pointer arithmetic. Input/output is defined by libraries ratherthan being built into the language.

The main features of Eiffel are as follows. Eiffel:

. is strictly OO: all functions are defined as methods in classes;

. provides multiple inheritance and repeated inheritance;

. supports “programming by contract” with assertions;

. has an unusual exception handling mechanism;

. provides generic classes;

. may provide garbage collection.

6.5.1 Programming by Contract

The problem with writing a function to compute square roots is that it is not clear what to do ifthe argument provided by the caller is negative. The solution adopted in Listing 21 is to allow forthe possibility that the argument might be negative.


Listing 21: Finding a square root

double sqrt (double x){

if (x ≥ 0.0)return . . . . //

√x

else??

}

Listing 22: A contract for square roots

sqrt (x: real): real isrequire x ≥ 0.0Result := . . . . --

√x

ensure abs(Result * Result / x - 1.0) ≤ 1e-6end

. This approach, applied throughout a software system, is called defensive programming(Section 6.3).

. Testing arguments adds overhead to the program.

. Most of the time spent testing is wasted, because programmers will rarely call sqrt with anegative argument.

. The failure action, indicated by “??” in Listing 21, is hard to define. In particular, it isundesirable to return a funny value such as −999.999, or to send a message to the console(there may not be one), or to add another parameter to indicate success/failure. The onlyreasonable possibility is some kind of exception.

Meyer’s solution is to write sqrt as shown in Listing 22. The require/ensure clauses constitutea contract between the function and the caller. In words:

If the actual parameter supplied, x, is positive, the result returned will be approximately√x.

If the argument supplied is negative, the contract does not constrain the function in any way at all:it can return -999.999 or any other number, terminate execution, or whatever. The caller thereforehas a responsibility to ensure that the argument is valid.

The advantages of this approach include:

. Many tests become unnecessary.

. Programs are not cluttered with tests for conditions that are unlikely to arise.

. The require conditions can be checked dynamically during testing and (perhaps) disabledduring production runs. (Hoare considers this policy to be analogous to wearing your lifejacket during on-shore training but leaving it behind when you go to sea.)

. Require and ensure clauses are a useful form of documentation; they appear in Eiffel inter-faces.

Functions in a child class must provide a contract at least as strong as that of the correspondingfunction in the parent class, as shown in Listing 23. We must have Rp ⇒ Rc and Ec ⇒ Ep. Note


Listing 23: Eiffel contracts

class Parentfeature

f() isrequire Rp

ensure Ep

end. . . .

class Child inherit Parentfeature

f() isrequire Rc

ensure Ec

. . . .

the directions of the implications: the ensure clauses are “covariant” but the require clauses are“contravariant”. The following analogy explains this reversal.

A pet supplier offers the following contract: “if you send at least $200, I will send you an animal”.The pet supplier has a subcontractor who offers the following contract: “if you send me at least$150, I will send you a dog”. These contracts respect the Eiffel requirements: the contract offeredby the subcontractor has a weaker requirement ($150 is less than $200) but promises more (a dogis more specific than an animal). Thus the contractor can use the subcontractor when a dog isrequired and make a profit of $50.

Eiffel achieves the subcontract specifications by requiring the programmer to define

Rc ≡ Rp ∨ · · ·Ec ≡ Ep ∧ · · ·

The ∨ ensures that the child’s requirement is weaker than the parent’s requirement, and the ∧ensures that the child’s commitment is stronger than the parent’s commitment.

In addition to require and ensure clauses, Eiffel has provision for writing class invariants . Aclass invariant is a predicate over the variables of an instance of the class. It must be true whenthe object is created and after each function of the class has executed.

Eiffel programmers are not obliged to write assertions, but it is considered good Eiffel “style” toinclude them. In practice, it is sometimes difficult to write assertions that make non-trivial claimsabout the objects. Typical contracts say things like: “after you have added an element to thiscollection, it contains one more element”. Since assertions may contain function calls, it is possiblein principle to say things like: “this structure is a heap (in the sense of heap sort)”. However:

. this does not seem to be done much in practice;

. complex assertions increase run-time overhead, increasing the incentive to disable checking;

. a complex assertion might itself contain errors, leading to a false sense of confidence in thecorrectness of the program.

6.5.2 Repeated Inheritance

Listing 24 shows a collection of classes (Meyer 1992, page169). The features of HomeBusinessinclude:


Listing 24: Repeated inheritance in Eiffel

class Housefeature

address: Stringvalue: Money

endclass Residence inherit House

rename value as residenceValueendclass Business inherit House

rename value as businessValueendclass HomeBusiness inherit Residence Business. . . .

end

. address, inherited once but by two paths;

. residenceValue, inherited from value in class House;

. businessValue, inherited from value in class House.

Repeated inheritance can be used to solve various problems that appear in OOP. For example, it isdifficult in some languages to provide collections with more than one iteration sequence. In Eiffel,more than one sequence can be obtained by repeated inheritance of the successor function.

C++ has a similar mechanism, but it applies to entire classes (via the virtual mechanism) ratherthan to individual attributes of classes.

6.5.3 Exception Handling

An Eiffel function must either fulfill its contract or report failure. If a function contains a rescueclause, this clause is invoked if an operation within the function reports failure. A return fromthe rescue clause indicate that the function has failed. However, a rescue clause may performsome cleaning-up actions and then invoke retry to attempt the calculation again. The function inListing 25 illustrates the idea. The mechanism seems harder to use than other, more conventional,exception handling mechanisms. It is not obvious that there are many circumstances in which itmakes sense to “retry” a function.

Exercise 28. Early versions of Eiffel had a reference semantics (variable names denoted referencesto objects). After Eiffel had been used for a while, Meyer introduced expanded types . An instanceof an expanded type is an object that has a name (value semantics). Expanded types introducevarious irregularities into the language, because they do not have the same properties as ordinarytypes. Why do you think expanded types were introduced? Was the decision to introduce themwise? 2

6.6 Java

Java (Arnold and Gosling 1998) is an OOPL introduced by Sun Microsystems. Its syntax bearssome relationship to that of C++, but Java is simpler in many ways than C++. Key features ofJava include the following.


Listing 25: Exceptions in Eiffel

tryIt islocal

triedFirst: Boolean -- initially falsedo

if not triedFirstthen firstmethodelse secondmethod

endrescue

if not triedFirstthen

triedFirst := trueretry

elserestoreInvariant-- report failure

endend

. Java is compiled to byte codes that are interpreted. Since any computer that has a Java bytecode interpreter can execute Java programs, Java is highly portable.

. The portability of Java is exploited in network programming: Java bytes can be transmittedacross a network and executed by any processor with an interpreter.

. Java offers security . The byte codes are checked by the interpreter and have limited function-ality. Consequently, Java byte codes do not have the potential to penetrate system securityin the way that a binary executable (or even a MS-Word macro) can.

. Java has a class hierarchy with class Object at the root and provides single inheritance ofclasses.

. In addition to classes, Java provides interfaces with multiple inheritance.

. Java has an exception handling mechanism.

. Java provides concurrency in the form of threads.

. Primitive values, such as int, are not objects in Java. However, Java provides wrapperclasses , such as Integer, for each primitive type.

. A variable name in Java is a reference to an object.

. Java provides garbage collection .

6.6.1 Portability

Compiling to byte codes is an implementation mechanism and, as such, is not strongly relevant tothis course. The technique is quite old — Pascal P achieved portability in the same way duringthe mid-seventies — and has the disadvantage of inefficiency: Java programs are typically an orderof magnitude slower than C++ programs with the same functionality. For some applications, suchas simple Web “applets”, the inefficiency is not important. More efficient implementations arepromised. One interesting technique is “just in time” compilation, in which program statementsare compiled immediately before execution; parts of the program that are not needed during aparticular run (this may be a significant fraction of a complex program) are not compiled.


Listing 26: Interfaces

interface Comparable{

Boolean equalTo (Comparable other);// Return True if ‘this’ and ‘other’ objects are equal.

}interface Ordered extends Comparable{

Boolean lessThan (Ordered other);// Return True if ‘this’ object precedes ‘other’ object in the ordering.

}

The portability of Java has been a significant factor in the rapid spread of its popularity, particularlyfor Web programming.

6.6.2 Interfaces

A Java class, as usual in OOP, is a compile-time entity whose run-time instances are objects. Aninterface declaration is similar to a class declaration, but introduces only abstract methods andconstants. Thus an interface has no instances.

A class may implement an interface by providing definitions for each of the methods declared inthe interface. (The class may also make use of the values of the constants declared in the interface.)

A Java class can inherit from at most one parent class but it may inherit from several interfacesprovided, of course, that it implements all of them. Consequently, the class hierarchy is a rootedtree but the interface hierarchy is a directed acyclic graph. Interfaces provide a way of describingand factoring the behaviour of classes.

Listing 26 shows a couple of simple examples of interfaces. The first interface ensures that instancesof a class can be compared. (In fact, all Java classes inherit the function equals from class Objectthat can be redefined for comparison in a particular class.) If we want to store instances of a classin an ordered set, we need a way of sorting them as well: this requirement is specified by the secondinterface.

Class Widget is required to implement equalTO and lessThan:

class Widget implements Ordered{. . . .}

6.6.3 Exception Handling

The principal components of exception handling in Java are as follows:

. There is a class hierarchy rooted at class Exception. Instances of a subclass of class Exceptionare used to convey information about what has gone wrong. We will call such instances“exceptions”.

. The throw statement is used to signal the fact that an exception has occurred. The argumentof throw is an exception.


Listing 27: Declaring a Java exception class

public class DivideByZero extends Exception{

DivideByZero(){

super("Division by zero");}

}

Listing 28: Another Java exception class

public class NegativeSquareRoot extends Exception{

public float value;NegativeSquareRoot(float badValue){

super("Square root with negative argument");value = badValue;

}}

. The try statement is used to attempt an operation and handle any exceptions that it raises.

. A try statement may use one or more catch statements to handle exceptions of various types.

The following simple example indicates the way in which exceptions might be used. Listing 27shows the declaration of an exception for division by zero. The next step is to declare anotherexception for negative square roots, as shown in Listing 28

Next, we suppose that either of these problems may occur in a particular computation, as shownin Listing 29. Finally, we write some code that tries to perform the computation, as shown inListing 30.

6.6.4 Concurrency

Java provides concurrency in the form of threads . Threads are defined by a class in that standardlibrary. Java threads have been criticized on (at least) two grounds.

Listing 29: Throwing exceptions

public void bigCalculation ()throws DivideByZero, NegativeSquareRoot

{. . . .

if (. . . .) throw new DivideByZero();. . . .if (. . . .) throw new NegativeSquareRoot(x);. . . .

}


Listing 30: Using the exceptions

{try{

bigCalculation()}

catch (DivideByZero){

System.out.println("Oops! divided by zero.");}catch (NegativeSquareRoot n){

System.out.println("Argument for square root was " + n);}

}}

. The synchronization methods provided by Java (based on the early concepts of semaphoresand monitors) are out of date and do not benefit from more recent research in this area.

. The concurrency mechanism is insecure. Brinch Hansen (1999) argues that Java’s concurrencymechanisms are unsafe and that such mechanisms must be defined in the base language andchecked by the compiler, not provided separately in a library.

Exercise 29. Byte codes provide portability. Can you suggest any other advantages of using bytecodes? 2

6.7 Kevo

All of the OOPLs previously described in this section are class-based languages. In a class-basedlanguage, the programmer defines one or more classes and, during execution, instances of theseclasses are created and become the objects of the system. (In Smalltalk, classes are run-timeobjects, but Smalltalk is nevertheless a class-based language.) There is another, smaller, family ofOOPLs that use prototypes rather than classes.

Although there are several prototype-based OOPLs, we discuss only one of them here. AnteroTaivalsaari (1993) has given a thorough account of the motivation and design of his language,Kevo. Taivalsaari (1993, page 172) points out that class-based OOPLs, starting with Simula, arebased on an Aristotelian view in which

the world is seen as a collection of objects with well-defined properties arranged [ac-cording] to a hierarchical taxonomy of concepts.

The problem with this approach is that there are many classes which people understand intuitivelybut which are not easily defined in taxonomic terms. Common examples include “book”, “chair”,“game”, etc.

Prototypes provide an alternative to classes. In a prototype-based OOPL, each new type of objectis introduced by defining a prototype or exemplar , which is itself an object. Identical copiesof a prototype — as many as needed — are created by cloning . Alternatively, we can clone a


Listing 31: Object Window

REF WindowObject.new → WindowWindow ADDS

VAR rectPrototypes.Rectangle.new → rectVAR contentsMETHOD drawFrame . . . .;METHOD drawContents . . . .;METHOD refresh

self.drawFrameself.drawContents;

ENDADDS;

Listing 32: Object Title Window

REF TitleWindowWindow.new → TitleWindowTitleWindow ADDS

VAR title"Anonymous" → titleMETHOD drawFrame . . . .;

ENDADDS;

prototype, modify it in some way, and use the resulting object as a prototype for another collectionof identical objects.

The modified object need not be self-contained. Some of its methods will probably be the sameas those of the original object, and these methods can be delegated to the original object ratherthan replicated in the new object. Delegation thus plays the role of inheritance in a prototype-based language and, for this reason, prototype-based languages are sometimes called delegationlanguages. This is misleading, however, because delegation is not the only possible mechanism forinheritance with prototypes.

Example 16: Objects in Kevo. This example illustrates how new objects are introduced inKevo; it is based on (Taivalsaari 1993, pages 187–188). Listing 31 shows the introduction of awindow object. The first line, REF Window, declares a new object with no properties. The secondline defines Window as a clone of Object. In Kevo terminology, Window receives the basic printingand creation operations defined for Object. In the next block of code, properties are added to thewindow using the module operation ADDS.

The window has two components: rect and contents. The properties of rect are obtainedfrom the prototype Rectangle. The properties of contents are left unspecified in this example.Finally, three new methods are added to Window. The first two are left undefined here, but thethird, refresh, draws the window frame first and then the contents.

Listing 32 shows how to add a title to a window. It declares a new object, TitleWindow withWindow as a prototype. Then it adds a new variable, title, with initial value "Anonymous", andprovides a new method drawFrame that overrides Window.drawFrame and draws a frame with atitle. 2


6.8 Other OOPLs

The OOPLs listed below are interesting, but lack of time precludes further discussion of them.

Beta. Beta is a descendant of Simula, developed in Denmark. It differs in many respects from“mainstream” OOPLs.

Blue. Blue (Kolling 1999) is a language and environment designed for teaching OOP. The languagewas influenced to some extent by Dee (Grogono 1991b). The environment is interesting inthat users work directly with “objects” rather than program text and generated output.

CLOS. The Common LISP Object System is an extension to LISP that provides OOP. It has anumber of interesting features, including “multi-methods” — the choice of a function is basedon the class of all of its arguments rather than its first (possibly implicit) argument. CLOSalso provides “before” and “after” methods — code defined in a superclass that is executedbefore or after the invocation of the function in a class.

Self. Self is the perhaps the best-known prototype language. Considerable effort has been put intoefficient implementation, to the extent that Self programs run at up to half of the speed ofequivalent C++ programs. This is an impressive achievement, given that Self is considerablymore flexible than C++.

6.9 Issues in OOP

OOP is the programming paradigm that currently dominates industrial software development. Inthis section, we discuss various issues that arise in OOP.

6.9.1 Algorithms + Data Structures = Programs

The distinction between code and data was made at an early stage. It is visible in FORTRAN andreinforced in Algol 60. Algol provided the recursive block structure for code but almost no facilitiesfor data structuring. Algol 68 and Pascal provided recursive data structures but maintained theseparation between code and data.

LISP brought algorithms (in the form of functions) and data structures together, but:

. the mutability of data was downplayed;

. subsequent FPLs reverted to the Algol model, separating code and data.

Simula started the move back towards the von Neumann Architecture at a higher level of abstrac-tion than machine code: the computer was recursively divided into smaller computers, each with itsown code and stack. This was the feature of Simula that attracted Alan Kay and led to Smalltalk(Kay 1996).

Many OOPLs since Smalltalk have withdrawn from its radical ideas. C++ was originally called“C with classes” (Stroustrup 1994) and modern C++ remains an Algol-like systems programminglanguage that supports OOP. Ada has shown even more reluctance to move in the direction ofOOP, although Ada 9X has a number of OO features.


6.9.2 Values and Objects

Some data behave like mathematical objects and other data behave like representations of physicalobjects (MacLennan 1983).

Consider the sets { 1, 2, 3 } and { 2, 3, 1 }. Although these sets have different representations (theordering of the members is different), they have the same value because in set theory the onlyimportant property of a set is the members that it contains: “two sets are equal if and only if theycontain the same members”.

Consider two objects representing people:

[name="Peter", age=55][name="Peter", age=55]

Although these objects have the same value, they might refer to different people who have the samename and age. The consequences of assuming that these two objects denote the same person couldbe serious: in fact, innocent people have been arrested and jailed because they had the misfortuneto have the same name and social security number as a criminal.

In the following comparison of values and objects, statements about values on the left side of thepage are balanced by statements about objects on the right side of the page.

Values

Objects

Integers have values. For example, the integer 6 is a value.Objects possess values as attributes. For example, a counter mighthave 6 as its current value.

Values are abstract. The value 7 is the common property shared byall sets with cardinality 7.

Objects are concrete. An object is a particular collection of sub-objects and values. A counter has one attribute: the value of thecount.

Values are immutable (i.e., do not change).

Objects are usually, but not necessarily, mutable (i.e., may change).It does not make sense to change a value. If we change the value 3,it is no longer 3.

It is meaningful, and often useful, to change the value of an object.For example, we can increment a counter.

Values need representations. We need representations in order tocompute with values. The representations 7, VII, sept, sju, anda particular combination of bits in the memory of a computer aredifferent representations of the same abstract value.

Objects need representations. An object has subobjects and valuesas attributes.

It does not make sense to talk about the identity of a value. Supposewe evaluate 2 + 3, obtaining 5, and then 7− 2, again obtaining 5. Isit the same 5 that we obtain from the second calculation?

Identity is an important attribute of objects. Even objects withthe same attributes may have their own, unique identities: consideridentical twins.

We can copy a representation of a value, but copying a value ismeaningless.


Copying an object certainly makes sense. Sometimes, it may evenbe useful, as when we copy objects from memory to disk, for exam-ple. However, multiple copies of an object with a supposedly uniqueidentity can cause problems, known as “integrity violations” in thedatabase community.

We consider different representations of the same value to be equal.An object is equal to itself. Whether two distinct objects with equalattributes are equal depends on the application. A person who isarrested because his name and social insurance number is the sameas those of a wanted criminal may not feel “equal to” the criminal.

Aliasing is not a problem for values. It is immaterial to the pro-grammer, and should not even be detectable, whether two instancesof a value are stored in the same place or in different places.

All references to a particular object should refer to the same object.This is sometimes called “aliasing”, and is viewed as harmful, but itis in fact desirable for objects with identity.

Functional and logical PLs usually provide values.

Object oriented PLs should provide objects.Programs that use values only have the property of referential trans-parency.

Programs that provide objects do not provide referential trans-parency, but may have other desirable properties such as encap-sulation.

Small values can be held in machine words. Large values are bestrepresented by references (pointers to the data structure representingthe value.

Very small objects may be held in machine words. Most objectsshould be represented by references (pointers to the data structureholding the object.

The assignment operator for values should be a reference assignment.There is no need to copy values, because they do not change.

The assignment operator for objects should be a reference assign-ment. Objects should not normally be copied, because copying en-dangers their integrity.

The distinction between values and objects is not as clear as these comparisons suggest. When welook at programming problems in more deatil, doubts begin to arise.

Exercise 30. Should sets be values? What about sets with mutable components? Does the modelhandle objects with many to one abstraction functions? What should a PL do about values andobjects? 2

Conclusions:

. The CM must provide values, may provide objects.

. What is “small” and what is “large”? This is an implementation issue: the compiler mustdecide.

. What is “primitive” and what is “user defined”? The PL should minimize the distinction asmuch as possible.

. Strings should behave like values. Since they have unbounded size, many PLs treat them asobjects.


. Copying and comparing are semantic issues. Compilers can provide default operations and“building bricks”, but they cannot provide fully appropriate implementations.

6.9.3 Classes versus Prototypes

The most widely-used OOPLs — C++, Smalltalk, and Eiffel — are class-based. Classes provide avisible program structure that seems to be reassuring to designers of complex software systems.

Prototype languages are a distinct but active minority. The relation between classes and prototypesis somewhat analogous to the relation between procedural (FORTRAN and Algol) and functional(LISP) languages in the 60s.

Prototypes provide more flexibility and are better suited to interactive environments than classes.Although large systems have been built using prototype languages, these languages seem to bemore often used for exploratory programming.

6.9.4 Types versus Objects

In most class-based PLs, functions are associated with objects. This has the effect of making binaryoperations asymmetric. For example, x = y is interpreted as x.equal(y) or, approximately,“send the message equal with argument y to the object x”. It is not clear whether, with thisinterpretation, y = x means the same as x = y. The problem is complicated further by inheritance:if class C is a subclass of class P , can we compare an instance of C to an instance of P , and is thiscomparison commutative?

Some of the problems can be avoided by associating functions with types rather than with objects.This is the approach taken by CLU and some of the “functional” OOPLs. An expression such asx = y is interpreted in the usual mathematical sense as equal(x,y). However, even this approachhas problems if x and y do not have the same type.

6.9.5 Pure versus Hybrid Languages

Smalltalk, Eiffel, and Java are pure OOPLs: they can describe only systems of objects and pro-grammers are forced to express everything in terms of objects.

C++ is a hybrid PL in that it supports both procedural and object styles. Once the decision wasmade to extend C rather than starting afresh (Stroustrup 1994, page 43), it was inevitable thatC++ could be a hybrid language. In fact, its precursor, “C with classes”, did not even claim to beobject oriented.

A hybrid language can be dangerous for inexperienced programmers because it encourages a hybridprogramming style. OO design is performed perfunctorily and design problems are solved byreverting to the more general, but lower level, procedural style. (This is analogous to the earlydays of “high level” programming when programmers would revert to assembler language to expressconstructs that were not easy or possible to express at a high level.)

For experienced programmers, and especially those trained in OO techniques, a hybrid languageis perhaps better than a pure OOPL. These programmers will use OO techniques most of thetime and will only resort to procedural techniques when these are clearly superior to the best OOsolution. As OOPLs evolve, these occasions should become rarer.

Perhaps “pure or hybrid” is the wrong question. A better question might be: suppose a PLprovides OOP capabilities. Does it need to provide anything else? If so, what? One answer is: FP


capabilities. The multi-paradigm language LEDA (Budd 1995) demonstrates that it is possible toprovide functional, logic, and object programming in a single language. However, LEDA providesthese in a rather ad hoc way. It more tightly-integrated design would be preferable, but is harderto accomplish.

6.9.6 Closures versus Classes

The most powerful programming paradigms developed so far are those that do not separate codeand data but instead provide “code and data” units of arbitrary size. This can be accomplishedby high order functions and mutable state, as in Scheme (Section 4.2) or by classes. Althougheach method has advantages for certain applications, the class approach seems to be better in mostpractical situations.

6.9.7 Inheritance

Inheritance is one of the key features of OOP and at the same time one of the most troublesome.Conceptually, inheritance can be defined in a number of ways and has a number of different uses.In practice, most (but not all!) OOPLs provide a single mechanism for inheritance and leave it tothe programmer to decide how to use the mechanism. Moreover, at the implementation level,the varieties of inheritance all look much the same — a fact that discourages PL designers fromproviding multiple mechanisms.

One of the clearest examples of the division is Meyer’s (1988) “marriage of convenience” in whichthe class ArrayStack inherits from Array and Stack. The class ArrayStack behaves like a stack— it has functions such as push, pop, and so on — and is represented as an array. There are otherways of obtaining this effect — notably, the class ArrayStack could have an array object as anattribute — but Meyer maintains that inheritance is the most appropriate technique, at least inEiffel.

Whether or not we agree with Meyer, it is clear that there two different relations are involved.The relation ArrayStack ∼ Stack is not the same as the relation ArrayStack ∼ Array. Hence thequestion: should an OOPL distinguish between these relations or not?

Based on the “marriage of convenience” the answer would appear to be “yes” — two distinct endsare achieved and the means by which they are achieved should be separated. A closer examination,however, shows that the answer is by no means so clear-cut. There are many practical programmingsituations in which code in a parent class can be reused in a child class even when interfaceconsistency is the primary purpose of inheritance.

6.10 A Critique of C++

C++ is both the most widely-used OOPL and the most heavily criticized. Further criticism mightappear to be unnecessary. The purpose of this section is to show how C++ fails to fulfill its roleas the preferred OOPL.

The first part of the following discussion is based on a paper by LaLonde and Pugh (1995) in whichthey discuss C++ from the point of view of a programmer familiar with the principles of OOP,exemplified by a language such as Smalltalk, but who is not familiar with the details of C++.

We begin by writing a specification for a simple BankAccount class with an owner name, a balance,and functions for making deposits and withdrawals: see Listing 33.


. Data members are prefixed with “p” to distinguish them from parameters with similar names.

. The instance variable pTrans is a pointer to an array containing the transactions that havetaken place for this account. Functions that operate on this array have been omitted.

We can create instances of BankAccount on either the stack or the heap:

BankAccount account1("John"); // on the stackBankAccount *account2 = new BankAccount("Jane"); // on the heap

The first account, account1, will be deleted automatically at the end of the scope containing thedeclaration. The second account, account2, will be deleted only when the program executes adelete operation. In both cases, the destructor will be invoked.

The notation for accessing functions depends on whether the variable refers to the object directlyor is a pointer to the object.

account1.balance()account2→balance()

A member function can access the private parts of another instance of the same class, as shown inListing 34. This comes as a surprise to Smalltalk programmers because, in Smalltalk, an objectcan access only its own components.

The next step is to define the member functions of the class BankAccount: see Listing 35.

Listing 36 shows a simple test of the class BankAccount. This test introduces two bugs.

. The array account1.pTrans is lost because it is over-written by account2.pTrans.

. At the end of the test, account2.pTrans will be deleted twice, once when the destructor foraccount1 is called, and again when the destructor for account2 is called.

Although most introductions to C++ focus on stack objects, it is more interesting to consider heapobjects: Listing 37.

The code in Listing 37 — creating three special accounts — is rather specialized. Listing 38generalizes it so that we can enter the account owner’s names interactively.

Listing 33: Declaring a Bank Account

class BankAccount{public:

BankAccount(char * newOwner = "");∼BankAcocunt();void owner(char *newOwner);char *owner();double balance();void credit(double amount);void debit(double amount);void transfer(BankAccount *other, double amount);

private:char *pOwner;double pBalance;Transaction *pTrans[MAXTRANS];

};


Listing 34: Accessing Private Parts

void BankAccount::transfer(BankAccount *other, double amount){

pBalance += amount;other→pBalance -= amount;

}

Listing 35: Member Functions for the Bank Account

BankAccount::BankAccount(char *newOwner){

pOwner = newOwner;pBalance = 0.0;

}BankAccount::∼BankAccount(){

// Delete individual transactionsdelete [] pTrans; // Delete the array of transactions

}void BankAccount owner(char *newOwner){

pOwner = newOwner;}char *BankAccount owner(){

return pOwner;}. . . .

Listing 36: Testing the Bank Account

void test{

BankAccount account1("Anne");BankAccount account2("Bill");account1 = acount2;

}

Listing 37: Using the Heap

BankAccount *accounts[] = new BankAccount * [3];accounts[0] = new BankAccount("Anne");accounts[1] = new BankAccount("Bill");accounts[2] = new BankAccount("Chris");// Account operations omitted.// Delete everything.for (int i = 0; i < 3; i++)

delete accounts[i];delete accounts;


Listing 38: Entering a Name

BankAccount *accounts[] = new BankAccount * [3];char * buffer = new char[100];for (int i = 0; i < 3; i++){

cout << "Please enter your name: ";cin.getline(buffer, 100);accounts[i] = new BankAccount(buffer);

}// Account operations omitted.// Delete everything.for (int i = 0; i < 3; i++)

delete accounts[i];delete accounts;delete buffer;

Listing 39: Adding buffers for names

// char * buffer = new char[100]; (deleted)for (int i = 0; i < 3; i++){

cout << "Please enter your name: ";char * buffer = new char[100];cin.getline(buffer, 100);accounts[i] = new BankAccount(buffer);

}// Account operations omitted.// Delete everything.for (int i = 0; i < 3; i++){

delete accounts[i]→owner();delete accounts[i];

}delete accounts;delete buffer;

The problem with this code is that all of the bank accounts have the same owner: that of the lastname entered. This is because each account contains a pointer to the unique buffer. We can fixthis by allocating a new buffer for each account, as in Listing 39.

This version works, but is poorly designed. Deleting the owner string should be the responsibilityof the object, not the client. The destructor for BankAccount should delete the owner string. Aftermaking this change, we can remove the destructor for owner in the code above: Listing 40.

Unfortunately, this change introduces another problem. Suppose we use the loop to create twoaccounts and then create the third account with this statement:

accounts[2] = new BankAccount("Tina");

The account destructor fails because "Tina" was not allocated using new. To correct this problem,we must modify the constructor for BankAccount. The rule we must follow is that data must beallocated with the constructor and deallocated with the destructor, or controlled entirely outside


Listing 40: Correcting the Destructor

BankAccount::∼BankAccount(){

delete pOwner;// Delete individual transactionsdelete [] pTrans; // Delete the array of transactions

}

Listing 41: Correcting the Constructor

BankAccount::BankAccount(char *newOwner){char *pOwner = new char (strlen(newOwner) + 1);strcpy(pOwner, newOwner);}

the object. Listing 41 shows the new constructor.

The next problem occurs when we try to use the overloaded function owner() to change the nameof an account owner.

accounts[2]→owner("Fred");

Once again, we have introduced two bugs:

. The original owner of accounts[2] will be lost.

. When the destructor is invoked for accounts[2], it will attempt to delete "Fred", which wasnot allocated by new. To correct this problem, we must rewrite the member function owner(),as shown in Listing 42.

We now have bank accounts that work quite well, provided that they are allocated on the heap. Ifwe try to mix heap objects and stack objects, however, we again encounter difficulties: Listing 43.The error occurs because transfer expects the address of an account rather than an actual account.We correct it by introducing the reference operator, &.

account1→transfer(&account3, 10.0);

We could make things easier for the programmer who is using accounts by declaring an overloadedversion of transfer() as shown below. Note that, in the version of Listing 44, we must use theoperator “.” rather than “→”. This is because other behaves like an object, although in fact anaddress has been passed.

The underlying problem here is that references (&) work best with stack objects and pointers (*)work best with heap objects. It is awkward to mix stack and heap allocation in C++ because two

Listing 42: Correcting the Owner Function

void BankAccount owner(char *newOwner){

delete pOwner;pOwner = new char(strlen(newOwner) + 1);strcpy(pOwner, newOwner);

}


Listing 43: Stack and Heap Allocation

BankAccount *account1 = new BankAccount("Anne");BankAccount *account2 = new BankAccount("Bill");BankAccount account3("Chris");account1→transfer(account2, 10.0); // OKaccount1→transfer(account3, 10.0); // Error!

Listing 44: Revising the Transfer Function

void BankAccount::transfer(BankAccount &other, double amount){

pBalance += amount;other.pBalance -= amount;

}

sets of functions are needed.

If stack objects are used without references, the compiler is forced to make copies. To implementthe code shown in Listing 45, C++ uses the (default) copy constructor for BankAccount twice:first to make a copy of account2 to pass to test() and, second, to make another copy of account2that will become myAccount. At the end of the scope, these two copies must be deleted.

The destructors will fail because the default copy constructor makes a shallow copy — it does notcopy the pTrans fields of the accounts. The solution is to write our own copy constructor andassignment overload for class BankAccount, as in Listing 46. The function pDuplicate() makesa deep copy of an account the function; it must make copies of all of the components of a backaccount object and destroy all components of the old object. The function pDelete has the sameeffect as the destructor. It is recommended practice in C++ to write these functions for any classthat contains pointers.

LaLonde and Pugh continue with a discussion of infix operators for user-defined classes. If theobjects are small — complex numbers, for example — it is reasonable to store the objects onthe stack and to use call by value. C++ is well-suited to this approach: storage management isautomatic and the only overhead is the copying of values during function call and return.

If the objects are large, however, there are problems. Suppose that we want to provide the standardalgebraic operations (+,−, ∗) for 4× 4 matrices. (This would be useful for graphics programming,for example.) Since a matrix occupies 4×4×8 = 128 bytes, copying a matrix is relatively expensiveand the stack approach will be inefficient. But if we write functions that work with pointers tomatrices, it is difficult to manage allocation and deallocation. Even a simple function call such as

project(A + B);

will create a heap object corresponding to A + B but provides no opportunity to delete this object.

Listing 45: Copying

BankAccount test(BankAccount account){return account;}BankAccount myAccount;myAccount = test(account2);


Listing 46: Copy Constructors

BankAccount::BankAccount(BankAccount &account){

pDuplicate(account);}BankAccount & BankAccount operator = (BankAccount &rhs){

if (this == &rhs)return this;

pDelete();pDuplicate(rhs);return *this;

}

Listing 47: Using extended assignment operators

A = C;A *= D;A += B;

(The function cannot delete it, because it cannot tell whether its argument is a temporary.) Thesame problem arises with expressions such as

A = B + C * D;

It is possible to implement the extended assignment operators, such as +=, without losing objects.But this approach reduces us to a kind of assembly language style of programming, as shown inListing 47.

These observations by LaLonde and Pugh seem rather hostile towards C++. They are making acomparison between C++ and Smalltalk based only on ease of programming. The comparison isone-sided because they do not discuss factors that might make C++ a better choice than Smalltalkin some circumstances, such as when efficiency is important. Nevertheless, the discussion makes animportant point: the “stack model” (with value semantics) is inferior to the “heap model” (withreference semantics) for many OOP applications. It is significant that most language designershave chosen the heap model for OOP (Simula, Smalltalk, CLU, Eiffel, Sather, Dee, Java, Blue,amongst others) and that C++ is in the minority on this issue.

Another basis for criticizing C++ is that is has become, perhaps not intentionally, a hacker’slanguage. A sign of good language design is that the features of the language are used most ofthe time to solve the problems that they were intended to solve. A sign of a hacker’s language isthat language “gurus” introduce “hacks” that solve problems by using features in bizarre ways forwhich they were not originally intended.

Input and output in C++ is based on “streams”. A typical output statement looks something likethis:

cout << "The answer is " << answer << ’.’ << endl;

The operator << requires a stream reference as its left operand and an object as its right operand.Since it returns a stream reference (usually its left operand), it can be used as an infix operatorin compound expressions such as the one shown. The notation seems clear and concise and, atfirst glance, appears to be an elegant way of introducing output and input (with the operator >>)capabilities.


Listing 48: A Manipulator Class

class ManipulatorForInts{

friend ostream & operator << (ostream & const ManipulatorForInts &);public:

ManipulatorForInts (int (ios::*)(int), int);. . . .

private:int (ios::*memberfunc)(int);int value;

}

Obviously, << is heavily overloaded. There must be a separate function for each class for whichinstances can be used as the right operand. (<< is also overloaded with respect to its left operand,because there are different kinds of output streams.) In the example above, the first three rightoperands are a string, a double, and a character.

The first problem with streams is that, although they are superficially object oriented (a streamis an object), they do not work well with inheritance. Overloading provides static polymorphismwhich is incompatible with dynamic polymorphism (recall Section 9.6). It is possible to writecode that uses dynamic binding but only by providing auxiliary functions.

The final operand, endl, writes end-of-line to the stream and flushes it. It is a global function withsignature

ios & endl (ios & s)

A programmer who needs additional “manipulators” of this kind has no alternative but to addthem to the global name space.

Things get more complicated when formatting is added to streams. For example, we could changethe example above to:

cout << "The answer is " << setprecision(8) << answer << ’.’ << endl;

The first problem with this is that it is verbose. (The verbosity is not really evident in such a smallexample, but it is very obvious when streams are used, for example, to format a table of heteroge-neous data.) The second, and more serious problem, is the implementation of setprecision. Theway this is done is (roughly) as follows (Teale 1993, page 171). First, a class for manipulators withinteger arguments is declared, as in Listing 48, and the inserter function is defined:

ostream & operator << (ostream & os, const ManipulatorForInts & m){

(os.*memberfunc)(m.value);return os;

}Second, the function that we need can be defined:

ManipulatorForInts setprecision (int n){

return ManipulatorForInts (& ios::precision, n);}

It is apparent that a programmer must be quite proficient in C++ to understand these definitions,let alone design similar functions. To make matters worse, the actual definitions are not as shown


Listing 49: File handles as Booleans

ifstream infile ("stats.dat");if (infile) . . . .if (!infile) . . . .

here, because that would require class declarations for many different types, but are implementedusing templates.

There are many other tricks in the implementation of streams. For example, we can use file handlesas if they had boolean values, as in Listing 49. It seems that there is a boolean value, infile, thatwe can negate, obtaining !infile. In fact, infile can be used as a conditional because there isa function that converts infile to (void *), with the value NULL indicating that the file is notopen, and !infile can be used as a conditional because the operator ! is overloaded for classifstream to return an int which is 0 if the file is open.

These are symptoms of an even deeper malaise in C++ than the problems pointed out by authors ofotherwise excellent critiques such as (Joyner 1992; Sakkinen 1992). Most experienced programmerscan get used to the well-known idiosyncrasies of C++, but few realize that frequently-used andapparently harmless constructs are implemented by subtle and devious tricks. Programs that looksimple may actually conceal subtle traps.

6.11 Evaluation of OOP

. OOP provides a way of building programs by incremental modification .

. Programs can often be extended by adding new code rather than altering existing code.

. The mechanism for incremental modification without altering existing code is inheritance .However, there are several different ways in which inheritance can be defined and used, andsome of them conflict.

. The same identifier can be used in various contexts to name related functions. For example,all objects can respond to the message “print yourself”.

. Class-based OOPLs can be designed so as to support information hiding (encapsulation) —the separation of interface and implementation. However, there may be conflicts betweeninheritance and encapsulation.

. An interesting, but rarely noted, feature of OOP is that the call graph is created and modifiedduring execution. This is in contrast to procedural languages, in which the call graph can bestatically inferred from the program text.

Section 6.9 contains a more detailed review of issues in OOP.

Exercise 31. Simula is 33 years old. Can you account for the slow acceptance of OOPLs byprofessional programmers? 2

7 Backtracking Languages

The PLs that we discuss in this section differ in significant ways, but they have an importantfeature in common: they are designed to solve problems with multiple answers, and to find asmany answers as required.

In mathematics, a many-valued expression can be represented as a relation or as a function re-turning a set. Consider the relation “divides”, denoted by | in number theory. Any integer greaterthan 1 has at least two divisors (itself and 1) and possible others. For example, 1 | 6, 2 | 6, 3 | 6,and 6 | 6. The functional analog of the divides relation is the set-returning function divisors:

divisors(6) = { 1, 2, 3, 6 }divisors(7) = { 1, 7 }divisors(24) = { 1, 2, 3, 4, 6, 8, 12, 24 }

We could define functions that return sets in a FPL or we could design a language that works di-rectly with relations. The approach that has been most successful, however, is to produce solutionsone at a time. The program runs until it finds the first solution, reports that solution or uses itin some way, then backtracks to find another solution, reports that, and so on until all the choiceshave been explored or the user has had enough.

7.1 Prolog

The computational model (see Section 10.2) of a logic PL is some form of mathematical logic.Propositional calculus is too weak because it does not have variables. Predicate calculus is toostrong because it is undecidable. (This means that there is no effective procedure that determineswhether a given predicate is true.) Practical logic PLs are based on a restricted form of predicatecalculus that has a decision procedure.

The first logic PL was Prolog, introduced by Colmerauer (1973) and Kowalski (1974). The discus-sion in this section is based on (Bratko 1990).

We can express facts in Prolog as predicates with constant arguments. All Prolog textbooks startwith facts about someone’s family: see Listing 50. We read parent(pam, bob) as “Pam is theparent of Bob”. This Bob’s parents are Pam and Tom; similarly for the other facts. Note that,since constants in Prolog start with a lower case letter, we cannot write Pam, Bob, etc.

Prolog is interactive; it prompts with ?- to indicate that it is ready to accept a query. Prologresponds to simple queries about facts that it has been told with “yes” and “no”, as in Listing 51.If the query contains a logical variable — an identifier that starts with an upper case letter —Prolog attempts to find a value of the variable that makes the query true.

If it succeeds, it replies with the binding.

Listing 50: A Family

parent(pam, bob).parent(tom, bob).parent(tom, liz).parent(bob, ann).parent(bob, pat).parent(pat, jim).

106

7 BACKTRACKING LANGUAGES 107

Listing 51: Prolog queries

?- parent(tom, liz).yes?- parent(tom, jim).no

?- parent(pam, X).X = bob

If there is more than one binding that satisfies the query, Prolog will find them all. (The exactmechanism depends on the implementation. Some implementations require the user to ask for moreresults. Here we assume that the implementation returns all values without further prompting.)

?- parent(bob, C).C = annC = pat

Consider the problem of finding Jim’s grandparents. In logic, we can express “G is a grandparentof Jim” in the following form: “there is a person P such that P is a parent of Jim and there is aperson G such that G is a parent of P”. In Prolog, this becomes:

?- parent(P, jim), parent(G, P).P = patG = bob

The comma in this query corresponds to conjunction (“and”): the responses to the query arebindings that make all of the terms valid simultaneously. Disjunction (“or”) is provided by multiplefacts or rules: in the example above, Bob is the parent of C is true if C is Ann or C is Pat.

We can expression the “grandparent” relation by writing a rule and adding it to the collectionof facts (which becomes a collection of facts and rules). The rule for “grandparent” is “G is agrandparent of C if G is a parent of P and P is a parent of C”. In Prolog:

grandparent(G, C) :- parent(G, P), parent(P, C).

Siblings are people who have the same parents. We could write this rule for siblings:

sibling(X, Y) :- parent(P, X), parent(P, Y).

Testing this rule reveals a problem.

?- sibling(pat, X).X = annX = pat

We may not expect the second response, because we do not consider Pat to be a sibling of her-self. But Prolog simply notes that the predicate parent(P, X), parent(P, Y) is satisfied by thebindings P = bob, X = pat, and Y = pat. To correct this rule, we would have to require that Xand Y have different values:

sibling(X, Y) :- parent(P, X), parent(P, Y), different(X, Y).

It is clear that different(X, Y) must be built into Prolog somehow, because it would not befeasible to list every true instantiation of it as a fact.

Prolog allows rules to be recursive : a predicate name that appears on the left side of a rule mayalso appear on the right side. For example, a person A is an ancestor of a person X if either A isa parent of X or A is an ancestor of a parent P of X:

ancestor(A, X) :- parent(A, X).


parent(pam, bob)

ancestor(pam, bob) parent(bob, pat)

ancestor(pam, pat) parent(pat, jim)

ancestor(pam, jim)

Figure 15: Proof of ancestor(pam, jim)

ancestor(A, X) :- ancestor(A, P), parent(P, X).

We confirm the definition by testing:

?- ancestor(pam, jim).yes

We can read a Prolog program in two ways. Read declaratively , ancestor(pam, jim) is astatement and Prolog’s task is to prove that it is true. The proof is shown in Figure 15, which usesthe convention that P

Q means “from P infer Q”. Read procedurally , ancestor(pam, jim) definesa sequence of fact-searches and rule-applications that establish (or fail to establish) the goal. Theprocedural steps for this example are as follows.

1. Using the first rule for ancestor, look in the data base for the fact parent(pam, jim).

2. Since this search fails, try the second rule with A = pam and X = jim. This gives two newsubgoals: ancestor(pam, P) and parent(P, jim).

3. The second of these subgoals is satisfied by P = pat from the database. The goal is nowancestor(pam, pat).

4. The goal expands to the subgoals ancestor(pam, P) and parent(P, pat). The second ofthese subgoals is satisfied by P = bob from the database. The goal is now ancestor(pam, bob).

5. The goal ancestor(pam, bob) is satisfied by applying the first rule with the known factparent(pam, bob).

In the declarative reading, Prolog finds multiple results simply because they are there. In theprocedure reading, multiple results are obtained by backtracking . Every point in the program atwhich there is more than one choice of a variable binding is called a choice point . The choicepoints define a tree: the root of the tree is the starting point of the program (the main goal) andeach leaf of the tree represents either success (the goal is satisfied with the chosen bindings) orfailure (the goal cannot be satisfied with these bindings). Prolog performs a depth-first search ofthis tree.

Sometimes, we can tell from the program that backtracking will be useless. Declaratively, thepredicate minimum(X,Y,Min) is satisfied if Min is the minimum of X and Y. We define rules for thecases X ≤ Y and X > Y .

minimum(X, Y, X) :- X ≤ Y.minimum(X, Y, Y) :- X > Y.

It is easy to see that, if one of these rules succeeds, we do not want to backtrack to the other,because it is certain to fail. We can tell Prolog not to backtrack by writing “!”, pronounced “cut”.


Program Meaningp :- a,b. p :- c. p ⇐⇒ (a ∧ b)∨ c

p :- a,!,b. p:- c. p ⇐⇒ (a ∧ b)∨ (∼a ∧ c)p :- c. p :- a,!,b. p ⇐⇒ c∨ (a ∧ b)

Figure 16: Effect of cuts

minimum(X, Y, X) :- X ≤ Y, !.minimum(X, Y, Y) :- X > Y, !.

The introduction of the cut operation has an interesting consequence. If a Prolog program has nocuts, we can freely change the order of goals and clauses. These changes may affect the efficiencyof the program, but they do not affect its declarative meaning. If the program has cuts, this isno longer true. Figure 16 shows an example of the use of cut and its effect on the meaning of aprogram.

Recall the predicate different introduced above to distinguish siblings, and consider the followingdeclarations and queries.

different(X, X) :- !, fail.different(X, Y).?- different(Tom, Tom).no?- different(Ann, Tom).yes

The query different(Tom, Tom) matches different(X, X) with the binding X = Tom and suc-ceeds. The Prolog constant fail then reports failure. Since the cut has been passed, no backtrack-ing occurs, and the query fails. The query different(Ann, Tom) does not match different(X, X)but it does match the second clause, different(X, Y) and therefore succeeds.

This idea can be used to define negation in Prolog:

not (P) :- P, !, fail.true.

However, it is important to understand what negation in Prolog means. Consider Listing 52,assuming that the entire database is included. The program has stated correctly that crows fly andpenguins do not fly. But it has made this inference only because the database has no informationabout penguins. The same program can produce an unlimited number of (apparently) correctresponses, and also some incorrect responses. We account for this by saying that Prolog assumesa closed world . Prolog can make inferences from the “closed world” consisting of the facts andrules in its database.

Now consider negation. In Prolog, proving not(P) means “P cannot be proved” which means “Pcannot be inferred from the facts and rules in the database”. Consequently, we should be scepticalabout positive results obtained from Prolog programs that use not. The response

?- not human(peter).yes

means no more than “the fact that Peter is human cannot be inferred from the database”.

The Prolog system proceeds top-down. It attempts to match the entire goal to one of the rulesand, having found a match, applies the same idea to the parts of the matched formulas. The“matching” process is called unification .


Listing 52: Who can fly?

flies(albatross).flies(blackbird);flies(crow).?- flies(crow).yes?- flies(penguin).no?- flies(elephant).no?- flies(brick).no?- flies(goose).no

Suppose that we have two terms A and B built up from constants and variables. A unifier of Aand B is a substitution that makes the terms identical. A substitution is a binding of variablenames to values. In the following example we follow the Prolog convention, using upper case lettersX, Y, . . .to denote variables and lower case letters a, b, . . .to denote constants. Let A and B be theterms

A ≡ left(tree(X, tree(a, Y ))B ≡ left(tree(b, Z))

and let

σ = (X = b, Z = tree(a, Y ))

be a substitution. Then σA = σB = left(tree(b, tree(a, Y )) and σ is a unifier of A and B. Notethat the unifier is not necessarily unique. However, we can define a most general unifier whichis unique.

The unification algorithm contains a test called the occurs check that is used to reject a substi-tution of the form X = f(X) in which a variable is bound to a formula that contains the variable.If we changed A and B above to

A ≡ left(tree(X, tree(a, Y ))B ≡ left(tree(b, Y )

they would not unify.

You can find out whether a particular version of Prolog implements the occurs check by trying thefollowing query:

?- X = f(X).

If the reply is “no”, the occurs check is implemented. If, as is more likely, the occurs check is notperformed, the response will be

X = f(f(f(f( . . . .

Prolog syntax corresponds to a restricted form of first-order predicate calculus called clausalform logic. It is not practical to use the full predicate calculus as a basis for a PL because itis undecidable. Clausal form logic is semi-decidable: there is an algorithm that will find a proof


Listing 53: Restaurants

good(les halles).expensive(les halles).good(joes diner).reasonable(Restaurant) : not expensive(Restaurant).?- good(X), reasonable(X).X = joes diner.?- reasonable(X), good(X).no

for any formula that is true in the logic. If a formula is false, the algorithm may fail after a finitetime or may loop forever. The proof technique, called SLD resolution, was introduced by Robinson(1965). SLD stands for “Selecting a literal, using a Linear strategy, restricted to Definite clauses”.

The proof of validity of SLD resolution assumes that unification is implemented with the occurscheck. Unfortunately, the occurs check is expensive to implement and most Prolog systems omitit. Consequently, most Prolog systems are technically unsound, although problems are rare inpractice.

A Prolog program apparently has a straightforward interpretation as a statement in logic, but theinterpretation is slightly misleading. For example, since Prolog works through the rules in theorder in which they are written, the order is significant. Since the logical interpretation of a rulesequence is disjunction, it follows that disjunction in Prolog does not commute.

Example 17: Failure of commutativity. In the context of the rules

wife(jane).female(X) :- wife(X).female(X) :- female(X).

we obtain a valid answer to a query:

?- female(Y).jane

But if we change the order of the rules, as in

wife(jane).female(X) :- female(X).female(X) :- wife(X).

the same query leads to endless applications of the second rule. 2

Exercise 32. Listing 53 shows the results of two queries about restaurants. In logic, conjunctionis commutative (P ∧ Q ≡ Q ∧ P ). Explain why commutativity fails in this program. 2

The important features of Prolog include:

. Prolog is based on a mathematical model (clausal form logic with SLD resolution).

. “Pure” Prolog programs, which fully respect the logical model, can be written but are ofteninefficient.

. In order to write efficient programs, a Prolog programmer must be able to read programs bothdeclaratively (to check the logic) and procedurally (to ensure efficiency).


. Prolog implementations introduce various optimizations (the cut, omitting the occurs check)that improve the performance of programs but compromise the mathematical model.

. Prolog has garbage collection.

7.2 Alma-0

One of the themes of this course is the investigation of ways in which different paradigms ofprogramming can be combined. The PL Alma-0 takes as its starting point a simple imperativelanguage, Modula–2 (Wirth 1982), and adds a number of features derived from logic programmingto it. The result is a simple but highly expressive language. Some features of Modula–2 are omitted(the CARDINAL type; sets; variant records; open array parameters; procedure types; pointer types;the CASE, WITH, LOOP, and EXIT statements; nested procedures; modules). They are replaced bynine new features, described below. This section is based on Apt et al. (1998).

BES. Boolean expressions may be used as statements. The expression is evaluated. If its valueis TRUE, the computation succeeds and the program continues as usual. If the value of theexpression is FALSE, the computation fails . However, this does not mean that the programstops: instead, control returns to the nearest preceding choice point and resumes with anew choice.

Example 18: Ordering 1. If the array a is ordered, the following statement succeeds.Otherwise, it fails; the loop exits, and i has the smallest value for which the condition isFALSE.

FOR i := 1 to n - 1 DO a[i] <= a[i+1] END

2

SBE. A statement sequence may be used as a boolean expression. If the computation of a sequenceof statements succeeds, it is considered equivalent to a boolean expression with value TRUE.If the computation fails, it is considered equivalent to a boolean expression with value FALSE.

Example 19: Ordering 2. The following sequence decides whether a precedes b in thelexicographic ordering. The FOR statement finds the first position at which the arrays differ.If they are equal, the entire sequence fails. If there is a difference at position i, the sequencesucceeds iff a[i] < b[i].

NOT FOR i :=1 TO n DO a[i] = b[i] END;a[i] < b[i]

2

EITHER–ORELSE. The construct EITHER-ORELSE is added to the language to permit back-tracking. An EITHER statement has the form

EITHER <seq> ORELSE <seq> . . . .ORELSE <seq> END

Computation of an EITHER statement starts by evaluating the first sequence. If it succeeds,computation continues at the statement following END. If this computation subsequentlyfails, control returns to the second sequence in the EITHER statement, with the program inthe same state as it was when the EITHER statement was entered. Each sequence is tried inturn in this way, until one succeeds or the last one fails.


Listing 54: Creating choice points

EITHER y := xORELSE x > 0; y := -xEND;x := x + b;y < 0

Listing 55: Pattern matching

SOME i := 0 TO n - m DOFOR j := 0 to m - 1 DO

s[i+j] = p[j]END

END

Example 20: Choice points. Assume that the sequence shown in Listing 54 is enteredwith x = a > 0. The sequence of events is as follows:

. The assignment y := x assigns a to y. This sequence succeeds and transfers control tox := x + b.

. The assignment x := x + b assigns a + b to x.

. The test y < 0 fails because y = a > 0.

. The original value of x is restored. The test x > 0 succeeds because x = a > 0, and yreceives the value −a.

. The final sequence is performed and succeeds, leaving x = a + b and y = −a.

2

SOME. The SOME statement permits the construction of statements that work like EITHER state-ments but have a variable number of ORELSE clauses. The statement

SOME i := 1 TO n DO <seq> END

is equivalent to

EITHER i := 1; <seq>ORELSE SOME i := 2 TO n DO <seq> END

Example 21: Pattern matching. Suppose that we have two arrays of characters: apattern p[0..m-1] and a string s[0..n-1]. The sequence shown in Listing 55 sets i to thefirst occurrence of p in s, or fails if there is no match. 2

COMMIT. The COMMIT statement provides a way of restricting backtracking. (It is similar tothe cut in Prolog.) The statement

COMMIT <seq> END

evaluates the statement sequence <seq>; if seq succeeds (possibly by backtracking), all ofthe choice points created during its evaluation are removed.

Example 22: Ordering 3. Listing 56 shows another way of checking lexicographic or-dering. With the COMMIT, this sequence yields that value of a[i] < b[i] for the smallest i


Listing 56: Checking the order

COMMITSOME i := 1 TO n DO

a[i] <> b[i]END

END;a[i] < b[i]

such that a[i] <> b[i]. Without the COMMIT, it succeeds if a[i] < b[i] is TRUE for somevalue of i such that a[i] <> b[i]. 2

FORALL. The FORALL statement provides for the exploration of all choice points generated by astatement sequence. The statement

FORALL <seq1> DO <seq2> END

evaluates <seq1> and then <seq2>. If <seq1> generates choice points, control backtracksthrough these, even if the original sequence succeeded. Whenever <seq2> succeeds, its choicepoints are removed (that is, there is an implicit COMMIT around <seq2>).

Example 23: Matching. Suppose we enclose the pattern matching code above in a func-tion Match. The following sequence finds the number of times that p occurs in s.

count := 0;FORALL k := Match(p, s) DO count := count + 1 END;

2

EQ. Variables in Alma-0 start life as uninitialized values. After a value has been assigned toa variable, it ceases to be uninitialized and becomes initialized . Expressions containinguninitialized values are allowed, but they do not have values. For example, the comparisons = t is evaluated as follows:

. if both s and t are initialized, their values are compared as usual.

. if one side is an uninitialized variable and the other is an expression with a value, thisvalue is assigned to the variable;

. if both sides are uninitialized variables, an error is reported.

Example 24: Remarkable sequences. A sequence with 27 integer elements is remark-able if it contains three 1’s, three 2’s, . . . ., three 9’s arranged in such a way that fori = 1, 2, . . . , 9 there are exactly i numbers between successive occurrences of i. Here is aremarkable sequence:

(1, 9, 1, 2, 1, 8, 2, 4, 6, 2, 7, 9, 4, 5, 8, 6, 3, 4, 7, 5, 3, 9, 6, 8, 3, 5, 7).

Problem 1. Write a program that tests whether a given array is a remarkable sequence.

Problem 2. Find a remarkable sequence.

Due to the properties of “=”, the program shown in Listing 57 is a solution for both of theseproblems! If the given sequence is initialized, the program tests whether it is remarkable; ifit is not initialized, the program assigns a remarkable sequence to it. 2


Listing 57: Remarkable sequences

TYPE Sequence = ARRAY [1..27] OF INTEGER;PROCEDURE Remarkable (VAR a: Sequence);

VAR i, j: INTEGER;BEGIN

For i := 1 TO 9 DOSOME j := 1 TO 25 - 2 * i DO

a[j] = i;a[j+i+1] = i;a[j+2*i+1] = i

ENDEND

END;

Listing 58: Compiling a MIX parameter

VAR v: integer;BEGIN

v := E;P(v)

END

MIX. Modula–2 provides call by value (the default) and call by reference (indicated by writingVAR before the formal parameter. Alma-0 adds a third parameter passing mechanism: callby mixed form . The parameter is passed either by value or by reference, depending on theform of the actual parameter.

We can think of the implementation of mixed form in the following way. Suppose thatprocedure P has been declared as

PROCEDURE P (MIX n: INTEGER); BEGIN . . . .END;

The call P(v), in which v is a variable, is compiled in the usual way, interpreting MIX as VAR.The call P(E), in which E is an expression, is compiled as if the programmer had written thecode shown in Listing 58.

Example 25: Linear Search. The function Find in Listing 59 uses a mixed form param-eter.

. The call Find(7,a) tests if 7 occurs in a.

. The call Find(x, a), where x is an integer variable, assigns each component of a to xby backtracking.

Listing 59: Linear search

Type Vector = ARRAY [1..N] of INTEGER;PROCEDURE Find (MIX e: INTEGER; a: Vector);

VAR i: INTEGER;BEGIN

SOME i := 1 TO N DO e = a[i] ENDEND;


Listing 60: Generalized Addition

PROCEDURE Plus (MIX x, y, z: INTEGER);BEGIN

IF KNOWN(x); KNOWN(y) THEN z = x + yELSIF KNOWN(y); KNOWN(z) THEN x = z - yELSIF KNOWN(x); KNOWN(z) THEN y = z - xEND

END;

. The sequenceFind(x, a); Find(x, b)

tests whether the arrays a and b have an element in common; if so, x receives the valueof the common element.

. The statementFORALL Find(x, a) DO Find(x, b) END

tests whether all elements of a are also elements of b.. The statement

FORALL Find(x, a); Find(x, b) DO WRITELN(x) END

prints all of the elements that a and b have in common.

2

KNOWN. The test KNOWN(x) succeeds iff the variable x is initialized.

Example 26: Generalized Addition. Listing 60 shows the declaration of a functionPlus. The call Plus(x,y,z) will succeed if x+y = z provided that at least two of the actualparameters have values. For example, Plus(2,3,5) succeeds and Plus(5,n,12) assigns 7 ton. 2

The language Alma-0 has a declarative semantics in which the constructs described abovecorrespond to logical formulas. It also has a procedural semantics that describes how theconstructs can be efficiently executed, using backtracking. Alma-0 demonstrates that, if appropriatecare is taken, it is possible to design PLs that smoothly combine paradigms without losing theadvantages of either paradigm.

The interesting feature of Alma-0 is that it demonstrates that it is possible to start with a simple,well-defined language (Modula-2), remove features that are not required for the current purpose,and add a small number of simple, new features that provide expressive power when used together.

Exercise 33. The new features of Alma-0 are orthogonal, both to the existing features of Modula-2 and to each other. How does this use of orthogonality compare with the orthogonal design ofAlgol 68? 2

7.3 Other Backtracking Languages

Prolog and Alma-0 are by no means the only PLs that incorporate backtracking. Others includeIcon (Griswold and Griswold 1983), SETL (Schwartz, Dewar, Dubinsky, and Schonberg 1986), and2LP (McAloon and Tretkoff 1995).


Constraint PLs are attracting increasing attention. A program is a set of constraints, expressedas equalities and inequalities, that must be satisfied. Prolog is strong in logical deduction but weakin numerical calculation; constraint PLs attempt to correct this imbalance.

8 Structure

The earliest programs had little structure. Although a FORTRAN program has a structure — itconsists of a main program and subroutines — the main program and the subroutines themselvesare simply lists of statements.

8.1 Block Structure

Algol 60 was the first major PL to provide recursive, hierarchical structure. It is probably significantthat Algol 60 was also the first language to be defined, syntactically at least, with a context-free grammar. It is straightforward to define recursive structures using context-free grammars —specifically, a grammar written in Backus-Naur Form (BNF) for Algol 60. The basis of Algol 60syntax can be written in extended BNF as follows:

PROG → BLOCK

BLOCK → "begin" { DECL} { STMT } "end"STMT → BLOCK | · · ·DECL → HEADER BLOCK | · · ·

With blocks come nested scopes . Earlier languages had local scope (for example, FORTRANsubroutines have local variables) but scopes were continuous (no “holes”) and not nested. It wasnot possible for one declaration to override another declaration of the same name. In Algol 60,however, an inner declaration overrides an outer declaration. The outer block in Listing 61 declaresan integer k, which is first set to 6 and then incremented. The inner block declares another integerk which is set to 4 and then disappears without being used at the end of the block.

See Section 9.8.1 for further discussion of nesting.

8.2 Modules

By the early 70s, people realized that the “monolithic” structure of Algol-like programs was toorestricted for large and complex programs. Furthermore, the “separate compilation” model ofFORTRAN — and later C — was seen to be unsafe.

The new model that was introduced by languages such as Mesa and Modula divided a large programinto modules.

Listing 61: Nested Declarations

begininteger k;k := 6;begin

integer k;k := 1;

end;k := k + 1;

end;

118

8 STRUCTURE 119

. A module is a meaningful unit (rather than a collection of possibly unrelated declarations)that may introduce constants, types, variables, and functions (called collectively features).

. A module may import some or all of its features.

. In typical modular languages, modules can export types, functions, and variables.

Example 27: Stack Module. One module of a program might declare a stack module:

module Stack{

export initialize, push, pop, . . . .. . . .

}Elsewhere in the program, we could use the stack module:

import Stack;initialize();push(67);. . . .

2

This form of module introduces a single instance: there is only one stack, and it is shared by allparts of the program that import it. In contrast, a stack template introduces a type from whichwe can create as many instances as we need.

Example 28: Stack Template. The main difference between the stack module and the stacktemplate is that the template exports a type.

module Stack{

export type STACK, initialize, push, pop, . . . .. . . .

}In other parts of the program, we can use the type STACK to create stacks.

import Stack;STACK myStack;initialize(myStack);push(myStack, 67);. . . .

STACK yourStack;. . . .

2

The fact that we can declare many stacks is somewhat offset by the increased complexity of thenotation: the name of the stack instance must appear in every function call. Some modularlanguages provide an alternative syntax in which the name of the instance appears before thefunction name, as if the module was a record. The last few lines of the example above would bewritten as:

import Stack;STACK myStack;

8 STRUCTURE 120

myStack.initialize();myStack.push(67);. . . .

8.2.1 Encapsulation

Parnas (1972) wrote a paper that made a number of important points.

. A large program will be implemented as a set of modules.

. Each programmer, or programming team, will be the owners of some modules and users ofother modules.

. It is desirable that a programmer should be able to change the implementation of a modulewithout affecting (or even informing) other programmers.

. This implies that a module should have an interface that is relatively stable and an imple-mentation that changes as often as necessary. Only the owner of a module has access to itsimplementation; users do not have such access.

Parnas introduced these ideas as the “principle of information hiding”. Today, it is more usual torefer to the goals as encapsulation .

In order to use the module correctly without access to its implementation, users require a speci-fication written in an appropriate language. The specification answer the question “What doesthis module do?” and the implementation answers the question “How does this module work?”

Most people today accept Parnas’s principle, even if they do not apply it well. It is thereforeimportant to appreciate how radical it was at the time of its introduction.

[Parnas] has proposed a still more radical solution. His thesis is that the programmeris most effective if shielded from, rather than exposed to, the details of construction ofsystem parts other than his own. This presupposes that all interfaces are completely andprecisely defined. While that is definitely sound design, dependence upon its perfectaccomplishment is a recipe for disaster. A good information system both exposesinterface errors and stimulates their correction. (Brooks 1975, page 78)

Brooks wrote this in the first edition of The Mythical Man Month in 1975. Twenty years later, hehad changed his mind.

Parnas was right, and I was wrong. I am now convinced that information hiding, todayoften embodied in object-oriented programming, is the only was of raising the level ofsoftware design. (Brooks 1995, page 272)

Parnas’s paper was published in 1972, only shortly after Wirth’s (1971) paper on program de-velopment by stepwise refinement. Yet Parnas’s approach is significantly different from Wirth’s.Whereas Wirth assumes that a program can be constructed by filling in details, perhaps with occa-sional backtracking to correct an error, Parnas sees programming as an iterative process in whichrevisions and alterations are part of the normal course of events. This is a more “modern” viewand certainly a view that reflects the actual construction of large and complex software systems.

8 STRUCTURE 121

8.3 Control Structures

Early PLs, such as FORTRAN, depended on three control structures:

. sequence;

. transfer (goto);

. call/return.

These structures are essentially those provided by assembly language. The first contribution ofPLs was to provide names for variables and infix operators for building expressions.

The “modern” control structures first appeared in Algol 60. The most significant contributionof Algol 60 was the block : a program unit that could contain data, functions, and statements.Algol 60 also contributed the familiar if-then-else statement and a rather complex loop statement.However, Algol 60 also provided goto statements and most programmers working in the sixtiesassumed that goto was essential.

The storm of controversy raised by Dijkstra’s (1968) letter condemning the goto statement indicatesthe extent to which programmers were wedded to the goto statement. Even Knuth (1974) enteredthe fray with arguments supporting the goto statement in certain circumstances, although hispaper is well-balanced overall. The basic ideas that Dijkstra proposed were as follows.

. There should be one flow into, and one flow out of, each control structure.

. All programming needs can be met by three structures satisfying the above property: thesequence; the loop with preceding test (while-do); and the conditional (if-then-else).

Dijkstra’s proposals had previously been justified in a formal sense by Bohm and Jacopini (1966).This paper establishes that any program written with goto statements can be rewritten as anequivalent program that uses sequence, conditional, and loop structures only; it may be necessaryto introduce Boolean variables. Although the result is interesting, Dijkstra’s main point was notto transform old, goto-ridden programs but to write new programs using simple control structuresonly.

Despite the controversy, Dijkstra’s arguments had the desired effect. In languages designed after1968, the role of goto was down-played. Both C and Pascal have goto statements but they arerarely used in well-written programs. C, with break, continue, and return, has even less needfor goto than Pascal. Ada has a goto statement because it was part of the US DoD requirementsbut, again, it is rarely needed. C++ inherited goto statements from C. Java does not have a gotostatement, although some uses of continue look rather like jumps.

The use of control structures rather than goto statements has several advantages.

. It is easier to read and maintain programs. Maintenance errors are less common.

. Precise reasoning (for example, the use of formal semantics) is simplified when goto statementsare absent.

. Certain optimizations become feasible because the compiler can obtain more precise informa-tion in the absence of goto statements.

8.3.1 Loop Structures.

The control structure that has elicited the most discussion is the loop. The basic loop has a singletest that precedes the body:

8 STRUCTURE 122

Listing 62: Reading in Pascal

var n, sum : int;sum := 0;read(n);while n ≥ 0 do

beginsum := sum + n;read(n);

end

Listing 63: Reading in C

int sum = 0;while (true){

int n;cin >> n;if (n < 0)

break;sum += n;

}

while E do S

It is occasionally useful to perform the test after the body: Pascal provides repeat/until and Cprovides do/while.

This is not quite enough, however, because there are situations in which it is useful to exit froma loop from within the body. Consider, for example, the problem of reading and adding numbers,terminating when a negative number is read without including the negative number in the sum.Listing 62 shows the code in Pascal. In C and C++, we can use break to avoid the repeated code,as shown in Listing 63.

Some recent languages, recognizing this problem, provide only one form of loop statement consistingof an unconditional repeat and a conditional exit that can be used anywhere in the body of theloop.

8.3.2 Procedures and Functions

Some languages, such as Ada and Pascal, distinguish the terms procedure and function. Bothnames refer to “subroutines” — program components that perform a specific task when invokedfrom another component of the program. A procedure is a subroutine that has some effect on theprogram data but does not return a result. A function is a subroutine that returns a result; afunction may have some effect on program data, called a “side effect”, but this is often consideredundesirable.

Languages that make a distinction between procedures and functions usually also make a distinctionbetween “actions” and “data”. For example, at the beginning of the first Pascal text (Jensen andWirth 1976), we find the following statement:

An algorithm or computer program consists of two essential parts, a description ofactions which are to be performed, and a description of the data , which are manip-

8 STRUCTURE 123

ulated by the actions. Actions are described be so-called statements , and data aredescribed by so-called declarations and definitions .

In other languages, such as Algol and C, the distinction between “actions” and “data” is lessemphasized. In these languages, every “statement” has a value as well as an effect. Consistently,the distinction between “procedure” and “function” also receives less emphasis.

C, for example, does not have keywords corresponding to procedure and function. There is asingle syntax for declaring and defining all functions and a convention that, if a function does notreturn a result, it is given the return type void.

We can view the progression here as a separation of concerns. The “concerns” are: whether asubroutine returns a value; and whether it has side-effects. Ada and Pascal combine the concernsbut, for practical reasons, allow “functions with side-effects”. In C, there are two concerns: “havinga value” and “having an effect” and they are essentially independent.

Example 29: Returning multiple results. Blue (Kolling 1999) has a uniform notation forreturning either one or more results. For example, we can define a routine with this heading:

findElem (index: Integer) -> (found: Boolean, elem: Element) is . . . .

The parameter index is passed to the routine findElem, and the routine returns two values: foundto indicate whether something was found, and elem as the object found. (If the object is not found,elem is set to the special value nil.)

Adding multiple results introduces the problem of how to use the results. Blue also providesmulti-assignments : we can write

success, thing := findElem("myKey")

Multi-assignments, provided they are implemented properly, provide other advantages. For exam-ple we can write the standard “swap sequence”

t := a;a := b;b := t;

in the more concise and readable form

a, b := b, a

2

Exercise 34. Describe a “proper” implementation of the multi-assignment statement. Can youthink of any other applications of multi-assignment? 2

Exercise 35. Using Example 29 and other examples, discuss the introduction of new features intoPLs. 2

8.3.3 Exceptions.

The basic control structures are fine for most applications provided that nothing goes wrong.Although in principle we can do everything with conditions and loops, there are situations inwhich these constructs are inadequate.

8 STRUCTURE 124

Example 30: Matrix Inversion. Suppose that we are writing a function that inverts a matrix.In almost all cases, the given matrix is non-singular and the inversion proceeds normally. Occa-sionally, however, a singular matrix is passed to the function and, at some point in the calculation,a division by zero will occur. What choices do we have?

. We could test the matrix for singularity before starting to invert it. Unfortunately, the testinvolves computing the determinant of the matrix, which is almost as much work as invertingit. This is wasteful, because we know that most of the matrices that we will receive arenon-singular.

. We could test every divisor before performing a division. This is wasteful if the matrix isunlikely to be singular. Moreover, the divisions probably occur inside nested loops: each levelof loop will require Boolean variables and exit conditions that are triggered by a zero divisor,adding overhead.

. Rely on the hardware to signal an exception when division by zero occurs. The exceptioncan be caught by a handler that can be installed at any convenient point in the callingenvironment.

2

PL/I (Section 3.5) was the first major language to provide exception handling. C provided prim-itives (setjmp/longjmp) that made it possible to implement exceptin handling. For further de-scriptions of exception handling in particular languages, see Section 3.

9 Names and Binding

Names are a central feature of all programming languages. In the earliest programs, numbers wereused for all purposes, including machine addresses. Replacing numbers by symbolic names was oneof the first major improvements to program notation (Mutch and Gill 1954).

9.1 Free and Bound Names

The use of names in programming is very similar to the use of names in mathematics. We say that“x occurs free” in an expression such as x2 + 2x + 5. We do not know anything about the value ofthis expression because we do not know the value of x. On the other hand, we say that “n occursbound” in the expression

5∑

n=0

(2n + 1). (12)

More precisely, the binding occurrence of n is in n = 0. The n in the expression (2n + 1) is areference to , or use of , to this binding.

There are two things to note about (12). First, it has a definite value, 36, because n is bound to avalue (actually, the set of values { 0, 1, . . . , 5 }). Second, the particular name that we use does notchange the value of the expression. If we systematically change n to k, we obtain the expression∑5

k=0(2k + 1), which has the same value as (12).

Similarly, the expression ∫ ∞

0e−x dx,

which contains a binding occurence of x (in dx) and a use of x (in e−x), has a value that does notdepend on the particular name x.

The expression ∫e−x dx,

is interesting because it binds x but does not have a numerical value; the convention is that itdenotes a function, −e−x. The conventional notation for functions, however, is ambiguous. It isclear that −e−x is a function of x because, by convention, e denotes a constant. But what about−e−ax: is it a function of a or a function of x? There are several ways of resolving this ambiguity.We can write

x 7→ −e−ax,

read “x maps to −e−ax”. Or we can use the notation of the λ-calculus:

λx . − e−ax.

In predicate calculus, predicates may contain free names, as in the following expression, whosetruth value depends on the value of n:

n mod 2 = 0 ∧ n mod 3 = 1. (13)

Names are bound by the quantifiers ∀ (for all) and ∃ (there exists). We say that the formula

∀n .n mod 2 = 0⇒ n2 mod 2 = 0

is closed because it contains no free variables. In this case, the formula is also true because (13) istrue for all values of n. Strictly, we should specify the range of values that n is allowed to assume.

125

9 NAMES AND BINDING 126

We could do this implicitly: for example, it is very likely that (13) refers to integers. Alternatively,we could define the range of values explicitly, as in

∀n ∈ Z .n mod 2 = 0⇒ n2 mod 2 = 0.

Precisely analogous situations occur in programming. In the function

int f(int n){

return k + n;}

k occurs free and n occurs bound. Note that we cannot tell the value of n from the definition ofthe function but we know that n will be given a value when the function is called.

In most PLs, a program with free variables will not compile. A C compiler, for example, willaccept the function f defined above only if it is compiled in a scope that contains a declaration ofk. Some early PLs, such as FORTRAN and PL/I, provided implicit declarations for variables thatthe programmer did not declare; this is now understood to be error-prone and is not a feature ofrecent PLs.

9.2 Attributes

In mathematics, a variable normally has only one attribute: its value. In (12), n is bound to thevalues 0, 1, . . . , 5. Sometimes, we specify the domain from which these values are chosen, as inn ∈ Z .

In programming, a name may have several attributes, and they may be bound at different times.For example, in the sequence

int n;n = 6;

the first line binds the type int to n and the second line binds the value 6 to n. The first bindingoccurs when the program is compiled. (The compiler records the fact that the type of n is int;neither this fact nor the name n appears explicitly in the compiled code.) The second bindingoccurs when the program is executed.

This example shows that there are two aspects of binding that we must consider in PLs: theattribute that is bound, and the time at which the binding occurs. The attributes that may bebound to a name include: type, address, and value. The times at which the bindings occur include:compile time, link time, load time, block entry time, and statement execution time.

Definition. A binding is static if it occurs during before the program is executed: duringcompilation or linking. A binding is dynamic if it occurs while the program is running: duringloading, block entry, or statement execution.

Example 31: Binding. Consider the program shown in Listing 64. The address of k is boundwhen the program starts; its value is bound by the scanf call. The address and value of n arebound only when the function f is called; if f is not called (because k ≤ 0), then n is never bound.2


Listing 64: Binding

void f(){

int n=7;printf("%d", n);

}void main (){

int k;scanf("%d", &k);if (k>0)

f();}

9.3 Early and Late Binding

An anonymous contributor to the Encyclopedia of Computer Science wrote:

Broadly speaking, the history of software development is the history of ever-later bind-ing time.

As a rough rule of thumb:Early binding ⇒ efficiency;Late binding ⇒ flexibility.

Example 32: Variable Addressing. In FORTRAN, addresses are bound to variable names atcompile time. The result is that, in the compiled code, variables are addressed directly, withoutany indexing or other address calculations. (In reality, the process is somewhat more complicated.The compiler assigns an address relative to a compilation unit. When the program is linked, theaddress of the unit within the program is added to this address. When the program is loaded, theaddress of the program is added to the address. The important point is that, by the time executionbegins, the “absolute” address of the variable is known.)

FORTRAN is efficient, because absolute addressing is used. It is inflexible, because all addressesare assigned at load time. This leads to wasted space, because all local variables occupy spacewhether or not they are being used, and also prevents the use of direct or indirect recursion.

In languages of the Algol family (Algol 60, Algol 68, C, Pascal, etc.), local variables are allocated onthe run-time stack. When a function is called, a stack frame (called, in this context, an activationrecord or AR) is created for it and space for its parameters and local variables is allocated in theAR. The variable must be addressed by adding an offset (computed by the compiler) to the addressof the AR, which is not known until the function is called.

Algol-style is slightly less efficient than FORTRAN because addresses must be allocated at block-entry time and indexed addressing is required. It is more flexible than FORTRAN because inactivefunctions do not use up space and recursion works.

Languages that provide dynamic allocation, such as Pascal and C and most of their successors, haveyet another way of binding an address to a variable. In these languages, a statement such as new(p)allocates space on the “heap” and stores the address of that space in the pointer p. Accessing thevariable requires indirect addressing, which is slightly slower then indexed addressing, but greater


flexibility is obtained because dynamic variables do not have to obey stack discipline (last in, firstout). 2

Example 33: Function Addressing. When the compiler encounters a function definition, itbinds an address to the function name. (As above, several steps must be completed before theabsolute address of the function is determined, but this is not relevant to the present discussion.)When the compiler encounters a function invocation, it generates a call to the address that it hasassigned to the function.

This statement is true of all PLs developed before OOP was introduced. OOPLs provide “virtual”functions. A virtual function name may refer to several functions. The actual function referred toin a particular invocation is not in general known until the call is executed.

Thus in non-OO PLs, functions are statically bound, with the result that calls are efficientlyexecuted but no decisions can be made at run-time. In OOPLs, functions are dynamically bound.Calls take longer to execute, but there is greater flexibility because the decision as to which functionto execute is made at run-time.

The overhead of a dynamically-bound function call depends on the language. In Smalltalk, theoverhead can be significant, because the CM requires a search. In practice, the search can beavoided in up to 95% of calls by using a cache. In C++, the compiler generates “virtual functiontables” and dynamic binding requires only indirect addressing. Note, however, that this providesyet another example of the principle: Smalltalk binds later than C++, runs more slowly, butprovides greater flexibility. 2

9.4 What Can Be Named?

The question “what can be named?” and its converse “what can be denoted?” are importantquestions to ask about a PL.

Definition. An entity can be named if the PL provides a definitional mechanism that associatesa name with an instance of the entity. An entity can be denoted if the PL provides ways ofexpressing instances of the entity as expressions.

Example 34: Functions. In C, we can name functions but we cannot denote them. Thedefinition

int f(int x){

B}

introduces a function with name f , parameter x, and body B. Thus functions in C can be named.There is no way that we can write a function without a name in C. Thus functions in C cannot bedenoted.

The corresponding definition in Scheme would be:

(defun f (lambda (x)E))


This definition can be split into two parts: the name of the function being defined (f) and thefunction itself ((lambda (x) E)). The notation is analogous to that of the λ-calculus:

f = λx .E

FPLs allow expressions analogous to λx .E to be used as functions. LISP, as mentioned in Sec-tion 4.1, is not one of these languages. However, Scheme is, and so are SML, Haskell, and otherFPLs. 2

Example 35: Types. In early versions of C, types could be denoted but not named. Forexample, we could write

int *p[10];

but we could not give a name to the type of p (“array of 10 pointers to int”). Later versions of Cintroduced the typedef statement, making types nameable:

typedef int *API[10];API p;

2

9.5 What is a Variable Name?

We have seen that most PLs choose between two interpretations of a variable name. In a PLwith value semantics, variable names denote memory locations. Assigning to a variable changesthe contents of the location named by the variable. In a PL with reference semantics, variablenames denote objects in memory. Assigning to a variable, if permitted by the language, changesthe denotation of the name to a different object.

Reference semantics is sometimes called “pointer semantics”. This is reasonable in the sense thatthe implementation of reference semantics requires the storage of addresses — that is, pointers.It is misleading in that providing pointers is not the same as providing reference semantics. Thedistinction is particularly clear in Hoare’s work on PLs. Hoare (1974) has this to say about theintroduction of pointers into PLs.

Many language designers have preferred to extend [minor, localized faults in Algol 60and other PLs] throughout the whole language by introducing the concept of reference,pointer, or indirect address into the language as an assignable item of data. Thisimmediately gives rise in a high-level language to one of the most notorious confusionsof machine code, namely that between an address and its contents. Some languagesattempt to solve this by even more confusing automatic coercion rules. Worse still, anindirect assignment through a pointer, just as in machine code, can update any storelocation whatsoever, and the damage is no longer confined to the variable explicitlynames as the target of assignment. For example, in Algol 68, the assignment

x := y

always changes x, but the assignment

p := y + 1

may, if p is a reference variable, change any other variable (of appropriate type) in thewhole machine. One variable it can never change is p! . . . . References are like jumps,leading wildly from one part of a data structure to another. Their introduction intohigh-level languages has been a step backward from which we may never recover.


One year later, Hoare (1975) provided his solution to the problem of references in high-level PLs:

In this paper, we will consider a class of data structures for which the amount ofstorage can actually vary during the lifetime of the data, and we will show that it canbe satisfactorily accommodated in a high-level language using solely high-level problem-oriented concepts, and without the introduction of references.

The implementation that Hoare proposes in this paper is a reference semantics with types. Explicittypes make it possible to achieve

. . . . a significant improvement on the efficiency of compiled LISP, perhaps even afactor of two in space-time cost for suitable applications. (Hoare 1975)

All FPLs and most OOPLs (the notable exception, of course, being C++) use reference semantics.There are good reasons for this choice, but the reasons are not the same for each paradigm.

. In a FPL, all (or at least most) values are immutable. If X and Y are have the same value,the program cannot tell whether X and Y are distinct objects that happen to be equal, orpointers to the same object. It follows that value semantics, which requires copying, would bewasteful because there is no point in making copies of immutable objects.(LISP provides two tests for equality. (eq x y) is true if x and y are pointers to the sameobject. (equal x y) is true if the objects x and y have the same extensional value. Thesetwo functions are provided partly for efficiency and partly to cover up semantic deficiencies ofthe implementation. Some other languages provide similar choices for comparison.)

. One of the important aspects of OOP is object identity . If a program object X correspondsto a unique entity in the world, such as a person, it should be unique in the program too. Thisis most naturally achieved with a reference semantics.

The use of reference semantics in OOP is discussed at greater length in (Grogono and Chalin 1994).

9.6 Polymorphism

The word “polymorphism” is derived from Greek and means, literally, “many shapes”. In PLs,“polymorphism” is used to describe a situation in which one name can refer to several different en-tities. The most common application is to functions. There are several kinds of polymorphism; theterms “ad hoc polymorphism” and “parametric polymorphism” are due to Christopher Strachey.

9.6.1 Ad Hoc Polymorphism

In the code

int m, n;float x, y;printf("%d %f", m + n, x + y);

the symbol “+” is used in two different ways. In the expression m + n, it stands for the functionthat adds two integers. In the expression x + y, it stands for the function that adds two floats.

In general, ad hoc polymorphism refers to the use of a single function name to refer to two or moredistinct functions. Typically the compiler uses the types of the arguments of the function to decidewhich function to call. Ad hoc polymorphism is also called “overloading”.


Almost all PLs provide ad hoc polymorphism for built-in operators such as “+”, “−”, “*”, etc.

Ada, C++, and other recent languages also allow programmers to overload functions. (Strictly,we should say “overload function names” but the usage “overloaded functions” is common.) Ingeneral, all that the programmer has to do is write several definitions, using the same functionname, but ensuring that the either the number or the type of the arguments are different.

Example 36: Overloaded Functions. The following code declares three distinct functions inC++.

int max (int x, int y);int max (int x, int y, int z);float max (float x, float y);

2

9.6.2 Parametric Polymorphism

Suppose that a language provides the type list as a parameterized type. That is, we can makedeclarations such as these:

list(int)list(float)

Suppose also that we have two functions: sum computes the sum of the components of a givenlist, and len computes the number of components in a given list. There is an important differencebetween these two functions. In order to compute the sum of a list, we must be able to choose anappropriate “add” function, and this implies that we must know the type of the components. Onthe other hand, there seems to be no need to know the type of the components if all we need todo is count them.

A function such as len, which counts the components of a list but does not care about their type,has parametric polymorphism .

Example 37: Lists in SML. In SML, functions can be defined by cases, “[]” denotes the emptylist, and “::” denotes list construction. We write definitions without type declarations and SMLinfers the types. Here are functions that sum the components of a list and count the componentsof a list, respectively.:

sum [] = 0| sum (x::xs) = x + sum(xs)len [] = 0| len(x::xs) = 1 + len(xs)

SML infers the type of sum to be list(int) → int: since the sum of the empty list is (integer)0, the type of the operator “+” must be int × int → int, and all of the components must beintegers.

For the function len, SML can infer nothing about the type of the components from the functiondefinition, and it assigns the type list(α) → int to the function. α is a type variable, acting asa type parameter. When len is applied to an actual list, α will implicitly assume the type of thelist components. 2


9.6.3 Object Polymorphism

In OOPLs, there may be many different objects that provide a function called, say, f . However,the effect of invoking f may depend on the object. The details of this kind of polymorphism,which are discussed in Section 6, depend on the particular OOPL. This kind of polymorphism is afundamental, and very important, aspect of OOP.

9.7 Assignment

Consider the assignment x := E. Whatever the PL, the semantics of this statement will besomething like:

. evaluate the expression E;

. store the value obtained in x.

The assignment is unproblematic if x and E have the same type. But what happens if they havedifferent types? There are several possibilities.

. The statement is considered to be an error. This will occur in a PL that provides static typechecking but does not provide coercion. For example, Pascal provides only a few coercions(subrange to integer, integer to real, etc.) and rejects other type differences.

. The compiler will generate code to convert the value of expression E to the type of x. Thisapproach was taken to extremes in COBOL and PL/I. It exists in C and C++, but C++ isstricter than C.

. The value of E will be assigned to x anyway. This is the approach taken by PLs that usedynamic type checking. Types are associated with objects rather than with names.

9.8 Scope and Extent

The two most important properties of a variable name are scope and extent .

9.8.1 Scope

The scope of a name is the region of the program text in which a name may be used. In C++, forexample, the scope of a local variable starts at the declaration of the variable and ends at the endof the block in which the declaration appears, as shown in Listing 65. Scope is a static propertyof a name that is determined by the semantics of the PL and the text of the program. Examplesof other scope rules follow.

FORTRAN. Local variables are declared at the start of subroutines and at the start of the mainprogram. Their scope starts at the declarations and end at the end of the program unit (thatis, subroutine or main program) within which the declaration appears. Subroutine nameshave global scope. It is also possible to make variables accessible in selected subroutinesby means of COMMON statements, as shown in Listing 66. This method of “scope control” isinsecure because the compiler does not attempt to check consistency between program units.

Algol 60. Local variables are declared at the beginning of a block. Scopes are nested : a decla-ration of a name in an inner block hides a declaration of the same name in an outer block,as shown in Listing 67.


Listing 65: Local scope in C++

{// x not accessible.t x;// x accessible. . . .

{// x still accessible in inner block. . . .

}// x still accessible

}// x not accessible

Listing 66: FORTRAN COMMON blocks

subroutine firstinteger ireal a(100)integer b(250)common /myvars/ a, b. . . .returnendsubroutine secondreal x(100), y(100)integer k(250)common /myvars/ x, k, y. . . .returnend

Listing 67: Nested scopes in Algol 60

beginreal x; integer k;begin

real x;x := 3.0

end;x := 6.0;

end


The fine control of scope provided by Algol 60 was retained in Algol 68 but simplified inother PLs of the same generation.

Pascal. Control structures and nested functions are nested but declarations are allowed only atthe start of the program or the start of a function declaration. In early Pascal, the order ofdeclarations was fixed: constants, types, variables, and functions were declared in that order.Later dialects of Pascal sensibly relaxed this ordering.

Pascal also has a rather curious scoping feature: within the statement S of the statementwith r do S, a field f of the record r may be written as f rather than as r.f.

C. Function declarations cannot be nested and all local variables must appear before executablestatements in a block. The complete rules of C scoping, however, are complicated by the factthat C has five overloading classes (Harbison and Steele 1987):

. preprocessor macro names;

. statement labels;

. structure, union, and enumeration tags;

. component names;

. other names.

Consequently, the following declarations do not conflict:

{char str[200];struct str {int m, n; } x;. . . .

}

Inheritance. In OOPLs, inheritance is, in part, a scoping mechanism. When a child class Cinherits a parent class P, some or all of the attributes of P are made visible in C.

C++. There are a number of additional scope control mechanisms in C++. Class attributes maybe declared private (visible only with the class), protected (visible within the class andderived classes), and public (visible outside the class). Similarly, a class may be derived asprivate, protected, or public. A function or a class may be declared as a friend of a class,giving it access to the private attributes of the class. Finally, C++ follows C in providingdirectives such as static and extern to modify scope.

Global Scope. The name is visible throughout the program. Global scope is useful for pervasiveentities, such as library functions and fundamental constants (π = 3.1415926) but is bestavoided for application variables.

FORTRAN does not have global variables, although programmers simulate them by overusing COMMON declarations. Subroutine names in FORTRAN are global.

Names declared at the beginning of the outermost block of Algol 60 and Pascal programshave global scope. (There may be “holes” in these global scopes if the program contains localdeclarations of the same names.)

The largest default scope in C is “file scope”. Global scope can be obtained using the storageclass extern.


Listing 68: Declaration before use

void main (){

const int i = 6;{

int j = i;const int i = 7;cout << j << endl;

}}

Listing 69: Qualified names

class Widget{public:

void f();};Widget w;f(); // Error: f is not in scopew.f(); // OK

Local Scope. In block-structured languages, names declared in a block are local to the block.There is a question as to whether the scope starts at the point of definition or is the entireblock. (In other words, does the PL require “declaration before use”?) C++ uses the “declarebefore use” rule and the program in Listing 68 prints 6.

In Algol 60 and C++, local scopes can be as small as the programmer needs. Any statementcontext can be instantiated by a block that contains local variable declarations. In otherlanguages, such as Pascal, local declarations can be used only in certain places. Although afew people do not like the fine control offered by Algol 60 and C++, it seems best to providethe programmer with as much freedom as possible and to keep scopes as small as possible.

Qualified Scope. The components of a structure, such as a Pascal record or a C struct, havenames. These names are usually hidden, but can be made visible by the name of the object.Listing 69 provides an example in C++. The construct w. opens a scope in which the publicattributes of the object w are visible.

The dot notation also appears in many OOPLs, where is it used to select attributes andfunctions of an object. For an attribute, the syntax is the same: o.a denotes attribute aof object o. If the attribute is a function, a parameter list follows: o.f(x) denotes theinvocation of function f with argument x in object o.

Import and Export. In modular languages, such as Modula-2, a name or group of names canbe brought into a scope with an import clause. The module that provides the definitionsmust have a corresponding export clause.

Typically, if module A exports name X and module B imports name X from A, then X may beused in module B as if it was declared there.

Namespaces. The mechanisms above are inadequate for very large programs developed by largeteams. Problems arise when a project uses several libraries that may have conflicting names.


Namespaces, provided by PLs such as C++ and Common LISP, provide a higher level ofname management than the regular language features.

In summary:

. Nested scopes are convenient in small regions, such as functions. The advantage of nestedscopes for large structures, such as classes or modules, is doubtful.

. Nesting is a form of scope management. For large structures, explicit control by name quali-fication may be better than nesting.

. Qualified names work well at the level of classes and modules, when the source of names isobvious.

. The import mechanism has the problem that the source of a name is not obvious where itappears: users must scan possibly remote import declarations.

. Qualified names are inadequate for very large programs.

. Library designers can reduce the potential for name conflicts by using distinctive prefixes. Forexample, all names supplied by the commercial graphics library FastGraph are have fg_ as aprefix.

. Namespaces provide a better solution than prefixes.

Scope management is important because the programmer has no “work arounds” if the scopingmechanisms provided by the PL are inadequate. This is because PLs typically hide the scopingmechanisms.

If a program is compiled, names will normally not be accessible at run-time unless they are includedfor a specific purpose such as debugging. In fact, one way to view compiling is as a process thatconverts names to numbers.

Interpreters generally have a repository that contains the value of each name currently accessibleto the program, but programmers may have limited access to this repository.

9.8.2 Are nested scopes useful?

As scope control mechanisms became more complicated, some people began to question to needfor nested scopes (Clarke, Wilden, and Wolf 1980; Hanson 1981).

. On a small scale, for example in statement and functions, nesting works fairly well. It is,however, the cause of occasional errors that could be eliminated by requiring all names in astatement or function to be unique.

. On a large scale, for example in classes and modules, nesting is less effective because there ismore text and a larger number of names. It is probably better to provide special mechanismsto provide the scope control that is needed rather than to rely on simple nesting.

C++ allows classes to be nested (this seems inconsistent, given that functions cannot be nested).The effect is that the nested class can be accessed only by the enclosing class. An alternativemechanism would be a declaration that states explicitly that class A can be accessed only by classB. This mechanism would have the advantage that it would also be able to describe constraintssuch as “class A can be accessed only by classes B, C, and D”.


9.8.3 Extent

The extent , also called lifetime , of a name is the period of time during program execution duringwhich the object corresponding to the name exists. Understanding the relation between scope andextent is an important part of understanding a PL.

Global names usually exist for the entire lifetime of the execution: they have global extent .

In FORTRAN, local variables also exist for the lifetime of the execution. Programmers assumethat, on entry to a subroutine, its local variables will not have changed since the last invocation ofthe subroutine.

In Algol 60 and subsequent stack-based languages, local variables are instantiated whenever controlenters a block and they are destroyed (at least in principle) when control leaves the block. It is anerror to create an object on the stack and to pass a reference to that object to an enclosing scope.Some PLs attempt to detect this error (e.g., Algol 68), some try to prevent it from occurring (e.g.,Pascal), and others leave it as a problem for the programmer (e.g., C and C++).

In PLs that use a reference model, objects usually have unlimited extent, whether or not theiroriginal names are accessible. Examples include Simula, Smalltalk, Eiffel, CLU, and all FPLs. Theadvantage of the reference model is that the problems of disappearing objects (dangling pointers)and inaccessible objects (memory leaks) do not occur. The disadvantage is that GC and theaccompanying overhead are more or less indispensable.

The separation of extent from scope was a key step in the evolution of post-Algol PLs.

10 Abstraction

Abstraction is one of the most important mental tools in Computer Science. The biggest problemsthat we face in Computer Science are concerned with complexity; abstraction helps us to managecomplexity. Abstraction is particularly important in the study of PLs because a PL is a two-foldabstraction.

A PL must provide an abstract view of the underlying machine. The values that we use in pro-gramming — integers, floats, booleans, arrays, and so on — are abstracted from the raw bits ofcomputer memory. The control structures that we use — assignments, conditionals, loops, and soon — are abstracted from the basic machine instructions. The first task of a PL is to hide thelow-level details of the machine and present a manageable interface, preferably an interface thatdoes not depend on the particular machine that is being used.

A PL must also provide ways for the programmer to abstract information about the world. Mostprograms are connected to the world in one way or another, and some programs are little morethan simulations of some aspect of the world. For example, an accounting program is, at somelevel, a simulation of tasks that used to be performed by clerks and accountants. The secondtask of a PL is to provide as much help as possible to the programmer who is creating a simplifiedmodel of the complex world in a computer program.

In the evolution of PLs, the first need came before the second. Early PLs emphasize abstractionfrom the machine: they provide assignments, loops, and arrays. Programs are seen as lists ofinstructions for the computer to execute. Later PLs emphasize abstraction from the world: theyprovide objects and processes. Programs are seen as descriptions of the world that happen to beexecutable.

The idea of abstracting from the world is not new, however. The design of Algol 60 began inthe late fifties and, even at that time, the explicit goal of the designers was a “problem orientedlanguage”.

Example 38: Sets and Bits. Here is an early example that contrasts the problem-oriented andmachine-oriented approaches to PL design.

Pascal provides the problem-oriented abstraction set. We can declare types whose values are sets;we can define literal constants and variables of these types; and we can write expressions using setoperations, as shown in Listing 70. In contrast, C provides bit-wise operations on integer data, asshown in Listing 71.

These features of Pascal and C have almost identical implementations. In each case, the elementsof a set are encoded as positions within a machine word; the presence or absence of an element

Listing 70: Sets in Pascal

type SmallNum = set of [1..10];var s, t: SmallNum;var n: integer;begin

s := [1, 3, 5, 7, 9];t := s + [2, 4, 6, 8];if (s <= t) . . . .if (n in s) . . . .

end

138

10 ABSTRACTION 139

Listing 71: Bit operations in C

int n = 0x10101010101;int mask = 0x10000;if (n & mask) . . . .

is represented by a 1-bit or a 0-bit, respectively; and the operations for and-ing and or-ing bitsincluded in most machine languages are used to manipulate the words.

The appearance of the program, however, is quite different. The Pascal programmer can thinkat the level of sets, whereas the C programmer is forced to think at the level of words and bits.Neither approach is necessarily better than the other: Pascal was designed for teaching introductoryprogramming and C was designed for systems programmers. The trend, however, is away frommachine abstractions such as bit strings and towards problem domain abstractions such as sets. 2

10.1 Abstraction as a Technical Term

“Abstraction” is a word with quite general meaning. In Computer Science in general, and in PLin particular, it is often given a special sense. An abstraction mechanism provides a way ofnaming and parameterizing a program entity.

10.1.1 Procedures.

The earliest abstraction mechanism was the procedure , which exists in rudimentary form even inassembly language. Procedural abstraction enables us to give a name to a statement, or group ofstatements, and to use the name elsewhere in the program to trigger the execution of the statements.Parameters enable us to customize the execution according to the particular requirements of thecaller.

10.1.2 Functions.

The next abstraction mechanism, which also appeared at an early stage was the function . Func-tional abstraction enables us to give a name to an expression, and to use the name to triggerevaluation of the expression.

In many PLs, procedures and functions are closely related. Typically, they have similar syntax andsimilar rules of construction. The reason for this is that an abstraction mechanism that works onlyfor simple expressions is not very useful. In practice, we need to abstract form arbitrarily complexprograms whose main purpose is to yield a single value.

The difference between procedures and functions is most evident at the call site, where a procedurecall appears in a statement context and a function call appears in an expression context.

FORTRAN provides procedures (called “subroutines”) and functions. Interestingly, COBOL doesnot provide procedure or function abstraction, although the PERFORM verb provides a rather re-stricted and unsafe way of executing statements remotely.

LISP provides function abstraction. The effect of procedures is obtained by writing functions withside-effects.

Algol 60, Algol 68, and C take the view that everything is an expression. (The idea seems to bethat, since any computation leaves something in the accumulator, the programmer might as wellbe allowed to use it.) Thus statements have a value in these languages. It is not surprising that

10 ABSTRACTION 140

this viewpoint minimizes the difference between procedures and functions. In Algol 68 and C, aprocedure is simple a function that returns a value of type void. Since this value is unique, itrequires log 1 = 0 bits of storage and can be optimized away by the compiler.

10.1.3 Data Types.

Although Algol 68 provided a kind of data type abstraction, Pascal was more systematic andmore influential. Pascal provides primitive types, such as integer, real, and boolean, and typeexpressions (C denotes an integer constant and T denotes a type):

C. . C(C, C, . . . ., C)array [T] of Trecord v: T, . . . .endset of Tfile of T

Type expressions can be named and the names can be used in variable declarations or as compo-nents of other types. Pascal, however, does not provide a parameter mechanism for types.

10.1.4 Classes.

Simula introduced an abstraction mechanism for classes. Although this was an important andprofound step, with hindsight we can see that two simple ideas were needed: giving a name to anAlgol block, and freeing the data component of the block from the tyranny of stack discipline.

10.2 Computational Models

There are various ways of describing a PL. We can write an informal description in natural language(e.g., the Pascal User Manual and Report (Jensen and Wirth 1976)), a formal description in naturallanguage (e.g., the Algol report (Naur et al. 1960)), or a document in a formal notation (e.g., theRevised Report on Algol 68 (van Wijngaarden et al. 1975)). Alternatively, we can provide a formalsemantics for the language, using the axiomatic, denotational, or operational approaches.

There is, however, an abstraction that is useful for describing a PL in general terms, withoutthe details of every construct, and that is the computational model (also sometimes calledthe semantic model). The computational model (CM) is an abstraction of the operationalsemantics of the PL and it describes the effects of various operations without describing the actualimplementation.

Programmers who use a PL acquire intuitive knowledge of its CM. When a program behaves in anunexpected way, the implementation has failed to respect the CM.

Example 39: Passing parameters. Suppose that a PL specifies that arguments are passed byvalue. This means that the programs in Listing 72 and Listing 73 are equivalent. The statementT a = E means “the variable a has type T and initial value E” and the statement x ← a means“the value of a is assigned (by copying) to x”.

A compiler could implement this program by copying the value of the argument, as if executingthe assignment x ← a. Alternatively, it could pass the address of the argument a to the functionbut not allow the statement S to change the value of a. Both implementations respect the CM. 2

10 ABSTRACTION 141

Listing 72: Passing a parameter

void f (T x);{

S;}T a = E;f(a);

Listing 73: Revealing the parameter passing mechanism

T a = E;{

T x;x← a;S;

}

Example 40: Parameters in FORTRAN. If a FORTRAN subroutine changes the valueof one of its formal parameters (called “dummy variables” in FORTRAN parlance) the value ofthe corresponding actual argument at the call site also changes. For example, if we define thesubroutine

subroutine p (count)integer count. . . .count = count + 1returnend

then, after executing the sequence

integer totaltotal = 0call p(total)

the value of total is 1. The most common way of achieving this result is to pass the address of theargument to the subroutine. This enables the subroutine to both access the value of the argumentand to alter it.

Unfortunately, most FORTRAN compilers use the same mechanism for passing constant arguments.The effect of executing

call p(0)

is to change all instances of 0 (the constant “zero”) in the program to 1 (the constant “one”)! Theresulting bugs can be quite hard to detect for a programmer who is not familiar with FORTRAN’sidiosyncrasies.

The programmer has a mental CM for FORTRAN which predicts that, although variable argumentsmay be changed by a subroutine, constant arguments will not be. The problem arises becausetypical FORTRAN compilers do not respect this CM. 2

A simple CM does not imply easy implementation. Often, the opposite is true. For example, C iseasy to compile, but has a rather complex CM.

10 ABSTRACTION 142

Example 41: Array Indexing. In the CM of C, a[i] ≡ ∗(a + i). To understand this, we mustknow that an array can be represented by a pointer to its first element and that we can accesselements of the array by adding the index (multiplied by an appropriate but implicit factor) to thispointer.

In Pascal’s CM, an array is a mapping. The declaration

var A : array [1..N ] of real;

introduces A as a mapping:A : { 1, . . . , N } → R.

A compiler can implement arrays in any suitable way that respects this mapping. 2

Example 42: Primitives in OOPLs. There are two ways in which an OOPL can treat prim-itive entities such as booleans, characters, and integers.

. Primitive values can have a special status that distinguishes them from objects. A languagethat uses this policy is likely to be efficient, because the compiler can process primitives withoutthe overhead of treating them as full-fledged objects. On the other hand, it may be confusingfor the user to deal with two different kinds of variable — primitive values and objects.

. Primitive values can be given the same status as objects. This simplifies the programmer’stask, because all variables behave like objects, but a naive implementation will probably beinefficient.

A solution for this dilemma is to define a CM for the language in which all variables are objects.An implementor is then free to implement primitive values efficiently provided that the behaviourpredicted by the CM is obtained at all times.

In Smalltalk, every variable is an object. Early implementations of Smalltalk were inefficient,because all variables were actually implemented as objects. Java provides primitives with efficientimplementations but, to support various OO features, also has classes for the primitive types. 2

Example 43: λ-Calculus as a Computational Model. FPLs are based on the theory ofpartial recursive functions. They attempt to use this theory as a CM. The advantage of this pointof view is that program text resembles mathematics and, to a certain extent, programs can bemanipulated as mathematical objects.

The disadvantage is that mathematics and computation are not the same. Some operations thatare intuitively simple, such as maintaining the balance of a bank account subject to deposits andwithdrawals, are not easily described in classical mathematics. The underlying difficulty is thatmany computations depend on a concept of mutable state that does not exist in mathematics.

FPLs have an important property called referential transparency . An expression is referentiallytransparent if its only attribute is its value. The practical consequence of referential transparencyis that we can freely substitute equal expressions.

For example, if we are dealing with numbers, we can replace E + E by 2 × E; the replacementwould save time if the evaluation of E requires a significant amount of work. However, if E was notreferentially transparent — perhaps because its evaluation requires reading from an input stream orthe generation of random numbers — the replacement would change the meaning of the program.2

10 ABSTRACTION 143

Example 44: Literals in OOP. How should an OOPL treat literal constants, such as 7? Thefollowing discussion is based on (Kolling 1999, pages 71–74).

There are two possible ways of interpreting the symbol “7”: we could interpret it as an objectconstructor , constructing the object 7; or as a constant reference to an object that already exists.

Dee (Grogono 1991a) uses a variant of the first alternative: literals are constructors. This introducesseveral problems. For example, what does the comparison 7 = 7 mean? Are there two 7-objectsor just one? The CM of Dee says that the first occurrence of the literal “7” constructs a newobject and subsequent occurrences return a reference to this (unique) object.

Smalltalk uses the second approach. Literals are constant references to objects. These objects,however, “just magically exist in the Smalltalk universe”. Smalltalk does not explain where theycome from, or when and how they are created.

Blue avoids these problems by introducing the new concept of a manifest class . Manifest classesare defined by enumeration. The enumeration may be finite, as for class Boolean with memberstrue and false. Infinite enumerations, as for class Integer, are allowed, although we cannot seethe complete declaration.

The advantage of the Blue approach, according to Kolling (1999, page 72), is that:

. . . . the distinction is made at the logical level. The object model explicitly recognizesthese two different types of classes, and differences between numbers and complexobjects can be understood independently of implementation concerns . Theycease to be anomalies or special cases at a technical level. This reduces the dangers ofmisunderstandings on the side of the programmer. [Emphasis added.]

2

Exercise 36. Do you think that Kolling’s arguments for introducing a new concept (manifestclasses) are justified? 2

Summary Early CMs:

. modelled programs as code acting on data;

. structured programs by recursive decomposition of code.

Later CMs:

. modelled programs as packages of code and data;

. structured programs by recursive decomposition of packages of code and data.

The CM should:

. help programmers to reason about programs;

. help programmers to read and write programs;

. constrain but not determine the implementation.

11 Implementation

This course is concerned mainly with the nature of PLs. Implementation techniques are a sepa-rate, and very rich, topic. Nevertheless, there are few interesting things that we can say aboutimplementation, and that is the purpose of this section.

11.1 Compiling and Interpreting

11.1.1 Compiling

As we have seen, the need for programming languages at a higher level than assembly languagewas recognized early. People also realized that the computer itself could perform the translationfrom a high level notation to machine code. In the following quote, “conversion” is used in roughlythe sense that we would say “compilation” today.

On the other hand, large storage capacities will lead in the course of time to veryelaborate conversion routines. Programmes in the future may be written in a verydifferent language from that in which the machines execute them. It is to be hoped (andit is indeed one of the objects of using a conversion routine) that many of the tiresomeblunders that occur in present-day programmes will be avoided when programmes canbe written in a language in which the programmer feels more at home. However, anymistake that the programmer does commit may prove more difficult to find because itwill not be detected until the programme has been converted into machine language.Before he can find the cause of the trouble the programmer will either have to investigatethe conversion process or enlist the aid of checking routines which will ‘reconvert’ andprovide him with diagnostic information in his own language. (Gill 1954)

A compiler translates source language to machine language. Since practical compilation systemsallow programs to be split into several components for compiling, there are usually several phases.

1. The components of the source program is compiled into object code.

2. The object code units and library functions required by the program are linked to form anexecutable file.

3. The executable file is loaded into memory and executed.

Since machine code is executed directly, compiled programs ideally run at the maximum possiblespeed of the target machine. In practice, the code generated by a compiler is usually not optimal.For very small programs, a programmer skilled in assembly language may be able to write moreefficient code than a compiler can generate. For large programs, a compiler can perform globaloptimizations that are infeasible for a human programmer and, in any case, hand-coding such largeprograms would require excessive time.

All compilers perform some checks to ensure that the source code that they are translating iscorrect. In early languages, such as FORTRAN, COBOL, PL/I, and C, the checking was quitesuperficial and it was easy to write programs that compiled without errors but failed when theywere executed. The value of compile-time checking began to be recognized in the late 60s, andlanguages such as Pascal, Ada, and C++ provided more thorough checking than their predecessors.The goal of checking is to ensure that, if a program compiles without errors, it will not fail when

144

11 IMPLEMENTATION 145

it is executed. This goal is hard to achieve and, even today, there are few languages that achieveit totally.

Compiled PLs can be classified according to whether the compiler can accept program componentsor only entire programs. An Algol 60 program is a block and must be compiled as a block. StandardPascal programs cannot be split up, but recent dialects of Pascal provide program components.

The need to compile programs in parts was recognized at an early stage: even FORTRAN wasimplemented in this way. The danger of separate compilation is that there may be inconsistenciesbetween the components that the compiler cannot detect. We can identify various levels of checkingthat have evolved.

. A FORTRAN compiler provides no checking between components. When the compiled com-ponents are linked, the linker will report subroutines that have not been defined but it willnot detect discrepancies in the number of type of arguments. The result of such a discrepancyis, at best, run-time failure.

. C and C++ rely on the programmer to provide header files containing declarations and imple-mentation files containing definitions. The compiler checks that declarations and definitionsare compatible.

. Overloaded functions in C++ present a problem because one name may denote several func-tions. A key design decision in C++, however, was to use the standard linker (Stroustrup1994, page 34). The solution is “name mangling” — the C++ compiler alters the names offunctions by encoding their argument types in such a way that each function has a uniquename.

. Modula-2 programs consist of interface modules and implementation modules. When the com-piler compiles an implementation module, it reads all of the relevant interfaces and performsfull checking. In practice, the interface modules are themselves compiled for efficiency, butthis is not logically necessary.

. Eiffel programs consist of classes that are compiled separately. The compiler automaticallyderives an interface from the class declaration and uses the generated interfaces of other classesto perform full checking. Other OOPLs that use this method include Blue, Dee, and Java.

11.1.2 Interpreting

An interpreter accepts source language and run-time input data and executes the program di-rectly. It is possible to interpret without performing any translation at all, but this approachtends to be inefficient. Consequently, most interpreters do a certain amount of translation beforeexecuting the code. For example, the source program might be converted into an efficient internalrepresentation, such as an abstract syntax tree, which is then “executed”.

Interpreting is slower than executing compiled code, typically by a factor of 10 or more. Since notime is spent in compiling, however, interpreted languages can provide immediate response. Forthis reason, prototyping environments often provide an interpreted language. Early BASIC andLISP systems relied on interpretation, but more recent interactive systems provide an option tocompile and may even provide compilation only.

Most interpreters check correctness of programs during execution, although some interpreters mayreport obvious (e.g., syntactic) errors as the program is read. Language such as LISP and Smalltalkare often said to be untyped, because programs do not contain type declarations, but this isincorrect. The type checking performed by both of these languages is actually more thorough than


that performed by C and Pascal compilers; the difference is that errors are not detected until theoffending statements are actually executed. For safety-critical applications, this may be too late.

11.1.3 Mixed Implementation Strategies

Some programming environments, particularly those for LISP, provide both interpretation andcompilation. In fact, some LISP environments allow source code and compiled code to be freelymixed.

Byte Codes. Another mixed strategy consists of using a compiler to generate byte codes and aninterpreter to execute the byte codes. The “byte codes”, as the name implies, are simply streamsof bytes that represent the program: a machine language for an “abstract machine”. One of thefirst programming systems to be based on byte codes was the Pascal “P” compiler.

The advantage of byte code programs is that they can, in principle, be run on any machine. Thusbyte codes provide a convenient way of implementing “portable” languages.

The Pascal “P” compiler provided a simple method for porting Pascal to a new kind of machine.The “Pascal P package” consisted of the P-code compiler written in P-code. A Pascal implementorhad to perform the following steps:

1. Write a P-code interpreter. Since P-code is quite simple, this is not a very arduous job.

2. Use the P-code interpreter to interpret the P-code compiler. This step yields a P-code pro-gram that can be used to translate Pascal programs to P-code. At this stage, the implementorhas a working Pascal system, but it is inefficient because it depends on the P-code interpreter.

3. Modify the compiler source so that the compiler generates machine code for the new machine.This is a somewhat tedious task, but it is simplified by the fact that the compiler is writtenin Pascal.

4. Use the compiler created in step 2 to compile the modified compiler. The result of this stepis a P-code program that translates Pascal to machine code.

5. Use the result of step 4 to compile the modified compiler. The result is a machine codeprogram that translates Pascal to machine code — in other words, a Pascal compiler for thenew machine.

Needless to say, all of these steps must be carried out with great care. It is important to ensurethat each step is completed correctly before beginning the next.

With the exercise of due caution, the process described above (called bootstrapping) can besimplified. For example, the first version of the compiler, created in step 2 does not have tocompile the entire Pascal language, but only those parts that are needed by the compiler itself.The complete sequence of steps can be used to generate a Pascal compiler that can compile itself,but may be incomplete in other respects. When that compiler is available, it can be extended tocompile the full language.

Java uses byte codes to obtain portability in a different way. Java was designed to be used as aninternet language. Any program that is to be distributed over the internet must be capable ofrunning on all the servers and clients of the network. If the program is written in a conventionallanguage, such as C, there must be a compiled version for every platform (a platform consists ofa machine and operating system). If the program is written in Java, it is necessary only to providea Java byte code interpreter for each platform.


Just In Time. “Just in time” (JIT) compilation is a recent strategy that is used for Java and afew research languages. The idea is that the program is interpreted, or perhaps compiled rapidlywithout optimization. As each statement is executed, efficient code is generated for it. Thisapproach is based on the following assumptions:

. in a typical run, much of the code of a program will not be executed;

. if a statement has been executed once, it is likely to be executed again.

JIT provides fast response, because the program can be run immediately, and efficient execution,because the second and subsequent cycles of a loop are executed from efficiently compiled code.Thus JIT combines some of the advantages of interpretation and compilation.

Interactive Systems A PL can be designed for writing large programs that run autonomouslyor for interacting with the user. The categories overlap: PLs that can be used interactively, suchas BASIC, LISP, SML, and Smalltalk, can also be used to develop large programs. The converseis not true: an interactive environment for C++ would not be very useful.

An interactive PL is usually interpreted, but interpretation is not the only possible means ofimplementation. PLs that start out with interpreters often acquire compilers at a later stage —BASIC, LISP, and a number of other interactive PLs followed this pattern of development.

A PL that is suitable for interactive use if programs can be described in small chunks. FPLs areoften supported interactively because a programmer can build up a program as a collection of func-tion definitions, where most functions are fairly small (e.g., they fit on a screen). Block structuredlanguages, and modular languages, are less suitable for this kind of program development.

11.1.4 Consequences for Language Design

Any PL can be compiled or interpreted. A PL intended for interpretation, however, is usuallyimplemented interactively. This implies that the user must be able to enter short sequences ofcode that can be executed immediately in the current context. Many FPLs are implemented inthis way (often with a compiler); the user can define new functions and invoke any function thathas previously been defined.

11.2 Garbage Collection

There is a sharp distinction: either a PL is implemented with garbage collection (GC) or it isnot. The implementor does not really have a choice. An implementation of LISP, Smalltalk, orJava without GC would be useless — programs would exhaust memory in a few minutes at most.Conversely, it is almost impossible to implement C or C++ with GC, because programmers canperform arithmetic on pointers.

The imperative PL community has been consistently reluctant to incorporate GC: there is nowidely-used imperative PL that provides GC. The functional PL community has been equallyconsistent, but in the opposite direction. The first FPL, LISP, provides GC, and so do all of itssuccessors.

The OOP community is divided on the issue of GC, but most OOPLs provide it: Simula, Smalltalk,CLU, Eiffel, and Java. The exception, of course, is C++, for reasons that have been clearlyexplained (Stroustrup 1994, page 219).


I deliberately designed C++ not to rely on [garbage collection]. I feared the verysignificant space and time overheads I had experienced with garbage collectors. I fearedthe added complexity of implementation and porting that a garbage collector wouldimpose. Also, garbage collection would make C++ unsuitable for many of the low-leveltasks for which it was intended. I like the idea of garbage collection as a mechanismthat simplifies design and eliminates a source of errors. However, I am fully convincedthat had garbage collection been an integral part of C++ originally, C++ would havebeen stillborn.

It is interesting to study Stroustrup’s arguments in detail. In his view, the “fundamental reasons”that garbage collection is desirable are (Stroustrup 1994, page 220):

1. Garbage collection is easiest for the user. In particular, it simplifies the building and use ofsome libraries.

2. Garbage collection is more reliable than user-supplied memory management schemes forsome applications.

The emphasis is added: Stroustrup makes it clear that he would have viewed GC in a differentlight if it provided advantages for all applications.

Stroustrup provides more arguments against GC than in favour of it (Stroustrup 1994, page 220):

1. Garbage collection causes run-time and space overheads that are not affordable for manycurrent C++ applications running on current hardware.

2. Many garbage collection techniques imply service interruptions that are not acceptable forimportant classes of applications, such as hard real-time applications, device drivers, controlapplications, human interface code on slow hardware, and operating system kernels.

3. Some applications do not have the hardware resources of a traditional general-purpose com-puter.

4. Some garbage collection schemes require banning several basic C facilities such as pointerarithmetic, unchecked arrays, and unchecked function arguments as used by printf().

5. Some garbage collection schemes impose constraints on object layout or object creation thatcomplicates interfacing with other languages.

It is hard to argue with Stroustrup’s conclusion (Stroustrup 1994, page 220):

I know that there are more reasons for and against, but no further reasons are needed.These are sufficient arguments against the view that every application would be betterdone with garbage collection. Similarly, these are sufficient arguments against the viewthat no application would be better done with garbage collection. (Stroustrup 1994,page 220) [Emphasis in the original.]

Stroustrup’s arguments do not preclude GC as an option for C++ and, in fact, Stroustrup advo-cates this. But it remains true that it is difficult to add GC to C++ and the best that can be doneis probably “conservative” GC. In the meantime, most C++ programmers spend large amounts oftime figuring out when to call destructors and debugging the consequences of their wrong decisions.

Conservative GC depends on the fact that it is possible to guess with a reasonable degree ofaccuracy whether a particular machine word is a pointer (Boehm and Weiser 1988). In typical


modern processors, a pointer is a 4-byte quantity, aligned on a word boundary, and its value is alegal address. By scanning memory and noting words with these properties, the GC can identifyregions of memory that might be in use. The problem is that there will be occasionally be wordsthat look like pointers but actually are not: this results in about 10% of garbage being retained.The converse, and more serious problem, of identifying live data as garbage cannot occur: hencethe term “conservative”.

The implementation of Eiffel provided by Meyer’s own company, ISE, has a garbage collector thatcan be turned on and off. Eiffel, however, is a public-domain language that can be implementedby anyone who chooses to do so — there are in fact a small number of companies other than ISEthat provide Eiffel. Implementors are encouraged, but not required to provide GC. Meyer (1992,page 335) recommends that an Eiffel garbage collector should have the following properties.

. It should be efficient, with a low overhead on execution time.

. It should be incremental, working in small, frequent bursts of activity rather than waiting formemory to fill up.

. It should be tunable. For example, programmers should be able to switch off GC in criticalsections and to request a full clean-up when it is safe to do so.

The concern about the efficiency of GC is the result of the poor performance of early, mark-sweepcollectors. Modern garbage collectors have much lower overheads and are “tunable” in Meyer’ssense. Larose and Feeley (1999), for example, describe a garbage collector for a Scheme compiler.Programs written in Erlang, translated into Scheme, and compiled with this compiler are used inATM switches. (Erlang is a proprietary functional language developed by Ericsson.)

GC has divided the programming community into two camps: those who use PLS with GC andconsider it indispensable, and those who use PLs without GC and consider it unnecessary. Expe-rience with C++ is beginning to affect the polarization, as the difficulties and dangers of explicitmemory management in complex OO systems becomes apparent. Several commercial productsthat offer “conservative” GC are now available and their existence suggests that developers willincreasingly respect the contribution of GC.

12 Conclusion

Values and Objects. Mathematically oriented views of programming favour values. Simulationoriented views favour objects. Both views are useful for particular applications. A PL thatprovides both in a consistent and concise way might be simple, expressive, and powerful.

The “logical variable” (a variable that is first unbound, then bound, and later may be unboundduring backtracking) is a third kind of entity, after values and objects. It might be fruitful todesign a PL with objects, logical variables, unification, and backtracking. Such a languagewould combine the advantages of OOP and logic programming.

Making Progress. It is not hard to design a large and complex PL by throwing in numerousfeatures. The hard part of design is to find simple, powerful abstractions — to achieve morewith less. In this respect, PL design resembles mathematics. The significant advances inmathematics are often simplifications that occur when structures that once seemed distinctare united in a common abstraction. Similar simplifications have occurred in the evolutionof PLs: for example, Simula (Section 6.1) and Scheme (Section 4.2). But programs are notmathematical objects. A rigorously mathematical approach can lead all too rapidly to the“Turing tar pit”.

Practice and Theory. The evolution of PLs shows that, most of the time, practice leads theory.Designers and implementors introduce new ideas, then theoreticians attempt to what theydid and how they could have done it better. There are a number of PLs that have been basedon purely theoretical principles but few, if any, are in widespread use.

Theory-based PLs are important because they provide useful insights. Ideally, they serveas testing environments for ideas that may eventually be incorporated into new mainstreamPLs. Unfortunately, it is often the case that they show retroactively how something shouldhave been done when it is already too late to improve it.

Multiparadigm Programming. PLs have evolved into several strands: procedural program-ming, now more or less subsumed by object oriented programming; functional programming;logic programming and its successor, constraint programming. Each paradigm performs wellon a particular class of applications. Since people are always searching for “universal” solu-tions, it is inevitable that some will try to build a “universal” PL by combining paradigms.Such attempts will be successful to the extent that they achieve overall simplification. It isnot clear that any future universal language will be accepted by the programming community;it seems more likely that specialized languages will dominate in the future.

150

References

Abelson, H. and G. Sussman (1985). Structure and Interpretation of Computer Programs. MITPress.

Ake Wikstrom (1987). Functional Programming Using Standard ML. Prentice Hall.

Apt, K. R., J. Brunekreef, V. Partington, and A. Schaerf (1998, September). Alma-0: an impera-tive language that supports declarative programming. ACM Trans. Programming Languagesand Systems 20 (5), 1014–1066.

Arnold, K. and J. Gosling (1998). The Java Programming Language (Second ed.). The JavaSeries. Addison Wesley.

Ashcroft, E. and W. Wadge (1982). Rx for semantics. ACM Trans. Programming Languages andSystems 4 (2), 283–294.

Backus, J. (1957, February). The FORTRAN automatic coding system. In Proceedings of theWestern Joint Computer Conference.

Backus, J. (1978, August). Can programming be liberated from the von Neumann style? Afunctional style and its algebra of programs. Communications of the ACM 21 (8), 613–64.

Bergin, Jr., T. J. and R. G. Gibson, Jr. (Eds.) (1996). History of Programming Languages–II.ACM Press (Addison Wesley).

Boehm, H.-J. and M. Weiser (1988). Garbage collection in an uncooperative environment. Soft-ware: Practice & Experience 18 (9), 807–820.

Bohm, C. and G. Jacopini (1966, May). Flow diagrams, Turing machines, and languages withonly two formation rules. Communications of the ACM 9 (5), 366–371.

Bratko, I. (1990). PROLOG: Programming for Artificial Intelligence (Second ed.). Addison Wes-ley.

Brinch Hansen, P. (1999, April). Java’s insecure parallelism. ACM SIGPLAN 34 (4), 38–45.

Brooks, F. P. (1975). The Mythical Man-Month. Addison Wesley.

Brooks, F. P. (1987, April). No silver bullet: Essence and accidents for software engineering.IEEE Computer 20 (4), 10–18.

Brooks, F. P. (1995). The Mythical Man-Month (Anniversary ed.). Addison Wesley.

Budd, T. A. (1995). Multiparadigm Programming in Leda. Addison Wesley.

Church, A. (1941). The calculi of lambda-conversion. In Annals of Mathematics Studies, Vol-ume 6. Princeton Univesity Press.

Clarke, L. A., J. C. Wilden, and A. L. Wolf (1980). Nesting in Ada programs is for the birds. InProc. ACM SIGPLAN Symposium on the Ada programming Language. Reprinted as ACMSIGPLAN Notices, 15(11):139–145, November 1980.

Colmerauer, A., H. Kanoui, P. Roussel, and R. Pasero (1973). Un systeme de communicationhomme-machine en francais. Technical report, Groupe de Recherche en Intelligence Artifi-cielle, Universite d’Aix-Marseille.

Dahl, O.-J., E. W. Dijkstra, and C. A. R. Hoare (1972). Structured Programming. AcademicPress.

Dijkstra, E. W. (1968, August). Go to statement considered harmful. Communications of theACM 11 (8), 538.

Gelernter, D. and S. Jagannathan (1990). Programming Linguistics. MIT Press.

151

REFERENCES 152

Gill, S. (1954). Getting programmes right. In International Symposium on Automatic Digi-tal Computing. National Physical Laboratory. Reprinted as pp. 289–292 of (Williams andCampbell-Kelly 1989).

Griswold, R. E. and M. T. Griswold (1983). The Icon Programming Language. Prentice Hall.

Grogono, P. (1991a, January). The Dee report. Technical ReportOOP–91–2, Department of Computer Science, Concordia University.ttp://www.cs.concordia.ca/∼faculty/grogono/dee.html.

Grogono, P. (1991b, January). Issues in the design of an object orientedprogramming language. Structured Programming 12 (1), 1–15. Available aswww.cs.concordia/~faculty/grogono/oopissue.ps.

Grogono, P. and P. Chalin (1994, May). Copying, sharing, and aliasing. In Colloquium on ObjectOrientation in Databases and Software Engineering (ACFAS’94), Montreal, Quebec. Avail-able as www.cs.concordia/~faculty/grogono/copying.ps.

Hanson, D. R. (1981, August). Is block structure necessary? Software: Practice and Experi-ence 11 (8), 853–866.

Harbison, S. P. and G. L. Steele, Jr. (1987). C: A Reference Manual (second ed.). Prentice Hall.

Hoare, C. A. R. (1968). Record handling. In Programming Languages, pp. 291–347. AcademicPress.

Hoare, C. A. R. (1974). Hints on programming language design. In C. Bunyan (Ed.), ComputerSystems Reliability, Volume 20 of State of the Art Report, pp. 505–34. Pergamon/Infotech.Reprinted as pages 193–216 of (Hoare 1989).

Hoare, C. A. R. (1975, June). Recursive data structures. Int. J. Computer and InformationScience 4 (2), 105–32. Reprinted as pages 217–243 of (Hoare 1989).

Hoare, C. A. R. (1989). Essays in Computing Science. Prentice Hall. Edited by C. B. Jones.

Jensen, K. and N. Wirth (1976). Pascal: User Manual and Report. Springer-Verlag.

Joyner, I. (1992). C++? a critique of C++. Available from [email protected]. SecondEdition.

Kay, A. C. (1996). The early history of Smalltalk. In History of Programming Languages–II, pp.511–578. ACM Press (Addison Wesley).

Kleene, S. C. (1936). General recursive functinos of natural numbesr. Mathematische An-nalen 112, 727–742.

Knuth, D. E. (1974). Structured programming with GOTO statements. ACM Computing Sur-veys 6 (4), 261–301.

Kolling, M. (1999). The Design of an Object-Oriented Environment and Language for Teaching.Ph. D. thesis, Basser Department of Computer Science, University of Sydney.

Kowalski, R. A. (1974). Predicate logic as a programming language. In Information processing74, pp. 569–574. North Holland.

LaLonde, W. and J. Pugh (1995, March/April). Complexity in C++: a Smalltalk perspective.Journal of Object-Oriented Programming 8 (1), 49–56.

Landin, P. J. (1965, August). A correspondence between Algol 60 and Church’s lambda-notation.Communications of the ACM 8, 89–101 and 158–165.

Larose, M. and M. Feeley (1999). A compacting incrmental collector and its performce in a pro-duction quality compiler. In ACM SIGPLAN International Symposiuum on Memory Man-agement (ISMM’98). Reprinted as ACM SIGPLAN Notices 343:1–9, March 1999.

REFERENCES 153

Lientz, B. P. and E. B. Swanson (1980). Software Maintenance Management. Addison Wesley.

Lindsey, C. H. (1996). A history of Algol 68. In History of Programming Languages, pp. 27–83.ACM Press (Addison Wesley).

Liskov, B. (1996). A history of CLU. In History of Programming Languages–II, pp. 471–496.ACM Press (Addison Wesley).

Liskov, B. and J. Guttag (1986). Abstraction and Specification in Program Development. MITPress.

Liskov, B. and S. Zilles (1974, April). Programming with abstract data types. ACM SIGPLANNotices 9 (4), 50–9.

MacLennan, B. J. (1983, December). Values and objects in programming languages. ACM SIG-PLAN Notices 17 (12), 70–79.

Martin, J. J. (1986). Data Types and Data Structures. Prentice Hall International.

McAloon, K. and C. Tretkoff (1995). 2LP: Linear programming and logic programming. In P. V.Hentenryck and V. Saraswat (Eds.), Principles and Preactice of Constraint Programming,pp. 101–116. MIT Press.

McCarthy, J. (1960, April). Recursive functions of symbolic expressions and their computationby machine, Part I. Communications of the ACM 3 (4), 184–195.

McCarthy, J. (1978). History of LISP. In R. L. Wexelblat (Ed.), History of Programming Lan-guages, pp. 173–197. Academic Press.

McKee, J. R. (1984). Maintenance as a function of design. In Proc. AFIPS National ComputerConference, pp. 187–93.

Meyer, B. (1988). Object-oriented Software Construction. Prentice Hall International. SecondEdition, 1997.

Meyer, B. (1992). Eiffel: the Language. Prentice Hall International.

Milne, R. E. and C. Strachey (1976). A Theory of Programming Language Semantics. JohnWiley.

Milner, B. and M. Tofte (1991). Commentary on Standard ML. MIT Press.

Milner, B., M. Tofte, and R. Harper (1990). The Definition of Standard ML. MIT Press.

Mutch, E. N. and S. Gill (1954). Conversion routines. In International Symposium on AutomaticDigital Computing. National Physical Laboratory. Reprinted as pp. 283–289 of (Williams andCampbell-Kelly 1989).

Naur, P. (1978). The European side of the last phase of the development of Algol. In History ofProgramming Languages. ACM Press. Reprinted as pages 92–137 of (Wexelblat 1981).

Naur, P. et al. (1960, May). Report on the algorithmic language Algol 60. Communications ofthe ACM 3 (5), 299–314.

Nygaard, K. and O.-J. Dahl (1978). The development of the SIMULA languages. In R. L.Wexelblat (Ed.), History of Programming Languages, pp. 439–478. Academic Press.

Parnas, D. L. (1972, December). On the criteria to be used in decomposing systems into modules.Communications of the ACM 15 (12), 1053–1058.

Perlis, A. J. (1978). The American side of the development of Algol. In History of ProgrammingLanguages. ACM Press. Reprinted as pages 75–91 of (Wexelblat 1981).

Radin, G. (1978). The early history and characteristics of PL/I. In History of ProgrammingLanguages. ACM Press. Reprinted as pages 551–574 of (Wexelblat 1981).

REFERENCES 154

Ritchie, D. M. (1996). The development of the C programming language. In History of Program-ming Languages, pp. 671–686. ACM Press (Addison Wesley).

Robinson, A. J. (1965). A machine-oriented logic based on the resolution principle. Journal ofthe ACM 12, 23–41.

Sakkinen, M. (1988). On the darker side of C++. In S. Gjessing and K. Nygaard (Eds.), Pro-ceedings of the 1988 European Conference of Object Oriented Programming, pp. 162–176.Springer LNCS 322.

Sakkinen, M. (1992). The darker side of C++ revisited. Structured Programming 13.

Sammet, J. (1969). Programming Languages: History and Fundamentals. Prentice Hall.

Sammett, J. E. (1978). The early history of COBOL. In History of Programming Languages.ACM Press. Reprinted as pages 199–241 of (Wexelblat 1981).

Schwartz, J. T., R. B. K. Dewar, E. Dubinsky, and E. Schonberg (1986). Programming with sets— an introduction to SETL. Springer Verlag.

Steele, Jr., G. L. (1996). The evolution of LISP. In History of Programming Languages, pp.233–308. ACM Press (Addison Wesley).

Steele, Jr., G. L. et al. (1990). Common LISP: the Language (second ed.). Digital Press.

Steele, Jr., G. L. and G. J. Sussman (1975). Scheme: an interpreter for the extended lambdacalculus. Technical Report Memo 349, MIT Artificial Intelligence Laboratory.

Strachey, C. (1966). Towards a formal semantics. In T. B. Steel (Ed.), Formal Language De-scription Languages for Computer Programming. North-Holland.

Stroustrup, B. (1994). The Design and Evolution of C++. Addison Wesley.

Taivalsaari, A. (1993). A Critical View of Inheritance and Reusability in Object-oriented Pro-gramming. Ph. D. thesis, University of Jyvaskyla.

Teale, S. (1993). C++ IOStreams Handbook. Addison Wesley. Reprinted with corrections, Febru-ary 1994.

Turing, A. M. (1936). On computable numbers with an application to the Entscheidungsproblem.Proc. London Math. Soc. 2 (42), 230–265.

Turner, D. (1976). The SASL Langauge Manual. Technical report, University of St. Andrews.

Turner, D. (1979). A new implementation technique for applicative languages. Software: Practice& Experience 9, 31–49.

Turner, D. (1981). KRC language manual. Technical report, University of Kent.

Turner, D. (1985, September). Miranda: a non-strict functional language with polymorphictypes. In Proceedings of the Second International Conference on Functional ProgrammingLanguages and Computer Architecture. Springer LNCS 301.

van Wijngaarden, A. et al. (1975). Revised report on the algorithmic language Algol 68. ActaInformatica 5, 1–236. (In three parts.).

Wexelblat, R. L. (Ed.) (1981). History of Programming Languages. Academic Press. From theACM SIGPLAN History of programming Languages Conference, 1–3 June 1978.

Wheeler, D. J. (1951). Planning the use of a paper library. In Report of a Conference on HighSpeed Automatic Calculating-Engines. University Mathematical Laboratory, Cambridge.Reprinted as pp. 42–44 of (Williams and Campbell-Kelly 1989).

Whitaker, W. A. (1996). Ada — the project. In History of Programming Languages, pp. 173–228.ACM Press (Addison Wesley).

REFERENCES 155

Williams, M. R. and M. Campbell-Kelly (Eds.) (1989). The Early British Computer Conferences,Volume 14 of The Charles Babbage Institute Reprint Series for the History of Computing.The MIT Press and Tomash Publishers.

Wirth, N. (1971, April). Program development by stepwise refinement. Communications of theACM 14 (4), 221–227.

Wirth, N. (1976). Algorithms + Data Structures = Programs. Prentice Hall.

Wirth, N. (1982). Programming in Modula-2. Springer-Verlag.

Wirth, N. (1996). Recollections about the development of Pascal. In History of ProgrammingLanguages, pp. 97–110. ACM Press (Addison Wesley).

A Abbreviations and Glossary

Actual parameter. A value passed to a function.

ADT. Abstract data type. A data type described in terms of the operations that can be performedon the data.

AR. Activation record. The run-time data structure corresponding to an Algol 60 block. Sec-tion 3.3.

Argument. A value passed to a function; synonym of actual parameter .

Bind. Associate a property with a name.

Block. In Algol 60, a syntactic unit containing declarations of variables and functions, followedby code. Successors of Algol 60 define blocks in similar ways. In Smalltalk, a block is aclosure consisting of code, an environment with bindings for local variables, and possiblyparameters.

BNF. Backus-Naur Form (originally Backus Normal Form). A notation for describing a context-free grammar, introduced by John Backus for describing Algol 60. Section B.4.2.

Call by need. The argument of a function is not evaluated until its value is needed, and thenonly once. See lazy evaluation .

CBN. Call by name. A parameter passing method in which actual parameters are evaluated whenthey are needed by the called function. Section 3.3

CBV. Call by value. A parameter passing method in which the actual parameter is evaluatedbefore the function is called.

Closure. A data structure consisting of one (or more, but usually one) function and its lexicalenvironment. Section 4.2.

CM. Computational Model. Section 10.2

Compile. Translate source code (program text) into code for a physical (or possibly abstract)machine.

CS. Computer Science.

Delegation. An object that accepts a request and then forwards the request to another object issaid to “delegate” the request.

Dummy parameter. Formal parameter (FORTRAN usage).

Dynamic. Used to describe something that happens during execution, as opposed to static.

Dynamic Binding. A binding that occurs during execution.

Dynamic Scoping. The value of a non-local variable is the most recent value that has beenassigned to it during the execution of the program. Cf. Lexical Scoping. Section 4.1.

Dynamic typing. Performing type checking “on the fly”, while the program is running.

Eager evaluation. The arguments of a function are evaluated at the time that it is called, whetheror not their values are needed.

156

A ABBREVIATIONS AND GLOSSARY 157

Early binding. Binding a property to a name at an early stage, usually during compilation.

Exception handling. A PL feature that permits a computation to be abandoned; control istransferred to a handler in a possibly non-local scope.

Extent. The time during execution during which a variable is active. It is a dynamic property,as opposed to scope .

Formal parameter. A variable name that appears in a function heading; the variable will receivea value when the function is called. “Parameter” without qualification usually means “formalparameter”.

FP(L). Functional Programming (Language). Section 4.

Garbage collection. Recycling of unused memory when performed automatically by the imple-mentatin rather than coded by the programmer. (The term “garbage collection” is widelyused but inaccurate. The “garbage” is thrown away: it is the useful data that is collected!)

GC. Garbage collection .

Global scope. A name that is accessible throughout the (static) program text has global scope.

Heap. A region for storing variables in which no particular order of allocation or deallocation isdefined.

High order function. A function that either accepts a function as an argument, or returns afunction as a value, or both.

Higher order function. Same as high order function .

Inheritance. In OOPLs, when an object (or class) obtains some of its features from anotherobject (or class), it is said to inherit those features.

Interpret. Execute a program one statement at a time, without translating it into machine code.

Lambda calculus. A notation for describing functions, using the Greek letter λ, introduced byChurch.

Late binding. A binding (that is, attachment of a property to a name) that is performed at a latertime than would normally be expected. For example, procedural languages bind functions tonames during compilation, but OOPLs “late bind” function names during execution.

Lazy evaluation. If the implementation uses lazy evaluation, or call by need , a function does notevaluate an argument until the value of the argument is actually required for the computation.When the argument has been evaluated, its value is stored and it is not evaluated again duringthe current invocation of the function.

Lexical Scoping. The value of a non-local variables is determined in the static context as deter-mined by the text of the program. Cf. Dynamic Scoping. Section 4.1.

Local scope. A name that is accessible in only a part of the program text has local scope.

OOP(L). Object Oriented Programming (Language). Section 6.

Orthogonal. Two language features are orthogonal if they can be combined without losing anyof their important properties. Section 3.6.

A ABBREVIATIONS AND GLOSSARY 158

Overloading. Using the same name to denote more than one entity (the entities are usuallyfunctions).

Own variable. A local variable in an Algol 60 program that has local scope (it can be accessedonly within the procedure in which it is declared) but global extent (it is created when theprocedure is first called and persists until the program termiantes). Section 3.3.

Parameter. Literally “unknown” or “unmeasurable” (from Greek). Used in PLs for the objectspassed to functions.

Partial function. A function that is defined over only a part of its domain.√

x is partial overthe domain of reals because it has no real value if x < 0.

PL. Programming Language.

Polymorphism. Using one name to denote several distinct objects. (Literally “many shapes”).

Recursion. A process or object that is defined partly in terms of itself without circularity.

Reference semantics. Variable names denote objects. Used by all functional and logic languages,and most OOPLs.

RE. Regular Expression. Section B.4.

Scope. The region of text in which a variable is accessible. It is a static property, as opposed toextent .

Semantic Model. An alternative term for Computational Model (CM). Section 10.2.

Semantics. A system that assigns meaning to programs.

Stack. A region for storing variables in which the most recently allocated unit is the first one tobe deallocated (“last-in, first out”, LIFO). The name suggests a stack of plates.

Static. Used to describe something that happens during compilation (or possible during linkingbut, anyway, before the program is executed), as opposed to dynamic.

Static Binding. A value that is bound before the program is executed. Cf. Dynamic Binding.

Static typing. Type checking performed during compilation.

Strong typing. Detect and report all type errors, as opposed to weak typing .

Template. A syntactic construct from which many distinct objects can be generated. A templateis different from a declaration that constructs only a single object. (The word “template”has a special sense in C++, but the usage is more or less compatible with our definition.)

Thunk. A function with no parameters used to implement the call by name mechanism of Algol60 and a few other languages.

Total function. A function that is defined for all values of its arguments in its domain. ex istotal over the domain of reals.

Value semantics. Variable names denote locations. Most imperative languages use value seman-tics.

Weak typing. Detect and report some type errors, but allow some possible type errors to remainin the program.

B Theoretical Issues

As stated in Section 1, theory is not a major concern of the course. In this section, however, webriefly review the contributions that theory has made to the development of PLs and discuss theimportance and relevance of these contributions.

B.1 Syntactic and Lexical Issues

There is a large body of knowledge concerning formal languages. There is a hierarchy of languages(the “Chomsky hierarchy”) and, associated with each level of the hierarchy, formal machines thatcan “recognize” the corresponding languages and perform other tasks. The first two levels of thehierarchy are the most familiar to programmers: regular languages and finite state automata;and context-free languages and push-down automata. Typical PLs have context-free (or almostcontext-free) grammars, and their tokens, or lexemes, can be described by a regular language.Consequently, programs can be split into lexemes by a finite state automaton and parsed by apush-down automaton.

It might appear, then, that the problem of PL design at this level is “solved”. Unfortunately,there are exceptions. The grammar of C++, for example, is not context-free. This makes theconstruction of a C++ parser unnecessarily difficult and accounts for the slow development ofC++ programming tools. The preprocessor is another obstacle that affects program developmentenvironments in both C and C++.

Exercise 37. Give examples to demonstrate the difficulties that the C++ preprocessor causes ina program development environment. 2

B.2 Semantics

There are many kinds of semantics, including: algebraic, axiomatic, denotational, operational,and others. They are not competitive but rather complementary. There are many applications ofsemantics, and a particular semantic system may be more or less appropriate for a particular task.

For example, a denotational semantics is a mapping from program constructs to abstract math-ematical entities that represent the “meaning” of the constructs. Denotational semantics is aneffective language for communication between the designer and implementors of a PL but is notusually of great interest to a programmer who uses the PL. An axiomatic semantics, on the otherhand, is not very useful to an implementor but may be very useful to a programmer who wishesto prove the correctness of a program.

The basic ideas of denotational semantics are due to Landin (1965) and Strachey (1966). The mostcomplete description was completed by Milne (1976) after Strachey’s death. Denotational semanticsis a powerful descriptive tool: it provides techniques for mapping almost any PL feature into ahigh order mathematical function. Denotational semantics developed as a descriptive techniqueand was extended to handle increasingly arcane features, such as jumps into blocks.

Ashcroft and Wadge published an interesting paper (1982), noting that PL features that wereeasy to implement were often hard to describe and vice versa. They suggested that denotationalsemantics should be used prescriptively rather then descriptively . In other words, a PL designershould start with a simple denotational semantics and then figure out how to implement thelanguage — or perhaps leave that task to implementors altogether. Their own language, Lucid,was designed according to these principles.

159

B THEORETICAL ISSUES 160

To some extent, the suggestions of Ashcroft and Wadge have been followed (although not neces-sarily as a result of their paper). A description of a new PL in the academic literature is usuallyaccompanied by a denotational semantics for the main features of the PL. It is not clear, however,that semantic techniques have had a significant impact on mainstream language design.

B.3 Type Theory

There is a large body of knowledge on type theory. We can summarize it concisely as follows. Hereis a function declaration.

T f(S x) { B; return y; }The declaration introduces a function called f that takes an argument, x of type S, performscalculations indicated as B, and returns a result, y of type T . If there is a type theory associatedwith the language, we should be able to prove a theorem of the following form:

If x has type S then the evaluation of f(x) yields a value of type T .

The reasoning used in the proof is based on the syntactic form of the function body, B.

If we can do this for all legal programs in a language L, we say that L is statically typed . If L isindeed statically typed:

. A compiler for L can check the type correctness of all programs.

. A program that is type-correct will not fail because of a type error when it is executed.

A program that is type-correct may nevertheless fail in various ways. Type checking does notusually detect errors such as division by zero. (In general, most type checking systems assume thatfunctions are total : a function with type S → T , given a value of type S, will return a value oftype T . Some commonly used functions are partial : for example x/y ≡ divide(x,y) is undefinedfor y = 0.) A program can be type-correct and yet give completely incorrect answers. In general:

. Most PLs are not statically typed in the sense defined above, although the number of loopholesin the type system may be small.

. A type-correct program may give incorrect answers.

. Type theories for modern PLs, with objects, classes, inheritance, overloading, genericity, dy-namic binding, and other features, are very complex.

Type theory leads to attractive and interesting mathematics and, consequently, tends to be over-valued by the theoretical Computer Science community. The theory of program correctness is moredifficult and has attracted less attention, although it is arguably more important.

Exercise 38. Give an example of an expression or statement in Pascal or C that contains a typeerror that the compiler cannot detect. 2

B.4 Regular Languages

Regular languages are based on a simple set of operations: sequence, choice, and repetition. Sincethese operations occur in many contexts in the study of PLs, it is interesting to look briefly atregular languages and to see how the concepts can be applied.


In the formal study of regular languages, we begin with an alphabet, Σ, which is a finite set ofsymbols. For example, we might have Σ = { 0, 1 }. The set of all finite strings, including the emptystring, constructed from the symbols of Σ is written Σ∗.

We then introduce a variety of regular expressions , which we will refer to as RE here. (Theabbreviation RE is also used to stand for “recursively enumerable”, but in this section it standsfor “regular expression”.) Each form of RE is defined by the set of strings in Σ∗ that it generates.This set is called a “regular language”.

1. g� is RE and denotes the empty set.

2. ε (the string containing no symbols) is RE and denotes the set { ε }.

3. For each symbol x ∈ Σ, x is RE and denotes the set {x }.

4. If r is RE with language R, then r∗ is RE and denotes the set R∗ (where

R∗ = {x1x2 . . . xn | n ≥ 0 ∧ xi ∈ R } ).

5. If r and s are RE with languages R and S respectively, then (r + s) is RE and denotes theset R ∪ S.

6. If r and s are RE with languages R and S respectively, then (rs) is RE and denotes the setRS (where

RS = { rs | r ∈ R ∧ s ∈ S } ).

We add two further notations that serve solely as abbreviations.

1. The expression an represents the string aa . . . a containing n occurrences of the symbol a.

2. The expression a+ is an abbreviation for the RE aa∗ that denotes the set {a, aa, aaa, . . .}.

B.4.1 Tokens

We can use regular languages to describe the tokens (or lexemes) of most PLs. For example:

UC = A + B + · · ·+ Z

LC = a + b + · · ·+ z

LETTER = UC + LC

DIGIT = 0 + 1 + · · ·+ 9IDENTIFIER = LETTER ( LETTER+ DIGIT )∗

INTCONST = DIGIT+

FLOATCONST = DIGIT+ . DIGIT∗. . . .

We can use a program such as lex to construct a lexical analyzer (scanner) from a regular expressionthat defines the tokens of a PL.

Exercise 39. Some compilers allow “nested comments”. For example, {. . .{. . .} . . .} in Pascal or/ ∗ . . ./ ∗ . . .∗ / . . .∗ / in C. What consequences does this have for the lexical analyzer (scanner)? 2


RE Control Structurex statementrs statement; statementr∗ while expression do statement

r + s if condition then statement else statement

Figure 17: REs and Control Structures

B.4.2 Context Free Grammars

Context free grammars for PLs are usually written in BNF (Backus-Naur Form). A grammar rule,or production , consists of a non-terminal symbol (the symbol being defined), a connector (usually::= or →), and a sequence of terminal and non-terminal symbols. The syntax for the assignment,for example, might be defined:

ASSIGNMENT → VARIABLE “ := ” EXPRESSION

Basic BNF provides no mechanism for choice: we must provide one rule for each possibility. Also,BNF provides no mechanism for repetition; we must use recursion instead. The following exampleillustrates both of these limitations.

SEQUENCE → EMPTY

SEQUENCE → STATEMENT SEQUENCE

BNF has been extended in a variety of ways. With a few exceptions, most of the extended formscan be described simply: the sequence of symbols on the right of the connector is replaced by aregular expression. The extension enables us to express choice and repetition within a single rule.In grammars, the symbol | rather than + is used to denote choice.

STATEMENT = ASSIGNMENT | CONDITIONAL | · · ·SEQUENCE = ( STATEMENT )∗

Extended BNF (EBNF) provides a more concise way of describing grammars than BNF. Just asparsers can be constructed from BNF grammars, they can be constructed from EBNF grammars.

Exercise 40. Discuss the difficulty of adding attributes to an EBNF grammar. 2

B.4.3 Control Structures and Data Structures

Figure 17 shows a correspondence between REs and the control structures of Algol-like languages.Similarly, Figure 18 shows a correspondence between REs and data structures..

Exercise 41. Why might Figures 17 and 18 be of interest to a PL designer? 2

B.4.4 Discussion

These analogies suggest that the mechanisms for constructing REs — concatenation, alternation,and repetition — are somehow fundamental. In particular, the relation between the standardcontrol structures and the standard data structures can be helpful in programming. In Jackson


RE Data Structurex int n;rs struct { int n; float y; }rn int a[n];r∗ int a[]

r + s union { int n; float y; }

Figure 18: REs and Data Structures

Structured Design (JSD), for example, the data structures appropriate for the application areselected first and then the program is constructed with the corresponding control structures.

The limitations of REs are also interesting. When we move to the next level of the Chomskyhierarchy, Context Free Languages (CFLs), we obtain the benefits of recursion. The correspondingcontrol structure is the procedure and the corresponding data structures are recursive structuressuch as lists and trees (Wirth 1976, page 163).

C Haskell Implementation

Several modern functional programming languages, including Haskell, are implemented using com-binators and graph reduction . The technique described below was introduced into mathematicallogic by M. Schoenfinkel in 1924 and was rediscovered as an implementation method by D.A. Turnerin 1979.

We start with three observations.

� Operators are really just functions. For example, we can write x+ y as add x y. Thus we onlyhave to consider prefix functions.

� Functions of more than one variable can be reduced to functions of one variable. We understandadd x y as (add x) y: we evaluate (add x), obtaining a function of one variable, which we thenapply to y.

� In a definition such as

succ x = add 1 x

we can distinguish between bound variables , or simply variables, such as x in this examples,and constants , such as 1 and +. Although add might be defined elsewhere, it is a constantas far as this definition is concerned.

We would like to write the definition of succ in the form

succ = ???

so that the distinction between the name being defined and the definition is clear. Schoenfinkelshowed how to do this using the following abstraction algorithm. If e is an expression possiblycontaining x, the abstraction of e with respect to x is written [x] e. It is an expression that doesnot contain x. The rules for abstraction follow. In the second rule, we assume x 6= y, becauseotherwise the first rule would apply.

[x] x = I

[x] y = K y

[x] (f g) = S ([x] f) ([x] g)

Apply these rules to the function succ, defined above, gives

succ = [x] (add 1 x)= S ([x] (add 1)) ([x] x)= S (S (K add) (K 1)) I

These expressions quickly become quite long. If we abstract n from

fac n = if (n == 0) 1 (n× fac(n − 1))

(where we treat if as a function of three arguments), we obtain

fac = S(S(S(K I if)(S(S(K ==)(K0))I))(K 1))(S(S(K ×)I)(S(K fac)(S(S(K −)I)( 1))))

In general, the size of the SKI expression increases exponentially with the size of the originalexpression. Turner introduced new functions B and C and optimizations to overcome the sizeproblem. Using these, the definition of fac becomes:

fac = S(C(B if (== 0))1)(S× (B fac (C − 1)))

164

C HASKELL IMPLEMENTATION 165

S�

�@

@RS

��

@@R

I

K

?

K

?+ 1

Figure 19: The graph for succ

To make use of these expressions, we apply an argument and then make use of the followingreduction rules:

I x = x

K x y = x

S f g x = (f x) (g x)B f g x = f (g x)C f g x = f x g

Using the simpler example, succ, we have:

succ 4 = S (S (K add) (K 1)) I 4= (S (K add) (K 1) 4) (I 4)= ((K + 4) (K 1 4)) 4= + 1 4= 5

For practical implementation, these expressions are represented as graphs: Figure 19 shows thegraph corresponding to succ. To evaluate an expression such as succ 4, we add a third subtreeto the top node, giving S three arguments, and transform the graph, always using the rule forwhatever function is at the root of the tree.

The basic technique of graph reduction is quite slow. When Turner presented one his functionalprogramming languages at a conference in 1982, he said that it was about 500 times slower thanconventional languages such as C and FORTRAN. Twenty years of research, however, have ledto steady improvements, and a modern implementation of a functional language has performancecomparable to that of a compiled procedural language.

COMP 348 Principles of Programming...

Documents

Transcript of COMP 348 Principles of Programming...