TWO-LEVEL FUNCTIONAL LANGUAGES
Cambridge Tracts in Theoretical Computer Science Managing Editor Professor C.J. van Rijsbergen, Department of Computing Science, University of Glasgow Editorial Board S. Abramsky, Department of Computing Science, Imperial College of Science and Technology P.H. Aczel, Department of Computer Science, University of Manchester J.W. de Bakker, Centrum voor Wiskunde en Informatica, Amsterdam J.A. Goguen, Programming Research Group, University of Oxford J.V. Tucker, Department of Mathematics and Computer Science, University College of Swansea
Titles in the series 1. G. Chaitin Algorithmic Information Theory 2. L.C. Paulson Logic and Computation 3. M. Spivey Understanding Z 4. G. Revesz Lambda Calculus, Combinators and Functional Programming 5. A. Ramsay Formal Methods in Artificial Intelligence 6. S. Vickers Topology via Logic 7. J-Y. Girard, Y. Lafont & P. Taylor Proofs and Types 8. J. Clifford Formal Semantics & Pragmatics for Natural Language Processing 9. M. Winslett Updating Logical Databases 10. K. McEvoy & J.V. Tucker (eds) Theoretical Foundations of VLSI Design 11. T.H. Tse A Unifying Framework for Structured Analysis and Design Models 12. G. Brewka Nonmonotonic Reasoning 15. S. Dasgupta Design Theory and Computer Science 17. J.C.M. Baeten (ed) Applications of Process Algebra 18. J.C.M. Baeten & W.P. Weijland Process Algebra 23. E.-R. Olderog Nets, Terms and Formulas 27. W.H. Hesselink Programs, Recursion and Unbounded Choice 29. P. Gardenfors (ed) Belief Revision 30. M. Anthony & N. Biggs Computational Learning Theory 34. F. Nielson & H.R. Nielson Two Level Functional Languages
TWO-LEVEL FUNCTIONAL LANGUAGES
FLEMMING NIELSON & HANNE RIIS NIELSON Department of Computer Science Aarhus University, Denmark
CAMBRIDGE UNIVERSITY PRESS
CAMBRIDGE UNIVERSITY PRESS Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, Sao Paulo Cambridge University Press The Edinburgh Building, Cambridge CB2 2RU, UK Published in the United States of America by Cambridge University Press, New York www. Cambridge. org Information on this title: www.cambridge.org/9780521403849 © Cambridge University Press 1992 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 1992 This digitally printed first paperback version 2005 A catalogue recordfor this publication is available from the British Library ISBN-13 978-0-521-40384-9 hardback ISBN-10 0-521-40384-7 hardback ISBN-13 978-0-521-01847-0 paperback ISBN-10 0-521-01847-1 paperback
Contents 1 Introduction 2 Types Made Explicit 2.1 The Typed A-Calculus 2.2 Type Analysis 2.2.1 Polytypes 2.2.2 The algorithm 2.2.3 Syntactic soundness and completeness 3 Binding Time Made Explicit 3.1 The 2-Level A-Calculus 3.2 Binding Time Analysis 3.2.1 Binding time analysis of types 3.2.2 Binding time analysis of expressions 3.2.3 Binding time analysis of programs 3.3 Improving the Binding Time Analysis 4
Combinators Made Explicit 4.1 Mixed A-Calculus and Combinatory Logic 4.1.1 Well-formedness 4.1.2 Combinator expansion 4.2 Combinator Introduction 4.3 Improving the Combinator Introduction
5 Parameterized Semantics 5.1 Types 5.1.1 Domain theory 5.1.2 Interpretation of types 5.2 Expressions 5.3 Programs
1 7 8 12 13 16 22 33 33 . 46 48 55 68 72 79 80 83 86 88 102 107 107 108 116 122 133
vi
Contents
6
Code Generation 6.1 The Coding Interpretation 6.1.1 First order aspects of the abstract machine 6.1.2 First order aspects of the coding interpretation 6.1.3 Recursion 6.1.4 Higher-order aspects 6.2 The Substitution Property 6.3 Well-behavedness of the Code 6.3.1 Definition of the well-behavedness predicates 6.3.2 Operations on execution sequences 6.3.3 Well-behavedness of operators 6.4 Correctness of the Code 6.4.1 Definition of the correctness predicates 6.4.2 Correctness of operators
139 139 143 . . . . . . . 144 148 152 154 163 165 168 170 185 185 188
7
Abstract Interpretation 7.1 Strictness Analysis 7.1.1 Types 7.1.2 The safety property 7.1.3 Expressions 7.1.4 Proof of safety 7.2 Improved Strictness Analysis 7.2.1 The Case construct 7.2.2 Tensor products
207 207 208 214 219 222 232 232 237
8
Conclusion 8.1 Optimized Code Generation 8.1.1 Using local strictness information 8.1.2 Using right context information 8.1.3 Using left context information 8.1.4 Pre-evaluation of arguments 8.2 Denotational Semantics 8.2.1 The language Imp 8.2.2 Transformations on the semantic specification 8.2.3 Towards a compiler 8.3 Research Directions
253 253 254 257 261 266 268 269 272 278 280
Bibliography
285
Summary of Transformation Functions
293
Index
295
List of Figures 1.1 1.2
Overview of Chapters 2, 3 and 4 Overview of Chapters 5, 6, 7 and 8
3.1 3.2
The structure of 2 binding times Comparison of the i?-level types for reduce
8.1
Compiler generation
5 6 35 48 279
VII
List of Tables 2.1 2.2 2.3 2.4 2.5
The typed A-calculus The untyped A-calculus Well-formedness of the typed A-calculus £^A: Type analysis of expressions (part 1) £TA : Type analysis of expressions (part 2)
9 9 11 18 19
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
The jg-level A-calculus Well-formed £-level types for reduce Well-formedness of the i?-level types Well-formedness of the 2-level A-calculus (part 1) Well-formedness of the 2-level A-calculus (part 2) TBTA- Binding time analysis of types £BTA : Binding time analysis of expressions (part 1) £BTA : Binding time analysis of expressions (part 2) £BTA : Binding time analysis of expressions (part 3) £BTA : Binding time analysis of expressions (part 4) £BT A : Binding time analysis of expressions (part 5)
34 37 38 41 42 49 57 60 62 65 66
4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8
The mixed A-calculus and combinatory logic 81 Well-formedness of the mixed A-calculus and combinatory logic (1) . 84 Well-formedness of the mixed A-calculus and combinatory logic (2) . 85 SQI™'- Combinator introduction for expressions (part 1) 93 EQ™: Combinator introduction for expressions (part 2) 94 95 £Q^V: Combinator introduction for expressions (part 3) £CJWV: Combinator introduction for expressions (part 4) 96 £Q™ : Combinator introduction for expressions (part 5) 97
5.1
Operators and their actual types
125
6.1 6.2 6.3 6.4 6.5
Configurations of the abstract machine Transition relation for the abstract machine The coding interpretation K (part 1) The coding interpretation K (part 2) The coding interpretation K (part 3)
140 141 145 146 147
IX
List of Tables
7.1 7.2 7.3 7.4 7.5
Type part of A Safety predicate for types of kind r Expression part of A (part 1) Expression part of A (part 2) Expression part of A (part 3)
208 214 219 220 221
8.1 8.2 8.3 8.4 8.5 8.6
The optimizing interpretation O (part 1) The optimizing interpretation O (part 2) The optimizing interpretation O (part 3) Semantics of Imp-expressions Semantics of Imp-statements Semantics of Imp-declarations and programs
254 255 256 271 272 273
Preface The subject area of this book concerns the implementation of functional languages. The main perspective is that part of the implementation process amounts to making computer science concepts explicit in order to facilitate the application, and the development, of general frameworks for program analysis and code generation. This is illustrated on a specimen functional language patterned after the Acalculus: • Types are made explicit in Chapter 2 by means of a Hindley/Milner/Damas type analysis. • Binding times are made explicit in Chapter 3 using an approach inspired by the one for type analysis. The binding times of chief interest are compile-time and run-time. • Combinators are made explicit in Chapter 4 but only for run-time computations whereas the compile-time computations retain their A-calculus syntax. The advantages of this approach are illustrated in the remainder of the book where the emphasis also shifts from a 'syntactic perspective' to a more 'semantic perspective': • A notion of parameterized semantics is defined in Chapter 5 and this allows a wide variety of semantics to be given. • It is illustrated for code generation in Chapter 6. Code is generated for a structured abstract machine and the correctness proof exploits Kripke-logical relations and layered predicates. • It is illustrated for abstract interpretation in Chapter 7. We generalize Wadler's strictness analysis to general lists, show the correctness using logical relations, and illustrate the similarity between tensor products and Wadler's case analysis.
XI
xii
Preface
Finally, Chapter 8 discusses possible ways of extending the development. This includes the use of abstract interpretation to obtain an improved code generation that may still be proved correct. We also illustrate the role of the mixed A-calculus and combinatory logic as a metalanguage for denotational semantics; this allows a systematic approach to compiler generation from semantic specifications.
Notes for the Reader This book is intended for researchers and for students who already have some formal training. Much of the work reported here has been documented elsewhere in the scientific literature and we have therefore aimed at a style of exposition where we concentrate on the main insights and methods, including proofs and proof techniques, but where we feel free to refer to the literature for technically complex generalizations and details of tedious proofs. To facilitate this, we provide bibliographic notes covering variations of the technical development. Our notation is mostly standard but we find 4c->' a more readable notation for 'partial functions' than '—'.
Acknowledgements The research reported here has been supported by The Danish Natural Sciences Research Council. The presentation has benefited from comments from our students and colleagues, in particular Torben Amtoft, Poul Christiansen, Fritz Henglein, Torben Lange, Jens Mikkelsen, Thorleif Nielson, Jens Palsberg, Hans J. Pedersen, Anders Pilegaard, Kirsten L. Solberg, Bettina B. S0rensen and Phil Wadler. Finally, David Tranah made a number of suggestions concerning how to improve the presentation.
Aarhus, January 1992
Flemming Nielson Hanne Riis Nielson
Chapter 1 Introduction The functional programming style is closely related to the use of higher-order functions. In particular, it suggests that many function definitions are instances of the same general computational pattern and that this pattern is defined by a higher-order function. The various instances of the pattern are then obtained by supplying the higher-order function with some of its arguments. One of the benefits of this programming style is the reuse of function definitions and, more importantly, the reuse of properties proved to hold for them: usually a property of a higher-order function carries over to an instance by verifying that the arguments satisfy some simple properties. One of the disadvantages is that the efficiency is often rather poor. The reason is that when generating code for a higher-order function it is impossible to make any assumptions about its arguments and to optimize the code accordingly. Furthermore, conventional machine architectures make it rather costly to use functions as data. We shall therefore be interested in transforming instances of higher-order functions into functions that can be implemented more efficiently. The key observation in the approach to be presented here is that an instance of a higher-order function is a function where some of the arguments are known and others are not. To be able to exploit this we shall introduce an explicit distinction between known and unknown values or, using traditional compiler terminology, between compile-time entities and run-time entities.
The functional paradigm To motivate the notation to be used we begin by reviewing the reduce function. In Miranda 1 it may be written as reduce f u = g 1
Miranda is a trademark of Research Software Limited.
1
Introduction
where g [] = u g (x:xs) = f x (g xs) Here the left hand side of an equation specifies the name of the function and a list of patterns for its parameters. The right hand side is an expression being the body of the function. If more than one equation is given for the same function (as is the case for g) there is an implicit conditional matching the argument with the patterns in the parameter list. Recursion is left implicit because a function name on the right hand side of an equation that defines that function name indicates recursion (as is the case for g). Finally, function application is left implicit as the function is just juxtaposed with its argument(s). By supplying reduce with some of its arguments we can define a number of well-known functions. Some examples are sum = reduce (+) 0 append xs ys = reduce (:) ys xs reverse = reduce h [] where h x xs = append xs [x] map f = reduce h [] where h x ys = (f x) : ys
A similar equational definition of functions is allowed in Standard ML. Slightly rewriting the definition of reduce above we obtain the Standard ML program fun reduce f u = l e t fun g [] = u I g (x::xs) = f x (g xs) in g end; An alternative formulation is val reduce = fn f => fn u => l e t val rec g = fn xs => if xs = [] then u e l s e f (hd xs) (g ( t l xs)) in g end; Here function abstraction is expressed explicitly by fn •••=>••• and the recursive structure of g is expressed by the occurrence of rec. Also the test on the form of the list argument is expressed directly whereas function application is still implicit. In short, we have obtained a somewhat more explicit, but perhaps less readable, definition of reduce.
The enriched A-calculus The development to be performed in this book will require all the operations to be expressed explicitly. We shall define a small language, an enriched \-calculus,
that captures a few of the more important constructs present in modern functional languages. The formal development will then be performed for that language and in many cases it will be straightforward to extend it to richer languages. The enriched A-calculus will only have explicit operations and, in particular, there will be an explicit function application •••(•••). Additionally we shall use parentheses of the form (• • •) to indicate parsing of the concrete syntax. In the enriched A-calculus the program sum may be written DEF reduce = Af.Au.fix (Ag.Axs. if i s n i l xs then u e l s e f (hd xs) (g ( t l xs))) VAL reduce (Ax.Ay.+((x,y))) (0) HAS I n t l i s t -> Int A program in the enriched A-calculus is a sequence of definitions together with an expression and a type: DEF di • • • DEF dn VAL e HAS t Each definition d\ has the form x\ = e\ where x\ is the name of the entity and t\ is its defining expression. The name x\ can be used in the expressions of the definitions given after its definition and in e. The type of e is supposed to be t. The basic types in the enriched A-calculus include amongst others Int, denoting the type of integers, and Bool, denoting the type of truth values. A function type is written as t —>• t', a product type as t x t' and a list type as t l i s t . We follow the convention that the function type constructor binds less tightly than the product type constructor which in turn binds less tightly than the list type constructor. Furthermore, the function type constructor associates to the right and the product type constructor to the left. As an example Int —> Int x Int x Int l i s t —» Int should be read as I n t -» (((Int x Int) x (Int l i s t ) ) —> Int). We shall assume that we have a number of constants. For example, constants representing values of the basic types such as t r u e and f a l s e of type Bool and 0, 1, - 1 , • • • of type Int In addition there are constants for the primitive operations such as + and * of type Int X Int —> Int, = of type Int x Int —> Bool, A and V of type Bool x Bool —» -i of type Bool —> Bool.
Bool and
1
Introduction
Functions are constructed using A-abstraction as in A.r.e, where x is the bound variable and e is the body of the abstraction. Function application is denoted by the explicit operation e(e') where e is the operator and e' the operand. Pairs are constructed using angle brackets as in (e,e') and the components of a pair are selected by f st e and snd e. Lists are built using the two constructs n i l and e:e' where the first gives the empty list and the second prefixes the list e' with the element e. The head and the tail of a (non-empty) list can be selected by hd e and t l e and the construct i s n i l e tests whether a list is empty. The conditional of the language is written as if e then e' else e" and recursion is expressed by the fixed point operator f ix e.
Overview The next three chapters follow the approach sketched above: underlying notions and ideas which are implicit (in some language) are made explicit in order to construct a stable platform from which to study the implementation of functional languages. • In Chapter 2 the types are made explicit: We show how an untyped program plus an overall type for that program may be transformed into a typed program. Even if the original program was typed it would most likely not have all the type information in an explicit form at every place where it might be useful in the implementation, and so some variant of this development would still be needed. • In Chapter 3 the binding times are made explicit. This is done by introducing a notation that allows an explicit distinction between the compile-time and run-time binding times. It is then shown how to propagate partial binding time annotations throughout the program. • In Chapter 4 the combinators are made explicit but only for the run-time computations. The key observation is that run-time computations should give rise to code but that compile-time computations should not. To facilitate code generation, and program analyses intended to aid in this, we need the dependency on free variables to be made explicit and this is achieved by an algorithm for introducing the combinators. This approach is summarized in Figure 1.1. The remainder of the book exploits the, by now fully explicit, notation to develop and illustrate a flexible notion of semantics. • In Chapter 5 the framework of parameterized semantics is developed. Here run-time types and combinators may be interpreted in virtually any way desired and this interpretation is then extended to all well-formed programs. Example interpretations include lazy and eager 'standard interpretations'.
untyped A-expression with overall type and binding time information 1 type analysis: Section 2.2 typed A-expression with overall binding time information 1 binding time analysis: Section 3.2 typed A-expression with binding time annotation I combinator introduction: Section 4.2 mixed A-calculus and combinator expression
Figure 1.1: Overview of Chapters 2, 3 and 4 • In Chapter 6 the more interesting example of code generation is studied, again through the definition of an interpretation. This involves the definition of a structured abstract machine. A subset of the translation is proved correct using Kripke-logical relations and layered predicates.
• In Chapter 7 it is shown how to formulate abstract interpretation (a compiletime program analysis technique) by means of an interpretation. We formulate a generalized version of Wadler's strictness analysis for lists and prove its correctness using logical relations. Tensor products are introduced as a companion to Wadler's notion of case analysis.
Throughout this development we shall concentrate on non-strict, i.e. lazy, semantics although most of the development could be modified to apply to a language with a strict, i.e. eager, semantics. Finally, in Chapter 8 we conclude with a discussion about how to extend this development by incorporating the results of abstract interpretation in order to perform improved code generation. We also show how the development may be used to generate compilers, or compiling specifications, from language specifications written in the style of denotational semantics.
1
Introduction
Chapter 5 parameterized semantics Chapter 7
Chapter 6
abstract interpretation
code generation
Chapter 8 ideas for optimized code generation
Figure 1.2: Overview of Chapters 5, 6, 7 and 8
Bibliographical Notes The programming language Standard ML is defined in [56] and Miranda is described in [102, 103]. Introductions to functional programming may be found in [107] and [11].
Chapter 2 Types Made Explicit Both Miranda and Standard ML enjoy the property that a programmer need not specify the types of all the entities defined in the program. This is because the implementations are able to infer the remaining types assuming that the program is well-formed. The benefit for the functional programming style is that higher-order functions can be defined in a rather straightforward manner. As an example, implementations of Miranda and Standard ML will infer that the type of the reduce function considered in Chapter 1 is (a-+/?->/?)->/?-> a list-*/? Here a and (3 are so-called type variables that may be instantiated (or replaced) with arbitrary types. The occurrence of reduce in the definition of sum (in Chapter 1) has the type (lnt->Int-»Int)-»Int-»Int list->Int because it is applied to arguments of type Int—>Int—>Int and Int. On the other hand, the occurrence of reduce in the definition of append (in Chapter 1) is applied to arguments of type 7 —» 7 l i s t —> 7 l i s t and 7 l i s t so its type will be (7 —>7 list—> 7 l i s t ) —»7 list—>7 list—>7 l i s t In this chapter the enriched A-calculus is equipped with a type inference system closely related to that found in Miranda and Standard ML. The type inference is based upon a few rules for how to build well-formed expressions. As an example, the function application e(e') is only well-formed if the type of e has the form t—>t f and if the type of e' is t and then the type of the application is t'. Similar rules exist for the other composite constructs of the language. For the constants we have axioms stating, for example, that + has type Intxlnt—»Int and that 0 has type Int. Based upon such axioms and rules we can infer that the program sum is well-formed and we can determine the types of the various subexpressions.
8
2
Types Made Explicit
Of course the results obtained are the same as those mentioned above for Miranda and Standard ML. The details are provided in Section 2.1. The next step is to annotate the program with the inferred type information: we shall add the actual types to the constants and the bound variables of A-abstractions. This means that the sum program will be transformed into the following program, to be called sumt: DEF reduce = Af [Int—>Int—>Int].Au[lnt]. fix (Ag[lnt l i s t -> Int]. Axs[lnt l i s t ] . if isnil xs then u else f(hdxs)(g(tl xs))) VAL reduce (Ax[lnt].Ay[lnt].+[lntxInt->Int]((x,y))) (0[lnt]) HAS Int l i s t -^ Int Here the type of reduce has been fixed and this is possible because there is only one application of reduce in the program. (In general we may have to duplicate the definition of some of the functions, see Exercise 2.) The details are presented in Section 2.2 where the Hindley/Milner/Damas type inference algorithm for the A-calculus is reviewed. As we shall see, it frees the user from having to worry too much about types.
2.1
The Typed A-Calculus
The base language underlying our development is the typed A-calculus. It has types, £ET, and expressions, e^E, and its syntax is displayed in Table 2.1. The Aj are base types where i ranges over an unspecified index set / and we shall write Bool for the type Abool of booleans, Int for the type Ajnt of integers and Void for the type Avoid that intuitively only has a dummy element. Product types, function types and list types are constructed using the type constructors x, —> and l i s t , respectively. The f\[i\ are primitives of type t as indicated and we shall write eq[lntxlnt—>Bool] for f eq[lntxlnt—>Bool] etc. Related to products we have notation for forming pairs and for selecting their components. Associated with function space we have notation for A-abstraction, application and variables. As for constants we shall write env for xenv etc.1 Related to lists we have notation for constructing lists, for selecting their components and for testing for emptiness of lists. Finally, we have the truth values, conditional and notation for recursive definitions. We shall consider typed A-expressions together with their overall types and to smooth the explanation we shall use the term programs for such specifications. To obtain readable examples it is desirable that a program allows the definition l
A less desirable consequence of this choice is that it depends on the context whether an identifier like ide means fide or Xide; however, in practice no confusion is likely to arise.
2.1
The Typed A-Calculus
teT t :: =Aj | ^ x ^ | ^t —> | t list
eeE e ::=
m e:e
(e,e) fst e snd e | Axi[i].e e(e) nil[£] hd e | tl e isnil e | true false if e then e else e |fix e
Xi
|
Table 2.1: The typed A-calculus
ueEUE ue ::= fj | (we,we) | fst we snd we AXJ.we | we (we) Xj | we:we nil | hd we |tl we | isnil we true false if we then we else we fix we
Table 2.2: The untyped A-calculus of 'global' functions in much the same way that the functional programming languages considered in Chapter 1 do. This motivates the definition of the syntactic category P(E,T) of programs. Actually, it is more convenient to give a general definition of a syntactic category P(£ I*,T*) of programs over expressions E* and types T*. It is given by: eeE* PeP(E*,T*)
p ::= DEF Xi=e p \ VAL e HAS t So P(E,T) is the syntactic category of programs where the expressions are typed A-expressions and the types are as above. If UE is the syntactic category of expressions in the untyped A-calculus, as given in Table 2.2, the programs in P(UE,T) have expressions that are untyped and types as above. Example 2.1.1 The program sum in Chapter 1 is in P(UE,T) whereas the program sumt above is in P(E,T). • When we allow the user to specify a typed A-expression and its corresponding type, by giving an untyped expression and its intended type, we pretend that we have a function of functionality
P(UEJ) -> P{E,T)
10
2
Types M a d e Explicit
that can automatically introduce the required type information into the untyped expression. This transformation will be studied in Section 2.2. E x a m p l e 2.1.2 Consider the program double given by DEF + = +[lnt—>Int—>Int] VAL Av[Int].(+(v))(v) HAS I n t - a n t Here we have abbreviated the constant f + to + and the variables x + and xv to + and v so that the unabbreviated program is DEF x+ = f+[Aint->Aint->Aint] VAL Ax v [A int ].(x + (x v )) (x v ) HAS Aint->Aint In practice it might increase the readability to abbreviate x + to plus rather than +. Also note that we have used the parentheses ( and ) to indicate grouping whereas ( and ) are used for function application. • Well-formedness We need to be precise about when an expression in the typed A-calculus has a given type. The details are given in Table 2.3 where we define a relation tenv h e : t for when the expression e has type t. Here tenv is a type environment, i.e. a map from a finite set dom(tenv) of variables to types, so it gives information about the type of any (free) variable in e. We use the notation tenv[t/xi] for an environment that is like tenv except that it maps the variable Xj to the type t. We shall sometimes write graph(£ent;) for the set {(x'l,tenv(xi))\x'l£dom(tenv)} and we shall write 0 for the environment whose domain is empty. Well-formedness of a program is then given by 0 h d : tx
0[
- , Xj_i.
2.1
The Typed A-Calculus
11
tenv h fi[t] : t tenv h e-i : t-i tenv h e 2 : tenv h (ei,e2) : i ff s +]
^ent; h e : ^ x £ 2
tenv h f s t e : ffent; e n t ; l h ee : : ^ XX " *1 ^ tenv h snd e : £2 *env[f7xi] h e : f
[snd] m
'M
r/\i • •
i i—\—rm IT—i cent; r Axj[t J.e : c—>t tenv h ei : t'—*t tenv h e2 : tf ^ent; h ei (e2) : ^
[x]
tenv h xi : ^
r.-i [nil]
tenv \~ e\ : t tenv \- e2 : t l i s t h ei:e 2 : t l i s t ^enr; h nil[^] : f l i s t
rh ji
^ent; h e : f l i s t
L J
inaj
^env h hd e : t ^ent; h e : £ l i s t
r+-,i 1
if (xi,£)Egraph(£ent;)
J
tenv \- tl e : t l i s t
fisnill ^en?; h e : £ l i s t • ^ ^env h i s n i l e : Bool [true] tenv h t r u e : Bool [false]
tenv h false : Bool
r- ^1 • ^
tenv h e\ : Bool ^eni; V e2 : t tenv \- e% : t ^env h if ei then e2 else e3 : ^
ffixl *• J
ferct; H g : ^-^^ ^env h f i x e : £
Table 2.3: Well-formedness of the typed A-calculus Example 2.1.3 The well-formedness of double (of Example 2.1.2) amounts to hdouble and to prove this it suffices to show that 0 h f+[lnt->Int->Int] : Int->Int->Int 0[lnt->Int-+Int/x + ] h Av[lnt].(x + (v))(v) : Int->Int where we have used a mixture of abbreviated and unabbreviated notation. The first result amounts to a single application of the axiom [f ]. To prove the second result it is sufficient to show that 0[lnt->Int->Int/x + ][lnt/v] V (x+(v))(v) : Int
(•)
since then the result can be obtained by a single application of the rule [A]. To show (•) we note that two applications of the axiom [x] give
12
2
Types Made Explicit
0[lnt->lnt->lnt/x + ][lnt/v] h x+ : Int->Int->Int 0[lnt->lnt->lnt/x + ][lnt/v] h v : Int Using the rule [()] we then have
0[lnt->lnt->lnt/x + ][lnt/v] h x + (v) : Int-> Int and one more application of the rule [()] gives 0[lnt->Int->Int/x + ][lnt/v] h (x+(v))(v) : Int as desired.
•
In the syntax of the typed A-calculus we have included sufficient type information in the expressions for the type of an expression to be determined by the types of its free variables: Fact 2.1.4 The relation tenv\~e:t is functional in tenv and e; this means that tenv\~e:ti and tenvhe:t2 imply ti — t2. D Proof: This is a straightforward structural induction on e and is left to Exercise 4. • We could thus define a (partial) function type-of such that type-oj'(tenv,e) = t if and only if tenv h e : t. However, we shall prefer a phrase like 'let t be given by tenv h e : V or 'let tenv h e : t determine t\
2.2
Type Analysis
Let us now consider how to propagate type information from a type into an untyped A-expression so as to obtain an expression in the typed A-calculus. In other words we consider the transformation
VTA: P(UE,T)^P(E,T) However, we have to realize that so far we have made too few assumptions about the number of, and nature of, primitives of form f[[t]. To remedy this we shall define a constraint C as a partial map from {fj|iE/} to information about the permissible types. Taking an equality predicate f eq as an example we intend that it may be used as feq[t] whenever the type t has the form £'—^'-^Bool 2 . We model this by setting 2
It might be argued that t1 should not be allowed to contain function spaces as the extensional equality on function types is not computable in general and the intentional equality is of limited use. To express this we should modify C(f eq ) to use a form of bounded quantification rather than the general quantification indicated by V. However, this goes beyond the standard Hindley/Milner/Damas type inference and we shall abstain from this.
2.2
Type Analysis
13
q) = VXa. and to explain this notation we need some concepts from the literature on 'polymorphic type inference'; this will also serve as a useful preparation for constructing the translation
2.2.1
Polytypes
A syntactic category of type variables may be introduced by
tveTV tv ::= Xi | X2 | ...
(infinite)
Often in the literature type variables are denoted by Greek letters a, a x , f3 etc. or ; a, 'b etc. but this is entirely a matter of taste. A polytype is then like a type except that it may contain type variables and this motivates defining ptePT pt ::= A, | ptxpt
| pt—>pt | pt l i s t | tv
An example polytype is Xi—>Xi—>Bool. We shall write FTV(pt) for the set of (free) type variables in pt. If FTV(p£)=0 we shall say that pt is a monotype and we shall not distinguish between monotypes in PT and types in T. A type scheme then is a polytype where some of the type variables may be universally bound. More precisely
tseTS ts ::= pt | Wtv.ts Thus VXi.Xi—>Xi—>Bool is a type scheme as is VXi.Xi—>X2—*Bool. The free type
variables FTV(ts) of a type scheme ts = Vtoi.- • -\/tvn.pt are FTV(pt)\{tvu- • -,tvn}. A type scheme with an empty set of free type variables is said to be closed, for example the type scheme displayed for C(f eq) above. From now on we shall assume that the constraint C maps constants to closed type schemes. Closely related to polytypes is the notion of a substitution. There are several ways to explain substitutions and we prefer to model them as total functions from TV to PT. However, for a total function S:TV—>PT to qualify as a substitution we require that the set
Dom(S) = { tveTV | S{tv)^tv } is finite. Hence one could represent a substitution S as [pti/Xi,...,ptn/Xn] where Dom(S') = {X l v ..,X n } and 5(Xi) = pt\. We may extend a substitution to work on polytypes by an obvious structural induction and we shall feel free to write simply
14
2
Types Made Explicit
S(pt) for the result of applying the substitution S to the polytype pt. We may thus regard a substitution 5 as a total function S:PT—>PT and this allows us to define composition of substitutions as merely functional composition. We shall write FTV(S) - U { FTV(S(
X 1-->Bool (using a tion S. As an example Int—>Int—»Bool substitution 5 = [Int/Xi]). We shall say that 5 covers pt if Dom(S)DFTV(pt). When S covers pt and is a ground substitution the polytype S(pt) will be a monotype because all type variables in pt will be replaced by monotypes. Turning to a closed type scheme ts = Vtvi.- • -\/tvn.pt, an instance of ts is simply an instance of pt. A generic instance of the closed type scheme ts is an instance pt1 of pt such that pt is also an instance of pt'. So X2-+X2—>Bool is a generic instance of C(f eq ) whereas Int—>Int—>Bool is only an instance. Example 2.2.1 In this notation the general type scheme for the reduce function is ts = VXi. VX2. (Xx —> X2 —> X2) —> X2 —»
Xi l i s t —>•
X2
and the type of the reduce function in the sum program is t — (Int —> I n t —> Int) -> Int -> Int l i s t -> Int To see that the concrete type t is an instance of the general type scheme ts we define the polytype pt = (Xx -> X2 -> X2) -> X2 -» X! l i s t -> X2 and the substitution S by Int Xj
if Xj is Xi or X2 otherwise
We may note that Dom(S) = {Xi,X2} and that FTV(S) = 0 so that 5 is a ground substitution. Then t = S(pt) so that t is an instance of pt and ts. However, t is not a generic instance of ts because pt is not an instance of t. n We can then define a modified typing relation tenv \~c e : t
which is like tenv h e : t except that
2.2
Type Analysis
15
tenv hc fi[i] : t if fjEdom(C) and t is an instance of C(f\) Similarly, we shall write
if a program p is well-formed under this assumption. So if C(f eq) is as displayed above we are sure that feq[£] is allowed if and only if t is of the form t'—tt'—->Bool. Example 2.2.2 Returning to Examples 2.1.2 and 2.1.3 we still have he double provided that C(f+) is as usual, i.e. C(f+) = Int —> Int -+ Int Note that Int —> interesting one.
Int —*
Int is indeed a closed type scheme, albeit not a very •
The main purpose of substitutions is to make distinct polytypes equal. So if pti and pt2 are distinct polytypes we want a unifying substitution 5 such that S(pti) = S(pt2). Such a substitution need not exist, e.g. if p£i=Bool and p£ 2 =Int, and if it exists it need not be unique, e.g. if pti=Xx and pt2—X 2. When a unifying substitution S exists we want it to be as 'small' as possible, i.e. whenever Sf is a substitution such that Sf(pti)=Sf(pt2) there should exist a substitution S" such that Sf=S/foS. That this is possible in general follows from: Lemma 2.2.3 (Robinson [90]) There exists an algorithm U which when supplied with a set {pti,...,ptn} °f P°lytypes, either fails or produces a substitution S: • It fails if and only if there exists no substitution 5 such that S(pti) = ... = S(Ptn). • If it produces a substitution S then — S unifies {ptu...,ptn}, i.e. S(pti) = ... = S(ptn), — S is a most general unifier for {p£lv..,p£n}, i.e. whenever S'(pti) = ... = S'(ptn) for a substitution Sf there exists a substitution S" such that S'=S"oS, — S only involves type variables in the pt\\ a bit more formally this may be expressed as Dom(5)UFTV(5) C (JLi FTV(pfi). • We shall sketch a construction of an algorithm U in Exercise 5. Note, however, that the lemma does not guarantee the existence of a unique substitution but merely a most general unifier (if any unifier exists). So writing I for the identity substitution one is free to let i/({X1,X2}) be either Ip^/X^ or I[Xi/X2].
16
2
2.2.2
Types Made Explicit
The algorithm
We now have sufficient background information for approaching the central ingredient in the transformation from 'untyped programs' in P(UE,T) to 'typed programs' in P(E,T). This amounts to inferring a typed A-expression e given an untyped A-expression ue. For this we shall allow the types in the A-expression e to be polytypes. For technical reasons we shall further request that variables be annotated with their types in the same way constants are. In other words, rather than producing an expression e G E we shall produce a polytyped expression pe G PE where the syntactic category of polytyped expressions is given by pe G PE |.. .| Xxilptl .pe |.. .| xj [pf] |..
pe ::= filpO
Clearly, if we remove the polytypes in pe we must obtain the untyped A-expression ue that we started with. We may formalise this by the condition ue = 6TA{pe) for an obvious type erasing function e^A'PE—^UE. Example 2.2.4 One may calculate that ^TA(AX![X!
l i s t -> I n t ] . XiCXa l i s t - • I n t ] ( n i l C X J ) )
amounts to Axi.Xi(nil).
•
The functionality of the type analysis algorithm thus is £$A: UE ^
PE x PT
because it is helpful to obtain also the overall polytype of the polytyped expression. This function is partial, i.e. is allowed to /az7, because it will need to use the algorithm U that is also allowed to fail. The functionality of £%A is slightly simpler here than in some other presentations in the literature. This is because the explicit typing of the polytyped variables obviates the need for producing a set of assumptions A G / Pfi n ({xi|iG/}xPT), that is a finite subset A of {xjJiG/} X PT. Instead we must define a function
ATA:
PE -> PfinUxiliel} x PT)
that extracts the required information. The definition of this function is by analogy with the definition of the free (expression) variables FEV(e) of an expression e and is left to Exercise 7. Example 2.2.5 Consider the expression
2.2
T y p e Analysis
17
e = Ax 1.((x1,x2),(x2,x1)) The set of free variables in e is FEV(e) = {x 2 }. Similarly consider the poly typed expression pe = A X l [ I n t ] . (( X l [Int],x 2 [Bool]), (x2 [ I n t ] , X l [Bool])) The assumptions that are free in pe are
ATA{P^)
=
{(x 2 ,Bool),(x 2 ,Int)}.
a
Before presenting the definition of £%A we need to introduce the following notation. For A G Pfm({xi | i G / } x PT) we shall write YTV{A) = U { FTV(pt) I {xupt)eA
}
for the type variables that A 'uses', SoA = { (xi,5(p«)) I (*,pt)eA
}
for the result of applying a substitution 5 to an assumption A, Ax = { (xi,pt)eA
| Xi^x }
for the part of A that does not involve the variable x, and A(x) = {Pt\
{x,pt)eA
}
for the set of polytypes that A associates with x. We shall say that A is functional if each A(x) is empty or a singleton. When A is functional we may identify it with the partial function, fun(yl), that maps xj to pt whenever (x-^pt)EA. With respect to the notation graph(-• •) introduced earlier, we observe that i4=graph(fun(>l)) for any functional set and that tenv—(\in(gra,ph(tenv)) for any partial function tenv. The type analysis algorithm £%A is defined in Tables 2.4 and 2.5 by a structural induction in which we ask for 'fresh' type variables, i.e. type variables that have not been used before3. Also, we extend substitutions to work on polytyped expressions in much the same way they were extended to work on polytypes. The intention is that £pA fa^s ^ an Y °f the invocations of U fail or if an f \ is encountered that is not in the domain of the constraint C. Basically £%A inspects its argument in a bottom-up manner. For each occurrence of a variable it introduces a 'fresh' type variable. These type variables may be replaced by other polytypes as a result of unifying the polytypes of various sub-expressions. However, only when the enclosing A-abstraction is encountered will the polytypes of the variable occurrences be unified. This is illustrated in the following example. 3
This could be made more precise by adding the set of 'fresh' type variables as an extra argument to £J?A ^u^ following tradition we shall dispense with this.
18
2
Types M a d e Explicit
(fi\pt],pt) if
an
^ P^ is a generic instance of ts a n d F T V ( p ^ ) are all 'fresh' otherwise
cc ir -c n J TAl f i 1 = \
fc
fail
(ue1,ue2) ] = let (peuptx) = £ let {pe2,pt2) = ^ in ((pei,pe 2 ), p*i f s t we
I =
let
(?e>P*) = ^TAiweI let ^ ! and tv2 be 'fresh'
let S =U({pt, tvxxtv2}) in (f st S(pe)9 S(tvx)) snd ^e ] = let (pe,pO = E^A{uej let ^ i and ^ 2 be 'fresh' let S =U{{pt, *t7ix*v2}) in (snd S(pe), S(tv2)) Xxi.ue ] = let {pe.pt) = ^ A [ M e l let ^ be 'fresh' let 5 = W({«v}U^ T in (Axi[5(tt;)].5(pc), = let (pe x ,pii) = fi let (pe2,pt2) = £$A[ue2] let ^ be 'fresh'
let S =U({ptu £TA! XI
I =
pt2-^tv})
l e t tv b e
' in (xiCfv], tv) let (pe2)pf2) = let 5 = W({p*i l i s t , pi 2 }) in (S(pei):S(pe 2 ), 5(p^2))
hd we ] = let (pe,p^ = £$A[ue] let ^ be 'fresh' let S = W({pf,
2.2
Type Analysis c c IT 4-n T A II
19 "n —
7/P
II —
let (pe,pt) = £^A\ueJ let to be 4fresh' let 5 = U({pt, tv list}) in (tl S(pe), S(pt))
^TA! n i l I — le^ ^ be 'fresh' in (nil[to], to l i s t ) T STL i 1
i
cC
IT
J_
7/ P
] = let (pe,pt) = £^Aliie} let to be 'fresh' let S = W({pt, to list}) in (isnil S(pe), Bool) [true, Bool)
11
(false, Bool)
^TA! i^ ^ i "then we2 else ue3 ] = let (pei,pti) = £^A{iiei]
let (pe2,pt2) = ^TA[we2] let (pe3,pt3) = £$Alue3]
let 5i = W({pii, Bool}) let 5 2 = U({j)t2, pt3}) in (if 5i(pei) then S2(pe2) else S2(pe3),S2(pt2))
5 T A [ f x x W e ] - = let (pe.pt) = £rAl we I
let to be cfresh' let S — U({pt, to—>to})
in (fix S(pe), S(tv))
Table 2.5: £%A: Type analysis of expressions (part 2) Example 2.2.6 Consider the (untyped) expression Ag.Ax.g(g(x)) and the computation of £^A[\g.\x.g(g(x))}. these we obtain
This initiates a series of calls. In
where Xi, X2 and X3 are distinct type variables. Using £/({X2,X3—>X4}) = I[X 3 ^X 4 /X 2 ] we get X 4 ] ( x [ X 3 ] ) , X4)
20
2
Types Made Explicit
Repeating the process we use ZY({X1,X4—>X 5}) = I[X4—>X 5/Xi] and we get £xAlg(g(*))I = (g[X 4 ^X 5 3 (g[X 3 ^X 4 ] (x[X 3 ])), X5) To compute £^A[Ax.g(g(x))] we observe that
and U{{X3,X6}) = I[X3/X6] so that £?A[Ax.g(g(x))] = (Ax[X3].g[X4-+X5] (g[X3->X4] (x[X 3 ])),X 3 ^X 5 ) Note that the two occurrences of g are annotated with different polytypes. It is only when encountering the enclosing Ag that the necessary identifications will be made. This is illustrated when computing £%Al\g.\x.g(g(x))}. Here we observe that {(g,X4->X5),(g,X3^X4)} and that ^({X 7 ,X 4 ^X 5 ,X 3 ^X 4 }) = I[X3/X4][X3/X5][X3^X3/X7] so that ^ A [Ag.Ax.g(g(x))] =
This completes the calculations.
•
We can now define the translation function VjA(ov programs. It has functionality V$A : P{UE,T) ^
P{E,T)
and it is partial because it invokes £%A and U and these might fail. The definition is DEF xl = ue1 • • • DEF xn=uen VAL ue0 HAS t ] = let ((Axi[p«i].--- (Xx n[ptn].pe0)(pen) --)(pei), ^TAI (AXI.- • • (\xn.ue0)(uen) •
pi) =
• -)(nei) ]
let 5i = U{{pt,i}) let S2 = (Atv.Void) o 5i in DEF xi=e'TA{S2{pei)) • • • DEF x n =4 A (5 2 (pe n )) VAL e'TA{S2{peo))
HAS t
The general idea is to use the close relationship between programs, e.g.
2.2
Type Analysis
21
DEF Xx = uei VAL ue0 HAS t and expressions containing a A-abstraction that is immediately applied to an argument, e.g.
(\x1.ue0)(uei) To obtain the desired result we need to unify the overall polytype produced by £TA with the monotype, £, supplied. Also we need to get rid of any remaining type variables and this motivates the substitution 4 Atv.Void that replaces type variables with the uninteresting type Void. Finally, the expressions S^pej) are not yet in E because variables in $2{pei) will be annotated with their types and this is not the case for expressions in E. We therefore need to use the type erasing function £^A that has
but otherwise behaves as the identity. Example 2.2.7 If we used S2=Si instead of S' 2=(A^.Void)oS'1 we would get V$A[ VAL (Axi.true) (Ax2.x2) HAS Bool ] to be VAL (Axi[Xi->Xi].true) (Ax 2 [Xi].x 2 ) HAS Bool but with 5 2 =(A^.Void)o5'i we get VAL (AxaCVoid-*Void] .true) (Ax2 [Void] .x2) HAS Bool which is a program in P(E,T). been
If we had dispensed with £^A the result would have
VAL (AxiCVoid-^Void] .true) (Ax2 [Void] .x 2 [Void]) HAS Bool which does not conform to the syntax of P(E,T). 4
•
Actually, Dom(Ata.Void) is the infinite set TV so in order to satisfy the conditions on substitutions we should use \iv. if tv£{J[FTV(Si(pe[)) then Void else tv.
22
2
2.2.3
Types Made Explicit
Syntactic soundness and completeness
To express soundness and completeness of £%A we need two auxiliary sets. The set we) = { (tenv,e,t) \ tenv\-ce:t A ue= A dom(«ent;)=FEV(ttc) } contains the set of triples of the form (tenv,e,t) such that e has type t, that is tenv he e : t, and e equals ue when the types are erased. For technical reasons it is assumed that the domain of the type environment equals the set of free (expression) variables in ue and hence in e. Analogously, the set INSc(ue) = { (SoATA(pe), e'TA(S(pe)), S(pt)) | figjue] = (pe.pt) A Dom(S) D FTV(pe)UFTV(p<) A S is a ground substitution A SOATA{P^) is functional } contains a set of triples obtained from £pA.I[weI = {Pe->pt)- The type is obtained as S(pt), that is a ground substitution applied to the polytype pt. The expression would similarly be S(pe), except that the poly typed variables are annotated with their types so that we have to use the erasing function efTA. Finally, the analogue of the type environment is SoAT?A(pe) where it is required that this set may be regarded as a function. It should be stressed that INSc(we) is the empty set, 0, if £pA[i/e] fails. Furthermore, it should be noted that for the two sets WFFc(ue) and to be comparable we should really have replaced tenv by graph(£env), or by fun(5^*4.TA(PC))•> but for readability we shall retain this imprecision. Proposition 2.2.8 (Soundness and Completeness of WFFc(ue)
= INSc(ue)
for all expressions ue£ UE and all constraints C.
D
Discussion: This proposition is really a conjunction of two results. One is the soundness (or correctness) result WFFc(ue)
D INSc(we)
which says that £%A only specifies well-formed expressions. The other result is the completeness result WFFc(ue)
C INSc(ue)
which says that each and every well-formed expression is obtainable from the result that £J?A specifies. Proof: We proceed by structural induction on ue using the cfresh'-ness of the
2.2
Type Analysis
23
type variables generated in £%A and leaving some of the less interesting cases as an exercise. The case ue::=fi. If fi^dom(C) both WFFc(ue) assume that fiEdom(C). We have WFFc(ue)
and INSc(we) are empty so
= { (0,fi[i],t) | i is an instance of C(fi) }
INSc(we) = { (0,fi[5(pi)]^(pO) I S c o v e r s P*i S i s ground, and pt is a generic instance of C(fi) with FTV(pi) all being 'fresh' } The result then follows because the set of types that are instances of a closed type scheme equals the set of types that are ground instances of a polytype that is a generic instance of the same closed type scheme. The case ue::=\xi.ueo. We assume first that xj £ FEV(we 0). We then have WFFc(ue)
={ (tenv, Axi[f].e 0 , *-»*o) | (ient7[t/xi],eo,«o) e WFF c (we 0 ) A dom(teni;) = FEV(ue) }
because the inference system in Table 2.3 is such that [A] must be the last rule used in a proof of well-formedness of Ax].e0. Writing tenv\X for the restriction of an environment tenv to a subset X of its domain &om(tenv) we have WFFc(ue)
={ (tenv\FEV(ue), Axi[^eni;(xi)] .e 0 , } o,*o) £ WFFc{ue0)
Turning to the other set we have
= { ((SoS')oATA(pe0)Xi, A (SoS'){tv)->{SoS')(pt0))
I ^ T A I ^ O ] = {Peo,pto) A tv 'fresh' A ^({^}U^ T A (peo)(x i ))=5 / A S covers all of 5r/o%4TA(pe0)Xi, S'(tv),
and S'(pto)
S'(pe0)
A 5 is ground A (SoSf)oATA{peo)xi is functional } where we have used
(SoSf)oATA{peo)xi = So(S (SoS')(pe0)=S(Sf(pe0)), (SoS')(pto)=S(S'(pt0)),
and
24
set or, we we
2 Types Made Explicit Next observe that ((5 r o5 f/ )o^ TA (pe 0 ))(x i ) is a singleton because S' unifies the ATA(P^O)(^I) and hence (SoS')(tv) may be replaced by ((5o5")o^4 TA(pe0))(xi) to be precise, the single element in that set. Since FEV(^e) = FEV(weo)\{xi} may replace (*So5'/)o^4TA(peo)xi by ((5'o5'/)ov4TA(peo))[FEV(we). Using this have
= { (((SoSf)oATA(peoWEV(ue),
Axi[((5o5')o^TA(pe0))(xi)]. e'TA((SoS')(pe0)), 5 / )o^ TA (pco))(x i ) -* (SoS')(pt0)) = (peo,pto) A ((5ro5r/)o^4TA(pe0))(xi) is a singleton A (SoSf) covers all of . A T A ^ O ) ^ , ATA{P^O){^I), peo and pto
A (SoS') is ground A ((SoS')oATA{peo))\FEV(ue)
is functional }
because the need for tv has vanished, given that ((SoSf)oATA(peo)){*i) has replaced (SoS')(tv), and that it follows from Lemma 2.2.3 that the (SoSf) ranged over in the previous equation for INSc equal the (SoSf) ranged over in the present equation for INSc- We then have INSc(^e) = { (tenv\FEV(ue), Xx[[tenv(xi)}.e0, tenv(x[)—>t 0) | (tenv,eo,to) G INSc(we0) } The desired result then follows from the induction hypothesis. Finally, we observe that if xi 0 FEV(weo) the result may be obtained in much the same manner. The case ue::=uei(ue2).
We have
A (tenv[FEV(tte 2),e2,J2) € WFFc (^e 2 ) A dom(*eni;)-= FEV(tte) } because the inference system of Table 2.3 is such that [()] must be the last rule used in a proof of well-formedness of ei(e2). Furthermore | (tenvueut2-*t) G A (tenv2,e2,t2) € WFFc (we 2 ) A VxGFEV(wei)nFEV(ue2): tenv1(x)=tenv2{x)
}
Here we write tenviUtenv2 for the partial function that maps xEFEV(i/ei) to tenv^x) and xGFEV(^e 2 ) to tenv2(x)\ this does not give rise to any confusion as tenvi(x)=tenv2(x) whenever xGFEV(wei)flFEV(we2). (If we wanted to be more precise about the distinction between a partial function and a functional set we could have written fun(graph(feni;1)Ugraph(fen^2)) instead of tenvi{Jtenv2.) Turning to the other set we have
2.2
Type Analysis
25
= { ((5o5'o4 T A (pe 1 )) U (SoS'o.4 TA (pe 2 )), ((5o5')(pe 2 ))), (S A £xAl ue 2l = (p^2,pt2)
A *» 'fresh' A U({ptu
pt2^tv})=S'
A 5 covers all of 5'o>4TA(pe1), 5'o 5'(pei), S'{pe2) and S'(*») A 5 is ground A (SoS'oATA(pei)) U {SoS'oATA{pe2)) is functional } where we have used
ATA(P^I
(p^)) =
(ue) = { ((SnoATA(pei))
A ^TA[W62I =
^TAIP^I)
U V4TA(P^2)- Furthermore
U (S"o,4 TA (pe 2 )),
(pe2,pt2)
A 5"(^i) - S»{pt 2)^t A 5 / ; covers all of - 4 T A ( P C I ) , AIA{P^2)->
P^U Pe2,
pti and pt2 A 5 ; / is ground A 5//o(w4TA(pei)U*4TA(pe2)) is functional } and since the 'fresh' type variables generated by ££^1^!] are disjoint from those generated by ££^[we2] we have INSo(tie) = { ((5i'o^TA(pei)) U
(S2'OATA(P^)),
£TA(5r(pei)(5^(pe 2 ))),0 I £rAl« eiI = {PeuPh) A £r A [ue 2 ] = (pe2,pt2) A S'Kph) = S%(Pt3)^t A S" covers all of *4TA(P^I), pei, and pt\ A ^2 covers all of AIA{P£2)'>
P^2-> and pt2
A S" is ground A S^OATA(P^I) is functional A 5j is ground A Sf2foATA{pe2) is functional A VxGFEV(we1)nFEV(^e2): (^o^ T A (pe 1 ))(x) = (Sf2'oATA(pe2))(x)
A (tenv2,e2,t2) G A VxGFEV(we!)nFEV(we2):
tenv1(x)=tenv2(x)
and the desired result then follows from the induction hypothesis. The case ue::=x\. We have
26
2 Types Made Explicit WFFc(ue) = { (0[*/xi],Xi,*) \ teT } WSc(ue) = { {{(Ti,S(tv))},Xi,S{tv)) | tv 4fresh', S is ground, and Dom(S)={tv} }
and clearly these sets are equal. The remaining cases will be left as an exercise as they follow the general pattern of the previous ones. In fact we may express this similarity in the following strong way. We shall take ue ::= if ue\ then ue2 else ue3 as our example since this is one of the harder cases. We may assume that a.
Bool->X1->X1->X1
and it then easily follows that (tenv, if ei then e2 else e3, t) G WFFc(if ue\ then ue2 else (tenv, fif[Bool->t->f->f](ei)(e 2)(e3), t) G WFFc(fif(t£c1)(i£c2)(Me3)) We also have
(tenv, if ei then e2 else e3, t) G INSc(if uei then ue2 else ue3)
t (tenv, f i f[Bool^i^^i](e 1 )(e 2 )(e 3 ), t) G INSc(fif(«ci)(«e 2)(tte3))
because e
2 else ue3 ] = (if pe! then pe2 else pe3, p^) if and only if
for a bijection S'ITF—>TF (which is needed because the different invocations of £PA n e e d not generate the same type variables in corresponding places). Hence the desired result follows from the cases that we have proved already. • The correctness of the translation of programs now follows. In the formulation we shall make use of the type erasing function
2.2
T y p e Analysis
27
defined by 7TTA[ DEF x1 = e1 ... DEF x n =e n VAL e HAS t ] = DEF X ^ C T A I C I ] .- DEF x n -£ T A[e n ] VAL e TA [e] HAS t T h e o r e m 2.2.9 (Soundness and Completeness of V^A) Consider a program up = DEF xi = uei ... DEF x n =ue n VAL ue HAS £ and consider the outcome of 'PXAI^PI- ^ ^ produces the program p = DEF xi = ex ... DEF x n =e n VAL e HAS t then • p is well-formed, i.e. he p, • the underlying program of p is up, i.e. up=7r^A(p). lfV$Alup}
fails then
• there is no well-formed program p in P(E,T) that has up as its underlying • program. Proof: This is a straightforward corollary of Proposition 2.2.8.
•
If we compare the theorem with Proposition 2.2.8 we note that the soundness part, that ^ X A I ^ P ] ^S well-formed, is as we would expect because we have arranged it such that there are no type variables left in T ^ A ! ^ ] and hence there is no need to consider ground substitutions. The completeness part, that T^XAI^PI on^y ^ a ^ s if there are no well-formed programs, is slightly weaker than in Proposition 2.2.8 because we do not claim that every well-formed program may be obtained from ^ T A I ^ I - This is the price to pay for having replaced type variables by Void; however, many properties of programs will be 'polymorphic invariant' (in the sense of [1]) and for these the 'arbitrariness' does not matter.
Bibliographical Notes The original type analysis algorithm was developed by Milner in [57] and is related to work by Hindley [37]. The type analysis algorithm presented here is inspired by Chapter 1 in L. Damas' Ph.D.-thesis [23]. The formulation of soundness and completeness has much in common with the treatment in [1].
28
2
Types Made Explicit
It is worth observing that the type inference system of Table 2.3 ensures that the expressions are monotyped; the use of polytypes (in the type analysis algorithm) is merely an aid in calculating the desired monotypes. It would be relatively easy to adapt the type inference algorithm so as to allow the DEF construct to introduce polymorphism. However, profound changes would then have to be made in the type inference system (not to speak of the development of the following chapters). Our treatment of soundness and completeness is purely syntactic in that we did not consider any (denotational) semantics of the untyped or typed A-calculus. This is contrary to the development of [57] where the untyped A-calculus is given a denotational semantics and it is proved that 'well-typed programs do not go wrong'. Continuing along these lines, a series of papers have studied how to map types to the sets of semantic values described; this centers around non-empty and Scott-closed sets (called ideals in [53]). However, in this book we have taken the perspective that type analysis, as well as binding time analysis (Chapter 3) and combinator introduction (Chapter 4), is merely a preprocessing stage before developing our notion of parameterized semantics in Chapter 5. A very brief appraisal of the material of this chapter may be found in [22].
Exercises 1. Use Table 2.3 to infer the type of the sum program. 2. Consider the following program computing the length of string lists DEF reduce = Af .Au.f ix(Ag.Axs. if i s n i l xs then u e l s e f (hd xs) (g ( t l xs))
DEF sum = reduce (Ax.Ay.+ ((x,y))) (0) DEF map = Af.reduce (Ax.Axs.f (x):xs) (nil) VAL Axs.sum (map (Ax.l) (xs)) HAS String l i s t —> Int
• Infer the type of the various subexpressions, in particular determine the type of the two occurrences of reduce. Is the program well-formed according to Section 2.1? • Duplicate the definition of reduce so that the program gets the form DEF reducel = • • • DEF reduce2 DEF sum = reducel • • • DEF map = Af .reduce2 • • • VAL Axs.--- HAS ••• Annotate this program with type information. Is it well-formed?
2.2
Type Analysis
29
3. (*) It is undesirable that the original length program from Exercise 2 is not well-formed. As suggested in Exercise 2 one approach might be to duplicate function definitions. Try to modify the development in Section 2.2 such that function definitions are duplicated as the need arises. How would you formulate soundness and completeness? 4. Prove that tenv\~e:t is functional in tenv and e, i.e. if tenv\~e:ti and tenv\-e:t2 then ti = t2. Observe that this holds for tenv\~ce:t as well. (Hint: use structural induction one.) 5. Consider the algorithm INPUT: two polytypes pt\ and pt2 OUTPUT: a substitution S or failure METHOD:
5:=I; Xl:=ptl] X2:=pt2; WHILE X1^X2 DO D := the first node in a preorder traversal where XI differs from X2\ Yl := the subtype of XI starting at D\ Y2 := the subtype of X2 starting at D\
IF Y2eTV A YlgTV TREN (Y1,Y2) := {Y2,Y1)\ IF Y1(£TV V FTV(F2)nFTV(Fi)^0 THEN fail, S:=(l[Y2/Yl})oS; X2:=(l[Y2/Yl]){X2)] Prove that this algorithm behaves as the algorithm U guaranteed by Lemma 2.2.3 (in the case of two inputs). To do so it may be helpful to show that exists if and only if U({X1 ,X2}) exists) and
(U({ptupt2}) exists implies U({ptupt2}) = U{{X1,X2}) o S) is an invariant of the WHILE loop. Deduce that the algorithm behaves as U provided that it terminates. To show that the algorithm always terminates it suffices to show that FTV(Xi)UFTV(A^) decreases on each iteration. 6. Give an inductive definition of SJA'-PE^UE to e^A'.E-^UE. Similarly define efTA.
and note that this carries over
7. Give an inductive definition of the free (expression) variables FEV(ue) in ue. Note that FEV(e) and FEV(pe) may be defined in much the same way. Give an inductive definition of *4
30
2
Types Made Explicit
8. Let tenv\X denote the type environment whose domain is XCdom(tenv). Prove that if tenv h e : t and tenv' — tenv\FEV(e)
then tenv' h e : t.
9. Use the type analysis algorithm V^pio compute the type of the sum program. 10. Prove that if ££ A[ue] = (pe.pt) then FTV(pt) C FTV(pe). (Hint: use structural induction on ue.) 11. Consider the clause for £XAI ^ ue^ "then ue2 else ue3 ] in Table 2.5. A possible modification is to replace the last two lines by:
let
S2=U({S1(pt2),S1(pt3)})
in (if S2{pei) then S2(pe2) else S2(pe3), S2(pt2)) Discuss whether or not this makes a difference (modulo a bijective renaming of type variables). 12. In this exercise we extend the typed A-calculus with notation for binary trees. Types are given by t : : = • • • | t tree and expressions are given by e ::= • • • | value e | l e f t e \ right e \ isatom e \ atom e | ei::e2 For us a binary tree may be a leaf (or atom) e with some data value value e, or it may be an internal node e with a left son l e f t e, and a right son right e; the expression isatom e indicates which of the two cases that applies. A tree with just one node is constructed by atom e whereas two trees are grafted together using the notation ei'.:e2. • Extend the well-formedness rules of Table 2.3. • Extend the type analysis algorithm of Tables 2.4 and 2.5. 13. In this exercise we extend the typed A-calculus with a monomorphic l e t expression. The syntax is e ::= \ l e t X[=ei i n e2
and the intention is that l e t x\=ei in e2 behaves as (AXJ[- • -].e2)(ei). • Extend the well-formedness rules of Table 2.3. • Extend the type analysis algorithm of Tables 2.4 and 2.5. Note that all occurrences of X] in e2 must have the same monotype; briefly discuss the complications that would arise from lifting this restriction.
2.2
Type Analysis
31
14. As a modification of the previous exercise consider the construct l e t r e c X[=ei in t2 where any occurrence of xj in e\ recursively refers to the Xj being defined. Repeat Exercise 13 for this construct.
Chapter 3 Binding Time Made Explicit Neither Miranda, Standard ML nor the enriched A-calculus has an explicit distinction between kinds of binding times. However, for higher-order functions we can distinguish between the parameters that are available and those that are not. In standard compiler terminology this corresponds to the distinction between compile-time and run-time. The idea is now to capture this implicit distinction between binding times and then to annotate the operations of the enriched A-calculus accordingly. In this chapter we present such a development for the enriched A-calculus by defining a so-called 2-level X-calculus. To be precise, Section 3.1 first presents the syntax of the 2-level A-calculus and the accompanying explanations indicate how it may carry over to more than two levels or a base language different from the typed A-calculus. Next we present well-formedness conditions for i?-level Aexpressions and again we sketch the more general setting. Of the many wellformedness definitions that are possible we choose one that interacts well with the approach to combinator introduction to be presented in Chapter 4. Section 3.2 then studies how to transform binding time information (in the form of £-level types) into an already typed A-expression. This transformation complements the transformation developed in Section 2.2.
3.1
The 2-Level A-Calculus
We shall use the types of functions to record when their parameters will be available and their results produced. For the program sumt of Chapter 2 it is clear that the list parameter is not available at compile-time and we shall record this by underlining the corresponding component of the type: Sumt = I n t l i s t —> i n t The fact that the argument list is not available at compile-time will have consequences for when the parameters are available for reduce. Again we shall record
33
34
3 Binding Time Made Explicit
teT2 t : : = Ai Ai
txt
txt
t->t
t list
t list
eeE2 e ::= ±\t\ ti[t] ( e , e ) (e,e) f st e fst e snd e snd e A Xi [* ].e Axi[f].e e ( e ) c(e) Xi e:e e:e | nil[f] nil[f] hd e | hd e t l e t l e isnil e i s n i l e | true true false false I if e then e e l s e e if e then e else e fix e fix e Table 3.1: The 2-level A-calculus this by underlining parts of the type: Reducet = (Int—>Int^>Int)-^Int-^ Int list—>Int Both Sumt and Reducet are examples of what we shall call 2-level types. This motivates defining the syntax of the IMevel A-calculus as indicated in Table 3.1. Here we have two constructs for each construct in the typed A-calculus of Table 2.1 on which we are based: one underlined construct and one construct that is not underlined1. Underlining indicates the non-availability of data, that is run-time entities, and non-underlining indicates availability of data, that is compile-time entities. One exception to this rule is that we have only one kind of variables; this is because a variable merely acts as a placeholder for the enclosing (underlined or non-underlined) A-abstraction. To prepare for the well-formedness rules and for the binding time analysis in Section 3.2, it is helpful to expand on the methodology behind the construction of the 2-level A-calculus. It should be clear that it is based upon two underlying notions: One is the underlying base language which in our case is the typed Acalculus as presented in Chapter 2. In general we just imagine a typed language L, that is a language with types and expressions. The other general notion is the distinction between when values are available. We have used the terminology 4compile-time' versus 4run-time' but other presentations of essentially the same material sometimes use the terminology 'early' versus 'late 7, 'known' versus 'unknown' or 'static' versus 'dynamic'. Writing c for compile-time and r for run-time we have the picture in Figure 3.1. Here r is below c because compile-time takes place before run-time and we imagine that this intuition is formalised by a partial order ^ with r ^ c 2 . In general, the distinction between when values are available is a distinction between binding times. So if we 1
In some papers overlining is used instead of 'non-underlining'. One is of course free to use a dual ordering instead; however, in terms of Scott's notion of information content it seems fair to say that 'compile-time' entities yield more information than 'run-time' entities, hence H e . 2
3.1
The 2-Level A-Calculus
35
Figure 3.1: The structure of 2 binding times are to consider a more general setup we will assume that there is a partially ordered set B=(B^) of binding times b^B with 6i^&2 whenever 62 takes place before b^. Using the common mathematical practice of writing 2 for any two-element set 3 we shall write 2 for the partially ordered set depicted in Figure 3.1, i.e. the elements are r and c and the partial order -< is given by r ^ r , r ^ c and c^c. One may then define the following general procedure for constructing the syntax of the B-level language L. The £-level A-calculus is obtained by taking B to be the partial order in Figure 3.1 and by letting L be the typed A-calculus. For types, the procedure is: For each type formation rule t::=(j)(ti,...,tn) in L and for each binding time b€B we add the rule i::=^*(ilv ..,f n ) to the types TB of the 5-level language L. In Table 3.1 we have written <j> for (f)c and for >r in order to obtain a more readable syntax. For expressions, the procedure is: For each expression formation rule e::=^(£i,...,£ n,ei,...,e m ) in L and for each binding time b£B we add the rule e::=<^ 6 (£ lv ..,£ n ,e lv ..,e m ) to the expressions TE of the 5-level language L. This would result in two kinds of variables, written Xj and xj, but in Table 3.1 we have written both as xj. As we are mostly interested in closed expressions, that is expressions without free variables, each occurrence of a variable Xj will have a unique defining occurrence of the form \n-\i\ or Axji]. The intention is that Xj has the same binding time information as the corresponding A has, just as the intention is that xj has the type t listed in connection with the corresponding A. Thus little or no generality is lost by having only one kind of variables. As we will also be using programs with the £-level A-calculus we should remark that we view the programs as something built on top of the i?-level A-calculus and, in general, on top of the £-level language L. As a consequence, if the language L already has a notion of programs then we are only interested in the programs corresponding to the binding time b£B given by the requirement that b is the greatest element in B (assuming that such an element exists). 3
In some approaches to numbers (e.g. [34]) a non-negative number n is built as the set of all the non-negative numbers below it. So 0 is the empty set, i.e. 0=0, 1 is the set having 0 as its only element, i.e. _/ = {0}, 2 is the set having 0 and 1 as elements, i.e. #={0,{0}}, etc.
36
3
Binding Time Made Explicit
To relate the 2-leve\ types, 2-\eve\ expressions and £-level programs to the types, expressions and programs of the previous chapter, it is helpful to have the transformation functions TBTA: T2
->
T
SETA: E2
-
E
P(E,T) which simply remove the underlinings. Similarly, we shall occasionally need the transformation functions T
BTA : T
-f
T2
BTA-
E -> E2
*"BTA:
P(E, T)
e
P{E2,T2)
which annotate each (relevant) symbol with the binding time information b. As usual, we may record annotations with r by underlinings and annotations with c by the absence of underlining.
Well-formedness of i?-level types The well-formedness conditions of the I?-level A-calculus are based on the wellformedness conditions of the underlying typed A-calculus (as presented in Chapter 2). In addition to ensuring that the type information agrees, we must now also ensure that the binding time information agrees, and a consequence of this is that well-formedness of the types themselves becomes of interest. To motivate our definition of a well-formed 2-leve\ type we need to take a closer look at the interplay between the compile-time level and the run-time level. Thinking in terms of a compiler it is quite clear that at compile-time we can manipulate pieces of code (to be executed at run-time) but we cannot manipulate entities computed at run-time. Hence at compile-time we cannot directly manipulate objects of type Int l i s t whereas we can manipulate objects of type Int l i s t —» Int because the latter type may be regarded as the type of code for functions (to be executed at run-time). So Int —> Int l i s t —> Int will be the type of a function that given an integer at compile-time will produce a piece of code that has to be executed at run-time. Similarly Int z± Int l i s t z± Int will be the type of a function to be executed at run-time whereas Int —> Int l i s t —> Int will be the type of a function to be executed at compile-time. Informally, the idea is that an 'all underlined' function type will always be wellformed; it will be called a frontier type. Furthermore, a type with no underlinings whatsoever will also be well-formed; it will be called a pure type. The well-formed compile-time types are then built from frontier types and pure types using the
3.1
T h e 2-Level A-Calculus
37
Reduce! = (int—»Int—>Int) —> Int —> Int l i s t —> Int Reduce 2 = (Int—»Int —*Int) —> Int —> Int l i s t —> Int Reduce 3 = (Int—> Int—> Int) -> Int -> Int l i s t —» Int Reduce 4 = (Int—»Int —>Int) —> Int —> Int l i s t —» Int Reduces = (Int—>Int —•>Int) —» Int —-> Int l i s t —» Int Reduce 6 = (Int—>Int->Int) —> Int z± Int l i s t ^± Int Reduce 7 = (Int—»Int—»Int) —» Int —> Int l i s t ^> Int Reduces = (Int—»Int—»Int) —» Int z± Int l i s t ^» Int Reduce 9 = (Int—>Int—»Int) —> Int ^ Int l i s t ^ Int Reduce 10 = (Int—•>Int —»Int) ^ Int ^» Int l i s t ^± Int
Table 3.2: Well-formed S-level types for reduce type constructors —^, x and l i s t . As an example consider the type of sum t; here we have two well-formed #-level types ! = Int l i s t —> Int Sum2 = Int l i s t —> Int
For the type of reduce we have the ten well-formed j?-level types shown in Table 3.2. Formally, well-formedness of types is given by a well-formedness relation h t :b saying that t is a well-formed type of binding time b. The typed A-calculus might be regarded as having just a single binding time and so it may seem surprising that it had no well-formedness relation h t. The reason is that the natural rules and axioms would be h *ix£ 2
I" h^t2
h t list
and so h t is always true. In the case of i?-level types, well-formedness is more discriminating as we shall see and thus there is a real need for a well-formedness relation. The first eight rules and axioms in Table 3.3 simply follow the structure of the type t. The last rule then allows types of different binding times to be mixed. It simply says that a well-formed run-time function type may be manipulated at compile-time. To motivate the rule we shall mention two potential generalizations that we decided not to incorporate. One is h t : r h *: c
38
3
h
[A]
Ai
: r
h U : r
h
\
h 1
:r
r
[A]
h A; : c
1V1
1- U : c
: r h r y :r h• t : i 1 list
h
-»*?
Binding Time Made Explicit
h
\- h • c
c h \- tl—*t 2 t 1
C C
c
c
•r :c
Table 3.3: Well-formedness of the ^-level types where t is not restrained to be of the form ^mi^- We reject this because £i—>£?:c unlike e.g. tiXt2:c, may be regarded as code for some machine and using the analogies from compiler construction it is clear that a compiler may (indeed should) manipulate code, i.e. entities of type £nz^2 :c 5 but that it may not manipulate other entities 'living' at run-time, e.g. entities of type tiXt 2:c. Another rule we decided not to incorporate is of the form h ...«..-:
c
h .••*•.• : r
This may be a more debatable choice but we shall see in Chapters 5 and 7 that compile-time types will be interpreted as structures called domains and that it will often be useful to interpret run-time types as enriched structures called algebraic lattices; since not all domains are algebraic lattices it would seem justified not to regard all compile-time types as being run-time types. Example 3.1.1 Returning to the types considered for the reduce function, we shall show that Reduce3 = (Int—»Int-»Int)-»Int->Int list—>Int is a well-formed type and that Reducet = (Int—»Int—>Int)—>Int—>Int list—>Int is not. The well-formedness of Reduce3, i.e. h Reduce 3 : c is a straightforward application of the rules in Table 3.3 using the rule [up] just once. It is harder to show the non-well-formedness of Reduce t , i.e. c}: h Reducet : b
3.1
The 2-Level A-Calculus
39
because one has, in principle, to consider an infinite set of deductions using the rules in Table 3.3. However, the only rule that would allow us to prove a statement of the form Hi—>t 2:b is [—»] and we see that then b=c. So if hReduce t:c then also hint—>>Int—>Int:c and hint—>Int list—>Int:c. Repeating this argument we see that hlnt:c and H i n t list)—>Int:c. Repeating the argument once more we see the need for H i n t l i s t ) : c . However, none of [A], [x], [z±] ov [list] are applicable because of the c, and none of [A], [x], [—>], [list] or [up] are applicable because of the form of Int l i s t . We thus have a contradiction and conclude that Reduce t is not well-formed. • The more we get into the actual details of the i?-level A-calculus the harder it is to give a similar development at the level of a 5-level language L and in particular the harder it gets to motivate the decisions made. However, we do wish to stress that our definition of the £-level A-calculus is a variation on a theme and that other variations may be studied in contexts where other intuitions or concepts from computer science need to be taken into account. So to stress this view we shall continue, in this section, to sketch the form of the general construction. For each well-formedness rule or axiom [PI
th
LU
tin
for types in L, and for each binding time &ES, we add the rule or axiom 1
J
l-^(*i,-",*n):ft
(This is harder if side conditions are present.) Finally, we add the rule for relating the binding times
[up] ltlj!U'"H:bhl
if *€* A b+b' A b
where $ is a fixed set of type constructors. In our case $={—>} and b^b 1 f\b
40
3
Binding Time Made Explicit
This says that the expression e has type t and that the type t has binding time b. In most cases the binding time b is uniquely determined by the type t but when t is of the form Jmt^-botlr b=c and b=r may be possible and the explicit occurrence of the intended b will then be used to restrict the use of the type t as we shall see below. Intuitively, the idea is to regard tenv h e : t : 6 as a judgement in a '6-context', i.e. a 4compile-time context' when b=c and a 4run-time context' when b=r. To make the technical definitions possible the type environment tenv is a map from a finite set of variables to pairs consisting of a type and a binding time. The definition is given in Tables 3.4 and 3.5 and all but the last two rules in Table 3.5 simply follow the structure of the expression e much as in Table 2.3 for the typed A-calculus. In other words, except for the last two rules in Table 3.5, we just have two copies of the rules in Table 2.3. The last two rules are of interest whenever we have a judgement of the form tenv h e : ti=±t2 : b because they allow to change b=c to b—x and (under certain conditions) to change b—r to b=c. Rule [down] thus allows to transfer a judgement in a compile-time context to the same judgement in a run-time context. Intuitively, this may be regarded as allowing a piece of code to be executed at runtime. Rule [up] then allows to transform a judgement in a run-time context back into a compile-time context. Intuitively, this means that a computation at runtime may be encapsulated as a piece of code that the compiler may manipulate. For this to be sensible the piece of code may not contain explicit references to other run-time data and this is expressed by the side-condition to rule [up]. Example 3.1.2 To illustrate the rules [up] and [down] we shall consider the 2level A-expressions tf"1 - Ax[(Az±A)z±(Az±A)]. Ay[Az±A]. x(y) A)->(A->A)]. Ay[A->A]. x(y) of intended type <*-i = ((A=±A)z±(Az±A)) ((A->AWA->A)) First we show that ^r~1 is a well-formed expression of type ty-i. To be precise we show that 0 h ^"1 : ^-i : c By using [A] twice we see that it is sufficient to show that tenv h x(y)_ : A—>A : c for tenv = 0[((A—>A)—>(A—>A):c)/x][(A—>A:c)/y]. Using [x] and [down] we get
3.1
The 2-Level A-Calculus
41
[1]
tenv KliW : f : r
if h f : r
[*]
tenv H f|[i] : £ : c
if h t : c
w
tenv h e2 : t 2 : r tenv r e\ '. ti : r tenv h (ei,e 2 ) : tiX.<2 : r tenv h e\ : t L : c ^env h e2 : t 2 tenv h (e a ,e 2 ) : ^ x ^ 2 : c tenv h c : ii X t2 ' T tenv h fst e : ti : r tenv h e : *! xt2 : c tenv 1- fst e : ti : c tenv h e : i ! x£ 2 : r tenv h snd e : *2 : r tenv \- e : Ux£ 2 : c tenv h snd e : t2 : c tenv[(t':r)/y :i] h e : < : r . f ^ t t tenv h Axi[^; \.e : ^^>^ : r tenv[(tf:c)/x :;] h e : t : c . r , f : c i t r~ t tenv h Axi[^' \.e : r —>c : c tenv \- ex : t '—>t : r ^env he?t : t': r ter iv h ex(e2) : £ : r tenv h d : t —' >t : c ^env h : *': c ter w h ei(e 2 ) : ^ : c if (xi,(^:6))Ggraph(^e7ii;) A h^:6 tenv hxi : i : 6 :
L\/J [fst] [f Qtl [XStj
[snd] [snd]
[A] [A]
c
J
LQ] [0]
W
[nil]
: r £env h e2 : t l i s t : r tenv h ei:e9 : ^ l i s t : r tenv h ei : *: c tenv h e2 : t l i s t : c tenv h ei:e2 : t l i s t : c if h t : r tenv h nil[<] : t l i s t : r
[nil]
tenv h nil[«; : t l i s t : c
tenv
[l]
M l-J
IMJ r
i
[ti] [tl]
if h t : c
tenv list : r tenv h hd e : t : r tenv h e : f l i s t : c tenv h hd e : t : c tenv h e : £ l i s t : r tenv h t l e : t l i s t : r tenv h e : * l i s t : c tenv h t l e : t l i s t : c
Table 3.4: Well-formedness of the i?-level A-calculus (part 1)
tenv h x : (A—>A)—>(A—>A)
: r
3
42
[true]
tenv F e : t l i s t : r tenv F i s n i l e : Bool : r tenv F e : t l i s t : c tenv F i s n i l e : Bool : c tenv F true : Bool : r
[true]
tenv F true : Bool : c
[false]
tenv F false : Bool : r
[false]
tenv F false : Bool : c
[isnil] [isnil]
Binding Time Made Explicit
tenv F e^ : t: r tenv F e-i : B o o l : r tenv F e2:t:r tenv F if d then e? else e^ : t : r tenv F ex : B o o l : c tenv F e2:t:c tenv F e3 : t: c h1Tf l i \ tenv F i f e\ then 62 e l s e e3 : t : c tenv F e : £—»£ : r \ '1 tenv F f i x e : t : r [^*] tenv F e : £—>£ : c [fix] tenv F fix e : t : c tenv F e: t: c i f , . . [down] tenv F e : * : r l J h * ' r tenv1 F e : f : r -r . / 1 A [up] tenv F 6 I t I C where graph (tenv r ) :::::: i(X' (t 'b))^ccraphftenv) b~rii\
[if]
Table 3.5: Well-formedness of the £-level A-calculus (part 2) and similarly tenv F y : A—>A : r Using [()] we get tenv F x(y)_ : A—>A : r
and the desired result then follows using [up] as tenv is already on the required form. (Intuitively this means that there are no A's having x(j)_ in its scope.) Next we show that ^ is not well-formed. Again this is a negative result and so a bit harder to substantiate. However, it should be reasonably clear that to show F
t :b
we would need t o show tenv F x ( y ) : A—>A : b for tenv = 0[((A—»A)-»(A—»A):c)/x][(A—>A:r)/y]. As [()] is the only rule that can introduce a •••(•••) we see that we may take b=c above and thus have to show
3.1
The 2-Level A-Calculus
43
tenv \~ y : A—>A : c Clearly [x] allows us to deduce tenv h y : A—>A : r but the side condition to rule [up] is not fulfilled and no other rule would allow us to show the desired result. We may thus conclude from our informal argument that ^ is not well-formed. This ends the technical explanation for why Vl/"1 is allowed but \I> is not. However, there is also a more intuitive explanation. The combinator ^ r~1 takes two pieces of code (x and y) as arguments and then returns a piece of code (x^y)J that views y as a closure and runs x on it to obtain a new closure. This would seem to be a perfectly sensible thing to do. Turning to the combinator $, it takes a first argument x that is a compile-time transformation on code, and a second argument y that is a closure. It then applies x at compile-time to the closure y which is not known until run-time, in order to produce a closure that can further be manipulated at run-time. This violates the idea that all compile-time computations take place before any run-time computation. • As we did for types we shall again sketch the ideas underlying the general construction of the well-formedness relation for expressions in the 5-level language L. The well-formedness rules for L, e.g. the typed A-calculus, may have various side conditions so we do not claim that the following explanation captures all the considerations needed. For each well-formedness rule or axiom • - - tenvn \~ e n : tn [PI i h ej : h W
i
tenv h
for expressions in L and for each binding time b£B we add the rule or axiom r R 6i
ey.ty.b
•••
tenvn\-en:tn:b
Finally, we add two rules for relating the binding times
[down] 1 J
lenvl
e :
l
:
tenv h e : t : 6
1 l if b^b A 6 —^ 6 A H : b ' '
h e' \ -X
if h h
^'
A h h
'-
A ht h A
where graph(£eni/] b') = {(x^(t\:b'1))£gralph.(tenv)
''
\
t™v'=tenv]b' ^{bi^b')}
(At this point it is helpful to assume that B is linearly ordered.) Comparing Tables 3.4 and 3.5 with the above general procedure we note the following features that are not fully accounted for. In the rules [f], [f], [A], [A], [x], [nil] and [nil] we have added a side condition; no analogue was present in Table
44
3
Binding Time Made Explicit
2.2 because there all types were well-formed. In the rules [A] and [A] the general procedure does not precisely characterize the dependency of tenv\ on tenv. Finally, we have only one rule [x] for variables in accord with the decision earlier in this section to have only one kind of variable. For later reference it is helpful to state a few simple properties of the wellformedness relation for the £-level A-calculus. For the statement of the first of these it is helpful to extend TBTA- T2 —± T to operate on type environments. For this we define rBTA(tenv)
- (A(<:6).T B TA(0) °
tenv
so that, for example, TRTA(0[(A—>A:C)/X]) = 0[A—»A/x]. Fact 3.1.3 If tenv h e : t : b then TBTA(^TW) h ^BTA(C) : TBTA(0-
D
Proof: This is a straightforward induction on the deduction tenv h e : t : b.
•
Fact 3.1.4 If tenv h e : t : b then h t : b.
•
Proof: This is a straightforward induction on the deduction tenv h e : t : b.
•
Finally, we have a (weak) analogue of Fact 2.1.4 about the well-formedness relation being functional: Fact 3.1.5 If tenv h e : ti : &i and tenv h e : t2 : b2 then £i = t2.
•
Proof: This is a straightforward structural induction on e.
•
Because of the rules [down] and [up] we clearly do not always have bi—b 2 in the above fact. However, if tx (and hence t2) is not a run-time function type (i.e. is not a frontier type) we will indeed have bi = b2. Finally, we must define the well-formedness relation tenv \~c e : t : b where C is a constraint as in Chapter 2. Again this amounts to supplying new definitions for the axioms [f ] and [f ] and these new axioms will be of the form [f]
tenv hc f\[t] : t : r
if h t : r, fiGdom(C) and t is a ,2-level instance of C(fj)
[f]
tenv he f\[t] : t : c
if h t : c, fjEdom(C) and t is a iMevel instance of C(fi)
To explain the notion of 2-level instance we observe that the notion of polytype generalizes to the notion of £-level polytype, i.e. pt2 G PT2 pt2 ::= Ai | A; | pt2 x. p*2 | pt2 x pfjg | pt2 z± pt2 l i s t | pfjg l i s t | tv
3.1
The 2-Level A-Calculus
45
Furthermore, the notion of substitution generalizes to both 5 : TV —» PT2 and S : PT2 —» PTS and the notion of instance therefore generalizes to 2-level polytypes as well. By a 2-level instance of a closed type scheme ts = VXi.- • -WXn.pt we then mean an instance of a £-level polytype pt2 for which Clearly a 2-leve\ instance t£T2 of a closed type scheme ts, as above, will have to be an instance of ts but the condition ensures more than just this. As an example, the i?-level instances of
TBTA(0
C(feq) = VXi. X ^ X x - ^ O O l include Int—>Int—>Bool and Int-*Int—»Bool but not Int—>Int—>Bool although TRTA (Int—>Int—»Bool) is an Int—*Int—>Bool is a well-formed 2-level type and instance of C(f eq ). In other words we ensure that the type variables must be the same #-level type. Well-formedness of 2-level programs For a program of the form DEF xi = ei • • • DEF x n =e n VAL e0 HAS t in P(E2,T2) we need to decide the binding times for the types of the t\. In accordance with the intention, expressed in Section 3.1, that the programs do not participate in the binding time distinction and thus correspond to the greatest binding time (assuming that such a binding time exists), we shall demand that the types of the e\ have binding time c. This motivates the rule 0 h e1 : t x : c ^ n _ 1 :c)/x n _ 1 ] h e n : ( n : c .[(t n :c)/x n ]h e o : t: c h DEF xi = e! • • • DEF x n =e n VAL e0 HAS t where again we use h p to express the well-formedness of the program p. Finally, we write he p for the well-formedness relation where all occurrences of f\[t] or fj[t] must have t to be a 5-level instance of C(f\). Example 3.1.6 Returning to the sumt program of Chapter 2 there are a couple of well-formed £-level programs that have sumt as the underlying program, i.e. that are equal to sumt when underlinings have been removed. One of the more interesting programs is sum9 given by
46
3
Binding Time M a d e Explicit
DEF reduce 9 = Af [lnt->Int->Int]. Att[lnt]. f i x (Ag[lnt l i s t z± Int]. Axs[lnt l i s t ] . if i s n i l xs then u e l s e f(hd x s K g ( t l xs)_)J VAL reduce 9 (Ax[lnt].Ay[lnt].+[lnt><.Int->Int].({x ?y)).) lO HAS I n t l i s t z± I n t where reduceg has type Reduce 9. Note that we cannot use Au[lnt].- • • instead of Au[lnt].- • • as the well-formedness conditions on types only allow us to manipulate run-time function types at compile-time and not run-time data types like I n t . The remaining well-formed #-level programs are less interesting. Corresponding to the 5-level type Reducei we have a program with no underlinings at all and corresponding to Reduce^ we have a program where all operations are underlined. These are the only well-formed 2-level programs with sumt as the underlying program. Note that in each of these programs the binding time for reduce is fixed and this does not cause problems because there is only one application of reduce in the program. In general, we may have to duplicate the definition of the function (see Exercise 15). •
3,2
Binding Time Analysis
In Section 2.2 we developed an algorithm for transferring type information into an otherwise untyped A-expression. We showed that when it succeeded the resulting program would be well-formed and that it failed only when it had to. This transformation naturally extends to an algorithm V$A: P{UE,T2)
--> P{E,T2)
for transferring the underlying type of a #-level type into an otherwise untyped A-expression. To be precise the extension is given by Xup. let DEF xi = wei • • • DEF xn=uen VAL ue HAS t2 = up let t = rBTA{t2) let DEF xi = ei • • • DEF x n =e n VAL e HAS t = 7>£A[DEF xi = ttex • • • DEF x n =ue n VAL ue HAS t] in DEF X! = ei • • • DEF x n =e n VAL e HAS t2 and this algorithm fails if and only if the explicit invocation of V^A fails. When the algorithm succeeds we know that the resulting program, or rather the result of applying TTBTA^O it, will be well-formed when regarded as a program in P(E,T). In this section we now extend the transformation process by developing an algorithm
3.2
Binding Time Analysis ^
47
P(E2J2)
for transferring binding time information (in the form of i?-level types) into a typed A-expression. By combining V^TA and V^A we thus have an algorithm for transferring type and binding time information into an untyped A-expression. By analogy with V^A we require the program produced by ^ETA ^° ^e well-formed as a program in P(E2,T2). This is not all, however, as this would be possible if V^A just underlined every basic construct and thereby transferred all computations to run-time. Rather, we insist that as many computations as possible are performed at compile-time in order to achieve greater run-time efficiency, as then code has to be generated for fewer computations. To formalise the condition that as many computations as possible are performed at compile-time we shall extend the partial order ^ from binding times to apply also to i?-level types, 2-\eve\ expressions and 2-level programs. Taking i?-level types as an example the idea is that
if ti and t2 are the same types once we forget about underlining and if every symbol underlined in t2 is also underlined in ^ . So the intention will be that, for example, Int—>Int—>Bool ^ Int—>Int—>Bool
•< Int->Int—>Bool
This can be expressed more formally by giving a structural definition of ^ but we shall dispense with the details; it is important to note, however, that ti^t 2 implies that TBTA(^I) = T BTA(^2)- In a similar way we can define
and
Example 3.2.1 The ordering between the ten well-formed i?-level types for reduce is shown in Figure 3.2. • A subset Y of a partially ordered set may have a greatest lower bound F\Y: a lower bound yo of Y is an element such that Vo 1^ V f°r aU elements y of Y and a greatest lower bound l~l Y is a lower bound such that Vo di n Y for all lower bounds y0 of Y
48
3
Binding Time Made Explicit
Reducei Reduce2 Reduce4
Reduce3
Reduce5
Reduce7
Reduce6
Reduces
Reduceq Reduceio Figure 3.2: Comparison of the iMevel types for reduce Since a partial order is antisymmetric the greatest lower bound is unique if it exists. We usually write yir\y2 for \l{yi,y2}. For the partially ordered set 2 any subset has a greatest lower bound: fl Y = c unless r G Y in which case fl Y = r. A subset Y of T2 is said to be consistent if {TBTx(y)\y £ Y] is a singleton. This notion is of interest because all consistent subsets of T2 have greatest lower bounds: simply underline a symbol if it is underlined in any of the elements of the (non-empty) set Y. Similar remarks apply to consistent subsets of E2 and of P(E2,T2).
3.2.1
Binding time analysis of types
We begin with developing a transformation algorithm T2X2
-> T2x2
for types. Here, as in Section 3.1, we write 2 for the partially ordered set with elements c and r and partial order given by r ^ r , r ^ c and c^c. We shall write the elements in T2x2 in the form t:b rather than in the more usual form (£,&) and we shall write ti'.bi
3.2
Binding Time Analysis
TBTAI
TBTA[
49
k\l:b2 ] = let b = blUb2 in Af:6 t1xblt2:b2 ] = = let
t[:bl'
= TBTAI h:
let t'2:b2' = TBTAI h- blV\b2
]
in if 67' = b2'
then t[xbl'tf2:blf where b = bl' TBTAI
tx^t2:b2\
= let ^ r & i ' = T^TAI *I :6in6^ let ^ifcjg' = TBTAI *2 :bmb2
in if 6 i ; = 6^ ; then tf1-*hl't'2:b2 else -/BTA! i—^ 2 : ^ ] where b = bl TBTAI
t l i s t 6 ' :b2] = let *':&' = T B T A [ t:binb2
Table 3.6:
in i' Iist 6 ':&'
TBTA-
] ]
]
Binding time analysis of types
it is by cases on the structure of the underlying type. Proposition 3.2.2 (Correctness and Optimality of TBTA) The equations of Table 3.6 define a total function TBTA: T2X2
-> T2x2
and it satisfies, for t:b£T2x2 arbitrary:
]
and that
1 t:b
TJBTAI^I
is greatest with this property, that is
t':b'^t:b A H':b' => t':b'±TBTAlt:b}
•
Note that the last condition expresses that the annotation produced by TBTA is optimal in the sense that any other well-formed annotation complying with the original annotation must be smaller in the ordering. The proof of this proposition needs the principle of complete induction. First, for a partial order < we shall write <, or strictly <, for the irreflexive part of it, so u
50
3
Binding Time Made Explicit
if W: ( Vw: u P(v)
then \/v: P(v) This means that if P(v) holds whenever P(u) holds for all u
if and only if h''bi^t2:b2
or TBTA(^I) is a proper subtype of
Examples include Intrr < Int:c and Int:c < Int—»Int:r but not Int:c < Int:r. We observe that ^ is a partial order and that it is well-founded because for a type t of length n there are at most 2 n + 1 pairs t':b' such that t':bf- i-e- by c a s e s o n which of the four clauses that applies to t:b, but the reasoning is much the same in all cases. The general observation is that for an equation
3.2
Binding Time Analysis
51
any recursive call on the right hand side will have tl\bt
tx^hlt2.b2.
Clearly
t1:blHb2 < t:b t2:binb2
< t:b
because t\ and t2 are actually proper subtypes of t. Furthermore, it follows from the inductive assumption that t'1:bl' =
TBTA[t1:binb2]
t'2:b2' =
TBTA{t2:binb2j
are both defined and bl' r< *i:6./n&£ b2' r<
If bl i
b2' we; then have —»
c2
and this shows the result in this case. If bl' that
b2f we set b = bl'r\b2'
and observe
t:b (because b^bl assumption.
must be the case) and the result then follows from the inductive
S t a g e 2: T h a t TBTA only produces well-formed pairs amounts to yet another complete induction with respect to the well-founded order < but this time using the predicate P(t:b)
= KTBTAI t:b
Again it is straightforward to inspect the equations one by one and to use Table 3.3 to show the desired result and we shall illustrate this for function space. Let t:b be of the form tx-*hl t2:b2 and write t[:bl' for T B T A [ h:blV\b2 ] and t'2:b2' for T B T A [ t2:blDb2
] . As hiblHbS
< t:b and t2:blHb2
< t:b we know from
the inductive assumption that h t'^.bl' and h tf2:b2f. If bl' = b2' it follows from [->*''] (in Table 3.3) that h t[ -> 6 i ' t'2 : bl'. If bl' = b2 this is the desired result; otherwise the desired result follows from rule [up] and the observation that bl' ^<
52
3
Binding Time Made Explicit
binb2 ^ b2 so that H i -> 61' t'2:b2. It remains to consider the case where bl' ^ b2f. Here t[ ->6 t'2\b2 < tx -^bl t2:b2 where b = bl'HbS' and the result follows from the inductive assumption. Stage 3: Finally we must show that the result of 7BTA[ t'-b ] is greatest among those pairs t"\b" with the properties \~t":b" and t"\b"
( H":b" A t":b"^t:b =* t":b"±TBTk{
t:b
Again the method of proof is to inspect the equations one by one and to use Table 3.3 to show the desired result. We shall illustrate this for function space. Let t:b be of the form h^hl t2:b2 and write t[:blf for T B T A[ k:binb2 J and t'2:b2' for T B T A [ t2:binb2
] . Next let H":b"
and t":b"
Then t":b"
must
be of the form t'{—> bl"t2:b2" and furthermore bl"-bl" t'^b2" ^ *i ->6 <;2:6jg. If 6 i ; = 6jg; this is the desired result and if bl' ^ 6^' the desired result follows from the inductive assumption. This completes the proof of Proposition 3.2.2. • Example 3.2.3 Recall the types Reducet = (Int—*Int—>Int)-*Int—>Int list—>Int Reduce3 = (Int—>Int—»Int) —>Int —>Int l i s t ^ I n t We shall show that [ Reduce t:c ] =
thus validating the claim made earlier that Reduce3 is the greatest well-formed type that obeys all the run-time annotations of Reducet. We calculate : c
] —
(as 7BTA[ Int l i s t : c ] = Int l i s t : r and [
l i s t z± Int : c ] =
(as TBTAI I n t : r I = M : r ) Int l i s t z± Int : c Next we note that
3.2
Binding Time Analysis
7BTA[
Int—> Int list—>Int : c ] = Int
53 —> Int l i s t ^± Int : c
and it follows that the desired result holds.
•
Proposition 3.2.2 has a number of consequences that we shall need later on. Corollary 3.2.4 If h * : b then T B T A [ t:b ] = t:b.
•
Corollary 3.2.5 TBTA is monotonic with respect to ^ .
•
Next let YC.T2x2 be a non-empty subset that is consistent in the sense that { T B T A ( 0 I ^ E Y} is a singleton. Then every consistent subset Y of T2 x 2 has a greatest lower bound. It is given by HY =
t:b
where b is n{6i|^:6iGF} and t = n{^ i|^i:6iE Y}. Corollary 3.2.6 For a non-empty and consistent set Y, as above, the pair
T B T A [ny] is well-formed, is less than or equal to (wrt. •<) all the elements t\'.b\ of Y, and is the greatest such pair. • Proof: Clearly l~l Y is the greatest lower bound of Y wrt. -<. Therefore 7BTA[ l~l Y ] is well-formed and a lower bound of Y. If some t'\b' is well-formed and a lower • bound of Y then t'lb'^HY so that t':b'^TBTAI n Y ] by Proposition 3.2.2. The final consequence of Proposition 3.2.2 to be studied here concerns how to transform a 2-\evel type to one that is a £-level instance of some closed type scheme. (This will be relevant when we consider the f j below.) So let ts be a closed type scheme of the form VX^ • -\/Xn.pt and where, without loss of generality, we assume that each Xj does occur in pt so that the set FTV(p£) of free type variables of pt is {Xi,- • *,Xn}. Next consider the following definition of a function t:b J = if TBTA(0 ^S n ° t a n instance of pt then fail else let t':b'
= TBTA[ t:b ]
let Y\ — { t\\c | £j occurs in V where X\ occurs in pt } let ti:6i = TBTA[nYi] let t" = tf with t\ substituted into those occurrences where X, occur in pt if *"=*' then t':b' else Example 3.2.7 Returning to the equality predicate of the previous chapter we consider the closed type scheme
54
3
Binding Time Made Explicit
tS = C(feq) = VXa. Xi-^X where the polytype pt = Xi->Xi-^Bool has FTV (pt) = {Xi,- • -,Xn} for n = l . Next consider the type t = Int—*Int—>Bool and the call TBTAI ^ : C I- We note that TBTA(0
= Int—*Int—>Bool
is an instance of pt=Xi—>Xi —>Bool
t':b' = T B TA[ t'-c ] = Int-»Int--»Bool:c Yi = { Int:c, Int:c } ti'bi = TBTAI Int:c ] = Int:r t" = Int-»Int-»Bool As t"^t' we have TBTAI ^ : C I — ^BTA! Int—»Int—»Bool:c ] and next consider the call TB^A[ Int-»Int->Bool:c ]. We note that T
BTA(0
is an instance of pt
f
t';b = TBTA[ Int->Int-»Bool:c ] = Int-»Int-»Bool:r Y1 = { Int:c } h'bi = TBTA[ Int:c ] = Int:r t" = Int-^Int-^Bool Since t"—t' we conclude that [ Int-^Int—>Bool:c ] = Int-»Int—>Bool:r Note that Int-^Int—^Bool is a ^-level instance of pt whereas the S-level type ] is not. • Int—>Int—>Bool resulting from TBTAJ Int—>Int—»Bool:c Lemma 3.2.8 Let ts=VXy • -VXn.pt be as above. If TBTA(0 is an instance of ts then TBXAI
t'-b I terminates with a result t'\b'
The result satisfies
as well as the property
3.2
Binding Time Analysis
55
\-t':b' A t'\b'
•
Proof: A recursive call of T^A is only performed on t":b' when t":b' -< tf:bf -< t\b. Hence complete induction with respect to the well-founded order X (not < ) shows t h a t 7g^ A terminates without failure on arguments of the form t:b where t is an instance of pt. (Recall that t":bf -< t:b implies TBTA(^") — r BTA(0-) Clearly the result t':b' is T B T A [ t":b" ] for some t"\b" and by Corollary 3.2.4 we have that f T B T A [ t':b' ] = t':b so that also T ^ A [ t'\b' ] = t':b'. Much as in the proof of Proposition 3.2.2 we can then show by complete induction on ^ (not < ) that t h e result of 7g^ A satisfies the property above and by another complete induction that it is greatest with this property. •
3.2.2
Binding time analysis of expressions
The main task in the binding time analysis is to develop a transformation algorithm for expressions. Previous formulations of such a transformation have been rather lengthy and fairly complicated. To overcome this we shall pattern the transformation after the type analysis algorithm presented in Section 2.2. This calls for replacing variables x; with variables xf[£] that are explicitly annotated with their (2-level) type and binding time information. Thus variables are treated much like constants. To be precise about this approach we define a modified class of £-level expressions where variables are as indicated above: e e E2' e::=f?[i] | . . . | A6Xi[t].e | e( 6 e) | x?[t] | . . . The function £BTA: E2' -> E2
then removes the annotation of variables, i.e. replaces x\[t] by Xj. Then the function £BTA°£BTA: E2'—*E removes annotations of variables as well as underlinings; to allow for a more readable notation we shall allow to write it simply as
By analogy with our definition for type analysis we also define E2' -> P f i n ({ X i |ie/} x
(T2x2))
which records the annotations given to free variables in a £-level expression. As the definition is much the same as for -4TA w e omit the details.
56
3
E x a m p l e 3.2.9 The definition of SBTA*13
Binding Time M a d e Explicit
sucn
£/ BTA(AcXi[lnt c l i s t c -> c Int c ]. x^[lnt c l i s t c -> c Int c ] ( c n i l c [ l n t c ] ) ) amounts to A c X l [lnt c l i s t c ^ c Int c ].
Xl
( c nil c [lnt c ])
Similarly £;BTA(AcXi[lntc]. ( c ( c x^[lnt c ],x-[Bool r ]), (c x5[lnt c ],xf[Bool r ]))) amounts to A c X l [lnt c ]. ( c ( c
Xl,Xl),
(c x2,x2))
The definition of ^4BTA is such that ^BTA^Xxpnt 0 ]. ( c ( c xJ[lnt c ],x^[Bool r ]), (c x<[lnt c],x£[Boolr]))) amounts to {(x 2 ,(lnt c ,c)),(x 2 ,(Bool r ,r))} We shall sometimes write this set as {(x2,Int c :c),(x 2 ,Bool r :r)}.
a
For finite subsets A of {xi|i£/} x (T2 x 2) we shall use much the same notation that was used in Section 2.2 for finite subsets of {xj|iE/} x PT. In particular
and A is functional if each A{ni) is empty or a singleton. There is an obvious bijective correspondence between finite and functional sets A and type environments tenv: if A is finite and functional then fun(A) is a type environment and if tenv is a type environment then graph(£eni;) is finite and functional. Finally, our definition of a partial order upon 5-level expressions in E2 carries over to expressions in E2f as does the definition of consistent sets and the characterization of greatest lower bounds. The transformation function then has the functionality
£gTA: E2'xT2x2
<-> E2'xT2x2
We shall take care only to apply £BT A ^°a n argument e:t:b if the underlying typed expression is well-formed; this means that is functional, and : T B TA(O
3.2
Binding Time Analysis
57
let t'o:b0' = 1rBTA[ to:b0 I r<
i) t
let t\:bl' = 1 BTA l( i:hl)n(t'0:c)} in UD( fi6i'[i [] : t[ : bl', bO')
(1) (2) (3)
Table 3.7: £BTA: Binding time analysis of expressions (part 1) and we shall say that e:t:b is 1 -well-formed when this is the case. We shall make use of this assumption when we encounter constructs where too little type information is explicitly given and where we thus have to rely on Fact 2.1.4 to infer the [1 -level!) type information that is missing. For brevity we shall write TBTA(-4BTA(^)) for {(xi?TBTA(^i))|(xi^i:^i)^^BTA(e)} i n the presentation of the transformation function below. The result of £BTAI e'-t'-b I should then be the greatest among the triples ef:t':bf that satisfy the well-formedness conditions: -4BTA(V)
is functional
inn(ABTA(e'))
KJ e'BTA(e'):t':b'
and that satisfy the condition of being less than the argument to £BTA: e':t':b'
^ e:t:b
Here we have once more extended the partial order ^ in a componentwise manner. A more informal wording of the last inequality is that e':t':b' respects the annotations in e:t:b. Interspersed with the definition of the transformation function we shall motivate and explain the clauses. We begin with the clause for constants as given in Table 3.7. The intention with the first two lines and the first argument to UD in the third line is to obtain the best result corresponding to the well-formedness rules of Tables 3.4 and 3.5 but with rules [up] and [down] excluded. For this reason neither bO nor bOf should be allowed to influence the annotation blf of fj. The intention with UD is then to investigate whether rules [up] or [down] may be used to change the overall binding time bl' to the binding time bO'. Ignoring rules [up] and [down] the procedure is thus as follows. In line (1) we ensure that the type t0 is consistent with the binding time bO. This may change the type (to t'o) but may also change the binding time (to bOf). In line (2)^we ensure that the types supplied for fj agree with one another and with the constraints expressed by the type scheme C(f\). Here we do not use the binding time bO1 (or bO) for tf0 as the applicability of bOf may depend on rules [up] and [down]. For example even if the argument to £g TA is f^ [lnt-*Int-*Bool]:Int-*Int—»Bool:r there is no need to change the annotation of feq from c to r. The best result then is f^'lt'il'.t'iibl* provided that rules [up] and [down] are ignored.
58
3
Binding Time M a d e Explicit
Taking rules [up] and [down] into account we must investigate whether or not they could be used to change the overall binding time bl1 to bOf which is closer to what was originally desired (i.e. bO). This is accomplished by the auxiliary function UD whose general definition is UD( e : t : b, b ') = if b=c A b1'=r A \~t:b' then e:t:b'
else if b=r A 6'=c A H :b' A
ABTA(e)
C {xj|\ieI}x(T2x{c})
then e :t:b' else e: t:b
(a) (b) (c) (d) (e)
Line (a) tests for the applicability of rule [down] and if it is applicable the result is produced in line (b). Line (c) tests for the applicability of rule [up] and if it is applicable the result is produced in line (d). If neither [down] nor [up] is applicable the result must be equal to the first parameter to UD and this is produced in line (e). — There is one snag in this definition, however, because to prove the correctness of S^TA we should like UD(e :£:&,&') •< e:t:bf to hold in order to establish £gTA[e:£:&] •< e:t:b. This may fail if b=c and b'=r but \~t:b' does not hold. We shall therefore verify that (&=c) A {b'=r) =» H:b' whenever we use UD(e:£:6,6'). In the clause for constants this amounts to verifying that \-t[:bO' holds when bl'=c and bO'—r. But line (1) ensures that t' o is all underlined in this case, hence t'^t^ and h ^ i r as required. E x a m p l e 3.2.10 The call £BTA!
eq[lnt->Int->Bool] : Int->Int-*Bool : c ]
amounts to ^BTA! f^Jlnt—>Int->Bool] : Int->Int->Bool : c ] We get t'Q:b0' = Int->Int->Bool : c t[:bl' = Int-->Int->Bool : r so that the result is UD( f c r q [lnt-^Int->Bool] : Int--»Int-»Bool : r, c) Now clearly we may use rule [up] to deduce that
3.2
Binding Time Analysis
59
0 he fcrq[lnt->Int->Bool] : Int-»Int->Bool : r implies 0 hc fcrq[lnt->Int—>Bool] : Int-»Int—>Bool : c Thus the result is ff q [lnt-»Int->Bool] : Int->Int—>Bool : c which amounts to eq[lnt—>Int—>Bool] : Int—>Int—>Bool : c in the 'usual' notation. Note that eq is underlined even though the resulting binding time is c. • Turning to products the clauses for tupling and projection are given in Table 3.8. The intention with the first two lines of the clause for tupling is to obtain the desired result for each of the components. Here the desired binding time is indicated by all of bl, b2 and bO and we therefore use their greatest lower bound. If the resulting binding times agree we then have the desired result. If they do not agree we have to perform yet another recursive call. This is a phenomenon not present in the type analysis in Section 2.2 because there we would have constructed a substitution that would equate all that needed to be equated. We cannot do so here and thus have to introduce the additional recursive call. This also motivates Finally, why the domain of £BT A e c l u a l s its codomain (rather than e.g. ExT2x2). we observe that the rules [up] and [down] are not applicable so that there is no need to include a call to UD. Next consider the clause for the projection on first components. In the first line it is ensured that the overall type and binding time agree. This is necessary in order to verify the condition for using UD in the fifth line: HQ'I&O' when bf=c and bO'=r. For the verification, note that t'o is all underlined in this case so that tfQ equals t'o. In the absence of information about the type of the other component of e we obtain the (jf-level!) type in the second line using Fact 2.1.4 and
In the third line all symbols are annotated with c so as to produce the most conservative 2-level type. This then facilitates the recursive call in the fourth line. Here we have taken the liberty of using the binding time &' twice in the pattern on the left hand side because the rules [down] and [up] are not applicable to product types and therefore the two occurrences must always agree (given the correctness of £BTA t° be shown in Proposition 3.2.13). The clause for the projection on the second component of a pair is analogous.
60
3
Binding Time Made Explicit
{hl^2) : * i x " i 2 : bO ] = let e'^.bl' = £gTA[ Cl : ^ : let e'2:t'2:b2f = ££TA[ e2 : *2 : if 61 / =6J8 /
then ( " ' e ^ ) : ^x*''^ : bl1 d s e ^ T A [ C ' c i , C ' 2 ) : tix 6 V 2 : where b1 = bl'V\b2' ] let ^ 6 0 ' = TBTAI to'bO ] let i 2 be given by fun(TBTA(*4BTA(e))) h) let *; = TiT let e':t'£xb't'{:b' = SgTA[ e:t'oxbl 6 f i n U D ( f s t ' e : tg : 6', 60;)
t[:bl ]
sndfci e : ^0 : bO ] = let t'o:b0'= TBTAI «O=W I let tx be given by fun(TBTA(-ABTA(e))) \~c let t[ = r§ TA (i!) let c / :ti / x*'tg:6 / = fg r A [ e:t[xbl t'0:bl i n U D ( s n d i ' e ' : tg : 6', bO') Table 3.8: £BTA: Binding time analysis of expressions (part 2) Example 3.2.11 The call £BTA[
f st
(eq[A->A->Bool], eq[A-^A-^Bool]) : A-+A->Bool : r ]
amounts to £BTA!
f stC
( C f eq[ A -^ A -^ B o o l ] 5 feqtA-^A-^B001])
:
A->A->Bool i r
Using the equation for f s t we get, in the notation with underlining, t'o:b0' = A->A-+Bool : r
We shall shortly see that A->Bool) : c ]
3.2
Binding Time Analysis
61
equals (A->A->Bool)x(A->A-->Bool) : c so that the result is UD( f s t c (cfj:q[A-»A->Bool]. fJ: q[A-+A->Bool]) : (A-»A--»Bool) : c, r) and, in the notation with underlining, this amounts to f st (eq[A-»A-»Bool], eq[A->A-»Bool]) : A->A-*Bool : r Returning to the call £BTAI
A->Bool) : c ]
we get e[:t[:br
= f*q[A->A->Bool] : A->A-*Bool : c
e'2:t'2:b2' = f£q[A->A->Bool] : A->A->Bool : c As bl1 — b2' we get the result (Az±A=±Bool)x(A->A-»Bool) : c as was claimed above.
•
Turning to function space the clauses are given in Table 3.9. First consider the simplest clause which is the one for variables. As may be seen this is analogous to the clause for constants. The only difference is in the second line where we use f TBTA instead of T^TA . ^he verification condition for UD is that h t[:bO when bl'=c and bO'=r and we may verify this as we did for constants. Next consider the clause for A-abstraction. Intuitively, the first line ensures that the result type and binding time are consistent. The second line is the recursive call and in the third line all assumptions about the type and binding time of Xi are collected in the set Y. This set is consistent in the sense that { T B T A ( 0 I ^ : ^ ^ } is a singleton as follows from our assumption that £^TA is only applied to 5-level expressions that are well-formed regarded as typed expressions in the sense of Chapter 2. It follows that fl Y exists and the fourth line therefore tests whether Y is a singleton, i.e. whether all assumptions about Xi are consistent. If this is the case we produce the result in the fifth line. The verification condition for UD is that h ti -^bl t' : bO' when bl=c and bO'=r and this follows from the first line
62
3 Binding Time Made Explicit
let t'o:b0' = T B T A [
in UD( x^'fi'j : ti r ^ r , 60'°) Sc
2 [ A *7xi[i1].e : t2 i 3 : 60 ] = b2 let t'2^ 't'3 : bO' = T B T A [[i2-+ 6 2 <3: 6 0 ] let e':t':b' = i BTA [ e:i3:6^' ] let Y = {(ti:b i ),( t'2,b')} U ^BTA(e')(xi)
if Y = {nY}
thenUD( A">Ci[*X ].e' : * ! - > " * ' : fti, 60' ) el8e£grA[A*x e'[xfW/ii] : « - » » * ' : 6 0 ' ] where t:b
cC If B T A 0.
1
(bl >•
\ 2 /
60] =
. f • ^0 •
let ^ be given by fun(r BTA (^ BTA (e 2 ))) hc £BTA(e2):
in5^A[e1(e2 )
:
i i ^ ' ^ o : bO]
^ A [ d ( c 2 ) : ^i-^6 H2 : 6 0 ] = let t'2:b0' = TEJTA! i2:60 ] — lp/ R cT A ir t^„1, •. v/ 1, _ .' *i v+O' •• t// ,J./ n let e[:t[ -*hl> t%:l ,'-*•i n ~~~ D 1/V IL •• A JJ x
] let c^:*i/:6jg/ = r A [ e2:ti:bl if {t[:bl') = (t \\b> then UD( e[ ( b l ' e i'2) : t j : fti',600 else 5 B ^ A [ e[( e'2) where t'\b (fi:6i ; ) n (t'{:b2f)
Table 3.9: ^BTA: Binding time analysis of expressions (part 3) (because bl=c and 60'=r is in fact impossible). If Y is not a singleton we must perform a new recursive call where we have replaced all assumptions about Xj with the new assumption t:b = F\Y. In the body of e' this is expressed by overloading the notation for substitution4, so that e'fx-'fij/xi] equals x\[t] when e' is of the form xf [*']. The equation for application is somewhat more intricate as it uses an auxiliary function. The problem is that we shall also need the type of the operand part of the application. In the first line of the body for £BTA w e u s e ^ ac * 2.1.4 to compute the missing ('i-level'!) type information and in the second line we obtain a 24
Without this 'overloading' it might be argued that e'[xf [/]/xi] equals xf [i]6 [t'] when e' is of the form xf'[<'].
3.2
Binding Time Analysis
63
level type by annotating each symbol with c. In the third line we obtain the desired result using £BTA which is supplied with full information about the type of the operator. In the body for £BTA the ^TS^ line ensures the consistency between the overall type and binding time. The second and third lines then contain the recursive calls. The fourth line tests whether or not the assumptions about the type and binding time of the operand agree. If this is the case we produce the result in the fifth line. The verification condition for UD is that h t'^'.bO' when bl'—c and bO'=r and this is enforced by the first line. If the assumptions do not agree we perform a recursive call. Since it might well be the case that t[ differs from t'{ this shows the need for the auxiliary function £*TA that allows also to record the type of the operand as otherwise we would start with T"BTA(^I) in once again and this might lead to a non-terminating computation. Example 3.2.12 Returning to Example 3.1.2 we write * = Ax[(Az±A)->(Azz>A)]. Ay[Az±A]. x(y)
We intend to perform the call
in order to obtain the best well-formed expression that is consistent with ^ , which is not itself well-formed. Before we can do this, however, we must transform the expression \I> from E2 to E2', i.e. all variables in \I> must be annotated with their type and binding time. So the call we do perform is ^ C t0]. \ry[t0}. *c[to ^ c to] ( c where t0 abbreviates A—>A, i.e. A £BTA!
r
—> T Ar. This call gives rise to the call
A r y N - xc[f0 ^ c t0] ( c f[t0]
):t
o
^t
o
:c]
which yields the call £BTAI
* % -*° to] (c y r [ i o ] ) : t0: r ] = ^BTAI XC[
C
t0] ( y r M
) : (Ac -+ c Ac) -+* t0 : r ]
We have ^ c *o] : (Ac -^ c Ac) -^ c tQ : c ] = -^ c to] : tQ -> c t0 : c : Ac -> c Ac : c ] = y r [i 0 ] : t0 : r Note the final r since the clause in UD that corresponds to [up] is not applicable. As £ 0 :c differs from £ 0:r the above call to £*%A gi ye s rise to the call
64
3
-+C*O]
(yr[io]) :
Binding Time Made Explicit
to^to-.rj
We have ^
^
C
r
: t 0 ^ r t0 : r ] = to] : t0 -+ r t0 : r
*O]
: r io] ( r Yr[
Ar
y[*o]. x c [i 0 -> c t0] ( c yr[^0] ) : h ^ r t0 : c ] = Ary[*0]. xc[
because the set F turns out to be a singleton and the clause in UD that corresponds to [up] is applicable. However, the original call to £BTA gives rise to the call £BTA!
Acx[i0 ~+r t0]. Ary[i0]. x c [i 0 ^ r t0] ( r f[t0] ) : {t0 - » r to) ^ c {to ^ r to) : c ]
because the set Y becomes {{t0 —> c t0 : c), ( r t0]. Ary[*0]. x c [t 0 -> r «o] ( r Yr[^o] ) : (*o ~+r *o) ~^c (*o ~^r to) : c (which is equal to the argument of £BTA)- Thus $ gets transformed to ) = i(A =± A)]. Ay[Az±A]. x which is well-formed but unfortunately is nothing but the identity. However, given that $ was not well-formed one could hardly hope for better. • We now turn to the equations for the constructs associated with lists. As no new phenomena arise we cut down on the explanations and examples. The clauses are given in Table 3.10. Turning to conditional, the truth values and fixed points we have the clauses of Table 3.11. It remains to verify that these clauses define a function £BTA with the desired properties. Recall that a triple e:t:b in E2' x T2 x 2 is 1-well-formed if the underlying expression is well-formed in the sense of Section 2.1: is functional
: rmA(t)
3.2
Binding T i m e Analysis
65
: bO ] = i2 let t\:t\:bl' = ££ TA [ tx : t : blF\b2nbO ] let e'2:t'2:b2' = £gTA[ e2 : i l i s t " ™ 2 ™ 0 : blUb2UbO ] if*; l i s t " ' : bl' = t'2: b2' then e\:hl'c2 : i'j l i s t " ' : bl'
^ J e ' ^ V , , : t': b'}
where *':&' = (t\ l i s t " ' : W ) n (t'2:b2') n i l " [ t x ] : *2 ] let *':&' = TBTAI {W.bl) V\ (t2:b2nb0) ] innil 6 '[i'] : f'list*' : b' hd" e : t0 : bO ] = let t'o:b0' = TBTAI h'hO \ let e' : tg l i s t " ' : 6J' = £gj A [ e : i{, l i s t " : bl ] in UD( h d " ' e' : ig : M', 60')
60
]
let e' : i ' l i s t ' : bO' = ££ TA [ e : t i i s t " n J 2 n 4 ° : i n t l * 0 ' e' : i' l i s t * 0 ' : bO' i s n i l " e : Bool 62 : bO ] = let ii be given by fun(TBTA(.4BTA(e))) hc let t[ = ^ T A (
let e':i':6' = £gTA[ e : *i : binb2nbO ] in i s n i l ' ' e' : Bool*' : 6'
Table 3.10: £BTA- Binding time analysis of expressions (part 4) Similarly, we shall say that e:t:b is 2-well-foiined if the underlying expression is well-formed in the sense of Section 3.1: «4BTA(C)
is functional
fun(.4BTA(e)) He e'BTA(e):t:b With this terminology we may rephrase Fact 3.1.3 as saying that every ^-wellformed triple is /-well-formed. We then have Proposition 3.2.13 (Correctness and Optimality of £"BTA) If e:t:b is 1 -well-formed then 5"BTAI e:^:& ] defines a result e':t':b' •< e:t:b that is 5-well-formed if
66
3
Binding Time M a d e Explicit
: Bool' 2 : bO ] = : Bool" n42n *° : blUb2V\bO £gTAl f a l s e " : Bool' 2 : bO ] = f a l s e " n h z n i 0 : Bool"™ 2™ 0 : blV\b2UbO i f hl e
i t h e n e2 e l s e e3 : t0 : bO ] = let t'o:b0' = TBTA[ M O ] let ei:Bool 4i ':6i' = £g TA [ e ^ B o o l " : ^ I let e'2:t'2:b2' = £gTA{ e2:t'0:bl ]
let e^:^:63' = £g TA [ e3:tj,:61 I if {t'2:b2') = (^:65 ; ) A (bl'=b2'=b3') then UD( i f " ' c^ then e'2 e l s e e^ : i'2 : b2\ bO') else f^ TA [ if*' e[ then e'2 e l s e e'3 : t' : bO' ] where t':6' = (t'2:b2')n(t'3:b3')n(t'3:bl') f | x A [ f i x " e : t0 : bO ] = let *{,:&0' = TBTA[ <0: let e':t'1^bl't'2:bl" = if (<' i: W) = (i' 2 :W) thenUD( f i x " ' e' : i'j : bl', bO') else £gr A [ fix*' e' : t' : bO' ] where t':b' = {t'^
Table 3.11: £BXA: Binding time analysis of expressions (part 5) is functional and that satisfies H J : 6 ; for all (xi,£j,6j) G ABTAW)e":t":b" ± e:t:b is jg-well-formed then e":t":b" X e':i':6'.
Furthermore, if •
Proof: As in the proof of Proposition 3.2.2 the overall proof strategy is to use complete induction. The well-founded order < to be used is defined by
if and only if ei:ti:bi < e2:t2:b2, or £BTA(CI)
is a proper subexpression of
For the first claim about £BTA
we ma
y
use
Pi(e:t:b) = e:t:b is 1 -well-formed
the predicate
3.2 Binding Time Analysis
67
e :
^ ] d e f i n e s a r e s u l t e':tf:bf r< e:t:b A
e':t':b'
A
(ABTA(^') is functional =>• e':t':b' is £-well-formed) A V (xi,ti:6i) G
and for the second claim about £BTA we may use the predicate P2(e:t:b)
= e:£:6 is i-well-formed A e'-.t'-.b1 is 5-well-formed A
e':t':b'
^ e:t:b
e':t':br
X
The formal proof that Pi and P 2 hold on all arguments amounts to formalising the explanations we gave in presenting the equations for £j$TA. There are no profound problems in this, but as may be guessed from the length of the definition of ^BTA^here are very many minor points to be made. For this reason we shall not present the formal proof but refer to the Bibliographical Notes for a reference to a full proof for a version of the 5-level A-calculus where the notion of well-formedness is only slightly different from here. We should point out however, the complication due to the auxiliary function ^BTA- For each of Pi and P 2 we then have to define properties P\ and P 2 that express the intention of Pi and P 2 on the arguments of £BTA- Similarly, we must define an analogue, <*, of the well-founded order. To allow for a clean presentation of the interaction between £BTA a n d ^BTA ^ ^s helpful to regard them as the same function, 5, which behaves like £BTA u P o n arguments of the form e:t:b and like u on £BTA P arguments of the form ei(e 2 )^'^. Similarly, we may amalgamate each Pi and P* to a predicate P\. We then define a well-founded order, <, which behaves like < on pairs of arguments that are both of the form e:t:b] which behaves like <" on pairs of arguments that are both of the form ei(e 2 ):£:6; and otherwise is given by c i ( c 2 ) : h ^bl
t0 : bO < ei(bl
to allow £BTA t ° call £BTA>
an<
e 2 ) : t0 : bO
^
ei : t : b < e i ( e 2 ) : t0 : bO to allow £BTA t ° call £BTA-
n
Corollary 3.2.14 If e:t:b is ^-well-formed then EgTA[ e:t:b j = e:t:b.
•
68
3 Binding Time Made Explicit
Corollary 3.2.15 £BTA is monotonic (with respect to ^) on 1-well-formed arguments. • Proof: For a 1-well-formed argument with no free variables the result of £BTA will be ^-well-formed. The corollary therefore follows from Proposition 3.2.13 for 1-well-formed arguments with no free variables. For the general statement a proof by complete induction is needed. •
3.2.3
Binding time analysis of programs
We now have the tools needed to define the transformation ^BTA- P{E,T2) *-> P(E2,T2) for programs. We only intend to apply V£TA to programs p£P(E,T2) for which is well-formed. Using TBTA and £gTA the underlying program KBTA(P)€P(E,T) and the general notation from the 5-level language L this allows us to define ^BTA! D E F x i = e i • • • DEF x n =e n VAL e0 HAS t ] = let 0 he t\ : t\ determine t\ let 0[
if bl'='..=bn'=b'=c
then DEF x1=e/BTA(e'1) • • • DEF x n =e / BTA (e;) VAL £BTA(eo) HAS *' else fail Here we first use Fact 2.1.4 to determine the types £l9 •••, t n of the typed Aexpressions ei, • • •, e n . We then construct the typed expression e and note that 0 he e : TBTA(0 holds if and only if TTBTA(P) is well-formed, where p is the argument to 'PBTA* TO apply ^BTAwe ^vs^ u s e £BTA ^° transform e in E to an expression in E2. Next we use a function £BTA f°r transforming an expression in E2 into one in E21. The argument 0 is an empty type environment and the interesting clauses are
where
«cn»(xi) = i : b
3.2
Binding Time Analysis
69
We leave the remaining clauses as an exercise (Exercise 13) but note that £BTA was already implicitly used in Example 3.2.12. Turning to the result of £BXA ^ must have the form indicated and if all the binding times equal c we can produce a program in P(E2,T2). If one or more binding times differ from c we have to fail due to the definition of well-formedness of programs in P{E2,T2). To express the correctness of ^BTA w e shall first extend 7TgTA: P{E,T) —>
P(E2,T2) to
TT|TA:
P{E,T2) -> P{E2,T2) by
i = ei • • • x n =e n VAL e0 HAS t) = DEF xi=e| T A (ei) • • • x n = 4 T A ( e n ) VAL e^TA(e0) HAS t Theorem 3.2.16 (Correctness and Optimality of 'PBTA) Consider some program p<EP(E,T2) of the form DEF x1 = e1 • • • DEF x n =e n VAL e0 HAS t that satisfies hc DEF xi = ei • • • DEF x n =e n VAL e0 HAS T B T A ( 0 in the typed A-calculus P(E,T). Then /PgTA[p] fails only if there is no wellformed program p" that respects the run-time annotations in p, i.e. there is no p"eP(E2,T2) such that (hc p") A ( p / / ^ | T A ( p ) ) . If ^BTAIP] d o e s n o t f a i l t h e result satisfies the property
and it is the greatest program with this property, i.e. hc p" A P" X
ff|TA(p)
=• p" X
Proof: Let p be as given in the Theorem. From he TTBTA(P) and the definition of he in the typed A-calculus, i.e. P(E,T), it follows that there are types f1? • • •, t n in T such that 0 h c d : tx - [^n-i/xn-i] hc en : *n - [Wxn]h c e 0 : rBTA(t) From Fact 2.1.4 it follows that £i, • • •, £n are uniquely determined. It follows that 0 hc e : T B T A(*)
where e is as in the definition of 'PBTA- Hence ^BTA[4TA(£BTA(C)): t : c ]
is well-defined and produces a result of the form ( A » i ' x 1 [ * ' 1 ] . . . . ( A » » # x n [ ^ ] . e i ) ( * » ' e ; ) . . . ) ( * i ' c i ) : t ' : 6'
70
3
Binding Time Made Explicit
Writing • • • DEF x n =e ; B T A (c;) VAL e'BTA(e'o) HAS t'
p' = DEF x1=e'BTA(e'1)
we have P ' I ^ B T A ( P ) because the result produced by £B T A will be less than its argument. Also the result of £gTA is £-well-formed and if bl '=• • -=bn'= b'=c this implies that also p' is well-formed, that is he p'. Next let p" = DEF xi = ei; • • • DEF x n =e£ VAL eg HAS t" be a well-formed program satisfying the run-time annotations in p, i.e. (he p") A (p" r< TTBTA(P))- W e then h a v e types *ii * * •> *n i n ^ s u c h t h a t 0 h c e'{ : *? : c
and by Fact 3.1.5 the types £", • • •, t'^ are uniquely determined. Hence 1 [i' 1 '],--(A
c
xnK].^')( CO---)( Cei')): *" : c
is 5-well-formed and less than or equal to (wrt. •<) the argument to £BTAthe correctness of £gxA w e then have 4T A ((A c x i ra---(A c x n K].e2)( c e^)---)( c ei')) : i" : c ^ (A»'x 1 [*' 1 ]....(A»- l x n [ty.ei)(»- l <)-)(" < e / 1 ) : t': b' so that (p'^pO A (bl'=---=bn'=b'=c) It then follows that Pg TA only fails when it is allowed to and that when it does not fail it produces the greatest program that is both well-formed and respects the run-time annotations in the argument. • Example 3.2.17 As a simple example consider the programs p1 — DEF x = f ix Ay[A].y VAL Az[A->A]. (z,x) HAS ( A - > A W ( A - * A ) X A )
p2 = DEF x = f ix Ay[A].y VAL Az[A->A]. (z,x) HAS (A-»A)-»((A-»A)jjA)
3.2
Binding T i m e Analysis
71
Note that TTBTA(PI) equals TTBTA(P2) a n d that it is a well-formed program in P(E,T). When calculating ^BTAIPII w e perform the call (A C X[A].A C Z[A-,A].(Z C [A^A],X C [A]))
(fix Acy[A].yc[A]) :
(A-»A)-»((A-->A)xA) : c ] (Acx[A].Acz[AIiA].(zc[A=±A],xc[A])) (fix Acy[A].yc[A]) : :c and get =DEFx = fixAy[A].y VAL Az[A->A]. (z,x) HAS (A-»AW(A-+A)xA) When calculating 'PBTA[P2]
we
perform the call
(A C X[A].A C Z[A->A].(Z C [A-^A],X C [A]))
(fix Acy[A].yc[A]) :
: c] r
r
r
(A x[A].A z[Az±A]. (z [Az±A],xr[A])) Cfix Ary[A].yr[A])_ : : c
However, the condition bl'—- • •=bn t=b'=c is not satisfied so rather than getting the program DEF x = f i x Ay[A].y VAL Az[A->A]. (Z,X)_ HAS (A^A)->( we let PBTAIP2I =/<«/ This is due to the definition of the well-formedness predicate for programs in the £-level A-calculus where we insisted on using the binding time c for the type associated with x. • Ideally we would have liked VgTAoV%A: P{UEJ2) <-• P(E2,T2) to fail only when V^A fails. However, in the example above we saw that this cannot be arranged. It is therefore worth pointing out that the call to £gTA in the body of wm< n ^BTAIPI ° t fail if l~c TTBTA(P)- Hence, to arrange for ^ B T A ^ T A ^° af^ on^Y when V%A does we might consider generalizing the well-formedness condition for programs in P(E2,T2) but we shall not go into this here.
72
3 Binding Time Made Explicit
3*3
Improving the Binding Time Analysis
The annotation obtained from the binding time analysis is optimal in the sense that as few computations as possible are postponed until run-time. However, it is often the case that a slight rewriting of the program will produce an even better distinction between the binding times. As an example, the order of the parameters of a function may be changed or the representation of data types may be modified. To illustrate this consider the function lookup of type (Name x I n t ) l i s t —» Name —*
Int
where we assume that the second components of all pairs in the first argument are unknown at compile-time. We then have a situation where known and unknown data are mixed and the binding time analysis will return a function with the annotated type (Name _x_ Int) l i s t z± Name z± Int so that all computations will be postponed until run-time. Alternatively, we may split the list of pairs into two lists and rearrange the order of the parameters so that the type becomes Name l i s t —» Name —»
Int l i s t —•>
Int
We then get a much better distinction between the binding times because now only the elements of the second list will be unknown at compile-time and the binding time analysis will return a function with the annotated type Name l i s t —» Name
—> Int l i s t z± Int
Hence some of the computations can be performed at compile-time and this idea is further explored in [81]. The above example is rather involved in that it changes the overall type of the function. This is not necessary for a transformation to be useful. Consider the program sum9 from Example 3.1.6. Here the fixed point operation of reduce 9 is underlined and intuitively this means that we cannot use the recursive structure of its body at compile-time, for example during abstract interpretation or when generating code. We may therefore want to replace the run-time fixed point by a compile-time fixed point. Since sum9 is the best completion of sumt we cannot obtain this effect by simply changing the annotation. The idea is therefore first • to transform the underlying program, and then • to repeat the binding time analysis. In our case we shall apply the transformation
3.3
Improving t h e Binding Time Analysis
Xx[tx}.fix (Xf[t2^t3].e)
73
> fix
that replaces the first pattern with the second. Here e[g(x)//] is e with all occurrences of/ replaced by g(x) and g is assumed to be a fresh identifier. Intuitively the transformation moves the fixed point operation to an outer level by passing the first parameter as an additional parameter during the computation of the fixed point. We then repeat the binding time analysis with Sumt as the 'goal type' and get the program sum9a: DEF reduce 9 a = Af [Int—»Int-»Int]. f i x (Ag'[lnt z± Int l i s t z± Int]. Au[lnt]. Axs[Int l i s t ] . if i s n i l xs then u e l s e f (hd xsKg'OOCtl xs))_) VAL reduce 9 a (Ax[lnt].Ay[lnt].-i-[lnt><_Int-»Int]jC{x ?y))_) HAS Int l i s t z± Int where the fixed point now is computed at compile-time and the function g' is bound at compile-time. As a side effect the functionality of the fixed point has been changed and, in particular, g' has become a higher-order function. For some stack-based implementations it is expensive to handle higher-order functions and we may therefore want to transform the program further to change the functionality of g'. To do that we first apply the following transformation to the underlying program of sum9a: fix
( f [
1 2 3
] [ i ] y [ 2 ] )
i].Xy[t2].(fix(Xg[tiXt2->t3].Xz[tiXt2\. z/x)[snd
z/y]))
Here g and z are assumed to be fresh identifiers. Intuitively this transformation will uncurry the parameter of the fixed point operation and in order to keep the overall type unchanged, the fixed point itself will be curried at the outermost level. Next we apply the /^-transformation
(Xx[t].e)(ef) > e[e'/x] (recalling that our semantics is supposed to be non-strict). The binding time analysis is then applied to the resulting program with Sumt as the overall annotated type and we get the program sum9b: DEF reduce 9 b = Af [Int-»lnt—>Int]. Au[lnt]. Axs[lnt l i s t ] , (fix (Ag[lnt x_ Int l i s t => Int]. Az[lnt 21 Int l i s t ] .
74
3 Binding Time Made Explicit if isnil (snd z) then f st z else f ^hd (snd z))_ (g i(fst z, t l (snd z)[)))) VAL reduce 9b (Ax[lnt].Ay[lnt]. (O[lnt]l
HAS Int l i s t -> Int
Bibliographical Notes The iMevel formalism as presented in this and the next chapter dates back to [66]. A sketch of the 2-level A-calculus of this chapter was given in [66, Chapter 6] and the complete account may be found in [73] (in the context of partial evaluation). Also [74, 75, 76, 80] consider versions of the ^-level A-calculus; the notation used in this chapter has incorporated the explicit distinction between f iff] and fjft] suggested by Mycroft [64]. Relaxing the need for run-time constructs to be well-formed, or perhaps by assuming a run-time recursive type as in [75], one may develop variations of the ^-level A-calculus. One such line of development is illustrated by Gomard and Jones [32] and by Henglein [36]. The notion of well-formed types is much as in [73, 74, 75, 76, 80] and apart from presentational details (see Exercise 6) also as in [66, 69, 72, 78]. We should stress that the rule rrfr- only holds for t of the form f=flz=><2 an< i that there is no rule frfr^. A slightly more permissive well-formedness notion is considered in [45]. Rules of the form rrrr- are likely to be needed for serious work on partial evaluation; this observation was made already in [73, Section 5] and is made in a strong way by Gomard and Jones [32]. We do not regard this as a criticism of the well-formedness relation studied here; rather we take the point of view that a wellformedness relation formalises a particular intuition and that different intuitions may be appropriate for different tasks. Despite this potential difference in the formulation of the well-formedness relation we believe that the techniques used to study it (e.g. for binding time analysis, see below) are rather similar. The notion of well-formed expressions is much as for TMLq of [76]. Apart from a different representation of type environments the main difference is that in TMLi there is no condition \~t\c in the rule [if]. The well-formedness rules studied here are superficially close to the well-formedness rules of [73, 74, 75, 80] for TML e . The difference is that for TML e there is no overall binding time associated with each type. This corresponds to weakening the side condition of rule [up] to H:c. While being a very local change this has far reaching consequences. As should be expected, the notation of well-formedness of [32, 36] differs in many ways from the notion studied here, not least because the run-time level of [32, 36] needs not be
3.3
Improving the Binding Time Analysis
75
fully typed. Binding time analysis may be approached in many different ways. The approach of [73] was to take least upper bounds in a structure that was shown to be a complete lattice. The approach of [75, 74] was to develop a binding time analysis based on Milner's original type inference algorithm W and [32] takes a similar approach. The present approach is based on Damas' type inference algorithm T. A main difference is that we have no counterpart of substitutions and that we therefore need to perform some extra recursive calls to 'unify' the binding time annotations. Other approaches include [47, 48, 59, 58, 92] and several of these are based on abstract interpretation. The approach of [48, 59] performs binding time analysis in conjunction with certain program transformations; in our approach program transformation, for example using fold/unfold [16] or partial evaluation [47], is a separate issue. The approach of [36] breaks new ground by reformulating binding time analysis as a constraint problem and then using more or less standard techniques to obtain an efficient solution. As in Chapter 2 our treatment of soundness and completeness is purely syntactic and for the same reasons as mentioned in the Bibliographical Notes of Chapter 2. If a semantic characterization is desired there are several ways to go. One is to use the notion of faithfulness of [71]. Another is to use the notion of partial equivalence relations, PER's, of [44]. A third is to use the notion of projection analysis as in [58]. Actually, these approaches are not so dissimilar as one might expect. Recent work has shown the ability of PER's to encapsulate the power of projections and the faithfulness of [71] amounts to a study of PER's restricted to a setting where they turn out to be equivalence relations. Semantic correctness is also studied in [32], An obvious extension of the work carried out here would be to allow £-level polytypes. A problem to be overcome is how to allow the i?-level polymorphic type of the identity function to include X i ^ X i as well as Xi—>Xi, depending on the kind of £-level type that X1 ranges over. A brief appraisal of the material of this chapter may be found in [74].
Exercises 1. Discuss the relationship between 'constant expressions' in PASCAL and our expressions of binding time c on the one hand, and between 'expressions' in PASCAL and our expressions of binding time r on the other hand. 2. Consider the programming language Standard ML and try to classify its ingredients with respect to two binding times. 3. Consider designing a 5-level A-calculus, e.g. with binding times 'run-time' (r), 'link-time' (1) and 'compile-time' (c) and with r ^ l ^ c .
76
3
Binding Time Made Explicit
4. Discuss whether the partially ordered set 5 = ( 5 , ^ ) of binding times should always be a totally ordered set or whether it makes sense to have incomparable elements. 5. Use Table 3.1 to prove the well-formedness of the types Reduce^ ..., Reduce^. Try to argue that these are all the well-formed types with TBTA(Reciucei) as the underlying type. 6. Consider the following piece of syntax: ct ::= Aj | ctxct | ct—*ct \ ct l i s t | ft ft ::= rtz±rt rt ::= Aj | rtx_rt \ rt=±rt \ rt l i s t Let Lrt be the context free language generated by the nonterminal rt and let Lct be the context free language generated by the nonterminal ct. Similarly, let Lt be the context free language generated by the nonterminal t of Table 2.3, and let Lt6 = {wELt|l-w:&} be the subset of those types in Lt that have binding time b. Show that Lrt = Ltr and that Lct = Ltc. 7. Give an inductive definition of
ti^t2.
8. Try to sketch a version of TBTA where the formulation is independent of the actual choice of binding times B and underlying language L. (Hint: it may be easier if you assume that B = (B^) is finite and linearly ordered. If you do not adopt this simplification you will probably have to impose other conditions on ii, e.g. that B is a fl-semilattice with a greatest element.) 9. Calculate T^A[Ubj when ts = VXi.VXo. X 1 -^X 2 -^X 1 xX 2 t = (Int—>Int)—»(Bool—»Bool)—>(lnt—>Int) x (Bool—>Bool) and 6 = c. 10. Verify that < in the proof of Proposition 3.2.2, is a partial order, i.e. is reflexive, transitive and antisymmetric. Also verify that < is well-founded. 11. Give detailed proofs of the cases _x. and x in Proposition 3.2.2. 12. Prove Corollaries 3.2.4 and 3.2.5. 13. Define 4 T A : {eeE2\FEV(e) underlying expression.
= 0} -> E2' using structural induction on the
3.3
Improving the Binding Time Analysis
77
14. Annotate the program from Exercise 2 of Chapter 2 with the best possible binding time information, assuming that the overall type is String l i s t —> Int. 15. In Exercise 2 of Chapter 2 we saw a program where one of the DEF-clauses needed to be duplicated in order for the type analysis to succeed. Give an example of a program where duplication is needed in order for the binding time analysis to succeed but where the type analysis succeeds without duplication. 16. Extend the types of the typed A-calculus with sum types, recursive types, type variables and type synonyms. Types are then given by t ::= • • • | t + * | rec Xj = t | Xj | l e t X{ = t in t With this notation we could use rec Xx = Unit + (tx x Xi) instead of ti list Our primary interest will be in closed types, that is types without free type variables. • Define the syntax of iMevel types. • Define the well-formedness relation for iMevel types. (Hint: take care to account for the possibility of free type variables.) • Try to define a binding time analysis of types. 17. Extend the expressions of the typed A-calculus with l e t and l e t rec. Expressions are then given by e ::= • • • | l e t xi = e in e | letrec Xi = e in e As usual, our primary interest will be in closed expressions, that is expressions without free variables. • Define the syntax of iMevel expressions. • Define the well-formedness relation for iMevel expressions. • Try to define a binding time analysis of expressions.
Chapter 4 Combinators Made Explicit The binding time information of a 2-\eve\ program clearly indicates which computations should be carried out at compile-time and which should be carried out at run-time. The compile-time computations should be executed by a compiler and it is well-known how to do this. The run-time computations should give rise to code instead. We may also want to perform some data flow analyses in order to validate some program transformations or to improve the efficiency of the code generated. It is important to observe that it is the run-time computations, not the compile-time computations, that should be analysed, just as it is the run-time computations, not the compile-time computations, that should give rise to code. This then calls for the ability to interpret the run-time constructs in different ways depending upon the task at hand. This is not straightforward when the run-time computations are expressed in the form of A-expressions. As an example, the usual meaning of Ax[lntxlnt]. f (g 0 0 1 is Xv.f(g v). However, we may be interested in an analysis which determines whether both components of x are needed in order to compute the result. This is an example of a backward data flow analysis and the natural interpretation of the expression will then be Xv.g(f v). It is not straightforward to interpret function abstraction and function application so as to be able to obtain both meanings. The idea is therefore to focus on functions and functionals (expressed as combinators) rather than values and functions. We then write f Dg for the expression above and the effect of both Xv.f (g v) and Xv.g(f v) can be obtained by suitably reinterpreting the functional D. This observation calls for transforming the run-time computations into combinator form. Similar considerations have motivated the use of combinators in 79
80
4
Combinators M a d e Explicit
the implementation of functional languages [101, 19, 41] and the use of categorical combinators when interpreting the typed A-calculus in an arbitrary cartesian closed category [49]. However, we shall stress once more that we leave the compiletime computations in the form of A-expressions and only transform the run-time computations into combinator form. In Section 4.1 we prepare for this transformation by defining the mixed Acalculus and combinatory logic. The types are as in the 2-level A-calculus, and the well-formedness conditions also have much in common with the well-formedness conditions of Section 3.1. The actual transformation is accomplished by the algorithm developed in Section 4.2.
4.1
Mixed A-Calculus and Combinatory Logic
The syntax of the mixed A-calculus and combinatory logic is given by Table 4.1. The difference, with respect to the syntax of the £-level A-calculus given by Table 3.1, is that combinators are used to express the computations at the run-time level. The meaning of the combinators is best expressed by sketching a run-time A-expression with the same meaning:
Tuple(e!,e 2 ) = Ax[---]. (ejOO, e 2 (x)} Fst[J'x.i"] = \x[t'xt"].
fst x
Snd[t'xt"] = Xx.[t'xt"]. sndx Curry e = Ax[- • •]. Ay[- • •]. e(.(x, y)J_ ex • e 2 = Ax[- • •].
el(e2(x))_
Id[t] = Xx[t]. x
Cons(ei,e 2 ) = Ax[- • •]. (ex(.x)
= \x[t'}. nil[t"] Hd[t] = Ax[£ l i s t ] , h d x Tl[i] = Ax[* l i s t ] , t l x Isnil[^] = Ax[£ l i s t ] , i s n i l x True[^] = Xx[t}. t r u e False[«] = Xx[t]. f a l s e Cond(e 1 ,e 2 ,e 3 ) = Ax[- • •]. if ei(x)_ then e2(x)_ e l s e e3(x)_ Fix[t] = Ax[*z±*]. f i x x
4.1
Mixed A-Calculus and Combinatory Logic
81
t ::= ki\ txt \ t-*t | t l i s t | Aj | txt \ t^±t \ t l i s t eeCE2 e ::= fi[t] | (e,e) | f s t e \ snd e | AXjlCl.6
C\C)
Xi
e:e | nil[^] | hd e | t l e | isnil e | true | false | if e then e else e | fix e \ Fi[t] | Tuple(e,e) | Fst[t] \ Snd[t] \ Curry e | Apply[^] | e • c | Id[i] | Cons(e,e) | Nil[i] | Hd[f] | Tl[t] \ Isnil[^] | True[<] | False[i] | Cond(e,e,e) | Fix[t] Table 4.1: The mixed A-calculus and combinatory logic Here we have indicated the general form of the type parameters associated with the combinators and we have used ellipses (• • •) to indicate missing type information in the run-time A-expressions. We shall provide a more precise relationship later in this section. E x a m p l e 4.1.1 As an example of a program in combinator form consider the given by program SWI9B€P(CE2,T2) DEF reduce 9 B = Af [ ]. Curry (f ix(Ag[ ]. Cond(Isnil[ ] • Snd[ ], Fst[ ], Apply[ ] • Tuple(f • Hd[ ] • Snd[ ], g D Tuple(Fst[ ], Tl[ ] • Snd[ ]))))) VAL Apply[ ] • Tuple((reduce 9B (Curry + [ ] ) ) • (Zero[ ]), Id[ ]) HAS Int. l i s t z± Int. For the sake of readability we have omitted the type information (in square brackets). Note that • behaves like functional composition in that an expression eiDe 2 should be read backwards. • The methodology behind the construction of the combinators is not entirely arbitrary. Concerning f s t , snd, hd? J i 5 i s n i l and fix of the 2-level A-calculus, we have proceeded as follows: instead of underlining we now use capitals and instead of an expression argument we provide explicit type information. The intention has been to supply as little type information as possible while ensuring that any expression without free variables still has a unique type. (This will become clearer when we present the well-formedness rules below.) A similar procedure has been used for
82
4
Combinators Made Explicit
nil, true, false and fj except that there was no expression argument to remove. For the constructs I, ( ) and if which take more than one expression argument, a new combinator name has been introduced. (This is a minor point, due to the fact that in the i?-level A-calculus one writes €iie2 rather than cons(ei,e9).) However, the 'functionality' of the expression arguments have changed: the combinators now take 'functions' as arguments rather than 'values'. Finally, the constructs A, (_ )_ and Xj related to function space, have been replaced by four new combinators: Curry, Apply, • and Id. There are some subtleties involved in motivating the combinators for function space and in deciding which combinators should take expression arguments and which should not. To give a detailed motivation requires a somewhat technical study of how to interpret typed A-calculi in so-called cartesian closed categories [40, 49] and this incorporates a study of why the approach to parameterized semantics in Chapter 5 works. We shall not go into these details, however, and in Chapter 5 we merely show that the approach to parameterized semantics works. Concerning the more general setting of the B-level language L we therefore do not have much to say in general, except to note that the intention is to eliminate variables associated with some binding time in B. This is done by replacing constructs operating on values of that binding time with combinators operating on functions whose domain contain information about the values of the eliminated variables, e.g. as shown in the equation for Tuple. Remark 4.1.2 The essence of the categorical motivation is as follows. The combinators • and Id express the basic categorical data. The combinators Fst, Snd and Tuple arise in the categorical characterization of cartesian product. In this Fst and Snd are morphisms, i.e. functions, that depend on no other morphism while Tuple depends on two. Hence Fst and Snd are used without any argument expression and Tuple is used with two. The combinators Hd, Tl, Isnil, Nil and Cons are intended to mimic this although they don't quite tally with a categorical characterization. (An alternative is to use a Case construct as discussed in Exercise 9.) Finally, the combinators Apply and Curry are the key morphisms involved in the characterization of cartesian closedness, i.e. function spaces. We refer to [40] for a quite readable introduction to such explanations. •
4.1
Mixed A-Calculus and Combinatory Logic
4.1.1
83
Well-formedness
The well-formedness relation for 2-\eve\ types is as in Table 3.3 and will not be repeated here. For expressions in combinator form, i.e. in CE2, the details are given by Tables 4.2 and 4.3. The form of the well-formedness relation is tenv h e : t where e£CE2 is a i?-level expression in combinator form, t£T2 is a £-level type and tenv is a type environment. Unlike in Chapter 3 the type environment now only maps variables to £-level types rather than pairs consisting of a type and a binding time. The reason simply is that we have only two binding times and since we are in the process of eliminating run-time variables, the type environments would have to associate all variables with the binding time c. For much the same reasons we do not explicitly indicate the intended binding time of the type t as this would also always equal c. The appearance of the well-formedness rules thus has much in common with the rules presented in Chapter 2. The well-formedness rules are presented in two tables. In Table 4.2 we have the fragment of the well-formedness rules from the 5-level A-calculus that still apply. In Table 4.3 we have the new rules that relate to the combinators. No rules analogous to [up] or [down] are present because removing one of only two binding times leaves us with just one binding time. However, we still have analogues of Facts 3.1.4 and 3.1.5. Fact 4.1.3 If tenv h e : t then h t : c. Fact 4.1.4 If tenv h e : t\ and tenv h e : t2 then t\ = t2.
• D
This also holds for the variation of the well-formedness rules where the types of the primitives are constrained. In this case the well-formedness rules must be changed to [f]
tenv he fi[t] : t
if H:c, fiGdom(C) and t is a 5-level instance of C(f\)
[F]
tenv he F-^t'^t] : t'z±t
if H'z+Uc, fiGdom(C) and t is a i?-level instance of C(f\)
Note in the axiom [F] that it is f, not t'z±t, that should be a £-level instance of C(fi). This has already been motivated by the equation Fj[f;z±^] = Ax[£']. ii[t]. For programs the well-formedness relation is much as in the previous chapters: 0h d : h
: t h DEF x1 = e1 - • • DEF x n =e n VAL e0 HAS t
84
4 Combinators Made Explicit
[f ]
tenv h fi[i] : t
r
if M : c
tenv h (ei,e 2 ) : h e : ^x£2 h f s t e : t\ tenv h e : ^ x l 2 tenv h snd e : t2
,i
\\]
e2 : t2 txxt2
tenv[t'/x{\ \~ e : t . , , ,/ . ^eni; h A X i [^]e : *'->*
L J
h e2 : f ;
h ei(e2) : i if (xi,£)£graph(£en?;) A H : c
[x]
tenv h xi : ^
r.i '•*•'
tenv \~ ei : t tenv \- e2 : t l i s t tenv h ei'.e2 : ^ l i s t
[nil]
Jent; h nil[*] : < l i s t
tenv \- e : t l i s t "T^FhdTTT" "Ten^FhdTTT h e : t list
M It'll 1
TrTTTTI
J
[true]
\ t list 7 r- \- er^ =—=tenv h i s n i l e : Bool tenv h t r u e : Bool
[false]
tenv h f a l s e : Bool
[isnill L
J
LXJ J
"
rf3_xi
if h f : c
- e^ : B o o l
tenv V e2\t
tenv h e 3 : t
^env h if e\ then e2 e l s e e?> : ^ Jenv_h_e_LJ_"r>i.
Table 4.2: Well-formedness of the mixed A-calculus and combinatory logic (1) A similar definition applies when constraints are considered. Example 4.1.5 The sum9B program of Example 4.1.1 is well-formed: h sum9B. D Example 4.1.6 Consider the function twice = Ag[lnt—>Int]. Ax[lnt]. g(g(x)) which applies its first argument twice to its second argument. This is a typed A-expression, i.e. twiceEi?, and it is well-formed, i.e. 0 h twice : (lnt->Int)->(lnt-*Int)
4.1
Mixed A-Calculus and Combinatory Logic
85
[F]
tenv h Filt'^t] : t'^t
if h t'z±t : c
[Fst]
tenv h t\ : t—*t\ tenv h e2 : £~»^2 £env h Tuple(ei,e2) : tz±tiXt2 *eni; h F s t ^ ' x T ] : fxt"^1 if h t'yA"^'
[Snd]
*enu h Sndf^^"] : t'xt'^t"
if h t'xt"^"
:c :c
r 1 ^ent; h e : t'xt"z±t [ UrryJ ° ^eni; h Curry e : t'^f'^t) [Apply] Jent; h Applyf^^''] : ((t1 zM'^xt')^" if I- ( ( ^ " ) x ^ ' ) _ ± £ " : c
[Id]
h e1 D e 2 : *z±*;/ tenv h Id[^] : t=±t ii^t^Uc
[Nil]
h Cons(e1,e2) : t—>r list tenv h N i l ^ ^ ^ l i s t ] : i ^ ^ ' l i if h t'^flist: c
[Hd]
tenv h Hd[<] : flist--»f
[Tl]
Jent; h Tl[<] : flist->aist
[Isnil]
^en?; h Isnil[f] : Hist—>Bool if h flist-->Bool : c
[True]
tenv h True[^] : £—>Bool
[False]
Jenv h False[£] : ^->Bool
\r A] ^ °n ^ [Fix]
tenv l~ e i : ^~»Bool tenv h e? : ^—>t f tenv h e tenv h Cond(ei,e2,e3) : f^f7 tenv h Fix[«] : (t^t)^ if h (t^t)^ :c
if h t l i s t - » t : c if h flist^flist : c
if h f->Bool : c if h £->Bool : c
Table 4.3: Well-formedness of the mixed A-calculus and combinatory logic (2) There are five well-formed £-level types with (Int—»Int)—>>(lnt—>Int) underlying type and three of them are *i = (Int->lnt)->(lnt-»lnt) t2 = (Int-^lnt)->(lnt->lnt)
as their
86
4
Combinators Made Explicit
Corresponding to these types we have 2-\eve\ expressions in combinator form. For t\ it is twice! = Ag[lnt->Int]. Ax[lnt]. g(g(x)) and one may check that 0htwicei:£i. For t2 it is twice 2 = Ag[lntand one may check that 0htwice2:^2- For t3 it is twice 3 = Curry ( Apply[lnt-»Int] • Tuple(Fst[(lntz=>Int)>Int] • Tuple(Fst[(lnt-»Int)x^Int], Snd[(Int^Int)xInt]))) In order to understand this formula it may help to write twice as twice = Ag[lnt-+Int]. Ax[lnt]. g( g( x)) and to check that 0htwice3:£3.
4.1.2
O
Combinator expansion
In the previous chapters we showed how the expressions of a newly introduced language could be transformed back into the more well-known language upon which it was based. In Chapter 2 we provided the functions 6TA and TTTA for erasing the type information, and in Chapter 3 the functions TBTA, £BTA and TTBTA for erasing the binding time information. We shall now perform a similar development that expands the combinators into £-level expressions. This amounts to reconsidering the equations above that show the intended meaning of the combinators. However, we need to be more precise in order to be able to fill in the type information left implicit above. We therefore define transformation functions e^™ : CE2 —> E2 and TTCI : P(CE2,T2) -> P(E2,T2). There is no need for a function rC\ for types as the 2-\e\e\ A-calculus and the mixed A-calculus and combinatory logic have the same type systems. It is, however, convenient to write Ta(tenv) for the type environment that maps Xj to ^:c whenever tenv maps Xj to t\. The superscript parameter to SQ™ is a type environment in the sense of this chapter and together with Fact 4.1.4, this will suffice for filling in the required type information. The intended functionality of SQ™ is therefore e£T
:
{ eeCE2 \ 3t: tenv h e : t } -> E2
4.1
Mixed A-Calculus and Combinatory Logic
87
i.e. 6QIV only applies to well-formed expressions, unlike £BTA a nd £TA- (One could define a function e : CE2—+UE that works for all expressions but this would not be what we need in the sequel.) The inductive definition is fairly straightforward, however, so we shall leave most of the cases as an exercise and only illustrate a few of the more interesting ones:
^ vl
Curry e ] = let tenv h e : t'xt'^t determine t' and f in Axa[*']. Axb[i"]. 4 e f v W l(xa,x b [2 '=i*"] I - \xa[(t'=±t")xtf].
(fst xa)_(snd xa)_
] = let tenv h e2 : t z l ^ determine t
in Axa[*]. e^rbi] i e^rl^l
i xa )).
In this definition we have taken the liberty of assuming that x a and x b are not in dom(£e?it;). To be precise we should have used some enumeration of variables, e.g. Xi, x 2 , • • •, and then let x a and x b correspond to the smallest and second-smallest index not in dom(tenv). By analogy with Fact 3.1.3 we then have: Fact 4.1.7 If tenv h e : t using the well-formedness rules of Tables 4.2 and 4.3 then rci(tenv) h e%?v[e] : t : c using the well-formedness rules of Tables 3.4 and 3.5. A similar fact holds for the well-formedness relation he where the types of constants are constrained. • Example 4.1.8 Returning to the functions twicei and twice 2 from Example 4.1.6 it is straightforward to verify that ] = Ag[lnt->Int]. Ax[lnt]. g(g(x)) twice 2 ] = Ag[lnt->Int]. Ax a [lnt].
88
4
Combinators Made Explicit
Both of these 2-level expressions have twice as their underlying expression (modulo renaming of variables). For twice 3 , however, £ei[twice 3 ] will be a fairly large • expression. We return to this in Exercise 4. It remains to define the combinator expansion for programs. We define TTCI
: { PeP(CE2,T2)
| \-p } -> P(E2,T2)
by 7Tci[ DEF x1 = e1 •" DEF x n =e n VAL e0 HAS t ] = let 0 h €\ : £i determine t\ let 0[^i/xi]- • -[^n-i/xn-i] h en : £n determine tn in DEF xi = elAeA DEF xn = It follows from Fact 4.1.7 that h 7rCi[p] whenever h p and similarly that he whenever he p.
4.2
Combinator Introduction
The developments of Sections 2.2 and 3.2 constitute a definition of a function -> P(E2,T2) that transforms programs in the form of untyped A-expressions together with an overall S-level type into programs that are in the form of £-level A-expressions together with a (possibly different) overall type. In other words, the type and binding time information has been propagated into the untyped expressions. In this section we will complement these developments by defining a function
Vci : P{E2,T2) ^
P(CE2,T2)
that transforms the run-time A-constructs into combinators in order to alleviate run-time variables. The definition of T^ci wiU be quite straightforward once a similar function has been defined for expressions. (Unlike the development in Section 3.2 we need no function for types as the types are the same in the i?-level A-calculus and the mixed A-calculus and combinatory logic.) To facilitate the definition of combinator introduction for expressions we shall work with a linearised version of type environments. Intuitively, the reason is that now the run-time parameters will be implicit parameters so rather than referencing
4.2
Combinator Introduction
89
them by their name we shall reference them by their position as determined by the position environment. A position environment, penv, is a finite list of triples consisting of a variable name, a £-level type and a binding time: penv G ({xi|iel} x T2 x 2) l i s t To obtain the type environment contained in a position environment we introduce functions ^BTA and p^v The function PBTA is defined by
{
t:b
if the rightmost (XJ,£J,6J) in penv
with Xi=Xj has t=t} and b=b} undefined if no (xj,tj,6j) in penv has Xj=Xj and gives the type environment in the sense of the £?-level A-calculus of Chapter 3. Similarly, the function /?QJ is defined by if
otherwige
and gives the type environment in the sense of the mixed A-calculus and combinatory logic studied in this chapter. By analogy we may define a function pTcl by Pci{ptnv){x.{) = t if and only if pBTA{penv)(xi) = t:r. Special interest centers around those triples in a position environment that relate to run-time variables. In particular it will frequently be of interest whether there are any such triples and we may define the function /? by f cc ifif all all tripl triples (x;,£;,&;) in penv have b\—c n, x P(penv) = < 4.1 • v ' y r otherwise
in order to express this succinctly. Note that f3(penv)—c implies but that the converse implication need not hold since a variable may occur in more than one triple. Assuming that f3(penv)=r we may define the product U(penv) of the types of the run-time variables as follows: undefined U(penvo) H(penv) —
t[ U(penvo)xt1
if f3(penv)=c if penv = penvo:(xj,^,c) A (3(penvo)=r if penv = penvo:(xi,£j,r) A (3(penvo)=c if penv—penvo:(xi,^,r) A (3(penvQ)=r
To obtain the element of U(penv) that corresponds to some Xj in the domain of Pci(penv) we may define the projection function n?env by
4
90
undefined j
Combinators Made Explicit
if Xj ^d it penv=penvo:{X[,t,c) A Xi^Xj A XJ( A Xi^Xj A
Id[H(penv)}
if penv A /3(penvo)=r if penv=penvo:(xj,t,r) A (3(penvo)=c
When TT?67 is defined it will be of functionality U(penv)=±t-i where £j is given by
Example 4.2.1 Consider the position environment penv given by ((xi,Bool,r), (x 2 ,Int,r), (x2,Bool,c), (x3,Bool,r), (x a? Bool->Int ? r)) The type environment PBTA{penv) 1S defined on x l7 x2 and x 3 and maps x1 to Bool:r ? x2 to Bool:c and x3 to Bool—>Int:r. The type environment pQ^penv) is defined on x 2 , and it maps x2 to Bool. By analogy we may note that pci(pcnv) is defined on Xi and x 3 , and it maps xi to Bool and x3 to Bool—»Int. Clearly (3(penv) equals r so that Ti(penv) is defined and we have U(penv) = ((Boolx_Int)j<_Bool)>^(Bool—>Int) The only projection functions that are defined are ^[tnv and -Kv^nv and one may calculate 7r|env = Snd[((Boolj£lnt)2£Bool)x^(Bool->Int)] *{env = Id[Bool] D Fst[Bool_xInt] D Fst[(Boolx_Int)2<.Bool] D Fst[((Boolx_Int)><_Bool)j<_(Bool-^Int)]
•
We now have sufficient apparatus that we may state our intentions with the function £^tv that performs combinator introduction for expressions. It has functionality S&nv : E2 --> CE2 and we shall take care only to apply it to a iMevel expression e that is well-formed. More precisely we shall assume that pBTA(penv) \~ e : t : (3(penv)
4.2
Combinator Introduction
91
for some 2-level type t. Thus if penv contains run-time triples we shall insist that \-t:r and otherwise that H:c. Further we shall assume that the position environment penv is well-formed: that h^:6i holds whenever (xi,t\,bi) is some triple in penv. It is our intention that h S^nv[e] : t nV e
h ^ci I I
:
if (3(penv)=c
n{penv)z±t
if (3(penv)=r
This means that all run-time variables have been eliminated and that the eliminated 'values' are instead supplied explicitly to the combinator expression. We shall find it helpful here, and in similar situations later, to write this more succinctly. To do so we define \(
\ 4- _ / * y H(penv)z±t
\i j3{penv)—c if f3(penv)—r
and thus merely write
ph(penv) h Sgrie] : A(penv) t A problem to be encountered in the definition of £ci nv [e] ^s ^ e fact that the rules [up] and [down] may have been used to show the well-formedness of e, that is to show that 3t,b: pBTA(penv) I" e '• t • b A b = /3(penv) It is to handle this problem that we only consider the possibility that b equals /3(penv). We therefore need operations for shortening and enlarging position environments. A position environment penv may be shortened by removing all triples whose binding time is r:
i
penv
if penv — ()
5#(pem7o):(xi,£i,c) if penv<=penv'0:(xi,fi,c) One may note that /?(£• penv) always equals c. To overcome the effect of shortif ening we need c( _ j f\ 0{penv,t) e - |
e
if
Curry(eDSnd[n^e^)^/])
if
(3{penv)=c
For this to be well-defined we need to ensure that
0(penv)=r => 3t'J": t^t'^t" holds for all invocations of 5(penv,t) e. Example 4.2.2 To illustrate the combined effect of 8% and 8 we shall assume that
92
4
),
Combinators Made Explicit
f3(penv)=r,
and
t=t'-*t".
Provided that
phipenv) h SScTvle] : t we then have Pci(Penv) ^~ $(penv,t) £c\penvlel
:
A(penv) t
so that £ciWV|[eJ — fi(penv,t) £c\penvle] seems feasible. This will become clearer in the definition of £&\nv below. D Turning to enlarging a position environment, we may add a dummy variable of the binding time r: _ j penv penv - | penv:^tf^
if f3(penv)=r ifp(penv)=c A t=t'
Note that fi(io*(t) penv) always equals r provided that (jj*(t) penv is indeed defined. For this to be the case we need to ensure that f3(penv)=c => 3t\t":
t^t'z+t"
holds for all invocations of w(t) penv. Furthermore, we assume that x a does not occur already in penv. (This can be made more precise in the way indicated earlier.) Finally, to overcome the effect of enlargement we need v _ J e ( L0[penv,t) e - j A p p l y ^ D T u p l e ( e J d ^ / ] )
if f3(penv)=r p(penv)=c A t=t'=±t"
if
where again the condition P(penv)=c =» 3t'J": t^t'^t" needs to be verified for each invocation of u>(penv,t) e. Example 4.2.3 To illustrate the combined effect of u;# and UJ we shall assume that PBTA(penv)\-e:t:f3(penv),
f3(penv)=c,
Provided that p^ipenv)
h ^ ' ' " ' [ e ] : t'=±t
we then have h cu(penv,t) SQJ '?env[e]
: t
and t=t'—>t".
4.2
Combinator Introduction
93
£VcTl HA I = S(penv,t) ti[t]
cTV
I = "(penv,t) Fi[A{(v(t) penv) t]
Table 4.4: S]^™: Combinator introduction for expressions (part 1) so that ^ c i ^ H ~ w(penv,t) SQI ^ pewv [e| seems feasible. This will become clearer in the definition of £Q™ below. • The definition of £^v is by induction on the structure of its argument. When defining £ci nv [e] w e shall assume that penv is well-formed and that P&Tk{venv)
I" e '• t
:
/3{penv)
For primitives we then have the clauses of Table 4.4 and these will be explained below. We should stress that <$, a;, A etc. are all intended to be expanded according to the value of penv. A good reading guide to Table 4.4 (as well as subsequent tables) is first to read the equations in the case where no expansion of 6 or to is necessary. Thus if f3(penv)=c we have ^ [ ^ [ 4 ] ] = f{[t] and if (3{penv)=r we have 8&n [UAl = Fi[to=±t] where t0 equals U(penv). For the latter clause recall that Fj[£o:i±£] is to correspond to Axa[^o]« £i[^]Next consider the case where expansion is needed. When f3(penv)=r the idea will be to define ££ e ri f i[*]] as 6(penv,t) £$**"[£&]] and this is already accomplished by the definition given. We also note that t must have the form tfz±tff in this case so that our use of 6 is valid. Similarly, when f3(penv)=c the idea will be to define ^ci^Iiif^]] a s w(penvj) EQI env[ii[t]] and this is also accomplished by the definition given. We also note that t must have the form t'z±t" in this case so that our use of u and w is valid. The reasoning of Examples 4.2.2 and 4.2.3 shows that £cinV[fi[*]l a n d £cittV|tei[*]I h a v e t h e functionality A(penv) t in all cases. Turning to compile-time and run-time products the clauses are given in Table 4.5. In the clause for compile-time tupling we know by the well-formedness assumptions that (3(penv)=c and in the clause for run-time tupling we know that fi{penv)—T. This obviates the need for any use of 5, £•, a;, or w. Turning to the projection functions we need to use Fact 3.1.5 to obtain the types of the argument expression. As we do not know the value of (3(penv), i.e. whether or not the rules [up] or [down] have been used in inferring that the argument to ££jnv is well-formed, we have to be prepared for all possibilities and thus have to use £, <$•, LJ and u>9. In all cases our use of ), £•, u and UJ% is valid. Example 4.2.4 Consider the expressions ex = f s t
(fi[A =i A],f 2 [A z ±Al)
e2 = f s t ( fi[Az±A], f2[AI=+A] )
94
4
Combinators Made Explicit
II
)
II
r*penv
SQI^I fst e ] = l e t pBTA(penv) h e : ti x t2 : c determine t\ in 8(penv,ti) f s t Ssc\p"lv[e] £QI
[ fst e I =let PBTA(penv) h e : h xt2 : r determine ^1, t2 in
^risnde ] =let
st
2j
D
£Q 1
l)penv
[e]
h e: h x t2 : c determine t2
PBTA{P™V)
in 8(penv,t2) s nd s e7
V
£Q\ [
s cy
snd e ] =let pBTA{penv) h e : t\ X. ^2 ' r determine fi, t2 in Lo(penv,t2) S ndl[hxt 2] a ^c
Table 4.5: EQI™: Combinator introduction for expressions (part 2) and the well-formed position environments penvi = () penv2 = (xa,A,r) Clearly we have pBTA{penVi) h ej : A->A : /3(penvi)
for all choices of i£{l,2} and J G { 1 , 2 } . It is straightforward to calculate that leij = f s t £&»">[ ( fxtAziA], f 2[A=±A] ) ] = f s t ( fifAziA], f2[AziA] ) and that £&nV2b2] = Fst[(Az±A)x.(Az±A)] D ^ n ( fi[Az±A], fclAzlA] ) ] = Fst[(A->A)x/A-->A)] • Tuple(F! [A^tC Az±A)] ,F2 [Az±( Az±A)]) since there is no need for >, 8*, u or o;« in these cases. Turning to the harder cases we have k^k)
f s t £%{"*\ ( f ^ A ^ ] , f2[Az±A] ) ]
where Curry and Snd are needed to get rid of the value for x a . Similarly, Fst[(A=iA)x.(A=iA)] °
r"2 [c2],Id[A])
4.2
Combinator Introduction
95 ( C)
' Ie]
] = if f3(penv)=r then Curry £Q^ .:(*.^) [ e ]
env 1 e l ( e 2 ) ] = let pBTA{p ) l~
t~*V6JIV IT
£&
/
\
«S"[.•!«,!]
Tl
Si : £'—^^
in 8(penv,t) (£QI
V6
: c
IL 6 i-*- JJ v^ L/C*! ^ 1
determine t n \ \ 1^2] ))
env rr
si : f'zif : r determine t and V in u>(penv,t) (App^ [e 2 ]))
Tup!
:S=l£i:
8{penv,t-l) Xj if /9 B T; 7rfCnV if /?BT7
= U:c
Table 4.6: £QT V : Combinator introduction for expressions (part 3) where Apply, Tuple and Id are needed to construct the artificial parameter for x . It is vital for the operation of 8, 8%, to and w that \~t:b holds for b—r as wella as b—c if and only if t is of the form t'z±tf/. • Turning to compile-time and run-time function spaces the clauses are given in Table 4.6. In the clause for compile-time A-abstraction we know by the wellformedness assumptions that f3(penv)=c and so can perform a rather straightforward translation. In the clause for run-time A-abstraction we do not know the value of /3(penv). If it is r then we have a straightforward expression involving Curry. If it is c we simply use £Q™ V * {e} as xi will be the only run-time variable in penv:(x\^t\r). This explains why w and w were not used. For the applications we again need to use Fact 3.1.5 to infer type information about the type of the argument and we do not know the value of (3(penv) and so have to use 5, £•, to and LO%. These uses are valid as may easily be checked. For variables we have two cases depending on whether the variable is retained or eliminated. Example 4.2.5 It is straightforward to calculate 4 \ [ Ag[lnt->Int]. Ax[lnt]. g(g(x)) ] Ag[lnt->Int]. Ax[lnt]. g(g(x)) which equals the function twice! from Example 4.1.6. Next
96
4 Combinators Made Explicit
£vc™\ nil[J] ] = Nil[A(pem;) (t list)] £cinv{ hd e ] — let pBTkipenv) \- e \ t list : cdetermine £ in 8(penv,t) hd £c\penv £Q^V\ hd e ] = let pBTk{venv) I" e : £list : r determine £ in u>(penv,t) (Hd[£] • ^ c£i I t l e 1 = t l cci lej
: r determine £ £cinV[ "tl e | = let pBTk{venv) \~ e '- list t inTl[<] D ^ i n l e l
^ci I i s n - i l
e
I — i s n i l ^QJ | e |
^ci^I i s n i l e ] = let /9pTA(^er^v) ^ e : t list : r determine t
in Isnil[t] • ^ r M
Table 4.7:
SQI™'-
Combinator introduction for expressions (part 4)
4 i [ Ag[lnt-^Int]. Axflnt]. g l g U ) l I = Ag[lnt—»Int].
Apply[Int—»Int] • Tuple(Curry(gDSnd[lntx.IM]), Apply[lnt--»Int] • Tuple(Curry(gDSnd[lnt2<.Int]),
which is somewhat more complicated than the function twice 2 from Example 4.1.6. Finally, 4 i [ Ag[lntz±In£]. Ax[lnt]. gIgU)l I = Curry ( Apply[lnt-»Int] • Tuple(ld[lnt->Int]DFst[(Int-»Int)j<_Int], Apply[lnt-»Int] D Tuple(ld[lnt-»Int]DFst[(Iiit-^Int)j<.Int] ? Snd[(Int-»Int)_xInt]))) which is almost equal to the function twice3 from Example 4.1.6.
O
The clauses for compile-time and run-time lists are mostly straightforward because it is only when taking the head (hd or hd) of a list that we cannot determine the value of j3(penv). The clauses are given in Table 4.7. The clauses for the remaining constructs are given in Table 4.8 and exhibit no new features.
4.2
C o m b i n a t o r Introduction
97
CQ! [ t r u e I = t r u e £vciv\ t r u e ] = True[II(peni;)]
^ r i false] = false £ciVl false ] = False[II(pent;)] 5g;nv[ if ci then e2 else e3 ] = let pBTA(penv) \~ e2 : t : f3(penv) determine t in S(penvJ) if £ci PenV I e i] "then £ciPeriV[e2l e l s e ci I if ei then e2 e l s e e3 | = let /?BTA(p^nt;) \- e2 : t : f3(penv) determine t m Lj{penv,t) Cond(SclK )V [ c j , 5 C I l ;p [c 2 j, 5CIV ;p
[e 3 ]
] = let /?BTA(P C ^ V ) I" e : ^^^ : c determine t in 6(penv,t) f i x ^ci^ ent/ [ e ] WV
[ f i x e ] = let /?BTA(PC^^) ^~ e : ^z±^ • r determine t
Table 4.8: £^v:
Combinator introduction for expressions (part 5)
Proposition 4.2.6 (Correctness of £Q\UV) If the position environment penv is wellformed and the 5-level expression e satisfies h e : t : f3(penv) then 5ci nV [ e I ^s defined and
^^(pent;) h S^v[e]
: A(penv) t
A similar result holds for he.
D
Proof: The proof is by structural induction on the argument expression e. In each case the general strategy will be as follows: (i) First consider the subcase where the value of f3(penv) is such that the potential occurrences of £, £•, u and u;# have no effect. For compile-time constructs this is when (3(penv)=c and for run-time constructs this is when /3(penv)=r. (ii) Next consider the subcase where f3(penv) has the opposite value. In some cases this conflicts with the assumption
98
4
PBTk{penv)
Combinators Made Explicit
\~ e : t : (3(pe7iv)
and then the proposition holds vacuously. When there is no conflict we proceed as follows: (a) Verify that the type arguments to 6, u and UJ% satisfy the conditions. (b) Show that the equation for £ci nv [e] amounts to 8(penv,- • •) £ C i ?env [e] v [e] when (3{penv)=c. when [3(penv)=r and u>(penv,- • •) ££i (c) Combine (i) and (iib) using the insights about the combined effects of 8% and 8 and of w and UJ that were presented in Examples 4.2.2 and 4.2.3. This proof is mostly straightforward as it only amounts to formalising the explanations given when motivating the definition of £Q™. We shall therefore only consider the cases corresponding to those also considered in the proof of Proposition 2.2.8. The case e::=fi[f]. If f3(penv)=c we have
and clearly
h f\[t] : A(penv) t
If /3(penv)=r we have
where t0 is H(penv) and t is of the form tiz±t2. To see that t is indeed of the form tiz±t2 note first that H : r follows from ^{penv)—T and pBTkipenv) h fi[<] : t : f3(penv) and second that \~t:c follows because rule [f] must have been used in order to obtain this. We then clearly have Pci(Penv) 1~ Curry(fi[£]DSnd[£ox^i]) : A(penv) t as was desired. The case e::=fj[<]. If f3(penv)=r we have
where to=H(penv) and clearly h Fi[£Oiii£] • A(penv) t
4.2
Combinator Introduction
99
If (3(penv)=c we have
where £ is of the form ti^_t2. To see that t is indeed of the form ti^_t2 note first that \~t\c follows from 0{penv)—c and PBTA{penv)^~f±[t]'t''l3{penv) and second that H : r follows because rule [f ] must have been used in order to obtain this. We then clearly have pccl(penv) h Apply[*] • Tuple(F i [^ 1 ^],Id[^ 1 ]) : A(penv) t (By way of digression, it might be tempting to use F\[t] for £cinv[£i[£]] *n ^ however, then Fj[- • •] would no longer correspond to Ax[* * *]• fi[* * *]•) The case e::=\xi[t'].e0.
s
case
If f3(penv)=c we have
and it follows from the well-formedness of e that t must be of the form t'—^t". Using the induction hypothesis we then have p-cl(penv:(xht',c)) \- £g"'**i't'fi)[e0] : Aipenv.fat',c))
t"
since we know from the well-formedness assumptions about e and penv that penv:(xi,t',c) is well-formed PBTA(penv:(xiJ\c))
h e0 : t" : (3(penv:(xi,tf,c))
We then have p&ipenv) h AXi[i']. ^r : ( X i > < '' C ) |eo] : A(pent;) i Since f3(penv) must equal c this completes the present case. The case e::=Axi[£'].e0. If f3(penv)=r we have £STU»[f].eo] = Curry £ST" !(lilt'ir)[eo] Much as above, it follows from the induction hypothesis that p^ipenv.^t',!))
h ^ " " ^ ' ' ' ^ [ e o ] : A(pent;:(xi,i',r)) *"
where t is of the form t'=yt". We then have p^penv)
h Curry £^nv-(Xi'* 'r^[e0] : A(penv) t
If (3(penv)=c we have ^ci [AXi[c J.e o | — c C I
|eo|
5
100
4
Combinators M a d e Explicit
and the result follows immediately from the induction hypothesis. T h e c a s e e::=ei(e2).
If f3(penv)=c we have
It follows from the induction hypothesis that pccl(penv) h S^nv[et]
:
pccl(penv) h Sl\nvle2j
: t'
t'^t
for a suitable type t' and it is then immediate that phipenv) h 5S e r[ci(e 2 )) : A(pent;) t If f3(penv)=r we have
where t is of the form tiz±t2- This is because H : r comes from the well-formedness assumption about e, and h^:c follows because rule [()] must have been used to obtain this. We saw above that the induction hypothesis guarantees pccl(8%penv) h ££ I pcni '[e 1 (e 2 )] : A(*#pent;) t and we then have Pci(penv) h Curry(fciPCWV[ as desired. T h e case e::=e 1 ^e 2 i- If f3(penv)=r we have
where t'^±t is the type of t\. It follows from the induction hypothesis that h h
and it is then immediate that phipenv) h Apply[«'^i] D T u p l e ( ^ n e i ] / c r [ e 2 ] ) : A(penv) t If (3(penv)=c we have £ S r [ e i i e 2 U = Apply[<] D
^
^
where t is of the form ^iz±^2- That f must be of this form follows much as in the previous case. We saw above that the induction hypothesis guarantees
4.2
Combinator Introduction
101
pccl(to%(t) penv) h £ci W ? e W l e i.( e 2)] : A(u;#(J) penv) t and we then have pcCI(penv) \- Apply[<]DTuple(^ I #(0pen le 1 ( e2 )J,Id[< 1 ]) : A(penv) t as desired. The case e::=Xi. If pBTA{penv)(xi)=t:c and /3(penv)=c we have X
^CI FiJ -
i
and clearly Pci(Penv)
^~ x i
:
A(penv) £
If PBTA(p^^^)(xi)=^:c and f3(penv)=r we have
where t0 is H(penv) and ^ is of the form tiz±t2. That ^ must be of this form follows much as in the previous cases. It is then immediate that Pci(penv) h Curry(xiDSnd[£ox^i]) : A(penv) t If PBTA(p^nv){^i)=t:^ *X TJ _
l il - ^i
When defining 7r?env
then (3(penv)=r and we have penv We
argued that it would have type U(penv)=±t and hence
Pci(penv) h Tr/'671^ : A(penv) t as desired.
•
We can now define the function 'Pci that performs combinator introduction for programs. It has functionality
: P{E2,T2) <-* P(CE2,T2) and we shall take care only to apply it to well-formed programs. The definition is Veil
DEF x
i = e i • • • DEF x n =e n VAL e0 HAS t ] = let 0 h ei : ^i : c determine £i let 0[
in DEF x1 = e[ • • • DEF x n = e ^ VAL e'o HAS t
102
4 Combinators Made Explicit
Here we have used Fact 3.1.5 to determine the types of the expressions in the well-formed program. The correctness is given by Theorem 4.2.7 (Correctness of VCi) If pEP(E2,T2) is well-formed, i.e. h p, then VCilp]eP{CE2,T2) is defined and is well-formed, i.e. h VcilpY A similar Q result holds for h c . Proof: This is a straightforward consequence of Proposition 4.2.6.
4.3
•
Improving the Combinator Introduction
The transformations presented for combinator expansion (SQIV) and combinator introduction (SQ6^) have the disadvantage that they often produce rather large expressions that no human would have produced. One way to explain this difference is that the algorithms do not take account of certain identities, or simplifications, that humans expect to hold. In this section we shall present two classes of such simplifications and suggest that they could be used to develop improved versions of combinator introduction and combinator expansion. One class is often called partial evaluation [27, 47, 73] and amounts to performing simplifications like DEF x = e VAL e' HAS t t> VAL e'[e/x] HAS * (Ax.e')(e) t> e'[e/x] Applied to the suites program of Example 4.1.1 we may obtain VAL Apply[ ] • Tuple((Curry (f ix(Ag[ ]. Cond(Isnil[ ] • Snd[ ], Fst[ ], Applyf ]DTuple((Curry +[ ])DHd[ ]DSnd[ ], g D T u p l e ( F s t [ ] , T l [ ] n S n d [ ]))))) • Zero[], Id[]) HAS I n t l i s t z± I n t Here partial evaluation supplies reduce 9 e with its first parameter, that is (Curry +[ ]), which is already known at compile-time. Since these simplifications only change the compile-time structure, they could equally well be performed on the program sum9 from Example 3.1.6. The run-time counterpart of partial evaluation is called algebraic transformation [7]. An example is the transformation Apply[ ] • Tuple((Curry e) • e',e") > e • Tuple(e',e") If we apply this transformation twice to the above program we get the program defined by
4.3
Improving the Combinator Introduction
103
VAL f ix(Ag[ ].Cond(Isnil[ ] • Snd[ ], Fst[ ], +[ ] • Tuple(Hd[ ] • Snd[ ], gDTuple(Fst[],Tl[]nSnd[])))) D Tuple(Zero[ ], Id[ ]) HAS Int l i s t z± Int Note that by now all higher-order run-time functions have disappeared. Algebraic transformations play an important role in simplifying programs. The algorithm for combinator introduction proceeds by structural induction and algebraic transformations can be used to reduce certain unnecessarily complicated expressions that arise in this process. This is illustrated in the exercises.
Bibliographical Notes The mixed A-calculus and combinatory logic of this chapter dates back to [66]. In its present form it is closely related to the combinator language TML m of [76] (with close cousins in [72] and [78]). The major difference is that we have no general Const[t] e construct that transforms an expression e of type t' to one of type t=±tf. Instead we have used a formula with Curry, as is demonstrated in 8, and we have translated a constant of the form f\[tf] to one of the form F,[£—>£']. The algorithm £Q\ for combinator introduction is much as in [76] where a language called TMLi is translated into TML m . In short, SQJ is an extension of the usual algorithm for translating the typed A-calculus into categorical combinators (as in [21]). A more general algorithm is presented in [80] where a language called TML e is translated into a version of TML m . In this algorithm the type of Abound variables may change during the translation. A brief comparison of TMLi and TML e w a s given in the Bibliographical Notes of Chapter 3. Concerning the present version of £QJ it is worth observing that 8(- • •) e roughly corresponds to e> in [21] and that similarly LO(- • •) e roughly corresponds to e<. The use of combinators (for part of the notation) is best motivated by the development of the following chapters. In essence, it amounts to providing a flexible framework that allows interpreting the constructs for different purposes (including code generation and abstract interpretation). Given the use of categorical combinators for the interpretation of typed A-calculi in cartesian closed categories it should not be surprising that they crop up here as well. Intuitively it is important that free variables are handled in an explicit way. Formally the combinator notation must be functionally complete in the sense that the expressive power is equivalent to that of the typed A-calculus. It is not unlikely that the development of this and the following chapters can be performed for another fixed set of sufficiently well-behaved (i.e. functionally complete) combinators. However, it is important for the development of the following chapters that there is only a fixed selection of combinators. This means
104
4
Combinators Made Explicit
that supercombinators [41, 86], a very popular tool for the implementation of lazy functional languages, would not be immediately usable; the reason is that the number of and form of supercombinators depend on the actual program at hand. Despite the problems of using supercombinators for the development of the following chapters it might still be worth investigating whether the development of the present chapter could be modified so as to produce supercombinators for the run-time level; we leave this as an open problem. As in the previous chapters we have not considered any semantic implications of the transformation function. One possibility would be to define an operational semantics of the 2-level A-calculus, to define an operational semantics of the mixed A-calculus and combinatory logic, and to show that the transformation functions preserve 'reducibility'; we refer to [21] for an approach based on these ideas (but in the 1 -level case). For our purposes it might be more natural to define a denotational semantics of each of the two languages and to show that the transformation functions preserve semantics; the problem with this approach is that our ultimate semantics for the mixed A-calculus and combinatory logic is the notion of parameterized semantics of Chapter 5 and that this notion of semantics is not definable for the IMevel A-calculus except by using considerations that boil down to the transformation functions of the present chapter. A brief appraisal of the material of this chapter may be found in [76].
Exercises 1. List the five well-formed ^-level types that have (Int—>Int)—»(lnt—*Int) as their underlying type. Use the development of Chapter 3 to show that only three of these correspond to £-level A-expressions that have the function twice of Example 4.1.6 as their underlying expression. Deduce from this and the development in this chapter that Example 4.1.6 lists all iMevel expressions in combinator form that 'correspond' to twice. 2. One might consider adding a combinator Uncurry that is to be the 'inverse' of Curry. Its well-formedness rule would be
tenv h e : t'z±(t"z±i) tenv h Uncurry e : tlxt"-*t Show that this is not necessary by defining Uncurry e in terms of the combinators of Table 4.1. 3. Complete the definition of SQ™. 4. Evaluate SQjJtwicea] where twice3 is as in Example 4.1.6. Use the rule (AXi[*].e)_(e'I > e[e'/ Xi ]
4.3
Improving the Combinator Introduction
105
to simplify the expression. Can you achieve Ag[lnt—>Int].Ax[lnt].g(g_(x)_)_? If not, suggest additional simplification rules. 5. Consider the twice 2 functions of Example 4.1.6 and the twice' function resulting from 4i[Ag[lnt->Int]. Ax[lnt]. gigOOU in Example 4.2.5. Suggest rewriting rules on combinator forms that may be used to simplify twice' to twice 2 and perform these simplifications. 6. (*) Try to use the insights of Exercises 4 and 5 to define improved versions of SQI and SQJ. 7. Evaluate / Pci[ sum 9] where the program sum9 is defined in Example 3.1.6. Try to simplify the program using rewriting rules of the kind considered in Exercise 5. 8. (*) Consider a version of the iMevel A-calculus of Chapter 3 where the side condition to rule [up] is strengthened to include also the condition that e is of the form Axi[£].e0. Does sum9 belong to this version? Develop a simplified translation function S^v for this version of the £-level A-calculus. Modify ^BTA s o a s t ° produce 5-level A-expressions in this version of the 5-level A-calculus. 9. (*) For the type t l i s t we have used the combinators Cons(e,e), Nil[f], Hd[f], Tl[i], Instead of this it is possible to use Cons(e,e), Nil[t], Case(e,e) where the idea is that Case(ei,e 2 ) = Ax[£ l i s t ] . if i s n i l x then ei(void) e l s e e2((hd x, t l x)) where void is the only element of the type Void (one of the A]). Try to modify Chapter 4 so as to change to this new set of combinators.
Chapter 5 Parameterized Semantics We want to interpret the run-time constructs of our language in different ways depending on the task at hand and at the same time we want the meaning of the compile-time constructs to be fixed. To make this possible we shall parameterize the semantics on an interpretation that specifies the meaning of the run-time level. The interpretation will define • the meaning of the run-time function types, and • the meaning of the combinators. Relative to an interpretation one can then define the semantics of all well-formed 2level types, of all well-formed £-level expressions in combinator form and of all wellformed S-level programs in combinator form. In this chapter we shall provide the detailed development of this framework and illustrate it by definitions of various forms of eager and lazy semantics. In the following chapters we shall use the framework to specify various forms of code generation and abstract interpretation; this will substantiate the claim that the development of parameterized semantics gives the desired flexibility. In Section 5.1 we concentrate on the IMevel types. This begins with covering the required domain theory, defining the semantics of ,2-level types relative to an interpretation and then providing examples of eager and lazy interpretations. In Section 5.2 we perform an analogous development for IMevel expressions in combinator form. We conclude with a treatment of i?-level programs and a discussion of our approach to semantics.
5.1
Types
The syntax of 5-level types was introduced in Chapter 3 and is given by t e T2 t\\=ki\ki\txt\tyLt\t-+t\tz±t\t
list | t list 107
108
5
Parameterized Semantics
Our task is to associate a set of values with each of these types. More precisely, the sets of values will be equipped with a partial order so that they become domains. We therefore begin by expounding the domain theory that we need in order to account for the types A, and the type constructors x, —> and l i s t .
5.1.1
Domain theory
We already encountered partially ordered sets in Section 3.1, that is structures (D,[I) where D is a set and C is a partial order on D. In this section we shall consider certain partially ordered sets where, intuitively, ^ C ^ means that d 2 is at least as informative as d\. We shall provide examples to justify this later but it is worth pointing out that we shall not feel constrained to only considering partially ordered sets that can meaningfully be understood in this way. To prepare for the definition of domains we need the concept of a least upper bound. A subset Y of a partially ordered set (.D,E) is s a id to be consistent if there is an element d&D such that yQd for all yG Y and if this is the case we say that d is an upper bound of Y. An element d£D is the least upper bound of the subset Y if it is an upper bound and if d^-d1 holds for all upper bounds d1 of Y. Since a partial order is antisymmetric, the least upper bound of Y is unique if it exists and is written \_\Y. Intuitively, the idea is that \JY combines all the information expressed by the elements of Y but without adding additional information. A partially ordered set (.D,E) is said to be a complete lattice if every subset Y of D has a least upper bound \JY. When the partial order E associated with (Z?,E) is obvious from the context we sometimes write D rather than (Z),E)Example 5.1.1 Let Pa; be the set of subsets of natural numbers: Pu> =
{K\KC{0,l,2,---}}
Then (Po;,C) and (Pa;,I)) are partially ordered sets where C is subset inclusion and D is superset inclusion, i.e. KDKf means K'C-K. In (Po;,C) every subset Y has a least upper bound |J Y given by (J Y where [}Y = { ne{0,l,2,.-.} | 3KeY: neK } Thus (Po?,C) is a complete lattice. In (Pa;,3) every subset Y has a least upper bound |_JF given by f]Y where O F = { ne{0,l,2,---} \VKeY:
neK }
In particular, (J0=0 and p|0={O,l,2,- • •}. Thus also (Pa;,I)) is a complete lattice. In Chapter 3 we defined the notion of a greatest lower bound of a subset Y. In (Po?,C) every subset Y has a greatest lower bound given by f]Y. In (Pa;,D) every subset Y has a greatest lower bound given by (JY. Thus least upper bounds in (Po;,C) correspond to greatest lower bounds in (Po;,D) and vice versa. For this reason (Po;,C) and (Pa;,13) are often said to be dual of one another. •
5.1
Types
109
Example 5.1.2 Given a set 5 we note that (S,=) is a partially ordered set, where si=s2 means that si is equal to s2- We shall say that (5,=) is a discrete partial • order. For the purpose of semantics it is too demanding to require all subsets to have least upper bounds. Rather we restrict our attention to least upper bounds of two classes of subsets. One class is very trivial since it contains only the empty set; it is of interest because the least upper bound of the empty set (if it exists) will be less than or equal to (i.e. C) any other element. It is said to be the least element and is denoted J_. The other class of subsets are the chains: a subset Y of (£),E) is a chain if it is of the form Y = { dn | n>0 } where the elements dn£D satisfy the condition n<m => dnQdm It is commonplace to write (dn)n for {d n |n>0} and \Jndn for LJ{^n|n>0}. We then define a cpo (complete partially ordered set) to be a partially ordered set that has a least element and that has least upper bounds of all chains. Clearly any complete lattice is a cpo. If all chains are finite the cpo is said to have finite height and if all chains have at most two elements the cpo is said to be flat. Example 5.1.3 Given a set S and an element * not in S we may define a flat cpo (Sj.,Q by
S±= S\J{*} dQd' if and only if d=* V d=d' If we take care of the possibility that • was already in 5 we would take S±= {(1,S)\SES}\J{(0,*)} and define C. accordingly but this gives rise to a rather cumbersome notation. Since • is the least element it is common to write _L instead of • in the above definition. Examples of flat cpo's include the truth values B = {true,false}j_ which may be depicted as
true
and the integers Z = {•• -,-1,0,1,-
false
110
5
Parameterized Semantics
which may be depicted as ...
-2-10
1
2
•••
In these examples one may explain _L as modelling an evaluation of a truth value or an integer that never succeeds in producing a value. Thus d 1Cc!2 means that d2 is at least as informative as d\. • An element d£D is compact if the condition d E LMn =» 3n: d C d n holds for all chains (dn)n. The set of compact elements is usually denoted BD = { b£D | b is compact } It equals D when D is of finite height (or finite) but is, in general, a proper subset of D. The cpo (J9,E) is algebraic if BD is countable and each element d^D is the least upper bound of a chain of compact elements: d — Lilian where (b n)nCBj) is a chain The cpo (Z),E) is consistently complete if each consistent subset (i.e. subset with an upper bound) has a least upper bound. We then define a Scott-domain to be a consistently complete and algebraic cpo. E x a m p l e 5.1.4 The partially ordered set (Pi<;,C) is a Scott-domain and the compact elements are the finite sets. The partially ordered set (Pu,I)) is a Scottdomain and the compact elements are the cofinite sets, that is, those sets that equal {0,1,2,- • •} once a finite number of elements have been added. The cpo's B and Z are Scott-domains and in general (5j_,E) is a Scott-domain if and only if S is countable and in that case every element is compact. • A domain is usually taken to be a Scott-domain and this will also be the reading in this book. However, large parts of domain theory may be developed by simply taking the domains to be the cpo's; this includes virtually all of the domain theory developed in this book since we do not have recursive types. This means that little is lost if the remainder of this book is read with domain meaning cpo and ignoring all mention of compact elements, algebraicity and consistent completeness. Construction 5.1.5 Given partially ordered sets (D,Q) their cartesian product (DxE,C.) by
an
d (£",E) we may define
5.1
Types
111
DxE = { (d,e) | deD A (d,e) C (d',e') if and only if ((dCd') A (eCc') This is a partially ordered set. It is a cpo if D and E are; the least element is (_L,JL) and the formula for least upper bounds of chains is Un (d n ,e n ) = (UiA, U n e n ) It is a domain if D and E are, and the compact elements are the pairs of compact elements:
B DXE = Bn X Bi Finally, it is a complete lattice if D and E are, and the formula for least upper bounds is
= (U{d\3e:{d,e)eY},
U{e\3d:(d,e)eY})
•
(much like the formula for chains above).
Example 5.1.6 The cartesian product of the domain B of booleans and the domain Z of integers may partially be depicted as
(true,-l)
(false,-l)
(true,O) (false,O)
(true,l) (false,l)
Intuitively, ^ C ^ means that d2 is at least as informative as d\. This holds for the partial orders of B, Z and BxZ and we note that (6,z)C(6 / ,z / ) indicates that (b',zf) is at least as informative as (6,z) because zf and bf are at least as informative as z and 6, respectively. If (b',zf) is strictly more informative than (6,2) it may be • because b1 and/or z1 is strictly more informative than 6 and/or z. A function /:(/),E)~K£\E) from a partially ordered set (D,Q) to a partially ordered set (E,Q) is a total function f:D^>E from D to E. It is monotonic if
112
5
Parameterized Semantics
» f(d%f{d') (for all d and df in D)\ it is s£n'c£ if d is least in D => f(d) is least in E (for all d&D)', and it is continuous if (^n)n is a chain in D with least upper bound d
JJ (f(dn))n is a chain in # with least upper bound f(d) (for all chains (dn)n of D). Every continuous function is monotonic. If/ is monotonic and (dn)n is a chain then (f(dn))n is a chain in E; if d is the least upper bound of (dn)n then f(d) is an upper bound of (f(dn))n but not necessarily the least upper bound. Lemma 5.1.7 If/:(D,CI)—>>(Z),[I) is a continuous function from the cpo (I?,C) to itself the formula
defines the least (pre-) fixed point of /, that is f(d)Qd =• FlX(f)Qd Proof: Since ±Qf{±) induction gives/ n (±)C/ n+1 (±) and it follows that (/n(-L))n is a chain. Hence FIX(/) is well-defined and furthermore FIX(/) = Un / n+1 (-L) since (/n(J_))n and (/ n + 1 (±)) n have the same upper bounds. By continuity of/ we then have /(FIX(/)) = /(Un /"(!)) = Un / n+1 (-L) = FIX(/) Next let f(d)C.d. Since A-Qd it follows by induction that fn(±)C.d so that also Corollary 5.1.8 Under the assumptions of Lemma 5.1.7 we have FlX(f)Qd for every fixed point d of/, that is for every d such that f(d)=d. • Construction 5.1.9 Given partially ordered sets (D,C.) and (£*,E) w e m a v define their function space (D—>E,Q) by
D—>E = { / | / is a continuous function from (D,Q) to (£\E) }
: f(d)Qf'(d))
5.1
Types
113
Since we have already used the notation f:D—>E to indicate that / is a total function from the set D to the set E we must be careful about the newly defined notation in order to avoid confusion. Sometimes this leads to writing [D—>£], D—> CE or (.D,E)—>(£\E) for what we defined as D—>E above. However, the risk of confusion is only slight. To see this recall that the discrete partial order (D,=) only has constant chains, so if (dn)n is a chain in (D,=) then 3d£D: Vn: dn—d. Therefore any total function from D to E will be a continuous function from (D,=) to ( £ , Q . Thus f:D-^E amounts to nothing but fe(Dy=)-*(E,Q. When D and E are partial orders then so is D—>E as denned above. When D and E are cpo's then so is D—>E. The least element LD^E in D—*E is Ad.J_£ where _L# is the least element in E and the formula for least upper bounds of chains is Un/n = Ad. |Jn(/n(<0) (that is, LJn / n is a continuous function and is the point-wise least upper bound of the chain (/ n )n)- When D and E are complete lattices so is D—>E and
UY = \d. UU(d)\feY} is the formula for least upper bounds. Recall that diQd2 is supposed to express that d2 is at least as informative as di and similarly for e x Ce 2 . Monotonicity of functions in D-^E then says that a function cannot retract information already given for less informative arguments. The partial ordering of functions reflects the fact that a function is more informative the more informative the results it produces. — That functions should not only be monotonic but also continuous may be motivated in a similar sort of way and we refer to [93] and [94] for this. • Remark 5.1.10 It is slightly more complicated to describe the compact elements in D—+E, For d£D and e£E the step-function (dy->e) : D—>E is defined by if d'Dd , A 7/ | e dv-*e = \d'. { . ". y J_ otherwise and it is continuous if d is compact. It is a compact element of D^E if e is also compact. The general form of a compact element in D—>E is a least upper bound of a finite subset of such step functions; this means that
n>0 A ViE{l,- • -,n}: dxeBD A e ^ B ^ A en)} is consistent } It makes no harm to take n>0 above as [J0 equals (J_H->J_) and the least element of a cpo is always compact. One can show that if D and E are domains (not merely algebraic cpo's!) then D^E is an algebraic cpo as well as a domain. •
114
5
Parameterized Semantics
A predicate on a cpo D is a total function P : D —• {true,false}. It is an admissible predicate if • P(-L) — true, and • for every chain (d n ) n , if P(dn) = true holds for n>0 then P(\Jdn) = true Then we have the following fixed point principle: Lemma 5.1.11 Let f^D—^D be a continuous function on the cpo D and let P be an admissible predicate on D. If for all
P(d) = true => P{f(d)) = true then P(FIX / ) = true.
•
Proof: Since P is admissible we have P(JL) = true. Induction on n shows that P(/ n (_L)) = true and hence P(FIX / ) = true follows from the admissibility of the predicate P. • Construction 5.1.12 Given a partially ordered set ( D , Q w e n e x t define the partially ordered set (JD°°,E) of potentially infinite lists. To prepare for this we shall say that a set of positive numbers is convex if it equals {1,2,- ••} or if it is of the form {1,2,- • -,n} for some n>0. We shall say that the set has supremum n exactly when it equals {1,2,- • -,n}. Assuming that • is an element not in D we define D°° = {1:K—>D\J{*} | (K is a convex set of positive integers) A (VnGA: /(n)=* => n is the supremum of K) } We shall feel free to write dom(/) = K when l:K—»Z)U{*} and we shall also write dom*(/) = {i£dom(/)|/(i)^*}. To allow for a more convenient notation for the elements of D°° we shall write given by l(i) = [dud2,- • - A ]
for /:{1,2,3,. • -,n+l}->ZXJ{*} given by l(i)=di€D when i
[dud2,. • • ,d n 0
for /:{1,2,. • -,n}-^Z)UW given by l(\)=d\£D when i
For a concrete example consider the potentially infinite lists of booleans, B°°. Here /i = [true,true,- • •] denotes the infinite list of true's, and / 2 =[true,true,true] denotes a finite list of length 3, and finally / 3=[true,true,true<9 denotes the list where the first 3 elements are true but where the remainder of the list is undefined. In Miranda one may define
5.1
Types
115
11 = True:11 12 = True:True:True:[] 13 = True:True:True:14 where 14=14 and then 11 evaluates to /x etc. Next define IU' if and only if ( (dom(/)Cdom(/')) A (VnEdom(/): /(n)U'(n)) ) where /(n)CI/'(n) implies that if one of /(n) or /'(n) is • then so is the other. Intuitively, the partial order on B means that diQd2 amounts to d2 being at least as informative as dx and this carries over to the partial order B°°. In particular, [di,* • *,^n] E [^i?* * '^ml holds if and only if n=m and each d-^d^ so that a finite list is at least as informative as another when they have the same length and each element in one list is at least as informative as the corresponding element in the other. Similarly, [e/i,« • -,dnd Q [d[,- • -,drmd holds if n<m and each diQdi for i
D\j{*}
(Un U ( i ) = U{/ j (i)|dom(/ j )9i} When D is a domain also D°° is and BDoo = { leD°° | (dom(/) is finite) A (Vi£dom(/): /(i) is compact (or •)) } This means that the compact elements in D°° are the finite lists [c^,-• -,dn] and • the partial lists [di,- • -,dnd where each d\ is compact in D. C o n s t r u c t i o n 5.1.13 To prepare for the semantics of expressions we need some auxiliary notation for lists in D°°. For prepending an element rftoa list / we need the function CONSps defined by CONSPS(d,/) = /' where f d if i=l /'(i) = J /(i-1) if i-ledom(/) [ undefined otherwise Here we have identified V:K-*D\J{*} with a partial function /;:{1,2,- • - J ^ D U J i } that is defined on the convex subset K of the positive integers. The empty list is denoted by NILps and is given by
116
5
Parameterized Semantics
NILps = /' where *
if i=l
undefined
otherwise
The first element of the list / is denoted by HDps(/) and is given by f 1(1) i f ( l e d o m ( / ) ) A ( / ( l # * ) _L otherwise
HD
and the remainder of the list / is denoted by TLps(/) and is given by TLps(/) = /' where /(i+1
) undefined
ifi+lGdom(/) otherwise
Thus NILPS=[], HDpS([])=-L/) and TLPS([])=[d. Finally, ISNILPS(/) tests whether or not the list / is empty and is given by
{ 5.1.2
true if /=NILPS false if /^NILps A ledom(/) if /^NILps A l£dom(/) _L
Interpretation of types
We now return to the task of associating domains with well-formed 5-level types of compile-time kind. Our goal will be to define a function [• •-](J) : { t | hi:c } -> { D \ D is a domain } where X provides the interpretation of the run-time function types. The definition of |i] (X) is by induction on the structure of t and will be explained below: [Ai](J) = Ai x[*2](X) [t [t1=±t2](J)
= I(t1=±t2)
The domains Ai are assumed to be fixed once and for all for each index 'i' in the unspecified index set / introduced in Chapter 2. We shall assume that Abooi = B Aim = Z Avoid = Void = {void} x
5.1
Types
117
but otherwise we shall not further enunciate the choices of the A]. For the product type constructor we use the cartesian product introduced in Construction 5.1.5; for the function space type constructor we use the continuous function space introduced in Construction 5.1.9; and for the list type constructor we use the potentially infinite lists introduced in Construction 5.1.12. Finally, the meaning of the run-time functions is left to the parameter X. To ensure the well-definedness of [• • -](X) we must clarify our assumptions about X. We shall say that t is a frontier type if and only if H : r A h£:c; this just means that t is a well-formed run-time function type (and hence of form ti-*t). Definition 5.1.14 An interpretation X of types is a mapping from frontier types to domains: I : { t | HIT A H:c } -> { D \ D is a domain } Proposition 5.1.15 If X is an interpretation of types then [• • -](J) : { t | H:c } -+ { D | D is a domain } is well-defined; this means that [f](X) is a domain whenever H:c.
•
Proof: We prove by structural induction on a type t that if H:c then {t}(X) defines a domain. The case t::=Aj. This is a consequence of the assumptions about the Ai. The case t::=Aj. This is immediate as H:c does not hold. The case t::=tiXt2. struction 5.1.5.
This is a consequence of the induction hypothesis and Con-
The case t::=t\X_t2- This is immediate as h£:c does not hold. The case t::=ti^t2struction 5.1.9.
This is a consequence of the induction hypothesis and Con-
The case t::=ti=*_t2. This is a consequence of the assumption that X is an interpretation of types. The case t::=t0 l i s t . This is a consequence of the induction hypothesis and Construction 5.1.12. The case t::= t0 l i s t . This is immediate as h£:c does not hold.
•
Example interpretations In the remainder of this section we now provide some example interpretations of types.
118
5
Parameterized Semantics
Example 5.1.16 The interpretation S\\\ is defined structurally by Sin(Ai) = A; i2i*2) = S m (i 1 )xS m (f 2 ) l = t i 2 ) = Sm( i ! ) —5- Sin ( *2 )
Sm(*o l i s t ) = S1n(to)00 This is all very natural given the interpretation of the compile-time types above. (The use of the subscript '111' is to indicate that all of 2<_, i± and l i s t are interpreted in a lazy manner.) That this defines a function S m : { t | H : r A H:c } -> { D \ D is a domain } follows much as in the proof of Proposition 5.1.15.
•
To illustrate additional interpretations of types we need to introduce new ways of constructing domains. Construction 5.1.17 Given cpo's (D,C.) and (E,Q) we may define their smash product (D*E,Q) by D*E = { (
}
(d,c) C (d',e') if and only if ((dQd') A (eCe')) This is a cpo with least element (i-,-L) and least upper bounds of chains given by Un (d n ,e n ) = (Un ^n, LJn Cn) The compact elements are BD*E
= { (d,e)eD*E
| (d£BD) A (eeBE)
and D*E is a domain if D and E are.
} •
Example 5.1.18 The smash product of the domain B of booleans and the domain Z of integers may partially be depicted as follows (where t abbreviates true and f abbreviates false):
5.1
Types
119
In B and Z one may read d^d2 as saying that d2 is at least as informative as di] this also holds for B*Z. Compared with B x Z one notes the absence of partly evaluated tuples like (true,J_) or (J-,0). This suggests that * models the notion of product in an eager language like Standard ML whereas X models the notion of product in a lazy language like Miranda. • Construction 5.1.19 Given cpo's (D,Q) and (E,C.) we may define their strict function space (D—> SE,Q) by D—* SE = { / | / is a strict and continuous function from D to E }
/ C / ' if and only if (VdeD: f{d)Qf'{d)) This is a cpo: the least element is Xd.JL and the formula for least upper bounds of chains is as for D^E. If (D,Q) and (E^C.) are domains then so is D^SE and the compact elements are the strict compact elements of D—>E. (A step function (d\—*e) turns out to be strict if e—1. whenever d=_L) Assume next that diQd2 means that d2 is at least as informative as d\ and similarly for eiC.e2. As with the continuous function space this way of reading the partial order also carries over to D-*SE: a function is more informative than another if it produces more informative results than the other. The requirement that functions in D—> SE must be strict may be read as saying that the function cannot produce any information (or value) before it has been supplied with some. Thus the strict function space would seem to model 'functions' that exploit the call-by-value parameter mechanism whereas the continuous function space would seem to model 'functions' that exploit the call-by-name parameter mechanism. This motivates the use of continuous function space for a language like Miranda and strict function space for a language like Standard ML. • Construction 5.1.20 Given a cpo (D,Q) we shall next define the partially ordered set (D*,Q) of finite lists. We adapt the definition of D°° and thus assume that • is an element not in D. We define £>* = { /:{l,-..,n}->DU{*} | (n>0) A (VmG{l,-•-,n}: /(m)^J.) A / p ' if and only if ( (dom(/)Cdom(/ / )) A (Vn£dom(/): /(n)C/ / (n)) ) To allow for a convenient notation we shall write [du- • -,dn] for /:{1,- • ., n given by l(\) = di£D\{*} when i
120
5
Parameterized Semantics
This defines (D*,[I) as a cpo: the least element is _L as indicated above and the formula for least upper bounds of chains is as for D°°. Furthermore,(JD*,C) is a domain if (Z)£) is, and the compact elements are _L as well as the finite lists [e/i,- • -^n] where each d\ is compact. O Example 5.1.21 For a concrete example consider the finite lists B* of truth values. This domain may be depicted as
[true]
[false]
[true,false]
Recall that in B we have d1C.d2 when d2 is at least as informative as d\. This also applies to B* and compared with B°° we note that all lists now have finite length (or are _L). This suggests that (• • •)* models the notion of lists in an eager language like Standard ML whereas (• • -)°° models the notion of lists in a lazy language like Miranda. • Construction 5.1.22 By analogy with the definition of S± in Example 5.1.3 we may define the lifted partially ordered set (Dj_,E) whenever (JD,C) is a partially ordered set. For this we assume that there is an element * not in D and define
d\Zd' if and only if ((<*=•) V ( ( ( # • ) A ( d y * ) A {dQd'))) Writing (D,[Z)± for (D±^) as defined above we note that (*S_i_,E), as defined in Example 5.1.3, amounts to (5,=)_L, where (5,=) is the discrete partial order of Example 5.1.2. It is straightforward to verify that D± is a cpo if D is, that D± is a domain if D is, and that the compact elements are those of D together with *. If cf1Crf2 in D means that d2 is at least as informative as d\ then the same holds for d\Qd2 in D±. The difference is that the least element J_£> of D is now regarded as providing some useful information whereas *, the least element of D±, is the element regarded as providing no useful information whatsoever. • Example 5.1.23 The interpretation S eee is defined structurally by ^eeevAij ~ A-i
S e ^ l X . M = Seee(<1)*Seee(<2) Seee(^lZ±^2) = (S e ee(^l)~ > sS e ee(^2))x Seee(*0 l i g t ) = S eee (« o )*
5.1
Types
121
That this defines a function S e e e • { t \ \~t:r A \~t:c } -^ { D \ D is & domain }
follows much as in the proof of Proposition 5.1.15. The use of the subscript 4eee' is to indicate that all of _x., z± and l i s t are interpreted in an eager manner. We already argued that the use of * for 2£ and (•••)* for l i s t models the behaviour of products and lists in an eager language like Standard ML. We also argued that Seee(ti)—> sSeee(t2) corresponds to 'functions' from Seee(^i) to Seee(t2) that exploit the call-by-value parameter mechanism. But the totally undefined function, that is Ad._L, is still to be regarded as a bona fide data element, and this motivates adding a new least element, that is to use Seee(^iz±^2) = (Seee(^i)— >sSeee(^2))±- In this way the function Xd.A. may be an element in tuples or lists, in accord with what is possible in an eager language like • Standard ML. E x a m p l e 5.1.24 To illustrate that eager and lazy components may be mixed rather freely consider the interpretation Siei defined by Slel(Ai) = A; £*2) = (S W (*l)xSid(*2))jL
Siei(*o l i s t ) = SM(t0)°° Here functions are assumed to be strict corresponding to the parameter mechanism call-by-value as used in an eager language. We have used lifting as well for much the same reasons put forward in Example 5.1.23, for example to allow the totally undefined function to be passed as a bona fide data element to a strict function. Products and lists are modelled by the 'lazy operators' also used in Sm. One difference is that we also use lifting for products; in the presence of strict function spaces this is needed to allow (-L,-L) as a bona fide data element. — We believe that this interpretation of types comes pretty close in modelling Hope [30], which • employs eager evaluation but has lazy data constructors. In the remainder of this book we will mostly be concerned with an interpretation called S. Its effect on types is given by S(Ai) = A; S{hxt2) = (S(t1)xS{t2))± S{tQ l i s t ) = S{to)°° We shall motivate this shortly. But first we note that well-definedness of
122
5
Parameterized Semantics
S : { t | H : r A H : c } -» { D \ D is a domain } may be proved much as in the proof of Proposition 5.1.15. There is little to be said about the use of Aj for Aj. The use of • • • x • • •, • ••—>•••• and • • -°°all suggest that we are modelling a lazy language rather than an eager language. In this respect S is close to S\\\ of Example 5.1.16 and distinctly different from S eee of Example 5.1.23 and S\e\ of Example 5.1.24. However, S differs from Sm in the use of lifting, • - -±, for products and functions much as in Siei and to some extent as in S eee . This means that we may model the difference between _L and (_L,_L) and between J_ and Xd.l. This is going to be vital for our simple-minded code generation of Chapter 6. However, there is some debate as to whether the use of 'lifting' should be present when modelling a lazy functional language. One may read the Miranda manual [103] as saying that there should be no difference between _L and (JL,_L) and between J_ and Xd.l. However, when experimenting with Miranda it is possible to construct examples where a difference seems to arise. Turning to Haskell [39] it would seem that one should distinguish between _L and (-L,_L) but that there is no need to distinguish between _L and Xd.l. Finally, the lazy A-calculus [3] has a clear distinction between _L and (JL,_L) and between _L and Xd.l.. Given the decision to use lifting for run-time products and run-time functions it may very well be asked whether we should also use lifting for compile-time products and compile-time functions. This is a feasible suggestion but in the absence of a clear solution to the above debate we shall stick to the 'more traditional' use of cartesian products and continuous functions without using lifting.
5.2
Expressions
In general an expression e of the mixed A-calculus and combinatory logic will have a type of compile-time kind and may have some free variables whose types are also of compile-time kind. To express this succinctly we write tenv he e : t where tenv is a type environment as in Section 4.1 and t is a £-level type of compiletime kind. From Fact 4.1.4 we see that the type t is uniquely determined by the expression e and the type environment tenv. Much as in Section 4.2 we shall find it convenient to work with a linearised form of type environment. This motivates defining a position environment, penv, to be a list of pairs of variable names and 5-level types. (Compared with Section 4.2 we have dispensed with the binding time component as it is always going to be c.) The type environment, /? corresponding to a position environment, penv, may be defined as follows:
5.2
Expressions
123
if the rightmost (XJ,£J) in penv with xj—Xj has t—t } if no (xj,tj) in penv has xj=Xj
t undefined
Much as in Section 4.2 we shall say that a position environment is well-formed when all 5-level types contained in it are, in fact, well-formed £-level types of compiletime kind. When penv is well-formed all pps(penv)(x[) will be well-formed 5-level types of compile-time kind, and by Fact 4.1.3 the type t will also be a well-formed £-level type of compile-time kind. Semantics of expressions then amounts to defining a function [e]CT)pe», : \penv]{I) -> [«](!) whenever penv is a well-formed position environment and t is the unique type such that pps(penv)\~ce:t. The domain [penv](I) is defined by it penv=() i£penv=penv':{x'J') It follows from Proposition 5.1.15, Construction 5.1.9, Fact 4.1.3 and the welldefinedness of the position environment, that |pent;](J) —> [i](X) is indeed a domain. By stretching our notation a little bit we may summarize the functionality of each [e](X)peTlv as the definition of a function [ • • • ] ( T ) , e » , : { e | pPS(penv)
h c : ( } ^
(\penv](I)^t](l))
In the definition of [e](X) penv below we shall be rather precise about the position environment, penv, but later we shall allow ourselves to dispense with it. Also we shall need to impose stronger demands on the parameter J than merely being an interpretation of types; we shall return to this afterwards. Finally, we shall use the type t below to denote the overall type of the expression between the semantic brackets '[' and ' ] ' . [ *[*] }(l)penv
= Aenv.
[ (ei,e 2 ) I (I) pcnV = Aenv. ([ei](J) pcnv (env), [e2](X)pCnv(env)) [ f s t e ](I)Penv = Aenv. vx where (vi,v 2 ) =
{e](l)penv(env)
I snd e ]{l)penv
[e](l)penv(env)
= Aenv. v 2 where (vi,v 2 ) =
I \xi[t'}.e ](l)penv
= Aenv. Av. [e](I) pcnv:(Xii< /)(env,v)
[ ei(e 2 ) ](I)Penv = Aenv. [eil(I) pcn t,(env) [ xj }(l)Penv = Aenv. 7rPS(xi,penv)(env) where 7rps(xi,pen'y/:(xj,^))((env/,v)) = v if xi=Xj / / 7rps(xi,pen?; )(env ) otherwise
124
5
Parameterized Semantics
[ ex:e2 ](I}penv = Aenv. C0NSPS([ei](J)pent,(env), [e2](T)peBV(env)) I nil[t] \{l)penv
= Aenv. NILPS
I hd e ](J)Pen» = Aenv. HD PS([[eJ(Z)j,en,,(env)) [ t l e ](X)penv = Aenv. TLPS([e](J)pens(env)) [ i s n i l e }(l)penv = Aenv. lSNlLPS([e](Z),eBV(env)) [ true \{l)ptnv
= Aenv. true
[ false }(J)Penv = Aenv. false [ if ei then e2 else e3 ](J)pen« = [e2](I)Peni;(env) if v=true [e3i(I)peBV(env) if v=false ± if v = l where v = [e!](I)pent,(env)
{
[ fix e }{l)penv = Aenv. J(fix[«])([c](J), e,,(env)) [ Fi[«] ]{l),env = Aenv. Tuple( ei ,e 2 ) ](I), e », = Aenv.
([e 2 ](I) pe ,,(env))
, e », = Aenv. I(Fst[t']) ^ ](!),,„ = Aenv. J(Snd[f]) [ Curry e }(J)penv = Aenv. J(Curry[i]) ([c](J),eBV(env)) e».
- Aenv.
](T)pnv = Aenv. !(•[
env(env))
5.2
Expressions
125
TYPE(f) (to^t1)^{t0^t2)-^(to=tt1xt2) (hxt2)=±h SiLd[tiXt2] {tiXt2)z±t2 Cons[in->(£i l i s t ) ] (*o->*i)->(*n->(*i list)H(* n -K*i l i s t ) ) Nil[*n->(*i l i s t ) ] tn-*(U l i s t ) ff l i s t W Hd[*] ff listWft list) Tl[i] Cf listWBool Isnil[<] t *iW to=th Fi[Bool True[i] f^Bool False[<] Cond[£0—>£i] «n^B00l)->ff
Tuple^on^iX.^] Fst[ii_x*2]
Table 5.1: Operators and their actual types [ True[t'} }{l)penv = Aenv. [ False[i'] ](I) pe ». = Aenv. I(False[t']) [ Cond(e1,e2,e3) J(J), e », = Aenv. J(Cond[<]) ([e 3 ](r) pe »,(env)) [ Fix[i']
= Aenv. J(Fix[i'])
The definition of [• • •](X)pCnv i s mostly straightforward. Several of the clauses demand that I interprets certain 'operators' (like f\ or Cond). We shall therefore define a notion of 'an interpretation of operators' and demand that 1 is an interpretation of operators (as well as of types). The 'operators' of concern are those candidate <£'s that are listed in Table 5.1. Each operator (f) includes a type component in square brackets and generally this is the minimal amount of information needed to define the 'type', TYPE(>), of the operator. One may note from the definition of TYPE and [• • -\(T)venv that operators are regarded as being curried (e.g. Cond). In the examples we shall often dispense with the type component of operators as an aid to readability. Finally, the semantic clause for variables makes use of an auxiliary function ?rps for locating a value in an environment.
126
5
Parameterized Semantics
Example 5.2.1 Consider the expression fix(Af[lnt->(Int list)].Cons(ld[lnt],f)) Given an interpretation X and an empty position environment () the meaning [fix(Af[Int-»(Int list)].Cons(Id[lnt] ? f))](J) () will be an element of the domain Void -> J(Int-»(lnt l i s t ) ) It is given by Av.J(fix[])(Af.J(Cons)(J(Id[]))(f)) where we have omitted the type information from the square brackets.
•
Definition 5.2.2 An interpretation X of operators (or just interpretation) is an interpretation X of types together with a designated element
for every operator , such that TYPE(^) is defined according to the definitions given in Table 5.1 and such that TYPE(^) is a well-formed 2-\eve\ type of compiletime kind, i.e. hTYPE((/>):c. A C-restricted interpretation of operators is defined analogously except that X(fj[£]) and X(F-1[tf=±t]) only need to be defined if t is a iMevel instance of C(fi). • Proposition 5.2.3 If X is an interpretation, penv is a well-formed position environment and h e: t then the clauses for [e](I) pcnv define a continuous function from [penv](X) to
If pps{penv)hje'-t w e need only assume that X is a C-restricted interpretation rather than a general interpretation. • Proof: The proof is by structural induction on expressions and makes use of Proposition 5.1.15. It amounts to showing that: (i) each clause for [• • '](X)penv defines an element [e](J) periv (env) of [f](2") when env is an element of [penv](I);
5.2
Expressions
127
(ii) the interpretation J is never applied to an operator <j> such that Table 5.1 contains no definition of TYPE(<^) or such that TYPE(>) is not a well-formed i?-level type of compile-time kind; (iii) |e](X) penv (env) depends continuously on env. The latter involves showing the continuity of the functions CONSps, HDps, TLps and ISNILps introduced in Construction 5.1.13. As the proof is largely routine we shall dispense with the details. • E x a m p l e interpretations We next illustrate this set-up by defining some example interpretations. These all have the flavour of so-called 'standard interpretations' in that they prescribe the input-output semantics. This is quite unlike the 'non-standard interpretations' to be encountered in Chapters 6, 7 and 8 where we will study code generation and abstract interpretation. E x a m p l e 5.2.4 We begin by extending the interpretation Sm of types (Example 5.1.16) to an interpretation of operators. When considering an operator <j> with type component [t] we shall assume that t is such that TYPE(>) is defined and is a well-formed 2-leve\ type of compile-time kind. * 0 ^iX_*2]) = A/!.A/2.Av. (/i(v),/ 2 (v)) Sui(Fst[iiX_t2]) = Av. v i where (vi,v 2 ) = v Sui(Snd[^i><_<2]) — Av. V2 where (v!,v 2 ) = v Sm(Cons[i0z±(«i l i s t ) ] ) = A/i.A/ 2.Av. C O N S P S ( / I ( V ) , / 2 ( V ) ) S m (Nil[i O z±*i l i s t ] ) = Av. NILPS Sm(Hd[<]) = Av. HDPS(V) Sm(Tl[«]) = Av. TLpg(v) Sm(Isnil[<]) = Av. iSNlLps(v) Siu(fi[*]) = f/ where f/ are unspecified elements in [£](Sm) Sni(Fi[<0=±*i]) = F* 0 -' 1 where F;0—
5
are unspecified elements in [tn—>tt](Sm)
Sin(a[(i 1= y* 2 )x(«o=t
128
5
Parameterized Semantics
S m (False[i]) = Av. false / 2 (v) / 3 (v)
±
if/!(v)=true if/ 1 (v)=false
if/i(v)=X
Sm(f ix[t]) = FIX where FIX(F) = \_\n Fn(±) Sm(Curry[to^lt1=±t2)])
= A/.Avi.Av2. /(v) where v = (v!,v 2 )
S m (Apply[i 0 ^i]) = Aw. / ( v ) where (/,v) = w S m (Fix[i]) = A/. FIX(/) It is straightforward, but tedious, to verify that this defines an interpretation of operators. If we intend to define Sm as a C-restricted interpretation, J ( f \[t]) need only be defined when t is a £-level instance of C(f\) and X(Fi[^Oz±^i]) need only be defined when t\ is a £-level instance of C(f\). Using the interpretation Sm the meaning of the expression fix(Af[lnt-+(lnt list)].Cons(ld[lnt].f)) of Example 5.2.1 is an element of Void -* (Z -> Z°°) and it is given by Av.FIX(Af.Au.coNSPS(u,f(u))) which is equivalent to Av.Au.[u,u,- • •] Applying this function to a dummy argument and the integer 1 will thus produce an infinite list of l's. Q To complete the definition of the interpretation S of Section 5.1 we need the auxiliary functions up:D—^D± and dn:D±—*D defined by up(d) = d d
if <#*
l
Here we have used the definition Dj_=DU{*} from Construction 5.1.22; had we used the definition D±=({1}xD)U({0}X{*}) we would have had up(d) = (l,d), dn((l,d)) = d and d?i((0,*)) = J_ instead. We then have the following definitions (leaving an explanation of S(f ix[£]) until afterwards):
5.2
Expressions
129
oi±*i.x*2]) = A/i.A/ 2 . if/i ^ otherwise S(Fst[£i><_£2]) = up(Xv. Vi where (v 1? v 2 ) = S(Snd[<121<2]) = wp(Av. v 2 where (v!,v 2 ) = dn(v)) S(Cons[i 0 =t(*i l i s t ) ] ) = A/x.A/2. up(Av.CONSPS(dn(/i)(v), -L
otherwise
S(Nil[
= A/j.A/2.
uP(Xv.dn(f1)(dn(f2)(v))) if A ^ l a n d / ± otherwise
2
^l
S(ld[i]) = up(Av.v) S(True[<]) = «p(Av. true) S(False[i]) = up(Av. false) S(Cond[<0=>iii]) = A/1.A/2.A/3. d«(/2)(v) if rfn(/ 1)(v)=true dn(/ 3 )(v) if dn(/ 1 )(v)=false ) ± if dn(/,)(v)=± otherwise S(f ix[t]) = FIX where FIX(F) = Un if t is pure (see below)
Fn(l)
130
5
^t2])
Parameterized Semantics
= XF. Un>i
S(fix[
where
F x =S(f ix[<1])(Av1.w1 where (w 1 ,w 2 )=F((v 1 ,F 2 (v 1 )))) F 2 =Av 1 .S(fix[i 2 ])(Av 2 .w 2 where (wi,w 2 )=F((vi,v 2 ))) and t\ x i 2 is composite but not pure (see below) S(Curry[*0=»:(*i=»*2)]) = Mup(\vi. wp(Av2. dn(f)(v) where v = wp(v1?v2))) if/^J. _L otherwise ^Oz±^i]) = up(Xw. dn(f)(v) where (/,v) = dn(w)) S(Fix[*]) = up(Xf.
FlX(dn(f)))
where FIX(F) - Un ^ Apart from the definition of S(f ix[£]) it is straightforward but tedious to verify that this is indeed an interpretation. We shall return to the well-definedness of S(f ix[*]) shortly. Example 5.2.5 To illustrate the difference between Sm and S consider the expression Tuple(Fst[lntj<_Int] ?Snd[lntj<_Int]) It has type Intj^Int —> Intj^Int and thus Sm will produce an element of the domain Void -> (ZxZ -> ZxZ) and S one of Void -> ((ZxZ) ± -> ( Z x Z ) ± ) ± . In the case of Sm it is Av.Au.u so whenever the void environment is supplied this is the identity function. In the case of S we get \v.up(\u.up(dn u)) Note that this does not correspond to the identity function once the void environment is supplied. Experiments with Miranda show that Tuple(Fst[ ],Snd[ ]) does indeed behave differently from the identity function. To be explicit consider the Miranda definitions idl v = v id2 v = ( f s t v, snd v) bot = bot
5.2
Expressions
131
Then the behaviours of i d l bot and id2 bot are slightly different. Since similar phenomena arise in Chapter 6 we decide to concentrate on interpretation S rather than interpretation Sm, but as discussed in Section 5.1 this decision may be subject to debate as far as Miranda is concerned. Finally we note that a similar example can be constructed using Curry and Apply to show that S and Sm differ here as well, and again S is close to the way the Miranda implementation behaves. • The definition of S(f ix[f]) requires some clarification. The most natural thing would be to set S(f ix[t])(F) = FIX(F)
(*)
for all types t. However, we have arranged that operators are in fact strict in their arguments, e.g. S(Tuple)(J_)(/)=_L So when t is a frontier type it is very likely that F _L = _L which means that FIX(F) would always be _L This relates to the decision to use
and to distinguish between the least element, _L, and the second least element, ^p(JL), where the latter corresponds to the 'undefined' function that may freely be passed around but that never terminates when called. Similar distinctions will be made in Chapter 6 when specifying code generation. This motivates retaining (jj) but to be more careful about (*). In particular, S(f ix[t1=±t2])(F)
= UnM
Fn(up(±))
would seem to give the required effect. If F(up(l)) = _L then F(JL) = _L by monotonicity so S(fi-x.[t1=±t2])(F) is defined and equals _L. If F(up(l.)) ^ _L then S(fix[t1=±t2])(F) F(up(±)) D up(±) so (F n (^(_L))) n >! is a chain and therefore is well-defined. This leaves us with the remaining types. If no frontier types are involved we should still be able to make do with (•). To make this precise we define the predicate pure by pure(Aj) holds for all A\ if pure(^i) and pure(£2) then pme(tiXt2) if pure(^) and pure(£2) then pure^—>£ 2) if pure(£) then pure(£ l i s t ) and note that pure(f) would seem to formalise the notion of t not involving any frontier types. We thus define S(fix[t]) - FIX if pure(t)
132
5
Parameterized Semantics
and clearly this is well-defined. We are left with the more complex combinations of pure and frontier types. As a simple example, consider t = (^iz±^) x (*iz±*2)- Here it would be natural to define S(f ix[£])(F) to be |Jn Fn(up(±.),up(A.)). However, well-definedness is a problem. To see this let d be such that d^up(l) and d^up(-L). Then it is conceivable that F(up(±),up(±.)) = (d,-L), F(d,±) = (_L,d), F(±,d) = (d,J_) and F(d,d) ^ (d,d). This means that no fixed point arises 1. We will thus have to take a different route and for this we shall be content with only considering cartesian products; the reason is that lists and function spaces pose some problems when we come to code generation (as discussed in [72]). We then say that a type t is composite if • it is pure, or • it has the form £iz±^2> ov • it is a product of composite types. In this case we define
S(f ixlhxh})
= XF.(F1,F2(F1))
where F1=S(fix[t1])(\v1.w1
where (w 1 ,w 2 )=F((v 1 ,F 2 (v 1 ))))
F 2 =A V l .S(f ix[« 2])(Av2.w2 where (w 1 ,w 2 )=F((v 1 ,v 2 ))) Clearly we have a well-defined definition. To show that it is sensible we note Lemma 5.2.6 If F : DxE -> DxE and • Fi is a fixed point of (Avi.Wi where (wi,w 2 )=F((vi,F 2 (vi)))) • ^ ( v i ) is a fixed point of (Av2.w 2 where (wi,w 2 )=F((vi,v 2 ))) for all values of then • (F 1 ,F 2 (F 1 )) is a fixed point of F
•
Proof: It is convenient to write
Alternatively one might reconsider the use of cartesian product for x and use smash product instead. (In a similar vein one might use lifted strict and bottom-reflecting functions for —*.) However, the use of smash product is not to be recommended for pure components so this approach would lead to a rather complex mixture of smash and cartesian product.
5.3
Programs
133
for wj where (wi,w 2)=* • •
The assumptions then yield that F1 = F(FUF2(F1))H from which the desired result (F1,F2{F1)) = F(F1,F2(F1)) •
easily follows. This result is adapted after the following much more well-known result. Lemma 5.2.7 ('Bekic's Theorem') If F : DxE -> DxE then FIX(F) = (F 1 ,F 2 (F 1 )) where Fi=FIX(Avi.W! where (wi,w 2 )=F((v 1 ,F 2 (vi)))) F2=Avi.FIX(Av2.w2 where (w1,w2)=F((v1,v2))) Proof: We shall leave the proof to Exercise 17.
• •
The restriction to composite types t when interpreting S(f ix[i]) means that technically we have not provided an interpretation of operators. To amend this we modify rule [fix] of Table 4.2 and use [f ixK]
tenv h e : t—*t
if t is composite tenv h fix e : t We shall write h for the well-formedness relation h when [f ix K] is used instead of [fix]. We shall thus be content with only defining an interpretation of operators for the designated subset of the mixed A-calculus and combinatory logic.
5.3
Programs
To define the semantics of programs we shall once more use the relationship between a program DEF x1 = e1 - • • DEF x n =e n VAL e0 HAS t
134
5
Parameterized Semantics
and a nested A-expression
that has been used in the definitions of V%A, ' P ^ a n d Vci- We thus define [ DEF x1 = e1 • • • DEF x n =e n VAL e0 HAS t ](I) G [t}(I) whenever h DEF x1 = e1 • • • DEF x n =e n VAL e0 HAS < by [ DEF xi = ei • • • DEF x n =e n VAL e0 HAS t ](I) = let 0he!:^i determine ^ let 0[*i/xi]- • '[tn^i/xn^i]hen:tn determine tn in [ (Axi[^i].« • •(Ax n [^ n ].e 0 )(e n )-•-)(ei) ](J)(j(void) where void is the unique element of Void that is not _L Theorem 5.3.1 If 2 is an interpretation and peP{CE2,T2) program (that is \~p) of the form
is a well-formed
DEF X! = ei • • • DEF x n =e n VAL e0 HAS t the above equation defines e If \~cp we need only assume that J is a C-restricted interpretation. Proof: The result is straightforward from Propositions 5.2.3 and 5.1.15. Example 5.3.2 Consider the program suiting defined in Section 4.3: VAL f ix(Ag[ ].Cond(lsnil[ ] • Snd[ ], Fst[ ], +[ ] • Tuple(Hd[ ] • Snd[ ], gDTuple(Fst[],Tl[] • • Tuple(Zero[ ], Id[ ]) HAS Int l i s t z± Int The semantics in the lazy interpretation Sm amounts to
I
,,,,
/
11/0
\
= XL FIX(G)(O,/) v if /=[] where G(g)(v,l) — ^ v' if i. if/=±
• •
5.3
Programs
135
and v' abbreviates PLUS(HDpS(/),#(v,TLpS(/)))
{
-+vn
if (l=[vw • ;vn\) A (Vi: v
JL if (/=[«!,- •-,»„]) A (3i: _L if/=[»!,---,i; na -L if/=[»!,•••] assuming t h a t Sm(+) = Av.PLUSif a n d
PLUS(, ^ v 1,,2)y = I ] [ J_
Vi=±)
' ("i^)A("^-L) otherwise
On the role of semantics In the previous chapters we have presented type inference, binding time analysis and combinator introduction as purely syntactic manipulations of programs. We have only referred to a vague and unspecified notion of semantics when considering program transformations and the extent to which they are correctness-preserving. Having introduced parameterized semantics and the 'standard interpretation' S, we can now take at least two approaches. One approach is to view the parameterized semantics as the ultimate semantics for the original enriched A-calculus of Chapter 2. Taking sum as an example we thus have to • transform it into an explicitly typed program, • then into a program that is annotated with explicit binding time information, and • finally into combinator form. Calling the resulting program sumc we thus define the semantics of sum to be the parameterized semantics of sumc. We can then express the correctness of the program transformations, partial evaluations and algebraic transformations as the condition that they do not change the final semantics regardless of the 'standard interpretation' used. Another approach is to give a direct definition of the semantics of the enriched A-calculus. Then one must ensure that the semantics of an untyped program (like sum) corresponds to the parameterized semantics of the transformed program (that is sumc) with respect to some 'standard interpretation'. We mentioned some possibilities in the Bibliographical Notes of Chapters 2, 3 and 4. However, we shall not pursue this approach since our ultimate interest is in non-standard interpretations which cannot be defined directly in terms of the untyped A-calculus.
136
5 Parameterized Semantics
Bibliographical Notes Standard references to domain theory include [93], [94] and [98] and text books on the subject include [96] and [91]. A less detailed account may be found in [83] but it should suffice for understanding the material of this chapter. The notion of parameterized semantics used here is based on [72] and [78]. A brief appraisal of the material of this chapter, with a view towards Chapter 7, may be found in [70].
Exercises 1. Prove the claims made about (Pu;,C) and (Po;,D) in Examples 5.1.1 and 5.1.4. 2. Prove that the cartesian product x of Construction 5.1.5 preserves the following properties: is a cpo; is a complete lattice; is a cpo of finite height; is a domain. (That is prove that if D and E are cpo's then so is DxE; that if D and E are complete lattices then so is DxE, etc.) 3. Prove that every continuous function is monotonic. (Hint: if dC-d' then {d,d1} is a chain.) 4. Prove that if/:(£),£)—>(£",C) is a monotonic function, and (D,E) and (E,Q) are cpo's, then / is continuous provided that
/(Un 4 ) C|J holds for all chains (dn)n of D. 5. Show that a chain in a partially ordered set D amounts to a monotonic function from (u;,<) to Z), where u;={0,l,- • •} is the set of natural numbers and < is the usual order relation for natural numbers. 6. Prove that the continuous function space —> of Construction 5.1.9 preserves the following properties: is a cpo; is a complete lattice; is a finite cpo. 7. (*) Prove that D-+E is a domain if D and E are. 8. (*) Prove that D°°is a cpo when D is, and that Z?°°is a domain when D is. 9. Prove that the smash product * of Construction 5.1.17 preserves the following properties: is a cpo; is a complete lattice; is aflat cpo; is a cpo of finite height; is a domain. 10. Prove that the strict function space —» s of Construction 5.1.19 preserves the following properties: is a cpo; is a complete lattice; is a finite cpo.
5.3
Programs
137
11. (*) Prove that D—> SE is a domain if D and E are. 12. Prove that the finite lists (• • •)* of Construction 5.1.20 preserves the following properties: is a cpo; is a flat cpo; is a cpo of finite height; is a domain. 13. Prove that the lifting (• • -)j_ of Construction 5.1.22 preserves the following properties: is a cpo; is a complete lattice; is a cpo of finite height; is a domain. 14. Define (at least partially) S eee as an interpretation of operators. 15. Define (at least partially) Siei as an interpretation of operators. 16. Repeat the development of Chapter 5 so as to interpret compile-time constructs in the manner of interpretation S (rather than in the manner of interpretation Sm). 17. Prove 'Bekic's Theorem' (Lemma 5.2.7). (Hint: consult [72] or [9].) 18. (*) When changing [fix] of Table 4.2 to [f ix K ] we may want also to change [fix] of Table 3.5 in a similar way. Investigate whether or not this influences the algorithm for combinator introduction as developed in Chapter 4. When changing [fix] of Table 3.5 we might consider changing [fix] of Table 2.3. Try to show that this is not necessary by modifying the binding time analysis as developed in Chapter 3.
Chapter 6 Code Generation The previous chapter developed the notion of parameterized semantics for the mixed A-calculus and combinatory logic. This was applied to showing that the run-time part of the language could be equipped with various mixtures of lazy and eager features. In this chapter we shall stick to one of these: the lazy semantics S. The power of parameterized semantics will then be used to specify code that describes how to compute the results specified by the lazy semantics. The abstract machine and the code generation are both developed in Section 6.1 as it is hard to understand the details of the instructions in the abstract machine without some knowledge of how they are used for code generation and vice versa. The abstract machine is a variant of the categorical abstract machine and its semantics is formulated as a transition system on configurations consisting of a code component and a stack of values. The code generation is specified as an interpretation K in the sense of Chapter 5. The remainder of the chapter is devoted to demonstrating the correctness of the code generation, K, with respect to the lazy semantics, S. To cut down on the overall length of the proof we shall exclude lists from our consideration. Section 6.2 then begins by showing that the code generation function behaves in a way that admits substitution. Next, Section 6.3 shows that the code generated is 'wellbehaved' in that it operates in a stack-like manner. This is a useful preparation for Section 6.4 where it is shown that the results produced by the code agree with the semantics.
6,1
The Coding Interpretation
The configurations of the abstract machine have the form (C,ST) where C is the sequence of instructions to be executed and ST is the stack of intermediate results. The instructions, values and configurations of the machine are summarized in Table 6.1 and will be explained in the remainder of this section. We shall write i\C for the code sequence with i as its first instruction and C as the remaining code 139
140
6
Code Generation
/ G Labels = N i G Ins i ::= CONST bj | PRIM OY | ENTER | SWITCH | BRANCH( C , C ) TUPLE | FST | SND | CONS | NIL | HD | TL | ISNIL | C U R R Y ( C ) I APPLY | DELAY(C) | RESUME I
CALLREC(/,C) | CALL / | REC C G Code = { i1:i2:- • -li^ie | 0 < k , i^elns for l < j < k } v G Val t ; : : = b i | (v,v) \ [} \ [v:v] \ {C;v} \ {C,v}
ST e Stack = { [vu- • .,vn] | 0} eVal for l < j < n } Config = { {C,ST) \ C G Code, ST G Stack } Table 6.1: Configurations of the abstract machine sequence and similarly, we write v:ST for the stack with v as its top element and ST as the remaining stack. The concatenation of two code fragments C\ and C^ is written C{C2 and similarly for stacks. Finally, the code sequence zx:z2:- • -'.i^ie is written ii'.i^-' - -:ik if k>0. The transition relation is a binary relation —> on configurations. The intuition is that
(C,ST) -> {C\SV) means that the first instruction of C is executed and changes the configuration from (C,ST) to (C',STf). The definition of the relation is summarized in Table 6.2 and will be explained in the remainder of this section. We shall mostly be interested in finite execution sequences which are sequences, A, of the form
((C o ,57'o),(C7 1 ,5r 1 ),...,((7 m ,5r in )) where • (Ci,STi) —> (Ci+i,5jTi+i) for all i<m, and • there is no configuration (C,ST) such that (C m ,£T m ) -> (C,ST). To aid readability we often write A in the form (Co,STo) - ( C L S T X ) ->
> (Cm,STm)
and we shall write A(i) for (Cj,5Ti), and if 0
6.1
The Coding Interpretation
(CONST
141
bi-.C, v.ST) -f (C, ty.ST)
(PRIM O,:C, V.ST) -> (C, 6{(v):ST) if o;(v) is defined (PRIM O\:C, V.ST) —> (PRIM O\:C, V.ST) if b-x(v) is not defined (ENTER:C, V.ST) -* (C, v:u: (SWITCH:C, V1.V2-.ST) -» (C, » (BRANCH(Ci,C2):C, true:ST) (BRANCH(C1,C2):C, false:5T) -» (C 2AC, 571) (TUPLE:C, w i: t; 2 :57) -» (C, (FST:C,
( Wl ,t; 2 ):57) - . (C, W
(SND:C, (n,t; 2 ):57) -» (C, (CONS:C, vx:v2:ST) -» (C, [ (NIL:C, v:57) -» (C, [ (HD:C, [ » I : » 2 ] : 5 7 ) -» (C, (HD:C, []:57) -* (HD:C, [ (TL:C, K:t; 2 ]:57) ^ (C,« 2 :57) (TL:C, []:57) -» (TL:C, []:57) (ISNIL:C, [w1:w2]:57) - • (C, false:5T) (ISNIL:C, []:57) -» (C, true:5T) ( C U R R Y ( C ' ) : C , v:57)
-» (C,
{C';
(APPLY:C, {C';» 1 }:» 2 :57) -+ (C, (DELAY(C"):C, »:57) -> (C, {C",
(RESUME:C, {C',v}:5T) -» (C""RESUME"C, W: (RESUME:C, v : 5 r ) -> (C, v:5T)
otherwise
(CALLREC(/,(7'):C, 5 7 ) -» (C /[CALLREC(/,C/)//TC. (REC:C, { C » : 5 T ) -> (C, where v' = {REC:RESUME,{C";u}} Table 6.2: Transition relation for the abstract machine free to write # A for m and to call this the length of A even though there are in fact m+1 configurations in the sequence A. Later we shall be interested in infinite execution sequences which are sequences,
142
6
Code Generation
A, of the form ((Co,5To),(C1,5T1),...,(Cm,5rm),..-) where • ( C i , 5 T 0 - > (C i + 1 ,5T i + 1 ) for alii. To aid readability we often write A in the form (Co,STo) -> {C and we shall write A(i) for (C-^STi), and if 0
=
{ A | A is a finite execution sequence }
ExSeq(m)
-
{ A G ExSeq(*) | # A - m }
ExSeq(u;)
=
{ A | A is an infinite execution sequence }
ExSeq(oo)
=
ExSeq(*) U ExSeq(io)
Often it is convenient with notation for the execution sequences that start with a specific code component or stack component. To this end we define ExSeq(£,C)
=
{ A e ExSeqO?) | 3ST: A(0) - (C,ST) }
ExSeq(AC»
=
{ A e ExSeq(AC) | A(0) - (C,[u]) }
where £ is any one of *, m, LO or oo. In later developments we shall exploit that the machine defined by Table 6.2 is in fact deterministic: Fact 6.1.1 For all code sequences C G Code and values v £ Va/, the set ExSeq(oo,C,i;) is a singleton; this means that there is precisely one (finite or infinite) execution sequence A with A(0) = (C,[t>]). •
6.1
The Coding Interpretation
6.1.1
143
First order aspects of the abstract machine
The machine has an unspecified set of base values bj (iG/). We shall assume that we have base values b t r u e and bfaise, written true and false, for the booleans and b 0 , bi, b_!, • • •, written 0, 1, — 1, • • •, for the integers. The base values are used to build composite values like • pairs of the form (^1,^2), • lists which may be empty, written [], or non-empty, written [^1:^2] where V\ is the head and v2 the tail of the list, • closures of the form {C',v} where C is a code sequence — these values represent functions and will be discussed in Subsection 6.1.4, and • thunks of the form {C,v} where C is a code sequence — these values represent the eventual outcome of executing C on a stack with v on top; however, the actual execution has been postponed. We shall assume that the representation of the various values in the machine allow to determine whether a value is a base value, a pair, a list, a closure or a thunk. In particular, if bi = bj then i^j and no bj has the form (• • •,• • •), [], [• • •:• • •], {...-,..}
ov {•••,••}.
For each base value b, the machine has an instruction CONST bj that pops the stack and pushes the value b\ on top of it. Furthermore, there is an unspecified set of primitive operations ox (i G / ) . An example is o+, written +, and its semantics is given by + (v) = ni + n2 if v = (^i,n 2 ) and % and n2 are integers. Note that if the top of the stack does not have the expected form then PRIM o\ will cause the machine to enter a loop (rather than simply stop). The machine has instructions for building composite values on the stack and for taking them apart. As an example TUPLE builds a pair from two values on the stack and FST and SND extract components from a pair. Similar remarks hold for CONS, HD and TL with the addition that if the stack top is the special value [] then the machine will loop rather than simply stop. The instruction NIL replaces the stack top with [] and ISNIL tests the form of the stack top. In addition to this there are a few general instructions: ENTER and SWITCH rearrange the stack and BRANCH(Ci,C2) is a conditional whose outcome depends on the value on top of the stack. The instructions CALLREC(/,C), CALL / and REC handle recursion and will be discussed in Subsections 6.1.3 and 6.1.4. The instructions CURRY(C) and APPLY cope with closures and will be discussed in Subsection 6.1.4. The thunks are central for achieving laziness in that they represent computations that are postponed. The instruction DELAY(C) will construct a thunk from
144
6 Code Generation
the code C and the value on top of the stack. The postponed computation may be initiated by the RESUME instruction and it will repeat itself until the postponed computation eventually terminates with a 4non-thunk' value. This is illustrated in the following example. Example 6.1.2 The instruction PRIM + will add the two components of a pair of integers on top of the stack. However, the addition may appear in a context where the top of the stack is a thunk, for example {ENTER:CONST 1:TUPLE,7}, and the machine will enter a loop when started in the configuration (PRIM +, [{ENTERlCONST 1:TUPLE,7}]). To overcome this we will have to use the RESUME instruction to evaluate the thunk and we obtain the finite execution sequence (RESUME:PRIM +, [{ENTER:CONST 1:TUPLE,7}]) —> (ENTERiCONST 1:TUPLE:RESUME:PRIM +, [7]) -» (CONST 1:TUPLE:RESUME:PRIM +, [7,7]) -> (TUPLE:RESUME:PRIM +, [1,7]) —> (RESUME:PRIM +, [(1,7)]) -> (PRIM + , [(1,7)]) -> In general, the top of the stack may be a thunk that evaluates to a pair of thunks and each component must then be evaluated before the addition can take place. So the general code for addition will be RESUME:ENTER:SND:RESUME:SWITCH:FST:RESUME:TUPLE:PRIM +
6A.2
First order aspects of the coding interpretation
We shall follow the approach of Chapter 5 and specify the code generation as an interpretation K. At first sight one may expect the type part of K to have K(£i z± £2) = Codex,forall frontier types t\ z± ^2 because this reflects the generation of code for the computations to be performed at run-time. However, when coming to recursion we have to generate relocatable code in order to get fresh labels. Instead of Code± we shall therefore use the domain RelCode = N -> {Code±) where N is the set of natural numbers or the discrete partial order (iV,=), and where the ordering C is defined by
6.1
The Coding Interpretation
145
K(Tuple[*Oi±tiX*2]) = XRCx.XRC2.Xd. ENTER:DELAY(/?C2(rf)):SWITCH:DELAY(iiCi(cf)):TUPLE
_L
iiRd(d) + _L and i?C2(d) + JL otherwise
i2i'2]) = Ad.RESUME:FST:RESUME i21*2]) = Ad.RESUME:SND:RESUME
K(Cons[^oz±*ilist]) = XRCx.XRC2.Xd. ENTER:DELAY(JRCf2(rf)):SWITCH:DELAY(i?Ci(cf)):CONS
if Rd{d) ^ ± and RC2(d) ^ ±
JL
otherwise
K(Nil[fn-+filist]) = Arf.NIL K(Hd[i]) = Ad.RESUME:HD:RESUME K(Tl[t]) = Ad.RESUME:TL:RESUME K ( l s n i l [ < ] ) = Arf.RESUME:ISNIL
Table 6.3: The coding interpretation K (part 1) RCi C RC2 if and only if \/deN: Rd(d)=±
V RCi{d)=RC2(d)
To see that RelCode is indeed a domain it suffices to observe that it is isomorphic to (N±) —> s (Code±) which is a domain (Construction 5.1.19). We shall then define K(*i =± t2) = RelCode for all frontier types t\ z± hThe interpretation of the operators is specified in Tables 6.3, 6.4 and 6.5 and will be explained in the remainder of this section. The code generation will obey the following rules: A: The code makes no assumptions about whether the initial value on top of the stack is a thunk or not. B: If the execution of the code terminates then the top of the stack will never be a thunk, and except for the top value, the stacks in the initial and final configurations will be the same. In Section 6.3 we shall formally prove that this is indeed the case. But for now we turn towards explaining the code generation. First consider the clause for = Ad.RESUME:FST:RESUME
146
6
Code Generation
where f/ are unspecified elements in [ t](K) K(Fi[tOz±ii]) - Ad. F ' 0 - ' 1 where Fj0— * are unspecified elements in Code±_
K(n[(t1=tt2)x(t0=±t1)]) = XRd.XRC2.Xd. ( DELAY(RC2{d)):(Rd{d)) I \{Rd{d) ^ ± and RC2(d) i-- ± \ J_ otherwise K(ld[^]) = Ad.RESUME
K(True[t]) = Ad.CONST true K(False[t]) = Ad.CONST false K(Cond[t0z±*i]) =
XRCx.XRC2.XRCz.Xd.
( ENTER:(jSCi(d)):BRANCH( JRCf2(d),RC3(d))
I
{ J_
iiRd(d) ^ i., RC2{d) ^ 1 and RC3(d) £ ±
otherwise
Table 6.4: The coding interpretation K (part 2) To fulfill condition A we first execute a RESUME instruction. If necessary this will replace the top of the stack with a value that is not a thunk. If it is a pair we can execute FST to select its first component. This may or may not be a thunk and in order to fulfill condition B we execute yet another RESUME instruction. The code generated for Snd, Hd and Tl is similar. In the case of I s n i l we can dispense with the last RESUME instruction because the instruction ISNIL never produces a thunk as result. In the case of True, False and Nil we must dispense with the initial RESUME instruction in order to behave in a lazy way and we can dispense with the last RESUME instruction because CONST true, CONST false and NIL never leave a thunk on top of the stack. Also note that the code generated for Id is a single RESUME instruction so that condition B will be fulfilled. The interpretation of Tuple, • , Cons and Cond shows how code fragments supplied as parameters may be composed to produce new code fragments. Consider the clause for Tuple: [*Oz±*i2i*2]) =
XRd.XRC2.Xd.
ENTER:DELAY(i2C2(d)):SWITCH:DELAY(i?(7i(d)):TUPLE
if Rd(d)^ _L
otherwise
± and RC2(d) ^ J_
First note that we only generate 'proper' code if RCi(d) and RC2(d) are indeed code fragments. The DELAY instructions are used because RC\(d) and RC2(d)
6.1
The Coding Interpretation
147
K(f ix[t}) = FIX where FIX(H) = Un Hn{±) if t is pure K(f ix[ti=±t2]) = XH.Xd. CALLREC(d,#(Ad'.CALL d)(d+l)) if #(Ad'.CALL d)(d+l) ^ -L _L otherwise K(fix[i!Xt 2 ]) - \H.(HUH2{H1)) where Hl=K(fix[t1])(Xv1.w1 where (w 1 ,w 2 )= J ff((v 1 ,ff 2 (v 1 )))), H2=Xvi.K(f ix[£2])(Av2.w2 where (wi,w 2 )=/y((vi,v 2 ))) and tiXt2 is composite but not pure K(Curry[tOz±(*iz±<2)]) = XRC.Xd. ' cvRRY(RC(d)) if RC(d) / JL _L otherwise K(Apply[iOz±*i]) = Ad. RESUME:ENTER:SND:SWITCH: FST:RESUME:APPLY:RESUME - Ad. RESUME:REC:RESUME Table 6.5: The coding interpretation K (part 3) are supposed to fulfill condition B so that they will execute to produce non-thunk values. However, in a lazy setting this is not permissible and we postpone the computations using the DELAY instruction. The ENTER, SWITCH and TUPLE instructions ensure that the postponed computations are given the same argument and that the pair is constructed on top of the stack. Note that it is not necessary to start with a RESUME instruction in order to fulfill condition A nor is it necessary to end with a RESUME instruction in order to fulfill condition B. The interpretation of Cons is similar to that for Tuple. Next consider the clause for • : K(n[(t 1 = ± t 2 )x(i 0 z±*i)]) = XRd.XRC2.Xd. DELAY(RC 2(d)):(RC1{d)) if Rd{d) ^ ± and RC2{d) ^ ± _L otherwise At first sight one might expect that the correct code is RC2(d):RCi(d) but in a lazy setting we must avoid executing RC2(d) if the result is not used by RC\(d). Since RC2(d) is assumed to fulfill condition B we therefore postpone the execution of the code fragment using the DELAY instruction. Because RC\(d) fulfills condition A it will initiate the postponed computation if necessary. The combined code will fulfill condition A as well as B.
148
6
Code Generation
The interpretation of Cond is straightforward as there is no need to postpone parts of the computations of a conditional. The interpretation K will specify how to generate code for the various constants of type £oi±£i- An example is +[lntxjnt—»Int] where we use K(+[lntX_Int->Int]) = Ad.RESUME:ENTER:SND:RESUME: SWITCH:FST:RESUME:TUPLE:PRIM +
as explained in Example 6.1.2. The code will fulfill conditions A and B. Another example is Zero[lnt—>Int] where we use K(Zero[Int-»Int]) = Ad.CONST 0 Example 6.1.3 Consider the expression Tuple(Fst[lntx^Int],Snd[lntx_Int]) The interpretation K gives the element Aenv.Ad.ENTER:DELAY(RESUME:SND:RESUME): SWITCH:DELAY(RESUME:FST:RESUME): TUPLE
of the domain Void —> RelCode. Given a void environment and a relocation parameter we see that the code generated will always terminate when executed. This is contrary to the code generated for Id which in certain cases may loop. Thus the code generated by K behaves in much the same way as the interpretation S of Chapter 5. •
6.1,3
Recursion
The interpretation of f ix[t] depends on the actual form of t. We already saw this happening in Chapter 5 when specifying S but the reasons here will be even more compelling. In the case of pure types, that is types that do not contain any run-time constituents, we simply have K(f ix[t]) = FIX where FlX(ff) - Un Hn(±) as in the standard interpretations of Chapter 5. The case where t — t\z±ti is more interesting because we are to generate code. The idea is that the instruction CALLREC(/,C") defines a label / and its associated piece of code C. Whenever the instruction CALL / appears in C it is interpreted as a recursive call to the label /. This is reflected in the operational semantics of the machine where we have (CALLREC(/,C"):C, ST) -> (C/[CALLREC(/,C/)//rC, ST)
6.1
The Coding Interpretation
149
In general the effect of the substitution E = [Cu- • -,C n //i,- • -,/n] (where the are all distinct) is defined by
(CONST
bi)[E] = CONST bi
(BRANCH(C,C"))[S] = BRANCH(C[£],C"[S]) (CALLREC(/,C))[S] = CALLREC(/,C[£']) if / = k and where CALLREC(/,C[£]) if / ^ /i for all i, l < i < n
f C(CALL /)[S]
= <
*
if / = /• /
-f / . / / f
li •
Note that the effect of the substitution is very similar to that of the A-calculus except that CALLREC (/,C)[£] does not consider the situation where some C\ contains the instruction CALL /. The reason is that this situation will never arise in the code generation. The actual code generated for K(f i x ^ i ^ t ^ ] ) is given by K(f ix[tlz±t2]) = XH.Xd. CALLREC(
• •/• • -f ix(\g[t=tt].- • -g- • •/• • •)• • •)
150
6
Code Generation
The overall code will have the form CALLREC(d,F(Ad'.CALL The idea is that the function F corresponds to the argument of the outermost fixed point and has the form
XRCf.Xd'.- - ".(RCf(d')):' • .:G(d> • • where G(df) is the code for the inner fixed point but relocated from dl. Thus the idea is that G(df) has the form CALLREC(d',G(Ad".CALL d')(d' where G has the form XRCg.Xd".- • -:(RCg(d'')):- • -:(RCf(d")):- • • Rewriting the earlier clause for G(d') we see that is has the form CALLREC(d',- • :CALL d':- • -:(RCf(df+!)):• • •) Returning to F we see that it has the form XRCf.Xd'.->>:(RCf(d')):-->:
CALLREC(dV • -:CALL d'\- • ".(RCf(df'+!)):• • •)•• * '
and thereby the overall code generated for the outermost fixed point will have the form CALLREC(d,- • -:CALL d:- • •: CALLREC(d+l,- • -:CALL d+l:- • •: CALL d:-••):•••) Example 6.1.5 Consider the expression of the sum^B program of Section 4.3. The semantics under the interpretation K is the function Xenv.Xd. DELAY(ENTER:DELAY(RESUME): SWITCH:DELAY(CONST 0): TUPLE): CALLREC(d, ENTER:DELAY(RESUME:SND:RESUME):RESUME:ISNIL: BRANCH(RESUME:FST:RESUME, DELAY(ENTER: DELAY(DELAY(ENTER:
6.1
The Coding Interpretation
151
DELAY(DELAY(RESUME:SND:RESUME): RESUME:TL:RESUME): SWITCH: DELAY(RESUME:FST:RESUME): TUPLE):CALL d): SWITCH: DELAY(DELAY(RESUME:SND:RESUME): RESUME:HD:RESUME): TUPLE): RESUME:ENTER:SND:RESUME: SWITCH:FST:RESUME:TUPLE:PRIM
where we have used the definition of K(+) given in Subsection 6.1.2.
•
Example 6.1.6 A somewhat simpler sum program would be the following: VAL f ix(Ag[lnt l i s t =± Int]. Cond(Isnil[lnt], Zero[lnt l i s t z± Int], +[lntxlnt-»lnt] • Tuple (Hd[lnt], g • Tl[lnt]))) HAS Int l i s t z± l i s t Here our code generation K is able to generate better code than in Example 6.1.5: Aenv.Ad. CALLREC(D, ENTER:RESUME:ISNIL: BRANCH(CONST 0, DELAY(ENTER: DELAY(DELAY(RESUME:TL:RESUME):
CALL d): SWITCH: DELAY(RESUME:HD:RESUME): TUPLE): RESUME:ENTER:SND:RESUME: SWITCH:FST:RESUME:TUPLE:PRIM
We shall return to this particular example in Chapters 7 and 8 and show how an even more substantial improvement can be obtained. • Consider now the case where the type t of f ix[£] has the form t\ X t2 but is not pure. Assuming that K(f ix[^]) and K(f ix[£2]) are well-defined we define K(f ix[hxt2]) = where
152
6
Code Generation
# ! = K(fix[^ 1 ])(Av 1 .w 1 where (w 1 ,w 2 ) H2 = Avi.K(f ix[i 2 ])(Av 2 .w 2 where (w 1 ,w 2 )=fl r ((v 1 ,v 2 ))) As in Chapter 5 we may remark that this is motivated by Lemma 5.2.6 (an 'analogue' of Bekic's Theorem). To appreciate the definition consider the case where both t\ and t2 have the form t^±t. Then the expression would typically have the form fix(Af[«iXt 2 ].(--- f s t f ••• sndf • • • , • • • f s t f ••• s n d f •••)) and the function H is
\{RC1,RC2).{\d.- • .Rd{dy - -RC2{d)' •., Xd.- • -RC^d)- • -RC2{dy • •) Thus H1 — Ac/.CALLREC(rf,---:CALL d\- • •:/f 2(Arf'.CALL d)(d+l):- • •) H2 = \RCi.\d.CALLREC(d,-
• -:(RCi(d+l)):- • -CALL d:- • •)
and thereby ,
dl- • •: CALLREC(d+l,- * *:CALL d:- • •: CALL d + 1 : - • • ) : • • • )
(
( CALLREC(d+l 5 - • -CALL d+1:- • •: CALLREC(d+2,- • -:CALL d+ll- • •: CALL d+2:---): •••:CALL d:->)
6.1.4
Higher-order aspects
Closures of the form {C\v} are used to represent functions as data objects. The closures differ from the thunks in that they must be supplied with an additional argument in order to be ready for execution. One builds closures using the CURRY instruction; the APPLY instruction then incorporates the additional argument and thereby transforms the closure into a thunk. The CURRY instruction corresponds directly to the Curry construct of the language and this is reflected in CURRY(i2C(d)) if RC(d) _L otherwise
6.1
The Coding Interpretation
153
The clause for Apply[£ lz= ^ 2 ] is slightly more complicated for much the same reason that K(+) is not simply Ad.PRIM +. In the standard semantics Apply is given a pair as argument where the first component must be a function. Thus the code generated for Apply must manipulate the top of the stack such that its first component is a closure and not a thunk (that would evaluate to a closure, once RESUME'd). This motivates defining: ±*i]) = RESUME:ENTER:SND:SWITCH: FST:RESUME:APPLY:RESUME
where the final RESUME instruction ensures that condition B is fulfilled. Example 6.1.7 Consider the expression twice 3 of Example 4.1.6: twice 3 = Curry ( Apply[lnt->Int] D Tuple(Fst[(Int-»Int)_xInt] ? Apply[lnt-»Int] • Tuple(Fst[(Int-^Int)j^Int], Snd[(Int->Int)x_Int]))) We then get |twice 3 ](K) = Aenv.Ad. CURRY(DELAY(ENTER: DELAY(DELAY(ENTER: DELAY(RESUME:SND:RESUME): SWITCH: DELAY(RESUME:FST:RESUME): TUPLE): RESUME:ENTER:SND:SWITCH: FST:RESUME:APPLY:RESUME): SWITCH: DELAY(RESUME:FST:RESUME): TUPLE): RESUME:ENTER:SND:SWITCH: FST:RESUME:APPLY:RESUME)
as the code of twice 3 under the interpretation K.
•
Finally, we consider the code generated for the construct Fix[£]. The idea is that the top of the stack contains a closure and we must compute the fixed point of the corresponding function. We define = Xd. RESUME:REC:RESUME
154
6
Code Generation
where the first RESUME instruction ensures that the top of the stack is a closure (and not a thunk that may evaluate to a closure) and the second RESUME instruction ensures that condition B is fulfilled. To explain the use of the REC instruction we first recall that its operational semantics is (REC:C, {C';v}:ST) -> (C, {C',(vJ)}:ST) where v' = {REC:RESUME,{C;T;}} The idea is then that C must be supplied with a pair (v,i/) such that v is the argument (corresponding to the free variables) and vf is a fixed point of the function represented by the closure. This fixed point can be written {REC:RESUME,{C";v}} and the purpose of the RESUME instruction is to ensure that condition B is fulfilled. — One may regard this RESUME instruction as being superfluous because whenever v' is RESUME'd the code will have the form RESUME: C and is thus transformed to REC:RESUME:RESUME:C. However, to simplify the proofs in the remainder of this chapter we shall retain the superfluous RESUME instruction.
6.2
The Substitution Property
In the remainder of this chapter we shall formulate and verify the claim that the code generated agrees with the semantics. This is a rather laborious task which we shall approach in stages and where we ignore lists in order to cut down on the complexity. The first stage aims at establishing a substitution property which will ensure that K(f ix[f^.^]) will only be applied to functions that may indeed be regarded as relocatable code sequences with holes in them. To motivate the need for this stage it is helpful to temporarily ignore the relocation parameter, i.e. to temporarily assume that RelCode equals Code. Consider now some frontier type tz±t and some F G [ ( < Z ± 0 ~ ^ ( ^ ^ 0 I ( ^ ) - We then have K(f ix)(F) = CALLREC(«,F(CALL •)) where the intention is that F is a code sequence with holes in it. It would then be helpful if F(K(f ix)(F)) = F(CALLREC(#,F(CALL and F(CALL F(CALL •)[CALLREC(#,F(CALL •))/•] were to agree. To see this note that if we execute (K(f ix)(F),ST) for one step we obtain
6.2
The Substitution Property
155
(F(CALL •)[K(f i: and since K(f ix)(F) is to be a 'kind of fixed point' of F it would be helpful if this configuration was equal to (F(K(fix)(F)),5T) so that the effect of CALLREC(«,« • •) indeed is to unfold the fixed point one level. However, this need not be the case as is illustrated by setting P( \ — / F S T tf x ls °f ^e f°r m CALLREC(«,- • •) ^ ' ~ y SND otherwise Luckily this is a contrived example. Even though F is a genuine element1 of [(tz±t)-^(tz±t)}{K) it should be intuitively clear that F cannot arise during code generation, that is F does not equal [e](K)(env) for any expression e. Our task in the remainder of this section is to make this clear in a formal setting. To this end, the set FreeLab(C) of free labels in the code sequence C may be defined inductively on the structure of code sequences. The more important clauses are: FreeLab(CYC2) = FreeLab(Ci) U FreeLab(C2) FreeLab(DELAY(C)) = FreeLab(C) FreeLab(CALL(rf)) = {d} FreeLab(CALLREC(d,C)) = FreeLab(C)\{d} Recall that in the previous section we wrote a substitution S in the form
In this section we shall regard a substitution S as a partial function from numbers to code sequences subject to certain conditions. The set Subst of substitutions is then defined by Subst = {S:iV<->CWe | dom(S) is finite A V/Edom(£): FreeLab(S(/)) C dom(S) } It may be turned into a partially ordered set by defining Si C S 2 whenever dom(Si) C dom(S2) and V^/Edom^): Ei(/) - S 2(/). When we write Si C £ 2 in the sequel it will be the tacit assumption that both Si and S 2 are elements of Subst. We often identify SeSubst with its graph: {(d,C)|dedom(S)AS(d)=C'}. We now have the apparatus needed to define the 'substitution predicate' compS[Z]t: [i](K) x [*](K) -> {true,false} x
We ignore here the omission of ensuring that F(l) = _L.
156
6
Code Generation
that is indexed by a well-formed type t of compile-time kind and that is parameterized on a substitution EESubst. The intention with compS[T,]t(xo^x) is that #o[S] equals x. The formal definition is by induction on the type t: compS[£\Ai(xo,x)
= x0 = x
compS[Y,}tlXt2{(x01,x02),(x1,x2)) = i) A compS[Y]t2(xm,x2)
W>max(dom(£)):
compSf[Z}tl=±t2(RC0{d),RC(d))
where corapSr'[E]t1_1t2(C'o,C') =
(C=± =» C o=±) A (C^I. =» C o ^± A C0[E]=C
A FreeLab(C 0 )Cdom(S))
This is an instance of what is sometimes called a Kripke-logical relation. Mostly it is the obvious structural definition with an equality in the case of compile-time base types. For compile-time function space we have made use of an important trick: to consider all S ; G Subst such that S ; 13 E. This will be of importance in Lemmas 6.2.1 and 6.2.2 below. For frontier types we express the desired substitution property and only restrain the relocation parameter d to be outside dom(S). As S e S u b s t we have FreeLab(#C(d))Cdom(£) as well. L e m m a 6.2.1 ('Parameter monotonicity') If SGSubst and H:c the above clauses define an admissible predicate compS\T*\t. Furthermore E 2 ^ E i A compS[T>i]t(x0^x) = Proof: The proof is by structural induction on the type t and is mostly straightforward. In the case t = t\ —> £ 2 we use the fact that
implies
vsas2: whenever We are now able to show that the definition of compS pays off in that it allows to solve the problem discussed in the beginning of this section.
6.2
The Substitution Property
157
Lemma 6.2.2 ('Substitution property') Assume that compS[E]t—+t{FoiF)i that the type t is of the form t = t\ ^± ^2> that d > max(dom(S)) and that F(\d'.CALL(d))(d+l) ^ JL Then F(\d'.CALL(d))(d+l) [CALLREC(d,F{\d'.CALL(d)){d+l))/d] =
F(\d'.CALLREC(d,F{\d'.CALL(d))(d+l)))(d+l)
and FreeLab(CALLREC(^F(A^CALL(d))(d+l))) C dom(E).
•
Proof: It is convenient to write C = F(\df.CALL(d))(d+l) C = CALLREC(d,C) C" = F(\d'.C')(d+\) and similarly for Co, C'Q and CQ. We assume that C^A. and must show that C'yj-, that C[C'/d] = C" and that FreeLab(C') C dom(E). The proof is in two stages that both proceed by extending S with a further pair. Stage 1: Consider Si = SU{(d,CALL(d))} We have SaGSubst because SESubst and FreeLab(CALL(d)) C {d} and it follows that Si I] E. It is immediate to verify that '.CALL(d),\d'.CALL(d)) so by compS[Ti](F0^F) we have Since d+l>max(dom(E 1)) this yields
It follows that Co^X and that FreeLab(C) C dom(Si) C dom(E)U{J} Copa] = C0[E] = C Stage 2: Consider E2 =
158
6
Code Generation
We have S 2 GSubst because SGSubst and FreeLab(C') = FreeLab(C)\{d} C dom(S) so clearly X^^S- It is immediate to verify that compS[X2}(\d'.CALL(d),\d'-C') so by compS[Ei](Fo,F) we have
Since d+l > max(dom(S 2 )) this yields compS[Y>2](\d' .CQ,\d' -C") Since COT^-L we get C"^J_ and C" = C0[SU{(d,C")}] Since E e S u b s t so that FreeLab(E(/)) C dom(E) when / 6 dom(S) we have C" = (C 0 [S])[C'/d] = C[C'/d} as was to be shown. Note that this lemma shows C[C'/d] = C" rather than the equality of C[C'/d] which is different and F(K(f ix)(F))(d+l), because Xd'.C = \df.K(fix)(F)(d) • from K(f ix)(F). To show that the premises of Lemma 6.2.2 do hold when we need to use the substitution property we shall show that K interprets all operators in an acceptable way and that this then carries over to all expressions of the mixed A-calculus and combinatory logic. Lemma 6.2.3 We have compS[S](K(<£),K(<£)) for all S e S u b s t and for all operators (j> of Table 5.1, except those of form fj or Fj and provided that the type t indexing f ix[t] is always composite and does not involve lists. • Proof: Clearly we cannot make a general claim about the fi or Fj as we have not specified the effect of K on all of these. However, concrete examples may be found among the exercises. Also we cannot make any claims about f ix[£] if t is not composite, as then K(f ix[£]) has not been defined, or if t involves (compile-time) lists, as then compS[H]t has not been defined. So let be one of the remaining operators of Table 5.1 and let us temporarily assume that it is not f ix[£]. From the 'parameter monotonicity' it follows that we may, without loss of generality, concentrate on S=0. Since the code K(<£) does not explicitly mention any CALLREC or CALL it is fairly straightforward to show
6.2
The Substitution Property
159
To be more specific, the type of <j> is of the form t^-^ >t0 for k>0 and frontier types t\. If k=0 the result is indeed obvious. If k>0 one may observe that the assumptions about the arguments of K(<;6) immediately carry over to the result. Finally consider f ix[t]. The proof is by induction on the structure of the type t. The base cases are when t is pure and when t is a frontier type. The inductive step arises when t is a product of composite types. The case t is pure. It is straightforward to show if pure(£) and SGSubst then compS\Yj]t{xo^x) <$ x0 — x by induction on t. It is then immediate that corap5[0](K(f ix[£]),K(f ix[£])) holds. The case t — tiz±t
2
is a frontier type. We consider EESubst, assume
V S a S : \/(RCo,RC): compS[Z'}{RCo,RC) => compS[Z'}{F0{RC0),F(RC)) and show comp5[S](K(fix)(F 0 ),K(fix)(F)) 50 let rf>max(dom(E)). If F(\d'.CALL(d))(d+l) F0(\d'.CALL(d))(d+l)
= l w e also have
= ±
and the result is immediate. Otherwise, write C =
F{\d'.CALL(d))(d+l)
and similarly for C o . The proof mimics Stage 1 in the proof of Lemma 6.2.2. So consider Si = SU{(d,CALL(d))} We have SiGSubst because SGSubst and FreeLab(CALL(d)) C {d} so clearly 51 3 S. It is immediate that compS[Tli}(\d'.CALL(d),Xdf.CALL(d)) so that Since rf+l>max(dom(S!)) this yields
160
6
Code Generation
and it follows that FreeLab(C) C dom(S)U{} Co[S] = C It is then easy to obtain FreeLab(CALLREC(d,C)) C dom(S) CALLREC(,C0)[S] = CALLREC(d,C) and this is the desired result. T h e case t is a product of composite types. To exploit the induction hypothesis one has to show that compS[Yi'](F'Q,F) carries over to the functions supplied as argument to K(f ix[£i]) and K(f ix[£2])- We refer to [72, Lemma 7.13] for an example of a proof along these lines. • L e m m a 6.2.4 ('Structural induction') Let penv be a well-formed position environment (in the sense of Section 5.2) such that pps{penv)
K
h e : t
and assume that neither e nor penv involves any lists. If EESubst is such that compS[Z](K(),K(<j>))
holds for all operators <j> that occur in e, then coropS[S]([e](K),[e](K))
•
N o t e that we have dispensed with explicitly indexing compS with type information. If we were to do so it would be helpful to define a 5-level type rps(pem;) such that [pewv](Z) = [rps(pent;)](X). Proof: The proof is by structural induction on the expression e. T h e case e ::= f\[t). We consider £' 3 S, assume compS[T,'](env0,env) and must show comp5r[S/](K(fi),K(fj)). This is straightforward using the premises of the lemma and the 'parameter monotonicity'. T h e cases Fj, Fst, Snd, Apply, Id, True, False and Fix are similar. T h e case e ::= t r u e . We consider £' I] E, assume comp5[S / ](env 0 ,env) and must show corapS[E'](true,true). This is straightforward using the definition of compS.
The case false is similar.
6.2
The Substitution Property
161
The case e ::= fix e0. We consider E' I] S, assume comp5[E;](env0,env) and must show corapS[S']([e](K)(env0),[e](K)(env)). The induction hypothesis and the 'parameter monotonicity' give comp5[S/]([eol(K)(env0),[eo](K)(env)) and the premises of the lemma give corop5[E](K(fix),K(fix)) Using the definition of compS we then obtain comp5[E/](K(fix)([c0](K)(env0)),K(fix)([e0](K)(env))) which is the desired result. The cases Tuple, Curry, • and Cond are mostly similar. The case e ::= (ei,e 2). We consider E' I] E, assume comp5[S;](envo,env) and must show comp5f[S/]([e](K)(env0),[e](K)(env)). The induction hypothesis and the 'parameter monotonicity' give cOmp5[S']([ei](K)(env0),[ei](K)(env)) for i = 1, 2 and the desired result then follows immediately from the definition of compS. The cases f st, snd and if are mostly similar. The case e ::= Axi[£'].e0. We consider S' 3 E, assume compS[S^envo^nv) and must show comp5[S/](|e](K)(env0),[e](K)(env)). This amounts to Comp5[S'](Av.[eoI(K)((env0,v)),
Av.[eo](K)((env,v)))
so consider S;/ 3 E;, assume compS[Yt"](vo)V) and show comp5[S"]([eo](K)((envo,Vo)),[eo](K)((env,v))) Using the 'parameter monotonicity' and the definition of compS we have compS[£"]((env0,v0),(env,v)) and the desired result then follows from the induction hypothesis and the 'parameter monotonicity'. The case e ::= ei(e2). We consider E; 3 E, assume comjp5[S;](envo,env) and must show comp5[S/]([e](K)(env0),[e](K)(env)). The induction hypothesis and the 'parameter monotonicity' give comP5[S']([ei](K)(env0),[ei](K)(env))
162
6
Code Generation
for i = 1,2 and the desired result is then immediate from the definition of compS. The case e ::= Xj. We consider E ' 3 E , assume compS[Ef](env0,env) and must show compSr[S'](7rps(xi,^eni;)(enVo),7rps(xi,penv)(env)) where we use ?rps of Section 5.2. We may write penv = (x i l ^ 1 )---(x i k ^ k ) env 0 = (• • -((void,v 0 i),- • -,vok)) env = (•••((void,vi),-..,v k )) and from the well-formedness assumption we know that we may define an index j by j = max{j| Xi. = Xi} Then 7rps(xi,pen^)(env0) = vOj and 7Tps(xi,penv)(env) = Vj so from the definition of compS we obtain
which is the desired result.
•
In the statements of the previous lemma we have used compS in a context where the two 'syntactic arguments' are always identical. This motivates defining compSt : [£j(K) —• {true,false} by compSt(x) = VEeSubst:
compS[Y]t{x,x)
which by the 'parameter monotonicity' is equivalent to compSt(x) = compS[$]t(x,x) Then lemma 6.2.3 asserts compS{K((l>)) for all 'acceptable' operators > of Table 5.1 and Lemma 6.2.4 uses this to assert corap5([e](K)) for all 'acceptable' expressions e of the mixed A-calculus and combinatory logic. Having introduced compSt it is then natural to consider whether we could have dispensed with the compS[- • •]*(• • •,• • •) predicate and instead have given a direct inductive definition of compSt(- • •) in such a way that analogues of Lemmas 6.2.2, 6.2.3 and 6.2.4 could still be proved. So far we have been unable to do this and we suspect that no substitution property (that is analogue of Lemma 6.2.2) could be obtained if such an approach is taken. However, it is worth observing the following properties of compSf L e m m a 6.2.5 For appropriate elements x, x\, # 2 , F and RC we have
6.3
Well-behavedness of the Code compSAi(z)
163
<=> true
compStlXt2((x1,x2))
<£> compStl(xi) A compSt2(x2)
compStl-+t2(F) A compStl(x) => compSt2(F(x)) compStl=±t2(RC)
& (Vd: RC{d)^±_ => FreeLab(#C(d))=0)
Proof: The first two double implications are straightforward. For the third implication assume that VSGSubst: compS[Y]tl-+t2(F\F) VSGSubst: compS\TJ\tl(x,x)
and consider SGSubst. That compS[E}t2(F(x),F(x)) is then immediate from the assumptions and the definition of compS[H]tl-+t2- For the final double implication we calculate compStl:i±t2{RC) & compS[Q]tl=tt2{RC,RC) &Vd: compS'[Q]tl=tt2(RC{d),RC(d)) <* \/d: (RC(d)^± ^ RC{d)^± A RC(d)[iH\=RC(d) A &\/d: where the first step is using the 'parameter monotonicity'.
6.3
•
Well-behavedness of the Code
We now begin by clarifying what well-behavedness of the code generated by K is supposed to mean, and later in this section we then show that the code is indeed well-behaved. The basic idea is that the code operates in a stack-like fashion in that it transforms the value on top of the stack and leaves the remainder of the stack unchanged. However, as the following example shows there are some pitfalls. Example 6.3.1 The code sequence
d = [Tuple(Id,Id)](K)(void)(l) = ENTER:DELAY(RESUME):SWITCH:DELAY(RESUME):TUPLE
is well-behaved. To see this let v:ST be any non-empty stack and note that
164
6
Code Generation
(Cuv:ST) ->5 (e, w:ST) where w = ({RESUME,?;},{RESUME,v}) (and that by determinacy this is the only execution sequence). The code sequence C2 = TUPLE is not well-behaved. To see this consider a stack Vi'.v2:ST and the execution sequence
(C2,v1:v2:ST) -> (e,w:ST) where w — (^1,^2) Here v2 has been removed from the resulting stack so it is not just the top element that has been transformed. For the same reason neither of the code sequences C 3 = ENTER, nor C 4 = SWITCH are well-behaved. However, if the code sequence C in question contains a RESUME instruction it is less clear whether or not C is well-behaved. As an example the execution sequence (RESUME,{CONST
bhv}:ST)
-> 3 (e,b{:ST)
would seem to suggest that RESUME is well-behaved whereas the execution sequence (RESUME,{TVPLE,v1}:v2:ST) -> 3
(e,(vuv2):ST)
would seem to suggest that RESUME is not well-behaved. The safe solution is of course to decree that RESUME is not well-behaved but this is totally unacceptable given our (frequent) use of RESUME instructions in the code generation K. In particular, even the code for Id would not be well-behaved. • To allow a code sequence containing RESUME instructions to be well-behaved we need to consider the element on top of the stack. Our second attempt at well-behavedness might then be as follows: a code sequence is well-behaved if its effect on a stack with a well-behaved top element only is to transform that top element into another well-behaved element; further, an element on the stack is well-behaved provided all code sequences contained in it are indeed well-behaved. This would seem to overcome the pitfalls exposed in Example 6.3.1: the code sequence CONST bj is well-behaved and therefore RESUME is well-behaved when
6.3
Well-behavedness of the Code
165
{CONST bj,v} is on top of the stack; however, the code sequence TUPLE is not well-behaved and therefore RESUME is not well-behaved when {TUPLE,^i} is on top of the stack. It remains to ensure the well-definedness of the well-behavedness predicate. To see that there is a problem note that well-behavedness is 'circularly defined' in that it presupposes well-definedness of elements on the stack and this amounts to well-behavedness of the code sequences contained in these elements. The usual method for 'breaking' such circularity is by appeal to structural induction but this does not work here: if the code sequence of interest is RESUME it is still possible (and indeed very likely) that the element on top of the stack will contain much larger code sequences. Another method that is sometimes applicable is to regard the definition of the predicate as a recursive definition and to use fixed point theory to obtain the solution. This method fails here because the recursive definition violates the monotonicity requirement. Yet another method is to index the predicate with a counter that expresses the maximum length of execution sequences considered and to regard longer execution sequences as 'infinite'. This method works but leads to a rather 'messy' calculation of new values for the counter (see Exercise 6). The method we shall adopt uses a well-founded relation defined on a 'measure' that depends on the 'type' of the element on top of the stack and whether or not this element is a thunk. In doing so we shall exploit condition B of Section 6.1. Having defined the well-behavedness predicate for elements of the stack it is then rather routine to define well-behavedness for code sequences and other entities.
6.3.1
Definition of the well-behavedness predicates
We first consider the formal definition of the well-behavedness predicate valWt for elements on the stack. The type t is supposed to be a well-formed 5-level type of run-time kind that indicates the 'type' of the element on top of the stack. This index is, in principle, dispensable at this stage since the machine has no explicit notion of type. However, it is hardly possible to dispense with the type index in the next stage (Section 6.4) and already in this stage it is helpful in reducing the number of cases to be considered in the proofs that follow. Also it is central for our chosen method for well-definedness to work. So we propose the following definition of valW: valWx (bj) = true for all basic values bj of type Aj e.g. true and false are all the basic values of type Bool valWtlya2((v1,v2))
= valWtl(vi)
valWtl=±t2({C;v0})=VVl:
A valWi2(v2)
166
6
valWt({C,v})
= VAeExSeq(*,C»: postW t(A)
Code Generation
A nothunk(A)
where postWt(A(0..m)) nothunk(A(0..m))
= 3v: A(ra) = (e,[v]) A valWt(v) = -.3C,C", vf, ST: A(ro) -
(C,{C'J}:ST)
and for later usage preWt(A(0..m))
= 3C,v: A(0) - (C,[v]) A valWt(v)
To motivate this definition note that it mostly proceeds by structural induction on the type subscript and perhaps with a case analysis on the form of the stack-element given as parameter. So it should be clear that for example valWt-Lt((^>i^2)) is intended to be false. The clause for thunks is applicable for all run-time types and simply considers the effect of running the code component of the thunk upon the value component. Intuitively there are four possible 'outcomes' of running a code sequence upon some value: • the computation may loop forever, or • the computation may produce a value as result, or • the computation may end with a 'dynamic' error, e.g. division by 0, or • the computation may end with a 'static' error, e.g. that the stack is too short. In this section we only consider finite execution sequences and so disregard the first 'outcome'. Also, we do not need to explicitly consider the third 'outcome' because the semantics in Table 6.2 has been designed so that 'dynamic' errors lead to looping forever. This is clearly demonstrated by the transitions for HD and TL upon empty elements and by the transition for PRIM(o]) upon a value v such that di(v) is not defined. This may be motivated by an analogy with the treatment of errors in the standard semantics where (in the absence of an 'error' element in the domain) one produces _L as result. Another more pragmatic motivation is that it reduces the number of cases that needs explicit attention in the proofs that follow. This leaves us with the second and fourth 'outcome'. The formulation of valWt({C,v}) explicitly considers the second 'outcome' and formulates the desired condition. This is done for initial and final stacks of length 1 but we shall see shortly that these stacks can always be extended and that the extensions will be left untouched. The absence of any explicit consideration of the fourth 'outcome' then amounts to the (major) claim that no 'static' errors can arise when executing well-behaved code.
6.3
Well-behavedness of the Code
167
L e m m a 6.3.2 The clauses for valWt define a predicate valWt : Val —> {true,false} whenever t is a well-formed S-level type of run-time kind that does not involve any list types. • Proof: We begin by introducing a bit of notation. When v£Val is a value we write t;::thunk to express that it is a thunk, that is 3C,u: v={C,u}, and we write t>::nothunk to express that it is not a thunk, that is -i(v::thunk). We then introduce a partial order on pairs of types and values by (
-< (t2,v2)
where (t\)Vi)
-< (t2,v2)
if and only if
(ti is a proper subtype of t2) or (ti = t2 A vi::nothunk A t^ It is straightforward to check that this defines a well-founded order. We then show that the clauses for valWt(v) are well-defined using the principle of complete induction (as in Section 3.2). This amounts to investigating each clause for valWt(v) and verifying that each valWtt(vf) on the right hand side has (tf,v')~<(t,v). This is immediate except when v is a thunk. In this case there is an occurrence of valWv(v') implicit in postWt(A). It has t=tf but also i/::nothunk • due to nothunk(A). Hence (t',v')-<(t)V) and this completes the proof. Turning to code sequences the idea is to define a relation compW much like compS of Section 6.2. Since we will need the substitution property in order to prove the required result for the fix operator we shall need to let compW include compS. There is no need, however, to let compW be parameterized on a substitution, nor is there a need to duplicate the 'syntactic' argument. We thus define compWt: {t](K) -> {true,false} as follows: compW Ai(z) = true compWtlXt2((xux2))
= compWt^x^
A compWt2{x2)
= compSh-+t2(F) A compSWtl^t compWtl^t2{F) where compSWtl-+t2(F) = V#: compWtl(x) =$> compWt2(F(x))
168
6 Code Generation compWtl=±i2(RC) = compStl=±t2(RC) A compSW tl^t2(RC) compSW'tl=ti2(RC(d)) where compSWtlz±t2{RC) = W : where compSW'il=Li2(C) =
Fact 6.3.3 The definition of corap5Wr51_>t2(C) is equivalent to > (VAEExSeq(*,C): prePf t l (A) =* (jwwrfW,2(A) A nothunk(A))) L e m m a 6.3.4 The above clauses define an admissible predicate compWu whenever t is a well-formed 5-level type of compile-time kind that does not involve any list types. • Proof: This is a simple structural induction.
•
The relationship with compS is given by Lemma 6.3.5 ('Layered predicates') We have compWt(x) => compSt{x) for all £€[£](K) and all well-formed iMevel types t of compile-time kind that do not involve lists. • Proof: This is a simple structural induction.
•
To motivate the name, 'layered predicates', consider the set {5,H^} partially ordered as depicted in • W
is Then compS and compW constitute a 'layer of predicates' where the stronger predicate is higher in the partial ordering. Technically this is intimately connected to the 'parameter monotonicity' of Lemma 6.2.1; in particular one notes the explicit inclusion of compStl-+t2 m the clause for compWtl-+t2 much as compS[E']tl-+t2 was included in compS[E]t1-+t2 whenever E ' ^ S .
6.3.2
Operations on execution sequences
To assist in the proofs about the well-behavedness and correctness of code we need to establish some notation for decomposing and combining execution sequences. In doing so we shall take care that the notation applies to finite as well as infinite execution sequences.
6.3
Well-behavedness of the Code
169
We begin by considering an execution sequence A and defining the prefix AocC of A that 'protects' the code sequence C. To be more specific consider A G ExSeq(oo,CiAC2) and define 11 = { * | 3ST,C: A(i)={CC2,ST)} as the set of indices where C2 is still present, 12 = { i | 3ST,C: A(t)=((7C 2 ,Sr) A C^e} as the set of indices where C2 is still present and is also 'untouched', and la = {«'eii I Vj
(C'fC2,STi) and we set
(AocC2)(t) - (C'uSTi) for all i
(if I3 is infinite)
If I3 is finite it must be of the form {0,- • -,m} and for i<m each A(i) is of the form (C'fC2jSTi) and we set (AocC 2 )(0 = (C'uSTi) for all i<m
(if I3={0,- • -,m})
Fact 6.3.6 If A e ExSeq(oo,C7C2) then (AocC2) e ExSeq(oo,d).
•
In an analogous way we may define the prefix AocST of A that 'protects' the stack ST. For this we shall write length(ST) for the length of the stack ST and we shall write arity(C) for the number of elements that needs to be on the stack for the first instruction to execute according to Table 6.2. As an example, arity(swiTCH)=2. Next let A <E ExSeq(oo,C) have A(0) = (C,ST{ST2) and consider defining (AocST2). We define Ji = { t I 3ST,C: A(i)=(C,SrST2)} as the set of indices where ST2 is still present, J 2 = { i I 3ST,C: A{i)={C,SrST2)
A arity(C)
as the set of indices where ST2 is still present and is also 'untouched', and J 3 = { t e J i I Y?
for all i
(if J 3 is infinite)
170
6 Code Generation
If J 3 is finite it must be of the form {0,- • -,ra} and for i<m each A(i) is of the form (Ci,ST'iST2) and we set (AocST2)(i) = (CuST'i) for all i<m
(if J3={0,- • -,m})
Fact 6.3.7 If A E ExSeq(oo,C) is such that A(0) = (C,ST{ST2) then (AocST2) is defined and (AocST2) E ExSeq(oo,C). • It is also helpful with notation for combining execution sequences. The basic observation is that if A = (Co,STo)-*
>(Cm,STm)
and CeCode, STeStack then (CoC,SToST)
->...->
{C^CST^ST)
However, even if A is an execution sequence the modified sequence need not be an execution sequence because when C^e and Cm=e it is likely that we can find some (C',ST') such that (Cm*C,STm*ST) -> (C',ST'). To overcome this obstacle consider Ai E ExSeq(rai,Ci) and A2 E ExSeq(oo,C2) and assume that = (e,ST) A2(0) = (C2,ST) for some stack ST. Then Ai & A2 is defined as (A &A Rr A)(t) \d\ = |/ ( ^ ^ 2 , 5 7 0 (A 1 2
if Aj(0 = (C'^STi) and ,-
Fact 6.3.8 Let Ai G ExSeq(m1,Ci) and A2 E ExSeq(oo,C2) be chosen such that Ai(mi) = (e,5T) and A2(0) = (C2,ST) for some stack STeStack. It then follows that (Ai & A2) E ExSeq(oo,CrC2). •
6.3.3
Well-behavedness of operators
We now embark on the long series of results that together constitute an analogue of Lemma 6.2.3, namely that compW holds for each operator. The general strategy in these proofs is to consider an execution sequence A E ExSeq(m,Cr---A(7k) and then decompose it into execution sequences Aj E ExSeq(rai,Ci) for ie{l,- • -,k}
6.3
Well-behavedness of the Code
171
such that A = A2 & • • • & A k For each i£{l,- • -,k} the proof strategy will then be • to apply the induction hypothesis to A, and lift the result to Aj+i h • • • h Ak, or • to simulate the transition that C\ gives rise to, i.e. write Aj out in detail. It is therefore helpful with a few facts about how decomposition and combination of execution sequences affect the predicates preW, postW and nothunk on execution sequences. Fact 6.3.9 ('Properties of preW) If AeExSeq(m,C"*C) then (a)
preWt(A)
<£> preWt(A<xC)
Fact 6.3.10 ('Properties of postW) If AeExSeq(ra,C"AC) then (a)
postWt(AocC)
Am' = #(AocC) =>
(b)
Vra'<m: postWt(A)
<£>
preWt(A(m'..m))
postWt{A(m'..m))
Fact 6.3.11 ('Properties of nothunk') If AeExSeq(ra, C'C) then (a)
If m' = #(AocC) then nothunk(AocC) <=> nothunk(A(0..m'))
(b)
Vmf<m: nothunk(A) <& nothunk(A(mf..m))
Due to the frequent use of RESUME instructions it is also helpful with the following lemma that characterizes their behaviour. Lemma 6.3.12 If AeExSeq(ra,RESUME) and then postWt(A) A nothunk (A).
preWt(A) •
Proof: If A(0) = (RESUME,[V]) and v::nothunk we know that A amounts to the execution sequence (RESUME,[v]) —> (e,[v]). Hence postWt(A) follows and nothunk (A) is immediate. If A(0) = (RESUME,[{CI,UI}]) we have A(l) = (CiARESUME,[vi]) so that A(l..m) £ ExSeq(ra-l,CYRESUME). Let now A1 G ExSeq(m 1 ,C 1 ) be given by Ai = A(l..m)ocRESUME From preWt(A) postWt(Ai)
we have valWt({Ci,vi}) A
nothunk(Ai)
and thus
172
6
Code Generation
We then obtain preWt(A(l
+ m1..m)) A nothunk(A(O.A + m1))
by the properties of postW and nothunk. In other words A(l + rai) = (RESUME,[v2]) where v2::nothunk and valWt(v2). It follows that A = A(0..1 + m! + l), that = (£,[^2]) a n d that valWt(v2). Hence postWt(A)
A nothunk(A)
has been established.
•
As our first result about an operator we now consider Fst. L e m m a 6.3.13 compWtlXt2=ttl(K(Fst)) that do not involve lists.
holds for all well-formed types Uxt<>.-+U •
Proof: We must show compS as well as compSW. The first result is a consequence of Lemma 6.2.3 so we concentrate on showing compSW'(K(Fst)). For this we shall rely heavily on Fact 6.3.3. So let d>0, write C = K(Fst)(d) = RESUME:FST:RESUME, let AEExSeq(ra,C) and assume that preWtlxt2{A). nothunk(A).
We must show postWtl(A)
A
Stage 1: Let AiGExSeq^,RESUME) be given by Ax = AocFST:RESUME. The properties of preW give preWtlxt2(Ai) s o by Lemma 6.3.12 and the properties of postW and nothunk we have preWt1xt2{^{mi"m)) A nothunk(A(rai)). Stage 2: Let A2GExSeq(m2,FST) be given by A2 = A(rai..ra)ocRESUME. We know that A 2 (0) is of the form (FST,[v2]) where valWtlxt2(v2) and v2::nothunk. By inspection of the clauses for valW we observe that v2 must be of the form valWtl(v2i). (^21^22)- Hence m2 = 1 and A 2 (m 2 ) = (£,[v2i]) and we know that Stage 3: Let A3GExSeq(m3,RESUME) be given by A 3 = A(mi+m2..m). Lemma 6.3.12 we have postWh(A)
By
A nothunk(A)
and the desired result follows.
•
Corollary 6.3.14 A similar result holds for Snd, Id, True and False.
•
L e m m a 6.3.15 compW(t^i)—>(t z±t2)—>(*z±*ixt 2)(K(Tuple)) holds for well-formed iMevel types t, ti and t2 of run-time kind that do not involve lists. •
6.3
Well-behavedness of t h e Code
173
Proof: Let us begin by expanding the statement that needs to be proved and then simplify it to something manageable. So using the definition of compW for compile-time function space we must prove corap5(K(Tuple)) compSW(K(Tuple)) The first of these follows from Lemma 6.2.3 so we concentrate on the second. For this we assume
and must show compW(tz±t2)->(t:=Ltlxt2)
(K(Tuple)(#Ci))
Proceeding as above this amounts to proving
Concerning the first of these the result follows from Lemma 6.2.3, the assumption about RC\ and the 'layered predicates' (Lemma 6.3.5), and using Lemma 6.2.5. Concentrating on the second result we assume compW(t=±t2){RC2) and must show
Using the definition of comp W for run-time function space we must prove
compS(K(Tuvle){RC1)(RC2)) compSW(K(Tuple){RC1)(RC2)) Concerning the first of these the result follows from Lemma 6.2.3, the assumption about RC\ and the 'layered predicates' (Lemma 6.3.5), and using Lemma 6.2.5. We are thus left with the second result. In conclusion, to prove the lemma it suffices to assume
compSW^t^RCi) for i=l,2 and to show
174
6
Code Generation
For this we shall rely heavily on Fact 6.3.3. So let d>0 and note that the result is trivial unless RCi(d)^A. and RC2(d)^±. in which case also RC(d)^l. and RC(d) = ENTER:DELAY(/2C2(rf)):SWITCH:DELAY(i?Ci(rf)):TUPLE Further, AGExSeq(m,/?Cf(c?),/y) and preWt(A). m—5 and that A(m) =
It is immediate to verify that
{e\({RC,{d),v},{RC2{d),v})})
Hence nothunk(A) follows and to show postWtlxt2(^) ^ suffices to show valWt.({RCi(d),v}) for i=l,2 and this result is immediate from the assumptions about the RC\.
O
Lemma 6.3.16 compW{t2-±tz)-^(tlz±t2)^{tlz±to>)(}^{u)) holds for all well-formed 2level types ti, t2 and £3 of run-time kind that do not involve lists. • Proof: As in the proof of Lemma 6.3.15 it suffices to consider RC\ and RC2 and assume compSW(RCi) for i = 1,2 and show compSW(RC) where RC = K(n)(fiCfi)(/2C2). We shall use Fact 6.3.3 for this. So let d>0 and note that the result is trivial unless RC\(d)^L and RC2(d)^± in which case also and then RC(d) Further let AeExSeq(m,RC(d)) and preWtl{A). It is immediate to verify that ra>l so that A(l..m)GExSeq(m—1, RCi(d)). From the definition of compSW we also have preWt2(A(l..m)). We may now use compSW(RCi) to obtain m)) A nothunk(A(l..m)) and the desired property follows from the properties of postW and nothunk.
•
Lemma 6.3.17 compW(K(Cond)) provided that the type of Cond does not involve lists. •
6.3
Well-behavedness of the Code
175
Note that we dispense with type subscripts in the statement and proof of this lemma as no confusion is likely to arise. Proof: Much as in the proof of Lemma 6.3.15 it suffices to consider RC\, RC2 and RC3 and assume compSW(RCi) for i=l,2,3 and show compSW{RC) where RC = K(Cond)(RC1)(RC2)(RC3). We shall use Fact 6.3.3 for this. So let d>0 and note that the result is trivial unless RCi{d)^L, RC2(d)^A. and in which case also RC(d)^± and RC(d) = ENTER: RC1(d):BRAKCR{RC2{d),RC3{d)) Further let AeExSeq(ra,#C(d)) and preW(A). Stage 1: It is immediate to verify that ra>l and that A(l) may be written as A(l) - (RC1{d):BRAXCH(RC2(d),RC3{d)),[v,v]) Stage 2: Let A2eExSeq(m2,RCi(d))
be given by
A2 = (A(l..m)ocBRANCH(i2Cf2(d),/2C3(d)))oc[t;] It is straightforward to verify (but not merely by using Fact 6.3.9) that preW(A2). Using compSW(RCi) we obtain postW(A2) A nothunk(A2). Hence = (BRAKCR(RC2(d),RC3(d)),[w,v}) where valW(v), valW-Q00\{w) and w::nothunk. Stage 3: By inspection of the clauses for valW we observe that w must be either true or false as it is not a thunk. Thus A(2+m2) = for JG{2,3} depending on the value of w G {true,false}. Regardless of the value of j we then have preW(A(2+m2..m)). Stage 4: Let A4eExSeq(m4,RCj(d)) be given by A4 = A(2+ra2..ra). Using compSW(RCj) we obtain postW(A) A nothunk(A) from which the desired result follows from the properties of postW and nothunk. • Lemma 6.3.18 compW(K(Curry)) provided that the type of Curry does not involve lists. •
176
6
Code Generation
Proof: Much as in the proof of Lemma 6.3.15 it suffices to consider RC\ and assume
compSWiRd) and show
compSW(RC) where RC = K(Curry)(/2Ci). We shall use Fact 6.3.3 for this. So let d>0 and note that the result is trivial unless RC\(d)^L in which case also RC(d)^± and RC(d) = CURRY(#Ci(d))
Further let AeExSeq(m,RC(d),v0)
and preW(A).
We know that m = 1 and that A(l) = (e,[{RCi(d);v0}]). Clearly nothunk(A) To do so let ^i and to show postW(A) it suffices to show valW({RC\(d)\vo}). be given such that valW(vi) and show valW({RCi(d), (^o^i)})- From preW(A) we have valW(v0) and hence valW((vo,Vi)) so by compSW(RCi) we have the required valW({RC1{d),(v0,v1)}). • L e m m a 6.3.19 compW ((tl:±t2)xtl)^t2(K(kipiply)) { ( ) ) i ± t 2 that do not involve lists.
holds for all well-formed types •
Proof: Much as in the proof of Lemma 6.3.13 it suffices to prove
and we use Fact 6.3.3 for this. So let d>0 and write K(Apply)(d) = RESUME: C" where C" = ENTERlSNDlSWITCHlFSTlRESUMElC'" C" — APPLY:RESUME
Further let AGExSeq(m,K(Apply)(rf)) and preW{tlz±t Stage 1: Let A1GExSeq(m1,RESUME) be given by Ax = AocC". The properties of preW give preW(t1-±t2)xt1{^i) s o by Lemma 6.3.12 we have nothunk(Ai) and postW(tl=±t2)}a1{A1). Stage 2: Let v be given by A(mi) = (C",[v]). From valW(tl=±t2)xti(v) a n ( i ^he fact that v is not a thunk it follows from inspection of the clauses for valW that v is of the form (u,w) where valWtl=±t2(u) and valWtl(w). We then have m>mx+4 and
6.3
Well-behavedness of the Code
177
A(mi+4) = (RESUME: C"\[u,w]) Stage 3: Let A3EExSeq(ra3,RESUME,w) be given by A 3 = (A(m 1 +4..)ocC // )oc[H It is straightforward to verify that preWtl=1t2( A3) (but not merely by using Fact 6.3.9). Using Lemma 6.3.12 we get nothunk(As) and postWt1=±t2(^s)' Stage 4: It follows that A(m!+4+m 3 ) = {CN \u\w\) where u1 is not a thunk, valWtl-tt2{u/) a n d valWtl(w). By inspection of the clauses for valW it follows that u' is of the form {Cyu}. Then m>rrii+5+m3 and A(m1+5+m3)
= (RESUME, [{C,(u,w)}])
Furthermore, we have valWt2({C,(~u,w)}) from the assumptions about {Cyu} and w. Stage 5: Let A5eExSeq(ra5,RESUME) be given by A 5 = A(mi+5+m 3 ..m) We have preWt2(As) so by Lemma 6.3.12 we get nothunk(A5) and postWt2(As). From this nothunk(A) A postWt2(A) follows using the properties of postW and nothunk. • The proofs for the remaining operators of Table 5.1, that is Fix and fix, require a new technique. In both cases the difficulty is that we need to show the wellbehavedness of code that works by 'unfolding' itself and therefore we need some way of getting an induction going. There are several ways that can be explored and we shall choose one that will also be useful in the next section. The general idea is to be able to control the number of unfoldings allowed for the REC and CALLREC instructions. The most convenient way in which to do this is to allow indexing these instructions with a counter n that is decreased every time an unfolding takes place and that gives rise to looping if the index is 0. To be more precise we have the following extension of Table 6.2:
(RECn+1:C, {C';v}:ST) - (C,
{C't(vJa)}:ST)
where v'n = {RECn:RESUME,{C";t;}}
(REC0:C, {C';v}:ST)
-» (REC 0:C,
{C';V}:ST)
(CALLRECn+1(/,C"):C, ST) -» (C'[CALLREC n(/,C")/TC, ST) (CALLREC0(/,C"):C, ST) -+ (CALLREC0(/,C"):C, ST)
178
6
Code Generation
The proof then proceeds by first replacing REC by RECn in K(Fix) and CALLREC by CALLRECn in K(f ix). Next an induction on n is performed, and the basis, n=0, is straightforward in both cases. Finally, the results for RECn and CALLRECn for all n must be lifted to results for REC and CALLREC. To facilitate the last step we introduce a 'syntactic' ordering. The key relationship is that RECn •< RECm if n < m RECn -< REC CALLRECn(/,Cn) < CALLRECm(/,Cm) if n < m and Cn ^ C m CALLRECn(/,Cn) •< CALLREC(/,C) if Cn •< C and this is then extended to elements of Code and Val in the obvious way, for example {REC7:RESUME,8} -< {REC8:RESUME,8} but {REC7:RESUME,8} 2< {REC8:RESUME,9} This ordering carries over to elements of RelCode by setting RC < RC if and only if Vd: RC(d) ^
RC'(d)
and to configurations of the machine by setting
if and only if C -< C and u\ •< u[ for each i Taking elements of Code as an example, the idea is that the least upper bound of a sequence (C n ) n of instruction sequences with indexed REC and CALLREC instructions will have a least upper bound C which is similar to each Cn but where (some) indices have been removed. We leave the details to the lemmas below. L e m m a 6.3.20 If ((C n ,i/ n )) n is a chain with least upper bound (C,u) and if ExSeq(*,C,^)^0 then there exists n 0 such that ExSeq(*,C n , / u n )^0 when n>n o . D
Proof: We shall prove a stronger result and to state it we shall write ExSeq(*)(C,5T) - { AeExSeq(*) | A(0)=(C,Sr) }
6.3
Well-behavedness of the Code
179
and similarly for ExSeq(oo)(C,ST) and ExSeq(m)(C,ST). We then claim that if ((C n ,ST n )) n is a chain of configurations with least upper bound (C,ST) and if ExSeq(*)( C,ST)^0 then there exists n o such that ExSeq(*)(C n ,5T n )^0 when n>n 0 . The proof is by contradiction. Without loss of generality we may assume that m is the minimal value for which it is possible to have A G ExSeq(m)(C,ST) (C,ST) is the least upper bound of a chain
((Cn,STn))n
ExSeq(*)(C n ,Sr n )=0 for infinitely many n From an obvious analogue of Fact 6.1.1 we may determine A n by A n G ExSeq(oo)(C n ,ST n ) Clearly ra>0 as otherwise C—e in which case Cn=e and ExSeq(*)(C n ,ST n )^0 holds for all n. Then C must be of the form i:C and each C n must be of the form in:C'n where i is the least upper bound of (in)n. Since A is finite we know that i cannot be of the form REC0 or CALLREC0(/,C//). It follows that there exists n 0 such that zn cannot be of the form REC0 or CALLREC0(/,Cn) when n>no- Then the same transition rule of Table 6.2 (augmented with RECn and CALLRECn) must be used in the first step of all A n (with n>n 0 ) as well as A. It follows that also (A n (l)) n > n o is a chain and that A(l) is not only an upper bound but the least upper bound. However, infinitely many A n (l..) are infinite and this contradicts the minimality of m. • L e m m a 6.3.21 If (^n)n is a chain of values with v as their least upper bound and if Vn: valWt(vn) then also valWt(v). • Proof: We proceed by structural induction on t. In each case we first consider the situation where v is not a thunk and we then finally consider the situation where v is a thunk. Note that vn^v implies that vn is a thunk if and only if v is. The case t::=Aj and v::nothunk. We must have vn=v for all n and from
we obtain valWt(v).
valWt{vn)
The case t:: — tiXt 2 and v::nothunk. We must have v to be of the form (u,w) and each vn to be of the form (un,wn) such that u is the least upper bound of (^ n)n and w is the least upper bound of (wn)n- From Vn: valWt(vn) we then obtain Vn: valWtl(un) and Vn: valWt2(wn). The induction hypothesis then gives valWtl(u) and valWt2(w) from which the desired valWt(v) follows. The case t::=tiz±t2 and v::nothunk. We must have v to be of the form {C;u} and each vn to be of the form {Cn',un} such that C is the least upper bound of (Cn)n and u is the least upper bound of (un)n. To show valWt(v) consider w such that valWtl(w). We then have
180
6 Vn:
Code Generation
valWta({Cn,{un,w)})
and from the induction hypothesis we obtain valWt3({C,{u,w)}) which is the desired result. The case v::thunk. We must have v to be of the form {C,u} and each vn to be of the form {Cn,un} and such that C is the least upper bound of the chain (C n ) n and u is the least upper bound of the chain (wn)n. Now valWt{v) holds vacuously unless there is A E ExSeq(ra,C» in which case we have to show post Wt(A) A nothunk(A). We now choose n0 such that ExSeq(*,Cn,un)^0 when n>n o . This is possible using Lemma 6.3.20. We then determine An for n>n0 by An e ExSeq(*,Cn,ttn) We shall show by induction on j<m that (An(j))n>no is a chain with least upper bound A(j) The base case, j=0, is immediate from the assumptions. The induction step follows much as in the proof of Lemma 6.3.20: since Ano is finite we cannot encounter any REC0 or CALLREC0(/,C^0) in going from A ^ j ) to Ano(j+1) and then we cannot either in going from An(j) to An(j+1) for n>n0. This completes the numerical induction on j<m. Since ((An(m))n>no is a chain with least upper bound A(ra) and since A^ is finite it follows that for n>n 0 all An have length m. From valWt(vn) we then have postWt(An) A nothunk(An) for all n>n 0 . This amounts to postWt(An(m)) A nothunk(An(m)) and we then have postWt(A(m)) A nothunk(A(m)) because we have already proved the result for non-thunk values of type t. We then obtain the desired result and this completes the proof. • Lemma 6.3.22 compW'(t=tt)I±t(K(Fix[f])) holds whenever (t^t)z±t formed type that does not involve lists.
is a well•
6.3
Well-behavedness of the Code
181
Proof: Much as in the proof of Lemma 6.3.13 it suffices to prove
Since K(Fix[£]) = Xd. RESUME:REC:RESUME this amounts to proving compS W(^t)_^(RESUME:REC:RESUME) and using Lemma 6.3.12 it suffices to prove valWi=±i({C\v})
=> valWt({REC:RESVME,{C;v}})
(P)
For this we begin by proving valWt=Lt({C;v})
=> valW t({RECn:RESVME,{C;v}})
(P n )
by induction on n. The basis case, n=0, is immediate as ExSeq(*,REC0:RESUME,{C>}) = 0 For the inductive step we assume (P n ) and prove (P n + i). So assume that we have
valWtl±t({C;v}) and consider
A e ExSeq(ra,RECn+1:RESUME,{C>}) We know that ra>l and that A(L.ra) G ExSeq(ra-l,RESUME,{C,(v,*;„)}) where v'n = {RECn:RESUME,{C>}} From the induction hypothesis we have valWt(v'n), and valWt=tt{{C;v})
then gives
valWt({C,(v,v'B)}) Lemma 6.3.12 and the properties of postW and nothunk then give
postWt(A) A nothunk(A) This establishes (P n +i) and completes the numerical induction. Finally, we obtain (P) using Lemma 6.3.21.
•
L e m m a 6.3.23 compW(t—>t)—n(K(f ix[£])) holds for all well-formed and composite types t that do not involve lists. • Proof: The proof is by induction on the structure of the type t. The base cases are when t is pure and when t is a frontier type. The case t is pure. As in the proof of Lemma 6.2.3 it is straightforward to show if pure(£) then compWt(x)
= true
182
6
Code Generation
by structural induction on t. It is then immediate that compW(t-*t)->t{K{fix[t])). The case t = t\z±t2 is a frontier type. We assume that
and must show compWt(K{fix[t])(F)) since Lemma 6.2.3 ensures compS(K(fix[t])). d>0 and to show
It suffices to consider an arbitrary
compSW't(K(fix[t])(F){d)) because the compS part of the result follows from Lemma 6.2.3 and the 'layered predicates' (Lemma 6.3.5); but it is more convenient to imagine showing compWt(\d'.K{fix[t])(F)(d))
(P)
If F(Xdf.CALL d)(d+l) — L this is immediate so assume that F(\d'.CALL d)(d+l) ^ _L and write C = F(\d'.CALL
d)(d+l)
To prove (P) we begin by proving comp W t(\d' .CALLRECn(d,C))
(P n )
by numerical induction on n. In the base case, n=0, we immediately have compS Wt(\d' ,CALLREC0(d,C)) because ExSeq(*,CALLRECo(rf,C),v)=0 for all values v. Furthermore, we have compS t(\d' .CALLREC0(d,C)) much as in Section 6.2. To be more precise it suffices by Lemma 6.2.5 to show that FreeLab(CALLREC0(d,C)) = 0 Setting S = {(d,CALL d)} we have compS\T]t(\d'.CALL
d,Ad'.CALL d)
6.3
Well-behavedness of the Code
183
so that compSt-+t(F) gives compS[S]t(F(Ad'.CALL d), F(\d'.CALL d)) and it follows that FreeLab(C) = FreeLab(F(Ad'.CALL d)(d+l)) C {d} so that FreeLab(CALLRECo(d,C))=0 follows. For the induction step we assume (P n ) and show (P n +i). It suffices to prove the compSW part, which boils down to compSW't(c
ALLRECn+i(d,C)),
as the compS part follows as above. For this we note that (P n ) a n d the assumptions about F give comp W t{F (\df
.CALLRECn(d,C)))
from which compSW't(F(\d'.CALLRECn(d,C))(d+l)) follows. Using the 'substitution property' (Lemma 6.2.2, or rather, an obvious analogue) we have compSWft(C[CALLRECn(d,C)/ d]) We then claim that this establishes compSWft(CALLRECn+1(d,C)) To see this note that for all values v, perhaps such that valWtl(v), A G ExSeq(*,CALLRECn+1(d,C),i;) if and only if A(0) = (CALLRECn+i(c/,C),[v]) and A(l..) G ExSeq(*,C[CALLRECn(d,C)/d],*;) and note that the properties of nothunk and postW ensure that postWt2(A)
A nothunk(A)
holds if and only if postWt2(A(l..))
A nothunk(A(l..))
we have that
184
6
Code Generation
This ends the proof by numerical induction. To be able to conclude (P) it suffices to show compSW't(C ALLREC(d,C)) This amounts to assuming
valWtl(v) and showing valWt2({CALLREC(d,C),v}) Using (P n ) we already have valWt2({CALLRECn(d,C),v})
for all n
and the result then follows using Lemma 6.3.21. The case t is a product of composite types. This case is analogous to the similar case in the proof of Lemma 6.2.3 in that one must show that compW(F) carries over to the functions supplied as argument to K(f ix[^]) and K(f ix[£2]); as in the proof of Lemma 6.2.3 we simply refer to [72, Lemma 7.13] for an example of a proof along these lines. •
Summary of the well-behavedness properties The lemmas proved above together yield the following analogue of Lemma 6.2.3: Corollary 6.3.24 We have comp W(K(<j>)) for all operators <j> of Table 5.1 whose type does not involve lists, except those of form fj or Fj and provided that the type t indexing f ix[f] is always composite. • By analogy with Lemma 6.2.4 we have: L e m m a 6.3.25 ('Structural induction') Let penv be a well-formed position environment (in the sense of Section 5.2) such that pps(penv)
K
h e:t
and assume that neither e nor penv involves any lists. If compW(K{(j>)) holds for all operators (j) that occur in e, then compW{\e\(K))
•
Proof: The proof is by structural induction on e much as in the proof of Lemma 6.2.4. Apart from using Lemma 6.2.4 to establish the compS part of the result, the proof of the compSW part proceeds along the same lines as in the proof of Lemma 6.2.4. We dispense with the details. •
6.4
Correctness of the Code
6.4
185
Correctness of the Code
We now have the apparatus needed for showing the correctness of the code generated. There are two ingredients to this, depending on whether or not the execution of a piece of code gives rise to termination. So suppose that some value of the abstract machine represents some semantic value. We shall then show that • if the execution of the code upon that value gives rise to a terminating computation that produces some new value, then this value represents the result of applying the semantic function to the original semantic value; and • if the execution of the code upon the original value gives rise to a nonterminating computation, then the result of applying the semantic function to the original semantic value gives _L. Given the determinacy of the abstract machine, and of the mixed A-calculus and combinatory logic, this constitutes the desired correctness property. In the remainder of this section we shall formalise this notion of correctness and then establish the required correctness properties for the operators and expressions of the mixed A-calculus and combinatory logic. This turns out to be a rather systematic extension of the development for showing well-behavedness.
6.4.1
Definition of the correctness predicates
We shall begin by adapting the val and comp predicates of the previous section so as to express the correctness. This calls for adding an additional parameter, namely the corresponding semantic value or function. The predicate on values is valCt: Val x S(t) -* {true,false} where t is a well-formed 2-level type of run-time kind. It is then natural to extend the partial ordering on { ^ , 5 } to one on { W,S,C} such that •
C
i,
W
i,
S
and with the understanding that there is no valS predicate 2. Also note that even though the type index was, in principle, dispensable in the definition of valW 2
Simply because the substitution property relates to the 'compile-time level' only.
186
6
Code Generation
it seems necessary to include it in the definition of valC in order to be able to express the domain in which the semantic entity is an element. We now propose the following definition of valC (leaving an explanation of the B\ function until afterwards). valCAt(bbx)
= valW^i)
valCtlxt2((v1,v2),x)
A ^[bj] - x
= 3xux2:
x = up((x1,x2)) A valCtl(vi,xi) A valCt2(v2,x2)
valCtl=ti2({C;v0}J)
= valWtl=tt2({C]v0}) A valWCil=±t2({C;v0}J) where valWCil=Lt2({C;v0}J) = (/ ^ J.) A ( (valCtl(vux) => * 2 ( { C , ( ^
valCt({C,v},x) = valWt({C,v}) A valWCt{{C,v},x) where valWCt({C,v},x) = VAe ExSeq(oo,C,t;): ( AeExSeq(*) => postCt{A,x) A nothunk{A) ) A ( AGExSeq(cj) =» x=±) where postC, and by analogy preC, is defined from valC in much the same way that postW', and preW, was defined from valW. We need to explain the role of the B\ functions. Each B\ function has functionality Bi : Val -> S(Ai) and purports to connect a base value bj of type A] with the appropriate element in the standard semantics. Thus #booi[true] = t r u e #booi [false J = false etc., but we shall leave the details of the remaining B\ functions unspecified. By analogy with Lemma 6.3.2 we have: L e m m a 6.4.1 The clauses for valCt define a predicate valCt : Val x S(t) -> {true,false} whenever t is a well-formed ^-level type of run-time kind that does not involve any list types. • Proof: The proof is by complete induction using the same well-founded order that was used in the proof of Lemma 6.3.2. We therefore dispense with the details. • The relationship to valW is expressed by:
6.4
Correctness of the Code
187
Lemma 6.4.2 ('Layered predicates') For all well-formed i?-level types t of runtime kind, vE Val and xGS(t) we have valCt{v,x) =$> valWt(v) Proof: The proof is by induction on the shape of the inference tree for valCt(v,x), that is, by induction on the well-founded order of the previous lemma. If the first, third or fourth clause of valC is used, the result is immediate from the valW conjunct that is present in the definition of valC. If the second clause is used, the result follows from the induction hypothesis. • Turning to code sequences we define a predicate compCt: [t}(K) x \t](S) -* {true,false} by structural induction on well-formed 2-\eve\ types t of compile-time kind: compCAi(x,y)
= x=y
compCtlXt2((x1,x2),(y1,y2))
= compCtl(xi,yi)
A compCt2(x2,y2)
_^ 2 (F,G) = compWtl^t2{F) A compWCtl^t2{F,G) where compWCt1->t2(F ,G) = Vx,y: compCtl(x,y) =» compC t2{F(x),G(y)) compCtl-±t2(RC\g) = compWtlz±t2(RC) A compWC\1:=Lt2(RC\g) comPWC'tl=Lt2(RC{d),g) where compWCix=Li2{RC,g) = Vrf>0: where compWC'tl_+t2(C,g) = (C= ±^ g = ±) A (C^ ±=> g^±A Vv,y: {valCtl{v,y) =>
valCt2({C,v},dn(9)y)))
By analogy with Lemma 6.3.4 we have L e m m a 6.4.3 The above clauses define an admissible predicate compCt, whenever t is a well-formed £-level type of compile-time kind that does not involve any list types. • Proof: This is a simple structural induction.
•
The relationship to compW is expressed by the following analogue of Lemma 6.3.5: Lemma 6.4.4 ('Layered predicates') If t is a well-formed £-level type of compiletime kind, se[<](K) and yG[t](S) then compCt(x,y)
=> compWt(x)
Proof: This is a simple structural induction.
•
188
6 Code Generation
6.4.2
Correctness of operators
Having defined the predicates we can next confront the task of proving that they hold for the operators. As has already been said this will turn out to be a rather systematic extension of the proofs of the previous section and we therefore only provide some of the more interesting details. We begin by observing that one may formulate properties of preC, postC (and nothunk) much as those formulated for preW, postW (and nothunk). This amounts to little more than extending preW and postW with an additional parameter denoting the semantic value. We therefore dispense with the formulation of these facts. The characterization of RESUME leads to the following analogue of Lemma 6.3.12. Lemma 6.4.5 If AeExSeq(oo,RESUME) and preCt(A,x)
then
• AeExSeq(*) => postCt(Ajx) A nothunk(A) • AeExSeq(u>) => x=± Note that this clearly states the intention with RESUME: it will not change the semantics of the entity on top of the stack but will massage it so that it is not a thunk. Proof: Let A(0) = (RESUME,[v]). If v is not a thunk we know that A = (RESUME,[V]) -> (e,[v]) so AEExSeq(*), nothunk(A) and postCt(A,x). If v is a thunk, that is v::thunk, we can write v — {Ci,Vi} so that = (CVRESUME,K])
Let now AiGExSeq(oo,Ci,Vi) be determined by Ax = A(l..) oc RESUME From valCt({Ci>Vi},x) we have • AxGExSeq(*) => postCt(Aux)
A
nothun^Ax)
• Ai£ExSeq(u;) => x=l If AxEExSeq^) also A£ExSeq(u;) and we have x=l. as desired. If AiGExSeq(*), that is AiEExSeq(rai) for some mi, we have = (RESUME,^])
6.4
Correctness of the Code
189
for some v2-:n°thunk and valCt(v2jx). Then
and we have AEExSeq(*), nothunk(A) and postCt(A,x)
as desired.
•
We now turn to the operators. The proofs will mimic those of Section 6.3 but with two differences. One is that we have to take the semantic entity into account when proving the result. The other is that the proofs proceed by decomposing execution sequences AEExSeq(oo) rather than AEExSeq(*); this means that we must take care to properly handle infinite execution sequences. By analogy with Lemma 6.3.13 we have L e m m a 6.4.6 compCtlxt2l±t1 (K(Fst),S(Fst)) holds whenever tiX_t2z±t\ is a wellformed type that does not involve lists. • Proof: We must show compW as well as compWC. The first result is a consequence of Lemma 6.3.13 so we may concentrate on showing compWC. For this let , write C = K(Fst)(d) = RESUME:FST:RESUME and note that C^=_L as well as S(Fst)^_L Next let v and y be given such that valCtlxt2(v,y) and consider showing valCtl({C ,v},dn(S(Fst))(y)). Using the compW part of the result, and that valC(v,y) implies valW(v), it actually suffices to prove
For this let AGExSeq(oo,C,v). As in the proof of Lemma 6.3.13 we now proceed in three stages. Stage 1: Let AiEExSeq(oo,RESUME,z;) be given by A i = A OC FST:RESUME
If AiGExSeq(u;) also A£ExSeq(u;) and we have y=J_ using Lemma 6.4.5; it then follows that dn(S(Fst))(y) = J_. Otherwise, AiEExSeq(*) and there exists m1 such that AiGExSeq(mi); using Lemma 6.4.5 we then have
postCt1xt2(A(mi),y) A nothunk(A(mi)) Stage 2: We now know that A(rai) is of the form (FST:RESUME,[v2]) where v2::nothunk and valCtlxt2{v2',y)' By inspection of the clauses for valC it follows that
190
6
V=
Code Generation
up(yuy2)
for appropriate t;2i, ^22 £ Val and y\ G S(^i). It is then immediate that valCtl(v2i,yi) A(mi + 1) = (RESUME, V2I) and postCtl(A(mi+l),
dn(S(Fst))(y)) follows.
Stage 3: Let A3eExSeq(oo,RESUME,v2\) be given by A 3 = A(mi+1..). One case is when A3EExSeq(u>) and then also AGExSeq(u;) and we have dn(S(f st))(y) = _L as required using Lemma 6.4.5. Otherwise, A3£ExSeq(*) and Lemma 6.4.5 ensures postCtl(A,
dn(S(fst))(y))
A nothunk(A)
This concludes the proof of valWCtl({C,v},dn(S(Fst))(y)).
•
Corollary 6.4.7 A similar result holds for Snd, Id, True and False.
•
Lemma 6.4.8 corapCt/(K(Tuple),S(Tuple)) holds for all well-formed iMevel types t' that are of the form ("izl'i) —> (*z±*2) —*• {tz±tiXt2) and that do not involve lists. • Proof: Lemma 6.3.15 proves the compW part of the result and proceeding along the lines of the proof of Lemma 6.3.15 it actually suffices to assume that i) for i=l,2 and to show that compWC(RC,g) where RC =
K{Tu?le)(RC1){RC2)
g = S(Tuple)(^)(sf2) So let d>0 and consider RC(d). If RC(d)=± there is some i such that Rd(d)=±; from compWC(RCi,gi) we then have g'{=•!. and it follows that g=A. which then establishes compWCf(RC(d),g). If RC(d)^± then also RC1{d)^l. and R and we have RC(d) = ENTER:DELAY(i2C2(rf)):SWITCH:DELAY(JRCfi(c?)):TUPLE Furthermore, also gi^-L and g2^-L so that g^L and we have
6.4
Correctness of the Code
191
g = up(Xy. up{dn(gl){y), To show compWC'(RC(d),g) we next consider v and y such that valCt(v,y) and we must show valCt1xt2({RC(d),v}, dn(g)(y)). Using Lemma 6.3.15 (and the 'layered predicates', Lemmas 6.4.4 and 6.4.2) this boils down to showing valWCtl*t2({RC(d),v},
dn(g)(y))
So let A£ExSeq(oo,RC(d)yv). It is immediate to see that AEExSeq(5) and that A(5) = (e,[({RC!(d),v}, {RC2(d),v})}) As nothunk(A) is immediate it remains to show valCtllLti(({RC1(d),v},{RC2(d),v}),up{dn(g1)(y),dn{92)(y))) and given the definition of i;a/CtlXt2 ^his follows from valCtdRCiid^v},
dn(9i)(y)) for i=l,2
which is a consequence of valCt(v,y) and compC(RC\^g1) for i=l,2.
•
Lemma 6.4.9 corapCV(K(D),S(n)) holds for all well-formed i?-level types t' that are of the form (^2z±^3) —* (^iz±^) —* (^11=^^3) and that do not involve lists. • Proof: Lemma 6.3.16 proves the compW part of the result and proceeding along the lines of the proof of Lemma 6.3.16 it actually suffices to assume that
and to show that compWCtl^t3(RC,g) where RC = K(u)(RC1){RC2) and g = S(n)(gi)(g2). So let d>0 and consider RC(d), If RC(d)=±. also RCi(d)=±. for some i so that g\=L and hence g=-L as desired. Otherwise, RC(d)^L in which case RC\(d)^l. and RC2{d)^l. and we have RC(d) = Furthermore, also gi^A- and g2^-L so that g^-L and we have g = up(Xy. dn(gl)(dn(g2)(y)))
192
6
Code Generation
To show compWC (RC (d) ,g) we next consider v and y such that valCtl(v,y) and must show valC({RC(d),v},dn(g)(y)). Using Lemma 6.3.16 (and the 'layered predicates', Lemmas 6.4.4 and 6.4.2) this boils down to showing valWCt3({RC(d),v},dn(9)(y)) So let AeExSeq(oo,RC(d),v). We know that so that ()
€
From cornpC{RC2i92) we have valCh({RC2(d),v},
dn(g2)(y))
and using compC(RCi,gi) we then have • A(l..)€ExSeq(*) => nothunk(A(l..)) A postCt3(A(l..),dn(9)(y)) • A(l..)eExSeq(a;) ^ dn{g){y)=JL In both cases the desired result, with A replacing A(l..), follows immediately. • Lemma 6.4.10 corapC(K(Cond),S(Cond)) provided that the type of Cond does not involve lists. • Proof: Lemma 6.3.17 proves the compW part of the result and proceeding along the lines of the proof of Lemma 6.3.17 it actually suffices to assume that
compCtl=±t2(RC3,g3)
and to show that compWCil=Li3{RC,g)
where RC = K(Cond)(#Ci)(#C 2)(#C3) and g = S{Cond)(g1)(g2){g3)' So let d>0 and consider RC(d). If RC(d)=± also RC[(d)=± for some i so that g\=±- and hence g=-L as desired. Otherwise, RC(d)^±. in which case RC\^A. for all i, and we have
6.4
Correctness of the Code
193
RC(d) = EKTER:RC1(d):BRAKCK(RC2(d),RC3(d)) Furthermore, also g^A- for all i, so that g^A. and we have
{
dn{g2){y) if dn(g1)(y)=true dn(gs)(y) if ^(^)(y)=false ) J_ if dn(gi)(y)=±
To show compWC'(RC(d),g) we next consider v and y such that valCtl{v,y) and must show valCt2({RC(d),v},dn(g)(y)). Using Lemma 6.3.17 (and the 'layered predicates', Lemmas 6.4.4 and 6.4.2) this boils down to showing valWCt2({RC(d),v},dn(9)(y)) So let AEExSeq(oo, RC(d),v) and let us proceed in stages, as in the proof of Lemma 6.3.17. Stage 1: We know that A(l) = (RC1(d):BRANCR(RC2{d),RC3(d)),[v,v]) Stage 2: Let A2eExSeq(oo,JRCi(d),v) be given by A2 = (A(l..)<xBRAKcn(RC2(d),RC3(d)))(x[v} We have preCtl{A2,y) and now have two cases depending on whether A2 is finite or not. If A2GExSeq(cj) we have dn(gi)(y)=A. from compC(RCi,gi)] it follows that also A£ExSeq(u;) and that dn(g)(y)=A. as desired in this case. Otherwise, A2GExSeq(*) and we have A nothunk(A2) from Stage 3: Writing A2GExSeq(m2) we have A(l+m 2 ) = (BRANCH(/2C2(d),i2 for some wE. Val such that w::nothunk and By inspection of the clauses for valC it follows that w is either true or false. It follows that A(2+m2) = (RC}(dUv]) dn{g){y) =
194
6
Code Generation
for a value jE{2,3} depending on w. Stage 4: Let A4EExSeq(oo,i2Cj(d),i;) be given by A 4 = A(2+ra 2 ..) Again we have two cases depending on whether A 4 is finite or not. If A4EExSeq(u>) we have dn(gj)(y)=±. from compC'(RCj,<7j); it follows that also AEExSeq(i<;) and that dn(g)(y)=A. as desired in this case. Otherwise, A 4 £ExSeq(*) and we have
postCt2(A4,dn(gj)(y)) A nothunk(^) from compC(RCvgj). to
But using the properties of postC and nothunk this amounts
postCt2{A,dn(gi)(y))
A nothunk(A)
which is the desired result as also AEExSeq(*).
•
L e m m a 6.4.11 corapC(K(Curry),S(Curry)) provided that the type of Curry does not involve lists. • Proof: Lemma 6.3.18 proves the compW part of the result and proceeding along the lines of the proof of Lemma 6.3.18 it actually suffices to assume that
and to show that
where RC — K(Curry)(RCi) and g = S(Curry)(#i). So let d>0 and consider RC(d). If RC(d)=± also RCi(d)=± so that gi=-L and hence g=-L as desired. Otherwise, RC(d)^± in which case RC\{d)^L and we have RC{d) = CURRY(#Ci()) Furthermore, also ^17^-L so that g^-L and we have g = up(\y0.
up(Xy1.
To show compWCf(RC(d),g)
dn(g)(up(yo,y1)))) we next consider v0 and y0 such that
valCto(vo,yo) dn(g)(y0)). Using Lemma 6.3.18 (and the and must show valCtl^t2({RC{d),v0}^ 'layered predicates', Lemmas 6.4.4 and 6.4.2) this boils down to showing
vatWCtl=ttt({RC(d),vo},dn(g)(yo))
6.4
Correctness of the Code
195
For this let AGExSeq(oo,i?C(rf),i;o). It is immediate that AEExSeq(l) and that
Then nothunk(A) is immediate and to show postC(A,dn(g)(y0)) valCtlz±t2({RC1{d)]v0},
we must show
dn{g)(y0))
Using Lemma 6.3.18 (and the 'layered predicates') this boils down to showing
It is evident that dn(g)(yo)^±_ so consider Vi and yi such that
valCt^Vi^) and consider showing
From valCto(vo,yo) and valCtl(vi,yi) valCtoxtAi^o^i),
we have
up(yo,y1))
and using compC ^0xt1)=tt2(RCi><7i)
valCt2({RCi(d),(v0,vi)},
we
obtain
dn(gi)(up(yo,yi)))
which is the desired result.
•
L e m m a 6.4.12 compC((<1_±<2)xtl)_l.i2(K(Apply),S(Apply)) holds for all well-formed that do not involve lists. • types ((tiz±t2)xti)^t2 Proof: Lemma 6.3.19 proves the compW part of the result so it suffices to prove compWC
((tlz±t2)xt1)I±t
where RC = K(Apply) and g = S(Apply). So let d>0 and note that as well as g^A- and that RC(d) = RESUME:C" g = up(Xy. dw(/)(j/!) where (/,l/i)=dn(y)) where C" = ENTERlSNDlSWITCHlFSTlRESUMElC'' C" = APPLYlRESUME
To show compWCf(RC(d),g)
we next consider v and y such that
196
6
Code Generation
and must show valCt2{{RC(d),v}, dn(g)(y)). Using Lemma 6.3.19 (and the 'layered predicates', Lemmas 6.4.4 and 6.4.2) this boils down to showing valWCt2({RC(d),v},
dn(g)(y))
So let AEExSeq(oo,/2C(d),i;). We proceed in stages as in the proof of Lemma 6.3.19. Stage 1: Let AiEExSeq(oo,RESUME,!;) be given by Ai = AocC". We now have two cases depending on whether Ai is finite or not. If AiGExSeq(cj) we have y=± using Lemma 6.4.5; then also AEExSeq(u>) and dn(g)(y)=±. which is the desired result in this case. Otherwise, AiEExSeq^) and there exists m1 such that A 1 EExSeq(m 1 ); using Lemma 6.4.5 we then have postC{tl=tt2)xt1{^{mi),y)
A
nothunk(A(m1))
Stage 2: Writing A(rrii) = (C\[v]) it follows by inspection of the clauses for valC that v must be of the form v = (u,w) It then follows from the definition of valC(tl=±t2)xt1 that also y must be of the form y=
up(f,yi)
and where valCtl=±t2(uJ) valCtl(w,yi) Furthermore, A(mi+4) = (RESUME: C",[u,w]) Stage 3: Let A3EExSeq(oo,RESUME,u) be given by A 3 - (A(m 1 +4..)ocC // )ocW If A3EExSeq(u>) we have /=-L using Lemma 6.4.5; then also AEExSeq(u;) and dn(g)(y)=A. which is the desired result in this case. If A 3EExSeq(*) there is m 3 such that A 3 EExSeq(m 3 ) and Lemma 6.4.5 gives t2(^3j)
A nothunk(A3)
Stage 4: It follows that
A(m!+4+m3) = (C",[u>])
6.4
Correctness of the Code
197
where valCtl=Lt2(u'J) Since ^'::nothunk it follows by inspection of the clauses for valC that u1 is of the form u1 = {C;U} Then A(rai+5+m 3 ) = (RESUME,[{(7,(¥,w)}]) and valCt2({C,(H,w)},
dn(f)(y1))
follows from valC({C;u}J)
and
valC(w,y1).
Stage 5: Let A5eExSeq(oo,RESUME) be given by A 5 = A(m 1 +5+m 3 ..) If A5EExSeq(u;) we have dn(f)(yi)=±. by Lemma 6.4.5; then also AEExSeq(u;) and dn(g)(y)=A. which is the desired result in this case. If A 5EExSeq(*) we have )A
nothunk(A5)
by Lemma 6.4.5. Using the properties of postC and nothunk this yields postC't2(A,dn(g)(y))
A nothunk(A)
which is the desired result in this case.
•
For the remaining operators, Fix and fix, we follow the approach of the previous section. This calls for first establishing a lemma about the behaviour of valC when taking least upper bounds. L e m m a 6.4.13 Let (vn)n be a chain of values with v as their least upper bound and let (yn)n be a chain of semantic values with y — [_| n yn as their least upper Q bound. If Vn: valCt(vn,yn) then also valCt(v,y). Proof: As in the proof of Lemma 6.3.21 we proceed by structural induction on t. In each case we first consider the situation where v is not a thunk and then finally consider the situation where v is a thunk. As in the proof of Lemma 6.3.21 we have that vn is a thunk if and only if v is. The case t::=Aj and v::nothunk. As in the proof of Lemma 6.3.21 we must have vn=v for all n. It then follows that also yn=y for all n and the desired result is immediate. The case t::=tiXt2 and t;::nothunk. We must have v to be of the form (u,w) and each vn to be of the form (un,wn) such that u is the least upper bound of (u n)n and w is the least upper bound of (wn)n. In a similar way we must have y to be of the form up(x,z) and each yn to be of the form up(xn,zn) such that x = |J n xn and z = LJn zn- From Vn: valCt(vn,yn) we obtain
198
6 Vn:
valCtl(unjxn)
Vn:
valCt2(wn,zn)
Code Generation
and using the induction hypothesis we obtain valCtl(u,x)
A valCt2(w^z)
from which the desired result follows. The case t::=tiz±t2 and v::nothunk. We must have v to be of the form {C;u} and each vn to be of the form {Cn;un} such that C is the least upper bound °f (Cn)n a n d u is the least upper bound of (un)n. From valCt(vn,yn) it follows that t/n^-L from which y^A- is immediate. Concerning valCt{v,y) we note that the valW part follows from Lemma 6.3.21 (and the 'layered predicates', Lemma 6.4.2). To prove (what remains of) the valWC part, consider w and z such that
valCtl(w,z) We then have Vn:
valCt2{{Cn,(un,w)},dn(yn)(z))
from which
valCt2({C,(u,w)},dn(y)(z)) follows using the induction hypothesis. This establishes the desired result in this case. The case v::thunk. We must have v to be of the form {C,w} and each vn to be of the form {C n ,^ n } and such that C is the least upper bound of the chain (C n ) n and u is the least upper bound of the chain (un)n. Concerning valCt(v^y) note that the valW part follows from Lemma 6.3.21 (and the 'layered predicates', Lemma 6.4.2). To show valWCt({C,u},y) we consider A e ExSeq(oo,C» and have two cases depending on whether A is finite or not. If AEExSeq(*) we proceed as in the proof of Lemma 6.3.21. Using Lemma 6.3.20 there exists a natural number n 0 such that A n G ExSeq(*,Cn,wn) for n>n 0 uniquely determines execution sequences A n . As in the proof of Lemma 6.3.21 we have a natural number m such that A G ExSeq(ra) A n G ExSeq(ra,C n ,w n ) for n>n 0
6.4
Correctness of the Code
199
and such that (A n (m)) n > no is a chain with least upper bound A(ra) Using valCt({Cn,un},yn) postCt{An,yn)
we now have
A nothunk(An) for n>n 0
from which
postCt(A,y) A nothunk(A) follows using the induction hypothesis (for the same type t but for non-thunk values). This establishes the result in this case. If AEExSeq(u;) it is easy to adapt the proof of Lemma 6.3.20 to show also that all A n G ExSeq(oo,C n ,u n ) have AnGExSeq(u;). Using valCt({Cn^un}yyn) it follows that yn—± hence y=A- as is the desired result in this case.
for all n and •
L e m m a 6,4.14 compC^^^t (K(Fix[£]),S(Fix[J])) holds for all well-formed types (tz±t)=±t that do not involve lists. • Proof: Much as in the proof of Lemma 6.3.22 it suffices to prove
as the compW part of the result follows from Lemma 6.3.22. For this let d>0 and show compW^C'(t_^t(RESUME:REC:RESUME, up(\f
.FIX(dn(f))))
Using Lemma 6.4.5 it suffices to prove valCiz±t({C;v},up{f))
=> va/C,({REC:RESUME,{C;i;}},FIX(/))
(P)
For this we begin by proving valCi=±i({C;v},up(f))
=> valCt{{RECn:RESVME,{C',v}}jn(±))
(P n )
by induction on n. The basis case, n=0, is immediate as ExSeq(*,REC0:RESUME,{C>}) = 0
For the inductive step we assume (P n ) and prove (P n +i). So assume that C, v and / are such that valCt=±t{{C',v},up(f)) and consider
200
6
Code Generation
A G ExSeq(oo,RECn+1:RESUME,{C>}) We have A(L.) e ExSeq(oo,RESUME,{C,(v,^)}) where v'n = {RECn:RESUME,{C>}} From the induction hypothesis we have valCt(v'nJn(±)) and valCt-Lt({C;v},up(f)) then gives
Lemma 6.4.5 then gives postCt{A,fn+1(L))
A nothunk{A)
if A(l..)eExSeq(*) and
if A(l..)EExSeq(u;). This establishes (Pn+i) and completes the numerical induction. Finally, we obtain (P) using Lemma 6.4.13. • Lemma 6.4.15 compC(t—^—^(K^f ix[£]),S(f ix[£])) holds for all well-formed and composite types t that do not involve lists. • Proof: The proof is by induction on the structure of the type t. The base cases are when t is pure and when t is a frontier type. The case t is pure. As in the proof of Lemma 6.3.23 it is straightforward to show if pure(£) then compCt(x,y) = x=y by structural induction on t. It is then immediate that compC(t->t)->t{K{fix[t])£{fix[t])). The case t = ti^_t2 is a frontier type. We assume that compCt->t(F,G) and must show compCt(K(tix[t])(F),S{fix[t])(G))
6.4
Correctness of the Code
since Lemma 6.3.23 ensures compWt{K(fix[t])). trary d>0 and to show
201
It suffices to consider an arbi-
compWC't(K(fix[t])(F)(d)£(fix[t])(G)) because the compW part of the result follows from Lemma 6.3.23 (and the 'layered predicates', Lemma 6.4.4); but it is more convenient to imagine showing compCt(\d\K(fix[t])(F){d),S(fix[t])(G))
(P)
We have two cases depending on whether F(Ad'.CALL d)(d+l) equals _L or not. The first subcase: Consider the situation where F(\d'.CALL d)(d+l) = JL
(1)
To proceed we should like to deduce that F(Ad'.CALLREC(d,CALL d))(d+l) = _L
(2)
This does not follow immediately from the 'substitution property' (Lemma 6.2.2) but may be proved by amending the proof of Lemma 6.2.2. To do so let S = {(d,CALLREC(d,CALL d))} and note that EESubst. Clearly CompS\T](\d' .CALL d,Ad'.CALLREC(d,CALL d)) and from corapS[0](F,F) we have COmpS[T,](F(\df .CALL d), F(Ac//.CALLREC(d,CALL d))) Since (1) holds it then follows that (2) holds. — We next claim that CompCt(\d'.CALLREC(d,CALL d),up(±)) holds. To prove this it suffices to consider v and y such that valCh(v,y) and to show va/C*2({CALLREC(d,CALL d),w}, ±(y)) But this is immediate since any
(3)
202
6
Code Generation
A G ExSeq(oo,CALLREC(d,CALL d),v) will be infinite and indeed J_(j/)=_L. — Using (3) and compCt-+t{F,G) we then have compCt{F(\df.CALLREC(d,CALL
d)), G(up(±)))
From (2) it follows that
G(up(±)) = J_ so that
As also K(f ix[J])(F)(d) = _L this establishes (P) in this case. The second subcase: Consider the situation where F{\d'.CALL so that K(f ix[i])(d) = CALLREC(d,C) where C = F(Xdf.CALL d)(d+l) To establish (P) we begin by proving compCt(\d'.CALLRECn{d,C),
Gn(up(±)))
(P n )
by numerical induction on n. In the base case, n=0, we immediately have compW t{\df .CALLREC0(d,C)) using the proof of Lemma 6.3.23 and compWCt(\df.CALLREC0{d,C),
up(±))
follows because any A E ExSeq(oo,CALLREC0(d,C),i;) is infinite and dn(up(A.))(y) = JL(j/) = _L regardless of the values v and y. For the induction step we assume (P n ) and show (P n + i). It suffices to prove the compWC part, which boils down to complVC;(CALLRECn+i(d,C),
Gn+1{up(±)))
6.4
Correctness of the Code
203
as the compW part follows from the proof of Lemma 6.3.23. For this we note that (P n ) and compCt->t(F,G) give compCt(F(\df.CALLRECn(d,C)),
Gn+1(up{±)))
from which compWC't(F(\d'.CALLRECn(d,C))(d+l),
Gn+1Mi.)))
follows. Using the 'substitution property' (Lemma 6.2.2, or rather, an obvious analogue) we have Gn+1(up(±)))
compWC't(C[CALLRECn(d,C)/d], We then claim that this establishes ,C),
Gn+1{up{±)))
To see this note that for all values v, perhaps such that valCtl(v,y) we have that
for some y,
A E ExSeq(oo,CALLRECn+i(rf,C),t;) if and only if A(0) = (CALLRECn+1(rf,C),[v]) and A(l..) e ExSeq(cx>,C[CALLRECn(^C)/d],t;) This ends the proof by numerical induction. To be able to conclude (P) it suffices to show compWC't(CALLREC(d,C),
S(fix[t])(G))
This amounts to assuming valCtl(v,y) and showing valCt2({CALLREC(d,C),v},
dn(S(fix[t])(G))(y))
Using (P n ) we already have valCt2({CALLRECn(d,C),v},
dn(Gn(up(±)))(y))
for all n
and the result then follows using Lemma 6.4.13. The case t is a product of composite types. This case is analogous to the similar case in the proof of Lemma 6.2.3 in that one must show that compC(F^G) carries over to the functions supplied as argument to K(fix[^]) and K(fix[f 2 ]); as in the proof of Lemma 6.2.3 we simply refer to [72, Lemma 7.13] for an example of a proof along these lines. •
204
6 Code Generation
Summary of the correctness properties The lemmas proved above together yield the following analogue of Lemma 6.2.3 and Corollary 6.3.24: Corollary 6.4.16 We have corapC(K(>),S(<£)) for all operators <j> of Table 5.1 whose type does not involve lists, except those of form f j or Fj and provided that the type t indexing f ix[£] is always composite. • By analogy with Lemmas 6.2.4 and 6.3.25 we have: L e m m a 6.4.17 ('Structural induction') Let penv be a well-formed position environment (in the sense of Section 5.2) such that K
Pps{penv) I" e:i and assume that neither e nor penv involves any lists. If compC(K(),S((l>))
holds for all operators (j) that occur in e, then
compC([e](K),[e](S))
•
Proof: The proof is analogous to that of Lemma 6.3.25 and is omitted.
•
K
Corollary 6.4.18 Suppose that 0 h e:t1=±t2, that compC(K((j)),S((f))) holds for all operators <j> that occur in e, and that valCtl(v,y). Then the execution of [eJ(K)(void)(l) on the stack [v] will either • loop, in which case dn([eJ(S)(void))(y) = ±, or • produce a stack [w] such that valCt2(w,dn(le](S)(void))(y)).
•
This corollary immediately carries over to apply also to programs of the mixed A-calculus and combinatory logic; as no difficulties are incurred we dispense with the formal statement.
Bibliographical Notes The present proof is mostly inspired by [72] where a similar proof is conducted for a metalanguage called TMLSC and for an abstract machine. However, there are a number of differences. In TMLSC there are no run-time functions allowed, and the standard semantics is eager rather than lazy; on the other hand TMLSC allows recursive types and sums at compile-time and run-time. Overall this means
6.4
Correctness of the Code
205
that the complexity of TMLSC is slightly smaller than that of the mixed X-calculus and combinatory logic. Concerning the abstract machines the one used here has to support more features, in particular thunks, and so is more expressive; on the other hand it uses structured code which considerably eases the proof effort. In summary we believe that the proof carried out here considers a more complex correctness problem. More importantly, we believe that the techniques of [72] have been further refined, not least the use of 'layered predicates' and 'parameter monotonicity', so that the proof should be more readable. The proof carried out in this chapter, as well as that of [72], relates to many proof efforts conducted by others. In a sense [52], which is based on [97], is closest in spirit to our approach but it considers the G-machine and graph reduction rather than an abstract machine like ours. A standard reference is [55] and a more recent approach is [88]; the latter does not consider the translation (or compilation) between two syntactically different notations and we believe that this considerably simplifies the burden of proof. Approaches based on an algebraic perspective include [26], [60] and [100]. An often cited operational approach is [87] that relates the reduction rules of the A-calculus to the SECD-machine. We should like to offer the conjecture that the proof techniques used in this chapter would seem to carry over to a nondeterministic machine: the choice of a particular AGExSeq essentially means that we regard nondeterminism as a choice between a number of deterministic futures. Thus in any one of these we have no difficulty in talking about what will happen in the next configuration. — Such a conjecture can hardly be proved in general. Suffice it to say that the present proof technique was originally conceived during a proof of compiler correctness for a nondeterministic and imperative language; it was only abandoned in that setting because it was felt that parallelism did not easily lend itself to this proof technique. A brief appraisal of the material in Section 6.1 may be found in [69].
Exercises 1. Consider the expression fix(Af[lnt->(Int list)].Cons(ld[lnt],f)) Determine the semantics under the interpretation K. Simulate the code on the abstract machine to illustrate that it does not loop although the standard semantics specifies the meaning to be a function returning an infinite list. 2. The code generated in Example 6.1.5 is obviously rather inefficient. However it can be simplified by applying a few peephole optimizations. As an example it is always safe to replace the code fragment DELAY(C):RESUME with CARESUME. Suggest a couple of such optimizations and apply them to
206
6
Code Generation
the code generated in Example 6.1.5. How does the resulting code compare with that of Example 6.1.6? 3. Extend the definition of compS to allow compile-time lists. 4. Using the result of Exercise 3, extend the proof of Lemmas 6.2.1, (6.2.2), 6.2.3 and 6.2.4 so as to allow compile-time lists. 5. Give an inductive definition of the ordering ^ on values, code sequences and configurations. 6. (*) Extend the definition of valW and compW so as to allow lists. To ensure well-definedness of valW one idea is to introduce a natural number n > 0 as a counter on the length of execution sequences considered when making the statements about well-definedness. Well-definedness will then be ensured by judicious modification of the counter n. For code sequences, wellbehavedness now means n-well-behavedness for all n > 0. A code sequence C is n-well-behaved if its effect in m<no no is a chain with A(l) as its least upper bound. 8. Let K(Zero) = Ac/.CONST 0 and show
compW(K(Zero)).
9. Let K(+) be defined as in Example 6.1.2 and show compW(K(+)). note the similarities between K(+) and K(Apply)).
(Hint:
10. Carry out the details needed (in the proof of Lemma 6.4.13) for adapting the proof of Lemma 6.3.20 to show that all A n are infinite if A is infinite. 11. Extend Exercise 8 to show corapC(K(Zero),S(Zero)). 12. Extend Exercise 9 to show corapC(K(+),S(+)).
Chapter 7 Abstract Interpretation The rationale behind the development of parameterized semantics in Chapter 5 is that it facilitates a multitude of interpretations of the mixed A-calculus and combinatory logic. We saw examples of 'standard semantics' in Chapter 5 and a code generation example in Chapter 6 and in this chapter we shall give examples of static program analyses. We shall follow the approach of abstract interpretation but will only cover a rather small part of the concepts, techniques and tools that have been developed. The Bibliographical Notes will contain pointers to some of those that are not covered; in particular the notions of liveness (as opposed to safety), inducing (a best analysis) and expected forms (for certain operators). We cover a basic strictness analysis in Section 7.1. It builds on Wadler's fourpoint domain for lists of base types but generalizes the formulation to lists of arbitrary types. In Section 7.2 we then illustrate the precision obtained by Wadler's notion of case analysis. We then review the tensor product which has been put forward as a way of modelling so-called relational program analyses (as opposed to the independent attribute program analyses). Finally we show that there is a rather intimate connection between these two ideas.
7*1
Strictness Analysis
Strictness of a function means that 1_ is mapped to J_. In practical terms this means that if a function is strict it is safe to evaluate the argument to the function before beginning to evaluate the body of the function. Since this method usually leads to a more efficient implementation there has been a great deal of interest in strictness analysis for the implementation of lazy functional languages 1. Our approach to strictness analysis is via abstract interpretation. The key idea is to 'simulate' the execution of the program (or function) by using abstract values rather than the concrete values of the standard semantics. For a strictness x
It is not of interest for eager functional languages because here all functions are strict.
207
208
7
A(Ai) A(t 1 X t2) A(t A(t list)
= = =
Abstract Interpretation
2 (A(«i) x A(i2 )U (A(ii) -• A(i 2 ) ) ± 0)±)l Table 7.1: Type part of A
analysis to be useful in practice it must terminate on all programs, including those that may diverge in the standard semantics. A sufficient condition for achieving this is that there are only a finite number of abstract values pertaining to any one type. Furthermore, a strictness analysis will only be useful if its results are safe (or correct) with respect to the standard semantics. We shall formalise this notion later in this section but the key idea is that one must not be able to reach false conclusions from the results of the analysis. For well-known computability reasons this means that in general one cannot hope for a strictness analysis that is optimal, that is one that is safe and misses no instances of strictness. In the absence of perfect information we shall allow the strictness analysis to miss instances of strictness. It is therefore convenient to say that a function is definitely strict if the strictness analysis detects this and not definitely strict otherwise. So safety of a strictness analysis amounts to saying that a definitely strict function is also strict (with respect to the standard semantics S); optimality would then additionally claim that a not definitely strict function is also not strict but we shall not be able to obtain this for the analyses developed here. A trivially safe strictness analysis would say that all functions are not definitely strict.
7.1.1
Types
We shall follow the approach of Chapter 5 and specify the strictness analysis as an interpretation A. The type part of this is given in Table 7.1 and will be explained and motivated in the sequel. For the base types, Aj, we use the two-point domain t 1 2=
lo with elements 0 and 1 and the partial ordering 0 Q 1. The intention is that 0 means that the corresponding value in the standard semantics is definitely _L whereas 1 means that it can be any value (possibly also _L). The interpretation of the type constructors x_ and ^± essentially are as in the standard semantics. Concerning A(tiXt2) it is natural that a property of a pair of concrete values is a pair of properties, hence the use of A(ti)xA(^). But an
7.1
Strictness Analysis
209
element of S(tiXt2) is not just a pair of values but may also be the new _L-element introduced by the use of lifting for S(tiX_t2). It is feasible to let the property (_L,_L) of A(ti)xA(t2) describe ± as well as wp(J_,±) of S(tiXt2) but some lack of precision results: if it is known from the strictness analysis that some function call results in (J-,-L) of A(ti)xA(t2) we would not be able to determine whether the function is really strict (as when it results _l_ of S(tiXt2)) o r n °t (as when it results up(±,±.) of S(tiXt2)). Hence to obtain a useful analysis we use (A(ti)xA(t2))± for A(^i^2) so that the strictness analysis has the possibility of saying that the result of a function call is _L rather than wp(±,_L). Similar remarks apply to the definition of A(t-i —>t). Example 7.1.1 The expression part of the interpretation A will be defined in a subsequent subsection but a few examples may be helpful now in order to appreciate Table 7.1. The function Id[lnt] is a strict function (under the standard semantics S) and it is thus natural to model A(ld[lnt]) as up(Xa.a) This records that S(Id[lnt]) is a genuine function, i.e. of the form up(- • •), and that the strictness property of its result equals that of its argument. Consider next the function Zero[t—»Int] that is intended to ignore its argument and always return the number 0. Since this is not a strict function it is natural to model A(Zero[£->Int]) as up(Xa.l) Again this records that S(Zero[£—>Int]) is a genuine function and that the strictness property of its result is always 1, irrespective of the strictness property of its argument. For functions involving composite types strictness is a more complex concept. As an example the function Fst[lntj<Jnt] will return J_ (in the standard semantics S) if its argument is _L or is of the form up(A.,v) for some concrete value v. By using (2x2)j_ as the domain of strictness properties it is therefore natural to model A(Fst[lntx_Int]) as up(\a.a,i where (a,i,a2)=dn(a)) Consider next the function +[lntx_Int—•>Int] that is intended to add a pair of integers. It needs both components of the pair and it is therefore natural to model A(+[lntjant—>Int]) as up(Xa.ai\la2 where (di,a2)=dn(a)) where fl denotes the binary greatest lower bound operator. (Its existence is guaranteed by Lemma 7.1.6 below.) •
210
7
Abstract I n t e r p r e t a t i o n
There are many possibilities for how to interpret lists. We cannot directly copy the definition for S and use A(t)°° for A(t l i s t ) as this would not allow describing lists of different lengths. This could be amended by adding a new greatest element, but a disadvantage is that the domain for lists then is infinite. Another approach is to interpret lists as A(t)nx2 for some value of n, possibly n=0. Here we have 'perfect' information for the first n elements of the list but for the remainder of the list we only know whether it is definitely _L or not. This gives a finite domain but it turns out not to be overly useful in practice. The approach we shall pursue is to generalize Wadler's construction of a fourpoint domain for lists of base types, i.e. lists of type Aj l i s t . We shall exploit the fact that all A(t) will turn out to be finite complete lattices and we note in passing that continuity is then equivalent to monotonicity 2. To appreciate the general definition it may be helpful first to explain Wadler's construction. It sets A(Ai l i s t ) = (A(Ai) ± ) ± and since A(Aj) = 2 this may be depicted as follows le Oe n 1
nO For readability we write 0 for _L, 1 for up(JL), Oe for up(up(0)) and le for up(up(l)). The intention with these strictness properties are as follows: 0 1
describes the _L-list, i.e. [9, additionally describes all infinite lists, i.e. [vi,i>2?' * ']•> and all partial lists, i.e. [ui,t>2># * *$? Oe additionally describes all finite lists that have a -L-element, i.e. [vi,-•-,!.,•--,t; k ], le additionally describes all finite lists.
Many studies have shown that this domain is useful in practice and many efforts have been vested in finding a generalization to general lists that is equally useful. To prepare for our generalization we need some terminology. Given a partially ordered set D and a subset YC.D we define the right-closure of Y, or upwards closure of Y, as 2
In categorical terms all of Chapter 7 takes place within the category of finite complete lattices and monotonic maps.
7.1
Strictness Analysis
211
3yeY: y\Zd } In the literature this is sometimes written \Y. A subset YCD is Scott-open, or open in the Scott-topology3, if and only if • Y is right-closed, and • for all chains (dn)n: if \Jn dn 6 Y then dn £ Y for some n. Given our restriction to finite domains the second condition is trivial and Scottopen just means right-closed throughout this section. It is immediate that RC( Y) is the least right-closed set that contains Y. We now define O(D) = ({ YCD | Y=RC{Y) A F ^ 0 } , D) as the partially ordered set of non-empty right-closed sets. Lemma 7.1.2 If D is a finite complete lattice then O(D) is a finite complete lattice with least element D, greatest element {Tp} where Tp is the greatest element of D, and least upper bounds and greatest lower bounds given by f) and U, respectively. • Proof: It is immediate that O(D) is a finite partially ordered set and that the least and greatest elements are as stated. (For the greatest element we need that D is a complete lattice rather than just a domain.) For the greatest lower bound f\y of a non-empty collection y of right-closed sets it suffices to note that \jy is right-closed and non-empty (since y is) and hence ny=\jy. For ^ ne l eas t upper bound \jy of a non-empty collection y of right-closed sets it suffices to note that f)y is right-closed and non-empty since all Yey have TG Y (and y ^ 0). • Example 7.1.3 O{2) has elements {0,1} and {1} with {O,1}5{1}. Hence (9(2) is isomorphic to 2 with {0,1} corresponding to 0 and {1} corresponding to 1. It follows that the definition of A(t l i s t ) given in Table 7.1 agrees with the above definition of A(Aj l i s t ) when t=kj. • The general definition of A(t l i s t ) given in Table 7.1 amounts to (O(A(t))±)±. For readability we shall write 0 1 Ye
for ±, for ?/p(_L), for up(up(Y))
much as we did for A(Aj l i s t ) above; furthermore, we shall write ye for RC({y})e. It is straightforward to describe the intended meaning of 0, 1 and Te whereas a little machinery is required for Ye when F^RC({T}). So 3
We shall not go into any topological considerations here.
7
212
0 1 Te
Abstract Interpretation
describes the _L-list, i.e. [<9, additionally describes all infinite lists and all partial lists, describes all lists.
Next consider YeEA(t l i s t ) where 7^RC({T}). We may write Y = {au--,ak} and we know that k>0. Then Ye describes all infinite lists and all partial lists and some finite lists; a finite list [vi,- • -,t;n] is described if there are k values ji?* • *dk s u c h that the property a\ describes the element vyi. Example 7.1.4 For a more complex example concerning lists consider A(i l i s t ) with <=lntxlnt. Here
Aft) =
<10
>01
4 _L
so that i lie
/
A(t l i s t ) =
<10e > Ole \ / \ HOenOle i
• OOe
1
' ±e
(i
4
1
• 0
The element lO^nOle really is RC({10,01})e = {10,01,ll}e but as this equals the greatest lower bound of 10e and Ole we shall write it as KtenOls. The concrete lists described by these properties may be exemplified as follows:
7.1
Strictness Analysis
0
describes [9,
1
additionally describes
le
additionally describes [_L],
00£
additionally describes
OlenlOe
additionally describes [ttp(J_,27),ttp(27,_L)],
Ole
additionally describes [up(_L,27)],
213
additionally describes (wrt. 01d~ll0e) [up(27,J_)], additionally describes [ttp(27,27)]. This will all be formalised in the next subsection when the safety predicates are formally defined. • Example 7.1.5 Turning to operations on lists of integers we recall the strictness properties 0, 1, Oe and le and consider the following functions: hd = Hd[lnt] length = f ix(Af [int l i s t z± Int]. Cond(lsnil[lnt], Zeroflnt l i s t z± Int], -t-[lntx_Int-»Int]DTuple(One[lnt l i s t z± Int], f DTI [Int]))) sum = f ix(Af [Int l i s t z± Int]. Cond(lsnil[lnt], Zero[lnt l i s t z± Int], +[lntx_Int-»Int]DTuple(Hd[lnt], fDTl[lnt]))) The function hd computes the head of a list and terminates (under the standard semantics S) on all lists except [d and []. The function length computes the length and terminates on all finite lists. Finally the function sum computes the sum of the integers and terminates on all finite lists that do not contain a J_-element. We can thus summarize the optimal analysis functions hd, length and sum by hd length sum
0 0 0 0
1 Oe le 1 1 1 0 1 1 0 0 1
As we shall see in the remainder of this chapter it costs considerable effort in order to obtain an analysis producing this result. D We complete this subsection by noting that each A(£) is a finite complete lattice as promised. L e m m a 7.1.6 For each well-formed type t of run-time kind Table 7.1 defines a finite complete lattice A(£). •
214
7
Abstract Interpretation
val^(v^a) = (a=0 => v=A.) valtl}a2(v,a)
= (a=A. =*• v = ± ) A valtl(vi) where (i 1? v 2 ) = c?n(i;) and («i, a2) = dn(a) > /=_L) A
valt] (w,a) =» '
val t2(dn(f)(v),dn(h)(a))
=> v/=±) A (a/=l => ViGdom(t;/): vl(i)^*) A ),l,Te} A (3iGdom(v/): u/(i)=*)
\/aedn(dn(al)): valt(vl(j(a)),a)) Table 7.2: Safety predicate for types of kind r Proof: This is a straightforward structural induction. Clearly 2 is a finite complete lattice and it follows from Chapter 5 that product, function space and lifting preserve the property. From Lemma 7.1.2 it follows that O also does. •
7.1.2
The safety property
So far we have only given informal explanations of the intended meaning of the strictness properties in A(t). Since the definition of the expression part of the interpretation presupposes a clear understanding of these meanings we shall begin by defining two safety predicates: valt for well-formed types of run-time kind and compt for well-formed types of compile-time kind. The predicate val t has functionality valt
:
S(£) X A(t) —>• {true,false}
and is defined by structural induction on t in Table 7.2. The predicate compt has functionality compt : [i](S) x [i](A) -> {true,false} and is defined from valt using techniques similar to, but somewhat simpler than, those used in Chapter 6; hence it is called a logical relation rather than a Kripkelogical relation. The definition is by structural induction on the well-formed compile-time type £, and is given by
7.1
Strictness Analysis
215
comp^(v,a) = (v — a) comptlXt2(v,a) = compt^vi.ax) A compt2(v2,a2) () and (ai,a2) = a V*e[til(S): VaG[*i](A): comptl(v,a) =» comp t2(f(v),h(a)) compt list(v/,a/) = dom(t;/) = dom(a/) A /) = dom*(a/) A m*(v/): compt(vl(i),al(i))) comptl=±t2(v,a) = va/^^^Ujfl) Example 7.1.7 Consider the expression length and its strictness property length as described in Example 7.1.5. One can prove that complnt ii s t _> int ([length](S)(void),Mp(/enjf/i)) and we shall see how this can be used to infer certain strictness properties of length without actually calculating [length](S)(void) on any arguments. As an example consider the list v/=[l,2 v . .,275. It is easy to verify that and from the definition of vahnt list -> int we then get that wa/^t(dn([length](S)(void))(t;/), length 1) Since length 1 = 0 this amounts to t;a/fet(dn([lengthI(S)(void))(t;/), 0) Inspection of the definition of valjj^ then guarantees that dn([length](S)(void))(t;/) = 1 showing that [length](S)(void) cannot terminate on the list vl. For another example consider the same list as above but now use the weaker assumption that Then we obtain va/fet(dn([length](S)(void))(t;/), 1) but this does not suffice for guaranteeing that [length](S)(void) cannot terminate on the list vl. This should not be surprising because the list v/'=[_L] is also described by 0e, that is valt(vlf,0e), and [length](S)(void) does indeed terminate on vV. •
216
7
Abstract Interpretation
The safety predicates valt and compt enjoy a number of properties that are indicative of what one would expect to hold for an arbitrary analysis. Lemma 7.1.8 For each well-formed type t of kind r the clauses of Table 7.2 define an admissible predicate valt : S(<) x A(t) -> {true,false} that enjoys the following properties: VaeA(t): valt{-Ls(t),a) VveS{t): valt(vJA(t)) ): valt{v,ai) A a1Qa2 => valt(v,a2) ): valt{v,ai) A valt(v,a2) => valt(v,a1F\a2) ViQv2 A valt(v2ia)
=£• val t(vi,a)
•
One way to motivate this result is to imagine that valt(v,a) = Pt(v)Qa where (3t : S(t)-*A(t) is a strict and continuous function. Proof: The lemma may be proved by structural induction on the type t. The case t=k± is straightforward. The cases t=tiXt2 and t=t1=±t2 are rather straightforward consequences of the induction hypothesis. We therefore restrict our attention to the case £=£'lis£. Here we have 7 proof obligations: 5 are explicitly listed and 2 are buried in the admissibility statement (namely, that the predicate holds on (±,_L) and that the predicate holds on the least upper bound of a chain when it holds on all elements of the chain). We shall approach these in order of increasing difficulty. (i) That valt'iist([d,O) holds follows by simple inspection of the definition of val t'list-
(ii) That valt'Hstivl^e) holds for all vl£S(t'list) follows by simple inspection of the definition of valtfi±st(Hi) That valt'Ust([9^1) holds for all a/GA(^/l_ist) is immediate for 3iEdom([d): [3(i)=*. (iv) That vliQvl2 and valt'i±st{vh-,a^) imply valt/iist(vh')aI) may be shown by cases on al. If al=0 it is immediate as then vli = vl2. If al=l we have that vl2 is infinite or partial; it follows that vli also is and hence val tii±st(vli^al). Consider next the case where al—Ys for some Y. If vl\ is not finite, or if F={T}, the result is immediate so assume that vl\ is finite and that Y^{T}. Since vl\ is finite also vl2 is and dom*(iy/1) = dom*(i;/2). From Y^{T} and valtiiist(vl2,Ye) we then have a mapping j: Y—>dom*(vl 2) such that
7.1
Strictness Analysis :
217
valt,(vl2(j(a)),a)
From the induction hypothesis it follows that
and this establishes the result. an (v) That valtfiist(vhah) d aliQal2 imply valt'i±st{v^ah) may be shown by cases on vl. If vl=[d the result follows from a previous result. If vl is not [d but is infinite or partial we have al{3\ and hence al2^l', the result is then straightforward. Finally suppose that vl is finite. Then al\ is of the form Y\e and al2 is of the form Y2e. We may assume that Y2^{T}, and hence 5^^{T}, as otherwise the result is straightforward. From valt'i±st(v^ah) w e then have a mapping j \ \ Fi—>dom*(i;/) such that Vaeyi:
valt,{vl(3X{a)),a)
From aliQal2 we have YiDY2. Defining j2 as the restriction of j \ to Y2 we then have a mapping j2: Y2—^dom*(v/) such that \/aeY2:
valtf(vl(j2(a)),a)
and this establishes the desired result. (vi) That valt'iist(vl'>ah) a n c l vo>ltfi±st(^^h) imply valt'i±st(vh a>h^ah) may be shown by cases on aliV\al2. If ali\lal2=0 then ali=Q for some i and the result is immediate. If aliUal2=\ then al\=\ for some i and the result is immediate. If al\r\al2—Ys for some Y we can find Y\ and Y 2 such that al\=Y\e, a!2=Y2e^ and Y=YiL\Y2. If one of Y\ or y2 is {T} then F = F j for some i and the result is immediate; so assume that y ^ { T } and that Y2^{T}. If vl is infinite or partial the result is immediate; so assume that vl is finite. We then have mappings ji : yi->dom*(t;/) such that valt'(vl(ji(a)),a)
(for i=l,2) holds for all a£Y\ (for i=l,2). We may then define
by otherwise and clearly valt>(vl(j(a)),a) holds for all a E F . (vii) Finally we must show that Vn: valt>iist(vln,aln) implies val*'iist(|Jn ^n? Un aln) where (vln)n and (aln)n are chains. By Lemma 7.1.6 it follows that {a/ n |n>0} is finite. Hence there exists n' such that |J n aln = aln>. Using a previous result, i.e. (v), we may set al=aln>, assume that
218
7
Abstract Interpretation
Vn: and it then suffices to show that ^fl/^iist(l_ln vln,al). We proceed by cases on al. If al—Q we have vln=± for all n hence |J n vln = _L as desired. If al=l all vln are infinite or partial and it follows that [Jn vln also is. If al—Te the result is immediate. So assume that al—Ye and that F ^ { T } . If Un vln is infinite or partial there is nothing to prove and the result is immediate. If Un ^n is finite there is n" such that vln is finite (and of the same length as |J n vln) whenever n>n". For each n>n" we therefore have a mapping j n : Y —> dom*(|Jn v/n) such that
VaeY: valti(vln(jn(a)),a) There are only finitely many candidates for mappings from Y to dom*(|Jn v/ n) as both Y and dom*(|J n vln) are finite. Hence there exists a mapping j : Y -> dom*(Un vln) such that j=jn for infinitely many n>n". We now claim that VaE Y: valt,{vln{j{a)),a)
(for all n>n")
This is immediate if jn=j\ if this is not the case we know that there exists n w >n such that jn,»=j. We then have :
valti(vlnm(j(a)),a)
and using the induction hypothesis and vln£lvlnin we also obtain
VaEY: valt'(vln(j(a)),a) From the induction hypothesis it then follows that
and this establishes the desired result.
•
L e m m a 7.1.9 For each well-formed type t of kind c the clauses for cornyt define an admissible predicate compt : [t](S) x [t](A) -> {true,false}
•
Proof: This is a rather straightforward structural induction and we omit the details. • The safety predicate valt also enjoys another property that only holds because we were careful to use lifting when interpreting x_ and z±. In the case where valt(v,a) amounts to f3t(v)C.a this result says that f3t is ^-reflecting: (3t(v)=±. implies V=JL.
7.1
Strictness Analysis
219
= '\hi.Xh2.
A(Twple[to=±tiXt2]) (
up(Xa.up(dn(hi
I
if h ^ _L and h2 ^
[ _L
0(«)))
otherwise
A(Fst[*i2ii2]) = up(Xa. a i where («1,«2) A(Snd[^!^2]) — up(Xa. a2 where
==
(«i,a2) =-
dn(a)) dn(a))
A(Cons[tn^>(U l i s t ) ] ) == Xhi.Xh2 f
if dn(h2){a)Ql
M / !
U Xa
^ '\Yen(a h(hi)(a)) s if dn(h2)(a)=Ye
'
and h2 ^ L
[ ±
otherwise
AfNil[« n ->fii l i s t ) ] ) = up(Xa. Te) A(Hd[*]) = up(Xa. I
f°
A(T1[*]) = up(Xa. { 1
We A(Isnil[t]) = up(Xa. I
£ Si] if a=0 if a = l if a=Ye
)
\ ) 1 if oDI
Table 7.3: Expression part of A (part 1) L e m m a 7.1.10 For each well-formed type t of kind r the predicate valt enjoys the property
\/veS(t):
valt(v,±A(t))
D
Proof: This is a rather straightforward structural induction and we omit the details. • A special instance of this result was already used in Example 7.1.7. It is a key result for the analysis A to be useful for optimizations based on strictness analysis.
7.1.3
Expressions
The expression part of the interpretation A is specified in Tables 7.3, 7.4 and 7.5. The overall shape of these definitions have much in common with the definition of the interpretation S in Chapter 5; major differences arise for run-time lists due
220
7
Abstract Interpretation
A(f i[i]) = f/ where f/ are unspecified elements in where F ^ " 1 are unspecified elements in {to—> up(Xa.dn(h1)(dn(h2)(a))) if hi ^ ± and /i2 / -L J_ otherwise A(Id[t]) = wp(Aa.a) A(True[£]) = up(Xa. 1) A(False[t]) = up(Aa. 1) i^oz±^i]) — Xhi.Xh2.Xh3.
±
_L if dn(Ai)(a)=0 rfn(/i2)(«)Urfn(/i 3)(a) if if hi ^ JL, /12 7^ -L, ^3 7^ -L otherwise
Table 7.4: Expression part of A (part 2) to the very different nature of S(£ l i s t ) and A(£ l i s t ) . The clause for Nil should be obvious as only Te can describe the empty list. Concerning Hd it should be clear that 0 is mapped to ±£A(£) because S(Hd) applied to [d yields _L; all other strictness properties are mapped to T because they can describe a list [vd where v may be any value whatsoever. For Tl we have to map Ye to T~£ because the tail of a list described by Ye could be the empty list and this is only described by Te. For I s n i l we map 0 to 0 and all other strictness properties to 1 because S ( l s n i l ) will give JL on [d and false on any infinite list. For Cons we have two cases depending on whether dn(h2)(a) is of the form Ye or not. If not the resulting list must be partial or infinite and we use the strictness property 1 for this. If it is of the form Ye we use the strictness property Y'e where
Y'e = YeU {dn^h^a^e or equivalently
Y' = Y\J This records the fact that the resulting list must also have a component (the first actually) that is described by dn(hi)(a).
7.1
Strictness Analysis
221
A(f ix[*]) = FIX where FlX(ff) - LJn Hn(±) if t is pure A(f ix[tlz±t2\) = XH. Un>i # n M - L ) ) A(fix[t1xt2]) = XH.(H1,H2(H1)) where ffi=A(f ix[
= up(Xa. dn(h)(a') where (h,af) = dn(a))
Table 7.5: Expression part of A (part 3) The clause for Cond in Table 7.4 is also different from the one for S(Cond) and the reason is again that S(Bool) and A (Bool) are rather different. If the test of the conditional results 0 we know that the result of the conditional in the standard semantics will be i-S(i H»(up(±)) stnct(H) = Xh. < , y±
i l
r
.
otherwise
Using the definitions of Example 7.1.1 we then have [length](A)(void) = FIX'(strict (Xh. up (X a. case dn(A(lsnil))(a) of 0: 0 1: 1 U (1 n dn(h)(dn(A(Tl))(a))))))
222
7 =
Abstract Interpretation
FlX'(strict(Xh.up(Xa. case a of 0: 0 1: 1 Oe: 1 le: 1)))
= up(\a.case a of 0: 0 1: 1 Oe: 1 The reason for this result is that the Isnil-test will give 1 except on the argument 0. Comparing the result with the optimal behaviour, up(length), expressed in Example 7.1.5 we note that we have not been able to capture the optimal behaviour on the element 1. — A similar phenomenon would arise for sum whereas hd is so simple that we have optimal behaviour simply because A(Hd) was defined to be optimal. •
7,1.4
Proof of safety
To have faith in the strictness analysis we must show that it is safe with respect to the standard semantics. As in Chapter 6 this will amount to a result stating comp t ([c](S),[e](A)) for appropriate expressions e and types t. We shall begin by establishing a similar result for each operator of Table 5.1. Lemma 7.1.12 compt(S(Tuple[toz±tiXt2]), A(Tuple[£Oz±£i_x^2])) holds for all n well-formed types t = (^ozzt^i) —» (£01=^2) -* (to=±tiXt2)Proof: We may assume that valto=±tl(flih1) valt0=tt2(f2,h2) and must show
where / = S(Tuple[*Oz±*iX.*2])(/i)(/2) h =
A(Twple[tOz±t1xt2])(h1)(h2)
7.1
Strictness Analysis
223
If h—1. we know that h\—l. for some i, hence /i=_L and /=_L as required; if/=_L the result is immediate from Lemma 7.1.8. So we may assume that h^A. and Next consider v and a such that valto(v,a) and show
which amounts to valtlxt2(up((dn(f1)(v),dn(f2){v))),
up((dn(hi)(a),dn(h2)(a))))
Since valti(dn(f \)(v)^dn(h[)(a)) follows from the assumptions this easily establishes the result. • L e m m a 7,1.13 compt1xt2=±tl(S(Fst[tiXt2]), formed types tiX_t2=±ti.
A(Fst[tiXt2]))
holds for all welln
Proof: Assume that
In particular, this means that valtl(vua1) where (^1,^2) = dn(v) and (a l5 a 2 ) = dn(a). It follows that
and this establishes the result.
•
Corollary 7.1.14 comptlxt2::±t2(S(Snd[^iX_^2])5 A(Snd[^i^2])) holds for all well• formed types t1xt2=±t2. L e m m a 7.1.15 corapf(S(Cons[£0z±£i l i s t ] ) , A(Cons[^0i^:^i l i s t ] ) ) holds for all • well-formed types t = (^oz±^i) —> (^o^l^i l i s t ) —> (^oz±^i l i s t ) . Proof: We may assume that valt0=ttl(fuhi) and must show
224
7 Abstract Interpretation
where
h= If h=A. we know that h\=L for some i, hence /i=_L and /=_L as required; i f / = the result is immediate from Lemma 7.1.8. So we may assume that h^± and Next consider v and a such that valto(v,a) and show valtli±st(dn(f)(v),dn(h)(a)) where we may use that our assumptions ensure that
We now have two cases: The case dn(h2)(a)Ql. In this case dn(f2)(v) must be a partial (possibly [d) or infinite lists and so is dn(f)(v) = COKSFS(dn(f1)(v),dn{f2){v)) It follows that
and since dn(h)(a)=l this establishes the result. The case dn(h2)(a)=Ye for some Y. We have dn(h){a) = Ye\l
(dn^ia^e
If dn(f2)(v) is partial or infinite so is dn(f)(v) and as above the result is immediate. So assume that dn(f2)(v) is finite. From valui±st(dn(f2)(v),Ye) we then obtain a mapping J2-- (Y\{T})^dom*(dn(f2)(v)) such that V«'G(r\{T}):
val tl(dn(f2)(v)(j2(a')),a')
7.1
Strictness Analysis
225
Note that this holds regardless of whether Ye=Te or not. Now define j:
YURC({dn{h1){a)})^dom*(dn(f)(v))
by \
otherwise
We then claim that Va'GFURC({rfn(ft1)(a)}):
valtl(dn(f)(v){j(a')),a')
The proof is by inspection of a'E Y\jRC({dn(hi)(a)}) and using Lemma 7.1.8. • Lemma 7.1.16 comp tn ^ 1 ii st (S(Nil[^ 0I= >(^list)]), A(Nil[^ 0I ^(^list)])) holds for all well-formed types ^oz±(^i l i s t ) . • Proof: It suffices to prove that
and this is immediate.
•
Lemma 7.1.17 comp(t ;yLstj_±i(S(Hd[£]),A(Hd[£])) holds for all well-formed types (t l i s t W . • Proof: We may assume that valt i±st(v,a) and then have to show that
Using Lemma 7.1.8 this is immediate when dn(A(}ld[t]))(a)=T so assume that dn(A(Kd[t]))(a)=±. Then a=0 so that v=[d and dn(S(Hd[i]))(w)=J. and the result is then immediate using Lemma 7.1.8. • Lemma 7.1.18 comp(t l i s t ) ^ iist)(S(Tl[£]),A(Tl[£])) holds for all well-formed types (t list)-*(t l i s t ) . • Proof: We may assume that valt list{v,a) and then have to show that valt iiBt(TLpS(i;),dn(A(Tl[f1))(a))
226
7
Abstract Interpretation
This is immediate when dn(A(Tl[t]))(a)=Te. The cases where dn(A(Tl[t]))(a) is 0 or 1 amounts to the cases where a is 0 or 1. If a=0 we have v=[d so that TLps(i>) = [d and valt iist(TLps(v),O) is then immediate. If a=l we have that v is partial or infinite and so is TLps(v) and valt iist(TLps(^),l) is then immediate. • L e m m a 7.1.19 comp(t ii8t)-+Booi (S(Isnil[f]),A(Isnil[i])) holds for all well-formed types (t list)—>Bool. O Proof: We may assume that valt and then have to show that (v),
dn(A(lsnil[t]))(a))
This is immediate when dn(A(lsnil[t]))(a) equals 1 so consider the case where it equals 0. Then a=0 so that v=[d and ISNILps(v)=:-L and the result follows. • A(n[(t1=±t2)x(to=±t1)])) Lemma 7.1.20 compt(S(n[(t1=±t2)x(to=±t1)]), for all well-formed types t = (tiz±t2) —> (toz±ti) —> (^oz±^)-
holds
Proof: We may assume that valtlI±t2(f1,h1) valto=±tl(f2,h2) and must show valt0z±t2(f,h) where
h = If h=± we know that h\=± for some i, hence /i=JL and /=JL as required; if /=-L the result is immediate from Lemma 7.1.8. So we may assume that h^l. an Next consider v and a such that valt0(v,a) From valtQz±tl(f2,h2)
we then get
valtl(dn(f2)(v),dn(h2)(a)) and from valtl:=lt2(fi,hi)
we then get
a
7.1
Strictness Analysis
227
valt2(dn(f1)(dn(f2)(v)),dn(h1)(dn(h2)(a))) and this establishes the result. Lemma 7.1.21 comptz±t(S(ld[t]),A(Id[t])) Proof: Straightforward.
• holds for all well-formed t^t.
D •
Lemma 7.1.22 compt_^Booi(S(True[^),A(True[^)) holds for all well-formed types
•
Proof: Straightforward.
•
Corollary 7.1.23 comfft-+Booi(S(Falsef£i),A(False[£])) holds for all well-formed types £—>Bool. • Lemma 7.1.24 compi(S(Cond[^oz±^i])? A(Cond[£0z±£i])) holds for all well-formed D types t = (£n->Bool) - • (*oz±*i) -> (*oz±*i) -» (
valtQ=±tl(f2,h2) valto^tl(f3,h3) and must show valto=Ltl{f,h) where / = S(Cond[toz±*i])(/i)(/2)(/3) h = If h=± we know that h\=A. for some i, hence /i=_L and /=J_ as required; if /=_L the result is immediate from Lemma 7.1.8. So we may assume that h^A. and Next consider v and a such that valt0(v,a) From vahn^BooiifiM
we get
If dn(hi)(a)=0 we have dn(h)(a)=±\ furthermore dn(fi)(v)=±. so that dn(f)(v)=± and hence
228
7
Abstract Interpretation
valtl(dn(f)(v),dn(h)(a)) follows from Lemma 7.1.8. If dn(hi)(a)=l we have dn(h)(a) = dn(h2){a) U dn(h3)(a) From valto-±t1(f2)h2), w^o^iC/^^)
an
d Lemma 7.1.8 we then get
valil(dn(f2)(v),dn(h)(a)) valtl (dn(f3)(v),dn(h)(a)) As valtl(l.,dn(h)(a)) is immediate from Lemma 7.1.8 we then have valtl (dn(f)(v),dn(h)(a)) regardless of the value of
rfn(/i)(/y)G{true,false,_L}.
•
Lemma 7.1.25 comp(t-+t)-*t(S(f ix[t]),A(f ix[<])) holds for all well-formed and composite types t. • Proof: The proof is by induction on the type t. The case t is pure. As in Chapter 6 it is straightforward to show if pure(£) then compt(x,y) = (x=y) by structural induction on t. It is then immediate that
The case t=t\z±t>2 is a frontier type. Let F and H be given such that
We begin by showing valt(Fn(up(JL)),Hn{up(L)))
(for all n>0)
by induction on n>0. The basis step, n=0, follows because valt,(dn{up{±))(v),dn(M-L))(i and (i7n(up(_L)))n>i are chains so by admissibility of valt (Lemma 7.1.8) we have
7.1
Strictness Analysis
229
which is the desired result. The case t is a product of composite types. This is analogous to the similar case in the proof of Lemma 6.2.3 in that one must show that comp(F,H) carries over to the functions supplied as argument to S(f ix[^]) and A(f ix[^]); as in the proof of Lemma 6.2.3 we simply refer to [72, Lemma 7.13] for an example of a proof along these lines. O Lemma 7.1.26 compt(S(C\irry[to=±(ti=±t2)]), A(CurTy[to=±(t1=rt2)])) holds for D all well-formed types t = (^o^int^) —* (^ozl^i^l^))Proof: We may assume that
and must show
where / = S(Curry[tOz±(*iz±i2)])(/i) h= If h=A. we know that /H=-L, hence /i=-L and f=A- as required; if/=_L the result is immediate from Lemma 7.1.8. So we may assume that h^l. and / ^ - L Next consider v0 and a0 such that valto(vo,ao) and show that valtl=±t2(dn(f)(vo),dn(h)(ao)) Since dn(h)(ao):j£l. it suffices to consider vx and ax such that valtl(vuax) and show that
But we clearly have
and since
230
7
dn(dn(h)(ao))(ai)
Abstract Interpretation
=
the result follows from valtoXtl:=tt2(fi,hi).
•
L e m m a 7.1.27 compt(S(Apply[^0z±^i]),A(APP1y[^oz±^i])) holds for all well-formed types t = (to=±ti) x_t0 z± h. • Proof: Assume that
and write (f,vf)=dn(v)
and {h^ar)—dn{a). We then have
so that valtl(dn(f)(v'),dn(h)(a')) follows. This establishes the desired result.
•
L e m m a 7.1.28 comp(<_Lt)_±t(S(Fix[£]),A(Fix[£])) holds for all well-formed types (tz±t)=tt. • Proof: Assume that valt=±t(f,h) We next claim that valt((dn(f))n(±Udn(h))n(±))
(for all n>0)
The proof is by induction on n: the basis step is immediate because valt is admissible and the induction step follows from valt=±t(f,h). Using admissibility of valt once more we have valt(FIX(dn(f)),FlX(dn{h))) and this establishes the desired result.
•
7.1
Strictness Analysis
231
Summary of the safety properties The lemmas proved above may be summarized as follows: Corollary 7.1.29 We have corap(S(^),A(>)) for all operators (f> of Table 5.1, except those of form f i or Fj and provided that the type t indexing f ix[t] is always composite. • By analogy with the results of Chapter 6 we have: L e m m a 7.1.30 ('Structural induction') Let penv be a well-formed position environment (in the sense of Section 5.2) such that
If comp(S(),A())
holds for all operators <j> that occur in e, then D
Note that we have dispensed with explicitly indexing comp with type information. If we were to do so it would be helpful to define a £-level type Tps(perct;) as discussed in Section 6.2. Proof: The proof is by structural induction on the expression e much as in the proof of Lemma 6.2.4. We dispense with the details. • K
Corollary 7.1.31 Suppose that 0 h e : tiz±t2 and that corap(S(<^),A(>)) holds for all operators (j) that occur in e. Then dn([e](S)(void)) is strict provided that dn([e](A)(void)) is (definitely) strict. Proof: From Lemma 7.1.30 we have t>a/tl=tla([eJ(S)(void), [e](A)(void)) and
is immediate. We then have t;a/ la (dn([e](S)(void))(J.s(« 1 )).
•
232
7 Abstract Interpretation
and by assumption dn([e](A)(void))(_L A(tl) ) - -LA(« 2) From Lemma 7.1.10 we then have dn([el(S)(void))(± S (t 1 )) = -Ls(ta) •
as required.
This corollary immediately carries over to programs of the mixed A-calculus and combinatory logic; as no difficulties are incurred we dispense with the formal statement.
7.2
Improved Strictness Analysis
The behaviour of a strictness analysis on lists is often important when assessing its quality and the archetypical functions on lists are the hd, length and sum functions of Example 7.1.5. In our approach, which is based on suitably interpreting the operators of the mixed A-calculus and combinatory logic, it is straightforward to obtain the optimal analysis of hd since it is equivalent to the operator Hd and we are not constrained in the analysis prescribed for operators. Concerning length and sum the compositional approach does present some complications and in Example 7.1.11 we concluded that the analysis A did not obtain optimal results for these functions. In this section we shall investigate two approaches that may rectify this for the functions length and sum although in general they cannot be optimal for all functions for reasons of computability. One method was already considered by Wadler (for lists of base types) and the other method exploits the tensor product; as we shall see they obtain much the same effect although one is more 'global' in approach and the other is more 'local'. Throughout this section we shall adhere to the simplifying assumptions of Section 7.1: all domains of strictness properties of a given type are finite and hence continuity means no more than monotonicity.
7.2.1
The Case construct
In the original formulation of his strictness analysis for lists of base types, Wadler made use of a Case construct. It replaces our use of I s n i l , Hd and Tl and ensures that the result of a test (using I s n i l ) is not separated from the result of a decomposition (using Hd and Tl). As was already illustrated in Exercise 4.9 the idea is that Case(ei,e 2 )
(*)
7.2
Improved Strictness Analysis
233
is 'equivalent' to Cond(lsnil,ei,e2DTuple(Hd,Tl))
(**)
By incorporating Case as a new operator we will be able to specify the strictness properties of Case freely so that the analysis of (*) is much more precise than the analysis of (**). To appreciate the analysis of Case it may be helpful to explain its standard semantics using the 'equivalence' between (*) and (**). This leads to: S(Case[*0 l i s t ^ *i])(/i)(/ 2) = ±
itfl=±ovf2=± S(Case[*0 l i s t z± ii])(/!)(/ 2 ) = up(Xv. dn(f2)(up(v',v"))
if v=
For the analysis we then propose the following definition to be motivated below: A(Case[i0 l i s t z± ^(ftiX&a) = ± if hi=A- or h2=AA(Case[£0 l i s t -> *i])(Ai)(fe2) = up(Xa. dn(h2)(up(TA(to),l)) \J{dn{h2)(up(n Y\ ( 7 0 Y')e)) \ Y'C Y} dn(h2)(up(TA(to)iTe)) U dn(hi)(Te) if hi±±. and
if a=0
if a=l if a=Ye ± Te if a=Te
where we use the notation YeY'
= RC(Y\RC(7')) U {T}
To motivate the definition of A(Case) we shall give an informal proof of the safety of A(Case) with respect to S(Case). Lemma 7.2.1 compf(S(Case[fn l i s t —» U])< A(Case[fn l i s t —» U])) holds whenever t = (t0 l i s t z± h) —> (^o2i(^o l i s t ) z± ti) —> (t0 l i s t nt tx) is a well-formed type. • Proof: For the non-trivial part of the proof we assume that
234
7
Abstract Interpretation
and that none of / ^ / 2 , hi or h2 equals J_. The definition of A(CdLse)(hi)(h2) applied to a then amounts to a case analysis upon the strictness property a. If a=0 we know that the list v is [d so that S(Case)(/i)(/ 2 ) applied to v gives J-S(ti)- It is therefore correct to use the strictness property J-A^J)If a=l we know that the list v is infinite or partial. Hence any element v' of S(t0) may be the head of v (unless v is [d) and the tail v" of v will still be infinite or partial. Hence TA(* 0 ) correctly describes vf and 1 correctly describes v" so that dn(h2)(up(TA(to),l)) as well as i-S(ti) (i n c a s e v 1S [d)correctly describes dn(f2)(up(v',vff)) If a—Te we know nothing about the list v, it may be the empty list [], its head vf may be any element of S(t0) and its tail v" may be any list of S(t0 l i s t ) . Thus
correctly describes dn(f1)([]) and dn(h2)(up(T,Te)) correctly describes dn(f2)(up(v',v")) as well as J-S(*i)- By using the least upper bound we obtain a strictness property that correctly describes both possibilities. Finally consider the case where a—Ye and Ye ^ Te\ we then know that the list v cannot be []. It therefore might be natural to use the strictness property dn(h2)(up(T,Te)) since indeed the head v' of the list v may be any element of S(£o)- However, the snag is that the tail v" cannot necessarily be any list of S(t0 l i s t ) because there are certain constraints from Y that may still have to be satisfied. Thus while dn(h2)(up(T,Te)) would not be incorrect we shall be able to do better. Consider the situation where v is a finite list; since v is not [d it will be of the form v=CONSps(^/,^//). We then have a mapping j : Y —> dom*(i;) such that valto(v(j(a)),a) concerning
holds for all a£ Y. We now have a number of possibilities
Y'= {aeY\j(a)=l} For each of these we shall argue that VaGF':
valt0{v',a)
valtoU§1(v",(YeY')e) The first of these is immediate and gives
7.2
Improved Strictness Analysis
235
valt0(v',nY>) using Lemma 7.1.8 where we set F10=T. The second of these is immediate if YQ Y' = {T}; so assume that YQ Y'^{T} and note that RC( Y') then is a proper subset of Y. For each ae Y\RC{ Y') we have agY' and hence j(a)^l. Thus
defines a mapping f : (y\RC(7')) -* dom*(t;//) such that valtQ(v"(/(a)),a) holds for all aEY\RC(F'). This mapping may be extended (in at least one way) to a mapping
such that valto(v"(f(a)),a) holds for all a Returning to each choice of Y'C. Y we now have a contribution dn(h2)(up(nY',(YeY')e)) and by taking the least upper bound of all of these we correctly describe all possibilities. • In the definition of A(Case[ ]) we may assume that Yf is non-empty as f~10=n{T} and y © 0 = y 0 { T } , and therefore no contributions will be missed. Furthermore one may assume that Y' is right-closed as n 7'=nRC( Y1) and YQ Y'= F9RC( Y'), and therefore no contributions will be missed. In summary we only need to consider those Y'eO(A(t0)) such that Y'CY. Example 7.2.2 In the case of lists of base types the above definition of A(Case) amounts to the following: A(Case[Ai l i s t z± h])^)^)
=±
if h\=A- or /i2=JL A(Case[Ai l i s t -» *i])(fei)(fe2) = up(\a. 0 dn(h2){up(l,l)) dn(h2)(up(lfie)) U dn(h2){up(0,le)) dn(h2)(up(l,le)) U dn(fei)(le)
if a=0 ifa=l if a=0e if a=U
if hx^L and We shall motivate the definition in the case where a=0e. Here we use that a=0e really stands for a=Ye with y={0,l}. The subsets Y' of Y are 0, {0}, {1} and {0,1} but we only need to consider {1} and {0,1}. Since
236
7
Abstract Interpretation
n{0,l} - 0 and {0,l}6{0,l} = {1} and n{l} - 1 and {O,1}0{1} = {0,1} this gives the contribution dn(h2)(up(lfie)) U dn(h2){up(0,le)) as stated. — This shows that our general definition of A(Case) specializes to Wadler's notion of case analysis for lists of base types. • Example 7.2.3 Using Case we may now consider the following definitions of length and sum: lengthx = f ix(Af [Int. l i s t =± Int]. Case(Zero[lnt l i s t z± Int], +[lntjant-*Int]DTuple(One[(lnt _x Int l i s t ) ^± Int], fDSnd[lntx.(Int list)]))) sum! = f ix(Af [Int l i s t z± Int]. Case(Zero[lnt l i s t =± Int], +[lnt^Int^Int]DTuple(Fst[lntx_(lnt list)], fDSnd[lntx_(lnt list)]))) As we have already said there is no need to redefine hd and thus no need to analyse it once again. Using the notation FIX' and strict of Example 7.1.11 we may then perform the following analysis of length: [length!](A)(void) =
FlX'(strict(\h.up(\a. case a of 0: 0
1: 1 n dn(h)(l) Qe: (1 n dn(h)(0e)) U (1 n dn(h)(le)) le: (1 n dn{h){le)) U 1))) = FlXf(strict(Xh.up(Xa. case a of 0: 0
1: dn(h)(l) Oe: dn(h)(le) le: 1))) = up(Aa.case a of 0: 0 1: 0 Oe: 1
7.2
Improved Strictness Analysis
237
Thus dn([lengthx](A)(void)) equals the optimal result of Example 7.1.5. Turning to sum we may perform the following analysis:
[sum1](A)(void) =
FlX'(strict(\h.up(\a. case a 0: 0 1: 1 n Oe: (1 U: (1
=
of dn(h){l) n dn(h)(0e)) U (0 n dn{h){le)) n dn(h)(le)) U 1)))
FlX'(strict(Xh.up(Xa. case a of 0: 0 1: dn{h){l) Oe: dn(h){0e) le: 1)))
= ^(Aa.case a of 0: 0 1: 0 Oe: 0 le: 1) Thus also rfn([sum1](A)(void)) equals the optimal result of Example 7.1.5.
7.2.2
•
Tensor products
The use of the Case construct allowed us to consider various combinations of the head and tail of a list and we saw in Example 7.2.3 that this sufficed for an optimal analysis of (slightly modified versions of) the length and sum functions. We shall now see that the same effect can be obtained in a different way through the use of tensor products. As a first step we shall improve the precision of the tests made in the strictness analysis. To this end we define a new analysis A' that has
A'(Bool) =
It is now necessary to reconsider the interpretation of the constructs I s n i l , True, False and Cond that relate to the type Bool. The motivation for the choice of A'(Bp_oJL) is that it enables us to define
238
7 0 A'(Isnil[f]) = up(Xa. { F 1
Abstract Interpretation
if a=0 if a^O A a if a=Te
so that we may consider interesting lists that are definitely not NILps. The interpretation of True, False and Cond is then rather straightforward: A'(True[*]) = up(Xa. T)
A'(False[i]) = up(Xa. F) if hi=±. or fi2=A. or h3=±. _L dn(h2 )(«) dn(h^ )(«)
if dn[h 1 )(a)=0 if dn{h i)(a)=T if dn(h x)(a)=F
if hi^±, h2^± and Example 7.2.4 Assume for the moment that A' agrees with A except as specified above. Using the abbreviations FIX' and strict of Example 7.1.11 we may then perform the following analysis of the length function:
[length](A')(void) = FIX! (strict(\h.up(\a. case dn(A'(lsiLil))(a) of 0: 0 T: 1 F: 1 n dn(A)(dn(A'(Tl))(a)) 1: 1 U (1 n dn{h)(dn(A'(Tl))(a)))))) =
FIX'(strict(\h.up(\a. case dn(A'(lsnil))(a) of 0: 0 T: 1 F: dn(A)(
1: dn{h)(dn(A'{Tl))(l)) Oe: dn{h)(dn{A'{Tl))(0e)) Is: 1)))
7.2
Improved Strictness Analysis
239
= FIX! (strict(\h.up{\a. case a of 0: 0 1: dn(h)(l) Oe: dn(h){le) le: 1))) = up(Xa.case a of 0: 0 1: 0 Oe: 1 le: 1) We note that the change from A to A' suffices for obtaining the optimal analysis for length. — Turning to sum we may perform the following analysis: [sum] (A') (void) = FIX'(strict(Xh.up(Xa. case dn(A'(lsnil))(a) of 0: 0 T: 1 F: dn(A'(Hd))(a) n dn{h)(dn(A'(Tl))(a)) 1: 1 ))) = FlX'(strict(Xh.up(Xa. case a of 0: 0 1: 1 n dn(h)(l) Oe: 1 n dn(h)(le) le: 1))) = up(Xa.c&se a of 0: 0 1: 0 Oe: 1
We note that [sum](A')(void) is slightly better than [sum](A)(void) but still not optimal (compared to sum of Example 7.1.5). • The analysis A' is not yet as powerful as the analysis A when the latter is augmented with a Case construct and the former is not. The crux of the problem arises when decomposing the strictness property 0e; here the head can be 1 and the tail 0e or the head can be 0 and the tail le. However, up(lfie) U up(0,le) = up(l,le)
240
7 Abstract Interpretation
and so our decomposition degrades to the (obviously correct) observation that the head can be anything and the tail can be anything. Our solution will be to interpret A'(tiXt2) as lifted tensor product rather than lifted cartesian product. This will enable us to achieve up(cross(Ifie)) U up^crossfo^le)) ^
up(cross(l,le))
for a suitable function cross. To conduct this development we need a few auxiliary notions. A function f:L-*M is (binary) additive if /(/iU/ 2 ) = f(h)Uf(l2) holds for all h and l2 in L. A function f:LxL'—>M is separately (binary) additive if
for all choices of /i, / 2, /, /J, l2 and /'. It is easy to show that if f:LxL'—>M is additive then it is also separately additive but the converse does not hold. The tensor product may then be regarded as a way of turning separately additive functions into additive ones. To be more precise consider finite complete lattices L and V. A pair (L®Z/,cross) is a tensor product of L and V (with respect to additivity and among the finite complete lattices) provided that • L®L' is a finite complete lattice, • cross: LxL'—^L®L' is a continuous (i.e. monotonic) function that is separately additive, • for all finite complete lattices M and for all continuous (i.e. monotonic) functions f:LxLf—>M that are separately additive the following universal property holds: there exists precisely one continuous (i.e. monotonic) function /® : L®L'—»M that is additive and satisfies the equation f®ocross=f. This may all be illustrated by the following commuting diagram: LxV cross L®V
M
We now have a 'definition' of a tensor product of finite complete lattices L and M; however, we have not guaranteed that a tensor product does exist nor that it is unique if it does. Actually the tensor product exists in quite a general setting (see the Bibliographical Notes) and is unique 'up to isomorphism' (see Exercise 7).
7.2
Improved Strictness Analysis
241
Our next task will be to give a concrete construction of a tensor product of finite complete lattices L and V. We shall arrange for the elements of L®L' to be certain subsets Y of LxL'. We shall say that a set Y is left-closed when Y = LC(Y) and where LC(Y) = {d\3yeY: dQy} denotes the left-closure (or lower-closure) of Y. We shall say that a set Y is closed in both components when Y = CCi(F) and Y = CC2(Y) and where
denote the closure in the first and second components, respectively. Lemma 7.2.5 For each subset YCLxL' the set TC(Y) = f){Y'CLxL' | FCF'A Y'=LC(Y') A
r=cci(r)
A Y'=CC2(Y')
}
is the least left-closed set that contains Y and is closed in both components.
•
Proof: The intersection is not taken over an empty set since LxL' is a candidate for Y'. Hence TC(Y) is defined and it is immediate that FCTC(F). To see that TC( y) is left-closed consider yETC(y) and dC.y, then d£Y* whenever Y' is one of the subsets over which the intersection is taken and hence c/£TC( Y). In a similar way it can be shown that TC( Y) is closed in both components. • We now construct a tensor product by the following data: L®L' = ({YCLXL' |
y^Ay
cross = A(/,/'). LC({(/,/')})
Proposition 7.2.6 The above data constructs a tensor product (with respect to additivity and among the finite complete lattices). • Proof: Since L and Lf are finite complete lattices it is immediate that L®L' is a finite partial order. Its least element is {(J_,_L)} and its greatest element is LxL1. If yCLL®L' is a non-empty set of elements the formula
242
7
Abstract Interpretation
defines a non-empty set, since all YEy have (_L,_L)E F, and it is left-closed, because each Y^y is, and closed in both components, because each Y^y is. Hence \iy is the greatest lower bound of the non-empty collection y. It follows that L®L' is a complete lattice. The formula for least upper bounds is [jy = TC(U^) where in general one cannot dispense with TC. That cross: LxL'-*L®L' is a function follows because LC({(/,/')}) is not only left-closed but also closed in each component. Clearly cross is monotonic and since LxL' is finite it follows that it is also continuous. For separate additivity we calculate cross{hUl2,U) =
= TC{cross{h,l')Ucross(l2,l')) = cross(li,l')\Jcross(l2,l') and similarly for the other component. Given a function f:LxL'-+M that is continuous (i.e. monotonic) and separately additive we may define a function
by the formula displayed. It is clearly monotonic and by finiteness of L®L' also continuous. Next we calculate f®{cross(l,l')) =
= /(W) showing that f®ocross — / . For additivity we observe that f®(Y1UY2) =
f®(Y1)Uf®(Y2)
and since YXUY2 = TC(FiUF 2 ) it suffices to show that /®(y) = /®(TC(r))
(for all YCLxL')
Since TC(F) and Y will always be finite we may prove the result by numerical induction on the number of elements of TC(F)\F; we shall denote this number by |TC(y)\F|. The basis step is immediate because |TC(y)\y| = 0 amounts to y=TC( y). For the inductive step we know that Y is not left-closed or not closed in both components. If Y is not left-closed we have dC.y^Y such that d^Y; by setting y ; =yu{d} we have TC(Y) = TC(y ; ) and
7.2
Improved Strictness Analysis
243
If Y is left-closed but not closed in the first component we have ( / I , I ' ) E F and (hJ')eY such that (hUhJ'WY; by setting Y'=Y\J{(h\Jl2J')} we have TC(F) = TC(Y') and
We proceed in a similar way if Y is not closed in the second component (but is left-closed and closed in the first component). In all cases we obtain a set Y 1 such that f*(Y) = /®(F'), TC(7) = TC(F') and | T C ( 7 ' ) \ r | < |TC(7)\F|. It follows that where the second equality holds because of the induction hypothesis. Finally we must show the uniqueness of /®. So let f:L®Lr-*M be a continuous and additive function such that f'ocross=f. Let Y^L®Lf be an arbitrary element of L®L'\ we can then find (/i,/i), • • *,(Wn) suc h that It follows that
y = ur=i {(/»/o} = ur=i so that
/ ; m = ur=i /'(c^c/i,/? This shows that f'=f® and hence /® is unique.
•
We can now return to the definition of the interpretation A'. We have already hinted at the interpretation of A'(£i2<^2) where we use A'(*i_x'2) = (A'(^) ® A; (t 2 )) ± We now have to consider the operators Tuple, Fst and Snd associated with runtime products: up(\a.up(cross(dn(hi)(a), dn(h2)(a)))) A.
if /h ^ _L and h2 ^ 1 otherwise
^ 2 ] ) = up(\a. ^ 2 ] ) = up(Xa.
244
7
Abstract Interpretation
Example 7.2.7 In Example 7.1.1 we considered the strictness analysis of the function +[lntj<Jnt—»Int] but now the interpretation of run-time products has changed. It is therefore natural to model A'(+[lnt2<_Int—>Int]) as up(Xa. LJ{/n//|(/,//)Gdn(a)}) In this way A;(+[lntj<Jnt-->Int]) will give 1 when applied to up (cross (I,I)) but will give 0 when applied to up (cross (0,1)) or up (cross (0,l)U cross (1,0)). • The analysis of Fst and Snd does not exploit the additional precision of the tensor product and this can hardly be expected otherwise. Concerning the analysis of + we now exploit the additional precision of the tensor product and one may therefore hope that A' will be better than A. However, the weak point is that Tuple is the only operator that constructs an element of the tensor product and that this element is of the form cross(- • •,• • •). This can be rectified by letting the interpretation of Tuple consider the atoms or the irreducible elements of the argument a (or dn(a)). References to approaches following these ideas may be found in the Bibliographical Notes. Here we shall take a shortcut and introduce special operators for exploiting the tensor product. One is Split[i] that is supposed to be 'equivalent' to Tuple(Hd[t],Tl[i]) and the other is Prod(e!,c2) that is supposed to be 'equivalent' to Tuple(eiDFst[*],e2nSnd[*]) for a suitable type t. In the standard semantics we thus have S(Split[i]) = up(Xv. ttp(HDps(v),TLpS(t;))) C/PTW,\ \f \f S(Prod) = AA.A/2.
up(Xv.up(dn(f1)(v1),dn(f2)(v2)) where (vuv2)=dn(v)) if/l9«±and/ 29fc± J_
For the analysis we then have
otherwise
7.2
Improved Strictness Analysis
245
A'(Split[*]) = up(Xa.up( cross(-Lfi) cross(T,l) U{cross(nY', (YeY')e) cross(T,Ts)
\ Y'CY)
if a=0 if a=l if a=Ys ^ Te if a=Te
A'(Prod) = Xh1.Xh2. a.up(\_\{cross(dn(hi)(l), dn(h2)(l'))\ (lj')£dn(< if h1 ^ _L and h2 ^ i. otherwise The definition of A'(Split) has many similarities to the definition of A(Case). Finally for types and operators not considered so far we shall assume that A' behaves as A. Example 7.2.8 In the case of lists of base types the above definition of A'(Split) amounts to the following: ' cross(0fi) if a=0 cross(1,1) if a=l xx } \ . )) ( cross{{j,le)\Jcross{lfle) 11 a=0e if a=le < cross(l,le) Naturally this has many similarities to the simplification of A(Case) obtained in Example 7.2.2. • A// . r,i\ ,\ / A'(Split [t \) = up(Xa.up(
Example 7.2.9 Using Split and Prod we may now consider the following definitions of length and sum: length2 = f ix(Af [Int l i s t z± Int]. Cond(lsnil[lnt], Zero [int l i s t z± Int], +[lntxlnt--»lnt] a Prod(One[lnt => Int], f) • Split [Int])) sum2 = f ix(Af [Int l i s t z± Int]. Cond(Isnil[lnt], Zero[lnt l i s t z± Int], •i-[lntx,Int-»Int] • Prod(ld[lnt], f) • Split[lnt])) Again there is no need to redefine hd and thus no need to analyse it once again. Using the notation FIX' and strict of Example 7.1.11 we may then perform the following analysis of length: [length2] (A7) (void)
=
FlX'(strict(\h.up(\a. case dn(A'(Isnil))(a) of 0: 0
246
7
Abstract Interpretation
T: 1 F: |J{1 l~l dn{h){l') | (/,Z')edn(dn(A'(Split))(a))} 1: 1 U (LJ{indn(/»)(/')|(/,/')edn(dn(A'(Split))(a))})))) = FIX'{strict (\h.up(X a. case a of 0: 0 1: U{1 n dn(h)(l') \ (l,V)€cross{l,l)} Oe: LJ{1 n dn(h)(V) \ (l,l')ecross(0,le)Ucross(l,Oe)} le: 1 ))) = FIX'(strict(Xh.up(X case a of 0: 0 1: dn{h){l) Oe: dn(h)(le) le: 1 )))
a.
= up(Aa.case a of 0: 0 1: 0 Oe: 1 le:l) Thus
FIX'(strict(\h.up(\a. case dn(A'(lsnil))(a) of 0: 0 T: 1 F: LJ{/ n dn{h)(l') \ (/,/')edn(dn(A'(Split))(o))} 1: 1 U
=
FIX.'(strict(Xh.up{\a. case a of 0: 0 1: LJO n dn{h)(V) | (l,l') Oe: U{/ n dn(h)(l') | (/,/')ecross(0,le)Ucross(l,0e)} le: 1 )))
=
FlX'(strict(\h.up{\a. case a of 0: 0
7.2
Improved Strictness Analysis
247
1: indn(h)(l) Oe: (Ondn{h)(le)) U le: 1 ))) = wp(\a.case a of 0: 0 1: 0 Oe: 0 le: 1) Thus also dn([sum 2 ](A / )(void)) equals the optimal result of Example 7.1.5. This is in contrast to what happened in Example 7.2.4 and is due to our use of tensor product. • Judging from Examples 7.2.3 and 7.2.9 we can obtain the optimal results for the hd, length and sum functions using either Case-analysis or the tensor product (with a few additional operators that could be dispensed with at the price of a more complex theory). One should take care, however, to note that there is a certain 'duality' in the sets considered. For run-time lists we are using right-closed sets whereas for tensor products we are using left-closed sets (that are additionally closed in each component). The use of left-closed sets is rather natural for abstract interpretation as is evidenced by the central role the lower powerdomain plays in many formulations of abstract interpretation. The use of right-closed sets for lists seems to be necessary to capture the essence of Wadler's insight: the ability to describe long finite lists that may have arbitrary elements except that one of these has to be undefined, that is J_. In the terminology of [2] one might say that the Wadler-like analysis of lists necessitates a formulation of liveness aspects in addition to the safety aspects. It remains to demonstrate the correctness of the analysis A'. For this we shall follow the approach of Section 7.1 (and Subsection 7.2.1) but we have to change the correctness predicates to reflect the differences between the type parts of A and A'. So we set > a~3T) A » a3F) valtl(vi,ai) (3(aua2)edn(a): where (^1,^2) = dn(v))
A valt2(v2,a2)
and assume that the remaining clauses for val' are as for val. The latter clause clearly demonstrates the intention with the tensor product: an abstract property is a set of pairs of properties and for a given value only one of these pairs needs to be applicable. Just as we defined the predicate comp from val we may define the predicate comp' from val'. We thus obtain an admissible predicate
248
7
Abstract Interpretation
val[ : S(t) x A'(t) -> {true,false} for all well-formed types t of run-time kind, and an admissible predicate comp't : [*](S) x {t](Af) -> {true,false} for all well-formed types t of compile-time kind. Our main 'local correctness result' then is the following analogue of Corollary 7.1.29: L e m m a 7.2.10 We have comp'(S (())), Af((f))) for every operator (j) that is either Prod or S p l i t or is one of the operators of Table 5.1, except those of form fi or Fi and provided that the type t indexing f ix[t] is always composite. • Proof: Given Corollary 7.1.29 it is only necessary to consider the operators Cond, True, False, Tuple, Fst, Snd, Prod and Split. The proof is rather straightforward and we dispense with the details. • It hardly comes as a surprise that we also have the following analogue of Lemma 7.1.30: L e m m a 7.2.11 ('Structural induction') Let penv be a well-formed position environment (in the sense of Section 5.2) such that K
h e:t
If
holds for all operators that occur in e, then
comp'(le](S)t[e](A'))
•
Proof: The proof is by structural induction on the expression e much as in the proof of Lemma 6.2.4. We dispense with the details. • We also have an analogue of Corollary 7.1.31 and as in Section 7.1 this also carries over to programs of the mixed A-calculus and combinatory logic; this then ends the demonstration of the correctness of A'.
Bibliographical Notes There is a wealth of literature on program analysis whether in the form of data flow analysis (as surveyed in [54]) or abstract interpretation (as surveyed in [4]). Strictness analysis, in the form of abstract interpretation, was originally conceived by Mycroft [62] and a useful extension to lists was first given by Wadler [104]. Since
7.2
Improved Strictness Analysis
249
then many authors have tried to extend strictness analysis to larger fragments of functional languages (which is 'easy') in a way which maintains the naturality of Wadler's approach (which is 'hard'); we believe the twist used here to be new but some other references are [29, 105, 14]. On the subject of abstract interpretation we have been rather modest in our treatment of techniques. We have only considered the safety aspects and [2] is a good reference on the dual notion of liveness aspects. Also we have only dealt with safety rather than techniques for inducing the best safe analysis and how the study of expected forms may make this more applicable in practice; we refer to [78] and [71] for references on this. Another topic not dealt with is the distinction between first order and second order analyses and the relation to partial equivalence relations which is again related to projection analysis [105]. There is a discussion of the relationship between abstract interpretation and projection analysis in [14] but it is not entirely satisfactory. For functional languages one often formulates abstract interpretation in 'BHAstyle' [15]. Our work is closer to the 'TML-style' [66, 70, 71, 45, 78] whose notation is inspired by the 'Cousot-style' [20] (after the 'inventors' of abstract interpretation). As a guide to the literature we shall briefly compare some of the salient features of the different styles, but we do not have the space to explain the concepts fully. One issue is that of notation: BHA-style Abs Cone abs
TML-style a (or abs) 7 (or con)
Cousot-style a 7
Here a is the abstraction map; it maps a set of values in the standard semantics to the single property that best describes all of them. (For strictness analysis of the integers it would for example map a set {3,27,- • •} of values to the strictness property 1 whereas the set {_!_} would be mapped to the strictness property 0.) Next 7 is the concretization map; it maps a property from the analysis to the set of values in the standard semantics that are described by that property. (For strictness analysis of the integers it would map 1 to {••-,-1,0,1,-•-,±} and 0 to {_]_}.) Finally /? is the representation map (called abstraction map in [15]); it maps a single value in the standard semantics to the property that best describes it. It should thus come as no surprise that a(Y) = \J{/3(y)\y€ Y}. Another issue is how to define maps a*, f3t and jt in a structural way over types t. Here the 'TML-style' allows much greater freedom than the other approaches. One example is the ability to use the tensor product in one analysis and the cartesian product in another. Another example is the ability to consider more complex type constructors like recursive types. Finally the study of expected forms is important for implementations of the analyses. A third issue is the weakest set of assumptions that can be used to obtain
250
7
Abstract Interpretation
some kind of theory. (Clearly the stronger the assumptions the more interesting the theory.) Here the cCousot-style' focuses on the concretization map 7 without requiring (0^,7) to form an adjunction (also known as a Galois-connection). In the cBHA-style' and the (later versions of the) cTML-style' focus is placed on a predicate like vol. This predicate is quite often of the form val(v,a) = f3(v)
Exercises 1. Depict the complete lattice A((Int l i s t ) l i s t ) and describe the meaning of its six elements. 2. Write bott — J-A(t) a n d topt = T ^ t ) . Give an inductive definition of bott and topt that is consistent with this. 3. Prove that co7npt->int(S(Zero),A(Zero)) where S(Zero) = up(Xv. 0) and A(Zero) is as in Example 7.1.1. 4. Prove that comft I n t x I n t _ I n t (S(+),A(+)) where S(+) = up(\v.
t;i + ^2 where (viJV2)=dn(v))
(taking ±+v=±=v-\-±)
and A(+) is as in Example 7.1.1.
5. (*) Define the type part of an interpretation Am that more closely mirrors the semantics Sm of Chapter 5. Extend the interpretation with its expression part and formulate the safety predicate. Prove the safety of a few operators. 6. Let Mi and M2 be partially ordered sets. An isomorphism 0 from M1 to M2 is an injective and surjective function 0:Mi—>M2 such that Vm,ro'eMi: mQmf <£> 0(m%0(m') Show that an isomorphism is a continuous function and that its inverse is also an isomorphism.
7.2
Improved Strictness Analysis
251
7. Let (Mi,crossi) and (M2,cross2) be tensor products of L and V. Show that there exists an isomorphism 6 (as in Exercise 6) from Mi to M2 such that 9ocross\ — cross2. (Hint: consider crossf2 and crossf1 where/®1 denotes/® with respect to (M\,cross\).) 8. The lower powerdomain P L ( ^ ) of a Scott-domain D may be defined as
Show that is a complete lattice, and is a Scott-domain, and that Xd.LC({d})f]B]j
is a continuous function from D to P
9. Show that (PL(DxD% \(Y,Y').{(yyy')\yeYhy'eY'})
is a tensor product
of P L ( / } ) and PL(^D') and determine the formula for /®. (Hint: you may assume that D and Df are finite so that BD=D and B j r ^ D ' , but the result holds in general.) 10. (*) Define a function
by induction over well-formed run-time types £ such that valt(v,a) = /?* where va/t is the correctness predicate of Table 7.2. Show that fit is strict and continuous (and that it maps compact elements to compact elements).
Chapter 8 Conclusion In the previous chapters we have focused on the theoretical development of the language of the mixed A-calculus and combinatory logic (Chapters 2, 3 and 4) and on the different standard and non-standard semantics of the language (Chapters 5, 6 and 7). There are two immediate application areas for this work, one is in the efficient implementation of functional languages and the other is in denotational semantics.
8.1
Optimized Code Generation
Much work in the community of functional languages has been devoted to the development of efficient implementations. This is well documented in [86] which contains a number of techniques that may be used to improve the overall performance of a cnaive' implementation. However, the theoretical soundness of all these techniques has not been established (although [52] goes part of the way). We believe that the main reason for this is that it is not well-understood how to structure correctness proofs even for naive code generation schemes. So although we have a firm handle on how to prove the safety of large classes of program analyses it is less clear how to formally prove the correctness of exploiting the analyses to generate 'optimized' code. Before addressing the question on how to improve the code generation of Chapter 6 let us briefly review the techniques we have used. The code generation is specified as an interpretation K (in the sense of Chapter 5) and its correctness is expressed by means of Kripke-logical relations. Turning to the strictness analysis of Chapter 7 we take a similar approach: the analysis is specified as an interpretation A and its safety is expressed using logical relations. The idea is now to specify the optimizing code generation schemes as interpretations and the goal will be to adapt the technique of (Kripke-)logical relations to express the correctness. 253
254
8
Conclusion
O(Tuple[*0:z>*i><.
RC2, A(Tuple[ ]) In h2)
= (K(Fst[ ]), A(Fst[ ]))
O(Snd[t l2i < 2 ]) = (K(Snd[ ]), A(Snd[ ])) X(RC 1,hl)-KRC2M)Q(Cons[f n -»Mi8t]) = (K(Cons[ ]) Rd RC2, A(Cons[ ]) hx h2) O(Hd[*]) = (K(Hd[ ]), A(Hd[ ]))
[])) ), A ( I s n i l [ ])) Table 8.1: The optimizing interpretation O (part 1)
8.1.1
Using local strictness information
The code generated by K can be improved if we have information about strictness. To see this consider the code generated for et • e 2: [ex • e 2 ](K) = Xenv.Xd. DELAY([e 2 ](K)(env)(d)):([ Cl ](K)(env)(d)) Here the computations of e2 are postponed using the DELAY instruction because the result produced by e2 need not be required by t\. However, if we do know that t\ is strict then we also know that the value of e2 will be needed and then we can dispense with the DELAY instruction and thus improve the code. To obtain such optimizations we shall specify an interpretation O that, essentially, performs K and A 'in parallel'. This means that the strictness information will always be available so that it may be used to generate better code for operators like • . The type part of O has O(ti z± t2) = K(h =± t2) x A(*i =± t2) and the expression part is given in Tables 8.1, 8.2 and 8.3. The interpretation of most of the operators is fairly straightforward because they do not make any use of the strictness information. The interpretation of • in Table 8.2 consults the strictness properties of its first argument to see if the DELAY instruction for the code can be omitted. In the definition of O(f ix[^lz=>f2]) given in Table 8.3 we use that the strictness information is independent of the code generated so in the definition of the functional H2 we simply supply H with an 'arbitrary' first argument, in this case \d\]. Having obtained the strictness information for the fixed point we may use it when we define the functional Hi.
8.1
Optimized Code Generation
255
o(fi[*]) = fj where f; are appropriate elements of W(O) 0(Fi[*oz±*i]) = (K(F;[ ]), A(Fj[ ]))
O(n[(i 1 ^ 2 )x(i 0 ^i)]) = X(Rd, hJ.XV R.C h2). (RC, A(D[ ]) hi h2) where if dnihi) i. = J-, RCi(d) ^ ± and Rd (d)?± RC = Xd.< DELAY(RC2(d)):(RC\(c 0) if dn(hi) ± 7^ J_, Rd(d) ^ 1 and RC, >.(d) * -L _L otherwise O(ld[i]) = (K(Id[ ]), A(ld[ ])) O(True[*]) = (K(True[ ]), A(True[ ])) O(False[*]) = (K(False[ ]), A(False[ ])) Q(Coild[^n—*t-\ j) = XyRCy \Jl\) .XyRO 2?^2y -y\{RC
(K(Cond[ ]) RCX RC2 RC3, A(Cond[ ]) Table 8.2: The optimizing interpretation O (part 2) Example 8.1.1 Consider the sum expression of Example 7.1.5: f i x (Af[ ].Cond(lsnil[ ], Zero[ ] , + [ ] • Tuple(Hd[ ], f D Tl[ ]))) where we have omitted the type information. From Example 6.1.6 we have that [sum](K)(void)(d) equals CALLREC(D, ENTER:RESUME:ISNIL: BRANCH(CONST 0 , DELAY(ENTER: DELAY(DELAY(RESUME:TL:RESUME):
CALL d): SWITCH:
DELAY(RESUME:HD:RESUME): TUPLE):
RESUME:ENTER:SND:RESUME: SWITCH:FST:RESUME:TUPLE:PRIM
The interpretation O will give
256
8
Conclusion
O(f ix[t]) = FIX where FIX H = LJn Hn{±) if t is pure O(tix[tlz±t2])
= A#.(K(f ix[ ]) Hi, A(f ix[ ]) H2) where
tf x i?C = RC where (flC, fe') = H{RC, A(f ix[ ]) # 2 ) H2h = h' where (/?C, A') = ff(Ad.[], A) O(fix[i x xi 2 ]) = Ai/.(i/j, # 2 (#i)) where tfa = O(f ix[« 1])(Av1.w1 where (w b w 2 ) = tf((vlvff2(Vl)))) H2 = Av!.O(f ix[*2])(Av2.w2 where (w1,w2) = H{(vuv2))) and ijXi 2 is composite but not pure io^(ii=Li2)]) - X(RC,h). (K(Curry[ ]) RC, A(Curry[ ]) h) i 0 ^ i ] ) = (K(Apply[ ]), A(Apply[ ])) 0(Fix[*o=±*i]) = (K(Fix[ ]), A(Fix[ Table 8.3: The optimizing interpretation O (part 3) [sum](O)(void) - {RC, [sum](A)(void)) where RC(d) is as above but without the underlined DELAY instructions. To see this note that the outermost DELAY instruction is dispensed with because + is strict and the innermost one is dispensed with because f itself is strict - the definition of O(f ix[ ]) for frontier types ensures that this information can be used for the recursive call. • Correctness issues The correctness of O can be proved using Kripke-logical relations. Here we shall briefly outline one such approach; the details of a similar approach may be found in [50] (for a subset of the language without lists). Basically, the proof is in three stages: Stage 1: It is shown that the code component of O satisfies the substitution and well-formedness properties of Chapter 6. The properties are expressed by predicates compSt- [£](O) —> {true, false} compWt: [£](O) —> {true, false} The definitions and subsequent proofs are rather trivial modifications of those found in Chapter 6.
8.1
Optimized Code Generation
257
Stage 2: It is shown that the strictness component of O is safe (in the sense of Chapter 7) with respect to the standard semantics S. This result is crucial for the definition of O(fix[tiz±t2]) in Table 8.3. The property is expressed formally by a predicate compAt: [i](S) x [i](O) —> {true, false} The interesting clause in the definition of this predicate is compAil:±t2(g,(RC,h))
= comptl=1t2(g,h)
The proof of the property is a straightforward modification of the proof in Chapter 7. If desired it may be augmented to show that the strictness component of O actually 'equals' A. Stage 3: Finally, the correctness of O is expressed by the predicate compCt: [ {true, false} The interesting clause in the definition is compCil=±i2((RC,h),g)
= compW tl^t2{RC ,h) A compAtl:=Lt2(g,(RC,h)) A compWCtlz±t2{RC,g)
where compWCt1-±t2{RC^g) is defined much as in Section 6.4. The proofs follow those of Section 6.4 and only the cases of • and fix are non-trivial. We refer to [50] for the details. We conclude the discussion of the correctness of O by observing that the notion of layered predicates has been extended to include the safety property of the strictness analysis. The 'layer' may therefore be depicted as follows: C
w
8.1.2
Using right context information
Example 8.1.1 shows that the interpretation O gives rise to better code because fewer computations are delayed. However, local strictness information does not suffice for dispensing with all the DELAY instructions we could hope for. Also it does not allow us to remove any of the seemingly unnecessary RESUME instructions. To get such improvements we need information about the context of the expressions. We shall distinguish between two kinds of contexts:
258
8
Conclusion
• right context information is concerned with the remaining computations, i.e. how the result of the current expression is used, and • left context information is concerned with the computations that already have been performed, i.e. how the argument supplied to the current expression has been produced. We shall now sketch how the interpretation O can be extended to take right context information into account so as to generate even fewer DELAY-instructions. The aim will be to show that the concept of parameterized semantics does indeed allow us to specify a wide range of optimizing code generation schemes. Strictness continuations In the code generated for Tuple(e 1? e2) we have so far postponed the computations of ei and e2 because we do not know whether or not their results are needed in the remainder of the computation. However, sometimes we do know that the result is needed; an example is the sum program where we have +[ ] • Tuple(- • •,• • •) and where we know that + is strict in both components of its argument. When this is the case it will be safe to dispense with the DELAY instructions. To perform this optimization we need information about the future use of results of subexpressions. This information can be provided by a strictness continuation which, basically, is a function that tells whether the remainder of the computation is definitely strict or not. Formally, a strictness continuation K (for a type t\ z± t2) is a function in the domain A(i 2 ) -> 2 The interpretation O r will be an extension of O with strictness continuations as right context information. The type part of O r is O r (t x =t h) = ((A(i 2 ) -+ 2) -> K(*x =t t2)) x A(h =± t2) Most of the operators either ignore the strictness continuation or pass it on to their arguments. A couple of examples are Or(Hd[ ]) = (A«.K(Hd[ ]), A(Hd[ ])) O r (D[ ]) = A(/1,A1).A(/2,A2). (/, A(D[ ]) hx h2) where if2{Kodn(hl))(d)):(fi(K)(d)) if (Kodn(hi)) _L = 0, /,(«)(d) ^ J. and/ a («odn(A 1 ))(d) / JL f = \n.\d.\
DELAY(/2(/codn(&i))((f)):(/i(/e)(d))
if (Kodniht)) 1 / 0 ,
± J.
otherwise
8.1
Optimized Code Generation
259
Note that here the strictness continuation K for the complete construct is supplied unchanged to the first parameter (/i) whereas it is updated to (Kodn(hi)) before it is supplied to the second parameter (f2)- Also note that the strictness continuations are computed in a 'backward manner'. The real benefit of the strictness continuation is apparent for the Tuple operator where we have M
(/, A(Tuple[ ]) hx h2)
where/ = AAC.Ad.ENTER: f2(\a.K{up(T,a)))(d)
if K{UP{T,±))
=0
DELAY(/2(Aa./c(iy?(T,a)))(d)) otherwise SWITCH:
f1(Xa.K{up{aJ))){d) if K(up(±J)) DELAY(fi(\a.K(up(a,T)))(d)) otherwise TUPLE
=0
There are at least three different ways to handle the fixed point operator fix in the case of frontier types: • to ignore the strictness continuation in all calls — this is often called in£raprocedural analysis and corresponds to using the most conservative strictness continuation Aa.l, • to specialize the code generation to the actual strictness continuation for each (recursive or initial) call — this means that we may have several versions of the code for the same fixed point, and • to collect and approximate the strictness continuation for all calls — this is often called m£erprocedural analysis and corresponds to the effect obtained when the program is annotated as in a sticky analysis. Only the first two approaches can easily be specified in the framework of parameterized semantics as developed in Chapter 5. In the first case we simply have Or(f ix[t1=±t2]) = XH. (/, A(f ix[ ]) H2) where / = AAC. K(f ix[ ]) Hi Hx RC = f (Aa.l) where (/',/*') = H{\K'.RC,
A(f ix[ ]) H2)
H2h = h' where (/',/*') = ff(A*.Ad.[], h) Unfortunately, the effect of using this definition is that no optimizations take place inside the body of the fixed point. So alternatively we may use the second approach. The idea now is to introduce a table mapping strictness continuations to labels. Initially no labels have been generated so
260
8
Conclusion
O r (f ix[ ]) - \H. {spec (A/c.-L), A(f ix[ ]) H2) where H2 is as above and spec generates the appropriate code: spec tab = AACO.A / ^o ^o) where (f,h)
= H(\K.\d.if
tab[KQV->d$\ K ^ JL then CALL (tab[K,oi-^do] K) else CALLREC(d, spec (tab[KOy->do][K\-^d]) K ( d + 1 ) ) ,
A(f ix[ ]) H2) Example 8.1.2 Consider the expression sum. Applying the interpretation O r sketched above together with the initial strictness continuation Xa.a we get the code CALLREC(D, ENTER:RESUME:ISNIL: BRANCH(CONST 0,
ENTER: RESUME:TL:RESUME:CALL d: SWITCH: RESUME:HD:RESUME: TUPLE: RESUME:ENTER:SND:RESUME: SWITCH:FST:RESUME:TUPLE:PRIM
so that no DELAY instructions are generated. Note that the optimization has rendered the underlined instructions unnecessary but that this is not exploited by the code generation. • The idea of using strictness continuations as an additional parameter to the coding interpretation was introduced in [82]. The interpretation is specified in some detail in [50] which also contains a correctness proof for a first-order subset of the language without lists. The proof follows the same lines as the previous proof but is much more complicated because the strictness continuations are passed around as parameters.
Evaluation transformers Strictness continuations are closely related to the evaluators of Burn [13]. Consider for example an expression e of type Int l i s t z± Int and assume that the strictness continuation for the result of e is Xa.a. Then the domain of strictness continuations for the argument of e contains five elements as indicated below:
8.1
Optimized Code Generation
«1 «2 «3 K4
261
0 1 Oe le 1 1 1 1 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0
For the sum expression the strictness continuation will be /C3, for the length expression K2 and for the hd expression ACI. The strictness information may be used to determine the degree to which it is safe to evaluate the argument of the expression. This is quite analogous to what is expressed by the evaluators of [13]; and indeed K\ corresponds to & for 0
8.1.3
Using left context information
The code generated by K makes no assumptions about the argument on top of the stack. Consequently the code generated contains a lot of RESUME instructions that ensure that the argument is evaluated to the degree required. As an example, +[ ] requires that its argument is fully evaluated so the instruction PRIM + must be preceded by a sequence of instructions that ensure that this definitely will be the case. However, if we do know that the argument is fully evaluated (as in the case of the sum expression of Example 8.1.2) then there is no need to generate the sequence of instructions preceding PRIM +. (In terms of Example 8.1.2 one could thus dispense with the underlined instructions.) The idea is now to keep track of the degree to which the arguments have been evaluated and thereby avoid generating some of the RESUME instructions. This is a typical example of left context information as it is concerned with the computations that have taken place in the past. Below we shall sketch how the interpretations considered so far may be extended to take this information into account.
Evaluation degrees We shall be interested in properties expressing the degree to which values have been evaluated. So for each run-time type t we shall define an appropriate domain
E(Ai) = 2
262
8 E{t1xt2)
= {E(t1)
Conclusion
x
E(t! => i 2 ) = (E(<x) -> E(i 2 )) ±
E(i l i s t ) = (O(E(t))±)± where O(D) is the domain of non-empty right closed subsets of D as in Chapter 7. The rationale for using the two-point domain for the base types is that a base value either is fully evaluated or is a thunk: 1: the value is definitely fully evaluated, 0: the value may not be fully evaluated. The interpretation of the type constructors mimics that of the strictness analysis of Chapter 7. For products we use lifted cartesian products as this allows us to distinguish between: upfcify):
the value is definitely a pair with the components evaluated to degree t\ and e2, respectively,
±:
the value need not be evaluated to a pair.
For lists we have: Ye: the spine of the list is fully evaluated and for each element of the list there is an abstract value in Y describing its evaluation degree, 1:
the list is evaluated to head normal form, so it is either [] or it has the form •••:•• •, i.e. CONSps(- • v • •),
0:
the value need not be evaluated to head normal form.
The execution of the DELAY instruction will always result in a value that is a thunk; this means that the bottom element of E(f) precisely describes its evaluation degree. On the other hand the execution of the RESUME instruction will always result in a value that is evaluated to some extent. We shall write uj t (or just u) for the (unique!) value of E(£) just above J_; we thus know that RESUME always will return a value that is at least evaluated as specified by LOtLet us first see how the interpretation K can be modified to take evaluation degrees into account. The type part of the interpretation Kj will have Ki(t! => t2) = (E(h) -> K(*! =± t2)) x (E(
E(*2))
The intention is that if (f,g) € Ki(f i z± ^2)a n d ^ n e evaluation degree of the element on top of the stack is e then f(e) describes the (relocatable) code generated and g(e) describes the evaluation degree of the element on top of the stack when the execution of the code has terminated. Note that g(e) ^ J_ always holds. The interpretation of the operators is mostly straightforward. Some of the interesting clauses are:
8.1
Optimized Code Generation
263
(Ae.K(Tuple[ ]){fx e)(/ 2 e), A Ki(Fst[ ]) = (/, Ae.(u; U ei where (ci,c2) = dn(e))) where
{
RESUME:FST:RESUME if e = _L FST:RESUME
if e =
up(±,t2)
FST
otherwise
K 1 (+[]) = (/,Ac.l) where RESUME:ENTER:SND:RESUME:SWITCH: FST:RESUME:TUPLE:PRIM + ENTER:SND:RESUME:SWITCH:FST:
RESUME:TUPLE:PRIM +
f = \e.\d.
if e = _L
if e = up(0fi)
ENTER:SND:RESUME:SWITCH:FST:
TUPLE:PRIM +
if e = up(lfi)
ENTER:SND:SWITCH:FST:RESUME:
TUPLE:PRIM +
if e -
PRIM +
up(0,l)
if e = up(l,l)
±)(/ 2 e), Ae.(^ ±))) Note that in the last definition fi is applied to the evaluation degree ± because its argument always will be a thunk. The fixed point f i x for frontier types can be handled in different ways: one is to ignore the information about evaluation degrees as in Ki(f ix[ ]) = A#.(Ae.K(f ix[ ])#', Ae.u) where H' RC = / ' ± where (/',#') = H(\e.RC, Xe.uj) Another possibility is to generate specialized code for each evaluation degree. This will involve introducing a table associating labels with evaluation degrees much as we earlier introduced a table associating labels with strictness continuations. Also one can imagine a sticky variant of the analysis of evaluation degrees but as in the previous subsection it is not clear how to formulate this as an interpretation. Example 8.1.3 The effect of using Ki rather than K is minimal as it only allows us to remove rather few RESUME instructions. As an example consider the sum expression and assume that the evaluation degree of the argument is Oe meaning that the spine of the list is evaluated but that the elements need not be evaluated. If we specialize the code generated to the actual evaluation degree we get CALLREC(d,ENTER:ISNIL:
264
8
Conclusion
BRANCH(CONST 0, DELAY(ENTER: DELAY(DELAY(TL):CALLREC(d+l,SWITCH: DELAY(HD:RESUME): TUPLE): RESUME:ENTER:SND:RESUME:SWITCH: FST:RESUME:TUPLE:PRIM
where CALLREC(d+l,- • •) is the usual code generated for sum using K (see Example 8.1.1) but allocated from label d+1. Thus only at the outermost level can we make use of the fact that the argument is partly evaluated; when the recursive call is encountered the argument will definitely be a thunk because of the DELAY instruction and we consequently have to use the 'less optimal' code. •
Combining left and right contexts The effect of using evaluation degrees as left context is vastly improved when combined with strictness information. The reason is that when we dispense with some of the DELAY instructions also more RESUME instructions will be unnecessary. The interpretation Ori sketched below will combine left and right context information. The type part of Ori is Ori(*i =± t2)
= (E(*!) x (A(i 2 ) -+ 2) -+ K(h =± t2)) x (E(h) x (A(i 2 ) -> 2) -> E(i 2 )) X A(*! z± t2)
Compared with Ki we note that we need the evaluation degree of the argument as well as the strictness continuation in order to determine the evaluation degree of the result. The reason is that the code generated will depend upon both these kinds of information. The expression part of Ori combines the information of O r and Ki. As an example consider the definition of O r i(n[ ]):
Orl(D[ ]) = Xifl,glth1)^(f2,92M
(/, 5, A(D[ ]) h h2)
where / and g are defined below. First we have (f2(t,K°dn(h1))(d)):(f1(g2{e,Kodn(h1)),K)(d)) if (Kodn(hi)) J_ = J_,
i. and
if /I(_L,AC)(C/)
±
otherwise
^ _L and f2^1Kodn[hi))yd)
^ _L
8.1
Optimized Code Generation
265
Note that the strictness continuation is updated as in O r (n[ ]). In the case where a DELAY instruction is generated we supply / i with the evaluation degree _L exactly as in Ki but if no DELAY instruction is generated we use g2 to get more precise information. Turning to the definition of g we have 9= Note that g1 is supplied with the parameter <72(^°dn(/ii)) when the DELAY instruction is not generated. The combination of evaluation degrees and strictness continuations may also be used in the definition of the interpretation of the Tuple operator: Orl(Tuple[ ]) = \{fugiA).\{f2,92M)-
(/, <7, A(Tuple[ ]) hx h2)
where / = A(e,«;). Xd. ENTER: / 2 (c,Aa./c(tip(T,a)))(d)
if K(UP{T,±))
DELAY(/2(e,Aa.«(ttp(T,a)))(d))
otherwise
=0
SWITCH:
f f1(eMAM^J)W)
\ DELAY(/i(e,Aa.«(up(a,T)))(d)) TUPLE
if *{up(±J)) = 0 otherwise
Thus the strictness continuation is used to dispense with the DELAY instructions for the two components and the information about evaluation degrees will ensure that this is recorded in the evaluation degree of the result so that the subsequent computations can make use of it. The function g is defined by g = \(t,K,).up(ii Ac(wp(_L,T)) = 0 then gi(e,\a.K,(up(a,T))) if K,(up(T\-L)) = 0 then g2(eJ\a.K(up(T,a)))
else _L, else JL)
Concerning the fixed point operator f i x we have the same options for frontier types as in the previous subsections. However, we shall not go further into this here. Example 8.1.4 Assume that the evaluation degree of the argument of sum is 0. If the initial strictness continuation is Xa.a we then get the code CALLREC(FILENTER:RESUME:ISNIL: BRANCH(CONST 0,
ENTER: RESUME:TL:RESUME:CALL d:
266
8
Conclusion
SWITCH: RESUME:HD:RESUME: TUPLE:
PRIM +)) Compared with Example 8.1.2 we see that the superfluous DELAY and RESUME instructions are no longer generated. If the evaluation degree of the argument of sum is known to be (te, that is the spine of the list has been evaluated but not necessarily the elements, then O ri allows us to dispense with even more RESUME instructions, namely those that are underlined in the above code. •
8,1.4
Pre-evaluation of arguments
In the optimizing interpretations seen so far we have used local strictness information and strictness continuations to avoid generating superfluous DELAY instructions. However, knowing that an expression e is definitely strict in its argument (in a given context) also means that it is safe to evaluate its argument before evaluating e itself. So if C is the code for e then it will be safe to emit the code sequence RESUME: C. Such optimizations are regarded as very valuable in [86] and in our setting they are particularly interesting for operations like fix, Cond and Tuple where we may risk evaluating the argument more than once. The optimizations of f i x and Cond may be performed on the basis of local strictness information. First consider the following modification of the interpretation of f i x given in Table 8.3: O'(f ix[ti z=> t2)) = \H.(RC,
A(f ix[ ]) H2)
[ RESUME:(K(f ix[ ]) Hx d) if dn(A(f ix[ where RC — Xd.{ { K(f ix[ ]) Hi d otherwise
])H2)±=±
and Hi and H2 are as in Table 8.3. Here we simply use that if the overall expression is strict then it is safe to evaluate its argument before the expression itself. Next consider how to modify the interpretation of Cond: O'(Cond[ ]) = (RC, A(Cond[ ]) ^ h2 h3) where RESUME:(K(Cond[ ]) RCX RC2 RC3 d) if (dn{h1)±=±)
V (dn(h2)-L=±- A
dn(h3)±=±)
and RCt{d) / ±, RC2(d) ^ ±, RC3(d) ^ ± K(Cond[ ]) RCX RC2 RC3 d otherwise
8.1
Optimized Code Generation
267
It is safe to make this modification because the conditional is strict in its argument if the test is strict in its argument or both branches are strict in their argument (see Table 7.4). The optimization of Tuple requires that we have information about strictness continuations as it is now crucial that at least one of the components of the pair will be needed in the future computations. So we can modify O r to have O;(Tuple[ ]) = A(/ 1 ,/ ll ).A(/ 2 ,/ l2 ). (/, A(Tuple[ ]) hx h2) where RESUME:(/?C(d)) if ( K ( U P ( T , ± ) ) = 0 ) V («(«p(J.,T))=0)
= \K.\d.
and RC(d) ^ 1 RC(d)
=
otherwise
Ot(Tvple[])(f1,hl)(f2,h2)
Combining pre-evaluation and evaluation degrees So far the effect of the pre-evaluating RESUME instructions has been 'lost' because the subsequent code has not taken advantage of them. To overcome this deficiency we must record the effect of the pre-evaluating RESUME instructions and this is exactly what evaluation degrees do for us. As an example consider the following modification of O(n[ ]): 92M
O,'(n[ ]) =
(/, 9, A(D[ ]) ^ h2) where
RESUME:(/2 u d):(fi (g2 w) d) if dn(hl)(dn(h2)±) = ±, e = ±, / i (g2 u>) d ^ ± and f2ud^±
= \e.Xd.
(/ 2 c <*):(/! (g2 e) d) if (dnih^dnih^L) ^ i. V e ± ± ) , dn(h{) ± = ±, /x (g2 e) d ^ ± and f2 t d
i.
DELAY(/2 e d):(/x ± d)
if dn(hi)
_L
g — Ae.
gi{g2(t))
±?±,
otherwise if(dn(h and
_L = _L
We leave the details to Exercises 3, 4 and 5 and simply illustrate their intended effect in the following example.
268
8
Conclusion
Example 8.1.5 Consider the sum expression once again and assume that the evaluation degree of the argument is 0 and that the initial strictness continuation is Xa.a. The interpretation Ori extended to do pre-evaluation of the argument will then give rise to the code RESUME:CALLREC(D,ENTER:ISNIL: BRANCH(CONST 0,
ENTER: TL:RESUME:CALL d: SWITCH: HD:RESUME: TUPLE: PRIM + ) )
Here the list is evaluated to head normal form just before each recursive call.
8*2
•
Denotational Semantics
Much effort has been devoted to the construction of compiler-generating tools based on semantic formalisms, not least denotational semantics [6, 18, 46, 51, 85, 89, 95, 106]. None of these approaches have succeeded in generating compilers that have been shown to be correct and that additionally generate code of reasonable run-time efficiency compared with hand-crafted compilers. Naturally, this calls for research directed at understanding and automating those 'ingredients' in the construction of hand-crafted compilers that give them better performance than those generated by systems. We believe that the techniques developed in this book present a step forward in this direction. Usually, the metalanguage of denotational semantics is a typed A-calculus, more precisely an extension of the language of Chapter 2 with sumtypes and general recursive types (see e.g. [91, 96]) and fortunately most of the techniques developed in this book extend to larger languages (see e.g. [75, 72, 78]). In this section we shall illustrate how the techniques can be applied to a denotational specification of a toy programming language. One of the important characteristics of hand-crafted compilers is that they distinguish between those computations that are performed by the compiler and those that are performed by the code generated [5]. Efficiency is then obtained by ensuring that as many computations as possible are performed once and for all, that is by the compiler. To this end a number of techniques may be usable: • errors may be detected at compile-time by type checking or type inference, • the association of identifiers with values is a two-stage mapping: an environment maps identifiers to storage locations and a state maps storage locations
8.2
Denotational Semantics
269
to values; computations involving the environment may then be performed at compile-time, • the layout of run-time storage is known at compile-time and may be exploited during code generation, and • various data flow analyses may be performed to improve the quality of the code generated. The material presented in the previous chapters shows how such techniques can be applied directly to a functional language. In this section we show how they can be carried over to the semantic specification of an imperative language Imp.
8.2,1
The language Imp
We shall consider an extension of a simple while-language with the following features: • a block construct allows declarations of variables and parameterless recursive procedures, • variables are declared by var- and const-declarations and only the former can have their values updated by subsequent assignments, and • it is possible to read an input file and to write an output file. Formally, the abstract syntax of Imp is given by the syntactic categories: a;,pG
Ide
identifiers
n G Num
numerals
a G Aexp
arithmetic expressions a ::= x \ n | a>\ + a2 | * * *
b G Bexp
boolean expressions b ::= di = a2 \ - • -
S E
Stm
statements S ::= x := a \ Sx ; 5 2 | if b then Sx else 5 2 | while b do S \ read x \ write a \ begin D S end | call p
D G Dec
declarations D ::= var x; D \ const x — n\ D \ proc p is S; D \ e
P G Prog
programs P ::= program S
270
8
Conclusion
Semantic domains Turning to the semantics we shall need an environment mapping identifiers to their denotable values. For Imp there are three kinds of denotable values: • locations used for identifiers introduced by var-declarations, • natural numbers used for identifiers introduced by const-declarations, and • store transformations used for identifiers introduced by proc-declarations. The store contains a mapping from locations to storable values and for Imp the only storable values are the natural numbers. Also the store records the current value of the input and output files. We shall express these semantic domains as types in the language of Chapter 2. First we shall need three base types: Int
for natural numbers
Bool
for truth values
Loc
for locations
We then introduce the following shorthands for types:
In =
Int l i s t
Out =
Int l i s t
Store
=
Env =
(Loc—>Int) x (In X Out) (idexLoc) l i s t x ((Idexlnt) l i s t x (ldex(Store—>Store)) l i s t )
We shall use the meta-variables a and p to range over State and Env, respectively, when specifying the semantic clauses. Usually the environment is a single mapping from identifiers to denotable values, e.g. (ldex(Loc+Int+(Store—»Store)))list, but the language of Chapter 2 does not include the sum-type and in order to keep within the type system of Chapter 2 we therefore use the above definition of Env. However, we want to stress that the development to be performed below is equally feasible using the alternative definition.
Semantic functions Corresponding to the syntactic categories we have the following semantic functions:
8.2
Denotational Semantics
271
A[x] = \p.if isvar then ACT. ( f s tCT)(lOOkupEnv(fSt p)(x)) e l s e ACT. lookupEnv(f s t (snd p))(x)
•4[n] A[m
= Xp.A<7jV[nl + «2] = A^.Ao•.+ ((Ala,i}(p) (CT),^[a 2 ](/5)(CT))) = «2l
=
A,ACT •
=
((^[ai](/») (<7)>*4[fl2] (p)(a)))
Table 8.4: Semantics of Imp-expressions Af: Num —•
Int
A:
Aexp —> Env —> Store —> I n t
B:
Bexp -> Env -> Store —• Bool
S:
Stm —* Env —> Store —» Store
I>:
Dec —> Env -> Env
^:
Prog -> In -> Out
Here we have assumed that the syntactic categories Num, Aexp, • • • are encoded as base types in the metalanguage. The semantic clauses are given in Tables 8.4, 8.5 and 8.6. We use the following primitive functions: =[ldexlde—»Bool] =[lntxlnt—>Bool] +[lntxlnt->lnt] isvar[EnvxIde—>>Bool] new[Env—»Loc] init-Env[Env] init-Store[ln—> Store] The intended meaning of = and + should be evident. The function isvar takes an environment p and an identifier x as parameters and tests whether or not x is recorded in the first component of p, that is whether or not it has been introduced by a var-declaration. The function new takes an environment p as a parameter and returns a location that is unused in p. The constant init-Env produces an initial environment and the function init-Store produces an initial store when supplied with the input file.
272
8 5[&:=a] = Xp.Xc (updatestore(fst cr) (lookupEnv(fst
Conclusion p)(x))
snd cr) S{Si ; 5 2 ] = Xp.X tT.5[52](/9)(5[5iK/9)((7)) 5[if 6 then Si else 52] = Xp.test (B[bJ (/>)) (5[5iJ(/))) 5[while 6 do 5] = A/>.f ix(Af .test CB[6] (/?)) (Acr.f (5[5] (/?) (cr))) <S[read x] -= Xp.Xo.(updatestore(fst cr)(lookupEnv(fst
p)(x))
(hd (fst (sndd))), (tl (fst (snd
<J)),
snd (snd a)))
<S[write a] = Xp.Xa.(fst a, (fst (snd <J), <S[begin D S end] = Xp.S[S](V[D](p)) 5[call p] = A/?.lookupEnv(snd (snd /?)) (p)
Table 8.5: Semantics of Imp-statements We also use some auxiliary functions. Two functions operate on the environment: lookupEnv = Atab.Ax.if =((x, fst (hd tab))) then snd (hd tab) else lookup ( t l tab) (x) updateEnv = Atab.Ax.Ay.(x,y):tab One function operates on the store: updatestore = Atab.Ax.Ay.Ax'. if =((x,x;)) then y else tab(x') Finally, we use the function t e s t = Ap.Af1.Af2.Acr. if p(cr) then fi(cr) else f2(cr) in the clauses for conditional and iteration.
8.2.2
Transformations on the semantic specification
We shall now outline how the development of the earlier chapters may be applied to the semantic specification of the language Imp.
8.2
Denotational Semantics
273
Djvar x \D] = A/>.2>[£>]((update^(fst p) (a;) (new(/»)), snd p)) DJconst x=n; D\ = A/>.Z>[D]((fst />, (update Env (f st (snd /?)) (a;) CA/"[ra]), snd (snd />)))) X>[proc p is 5; D] =
\p.V{D}((fst
p, (fst (snd p),
updateEnv(snd (snd p)) (p) (fix(Af.S[S]((fst />, (fst (snd /?), updateEnv(snd (snd p))
Xp.p Pfprogram 51 = A^.snd(snd(5[5](init-Env) (init-Store(0))) Table 8.6: Semantics of Imp-declarations and programs Introducing types The semantic clauses of Tables 8.4, 8.5 and 8.6 are specified in the untyped Acalculus of Chapter 2. An analogue of the type inference algorithm of Chapter 2 can be used to transform the clauses into the typed A-calculus. However, in order for this to succeed we have to duplicate the definitions of the functions lookups™ and updateEnv (see Exercise 2 of Chapter 2) so that we get the following versions: i- (idexLoc)list—>Ide lookupEnv2- (Idexlnt)list—>Ide—>Int lookupEnv3* (ldex(Store^Store))list—>Ide-+(Store—»Store) i- (idexLoc)list—*Ide—>Loc—>(ldexLoc)list 2* (Idexlnt)list—»Ide—>Int—>(ldexlnt)list updateEnv3- (Idex(Store—>Store))list-^Ide—>(Store—>Store) —>(ldex (Store—>Store))list Having done this it is fairly straightforward (but tedious) to annotate the semantic clauses of Tables 8.4, 8.5 and 8.6 with their type information. A couple of examples are given below: <S[a::=a] = A^[Env].A(j[Store].(updatestore(fst a) (lookupEnvi ( f s t p) (x) )
274
8
Conclusion
snd a) <S[while b do S] = A/>[Env].fix(Af [Store->Store]. test (B{b](p)) (A(7[Store].f(5[5](/9)((T))) (Acr[Store].a)) P[var x; D] = A/}[Env].Dp]((updateEnvi(fst p)(x) (new[Env—>Loc] (p)), snd p)) Djproc p is 5; L>] = .D[D]((f st /?, (fst (snd p), updateEnV3(snd (snd p))(p) (fix(Af[Store->Store]. 5[5]((fst />, (fst (snd/?), updateEnv3(snd (snd p))
The metalanguage of semantic specifications So far we have been a bit vague about the relationship between the semantic specifications and the languages of Chapters 2 and 3 (and 4). An illustration of this is our earlier phrase that "an analogue of the type inference algorithm of Chapter 2 can be used to transform the clauses into the typed A-calculus". It should be clear that the right hand sides of the semantic clauses are expressions of the languages studied in Chapters 2, 3 (and 4) provided that we allow *4|ai] etc. as additional primitives. The proper perspective is then that the syntactic category of semantic specifications is built on top on the syntactic categories of expressions and types in much the same way as we have seen that the syntactic category of programs is. We claim that this is rather straightforward and that the adaption of the various algorithms for type inference etc. will not present a major obstacle. To make this claim credible — without going into the details — let us consider how we dealt with programs. In all of Chapters 2, 3 and 4 we • translated the program into an 'equivalent' expression, • performed the transformation on the expression, and
8.2
Denotational Semantics
275
• extracted a program out of the transformed expression. A similar approach is feasible for semantic specifications provided we have sufficient operations available on the primitive types that represent the syntactic categories. As we take the syntactic categories to be base types we simply assume enough primitive operations. As a simple example consider the semantic specification of B of Table 8.4. It may be 'coded' as B = f ix(A#'[Bexp-»Env->Store->Bool].A&[Bexp] . if is-equality(6) then A/>.A(j.=((^(fst-arg(6))(/))((7), ,4(snd-arg(&)) (/>) (a))) else •••) However, we shall not go further into this here. Introducing binding times In a hand-crafted compiler for Imp we would expect that all operations concerning the environment are performed at compile-time whereas those involving the store are postponed until run-time. We shall now see that the distinction between environment and store can be obtained from the semantic specification above by applying the binding time analysis of Chapter 3. The starting point for the binding time analysis is the semantic function V for programs. It is natural to annotate its functionality in the following way: V: Prog —> In —> Out This indicates that the input of an Imp program will not be known until run-time. Thus the minimal annotation of the type of V becomes Prog —» In z± Out. With In and Out as run-time types we then, intuitively, get the following annotations: In
=
Int l i s t
Out
=
Int l i s t
Store
=
(Loc->Int) _>£ (In _*. Out)
Env =
(idexLoc)list X ((Idexlnt)list x (Idex (Store—>Store))list)
Here we have used the convention that the abbreviation for a fully underlined type is also underlined itself. With these annotations we then get the following annotations of the functionalities of the semantic functions:
276
8
AT: A:
Num —>. Int
B: S: V: V:
Bexp —•> E n v --> (Store z± Bool)
Conclusion
Aexp —> Env --> (Store z± Int) Stm —> Env —> (Store —> Store) Dec —> Env —> Env Prog -
it Out)
These annotations indicate that all computations involving the environment can be performed at compile-time whereas those involving the store are postponed until run-time.
Transformations that enhance the binding time annotations In the above annotations we have pretended that the binding time analysis algorithm for types is immediately applicable to the functionalities of the semantic specifications. However, as was discussed above we must 'encode' the semantic specifications in the expressions studied in Chapters 2, 3 and 4. When we do this it turns out that the behaviour of the right hand sides of the clauses do influence the functionalities of the semantic specifications. We shall discuss two incarnations of this problem below. The base type Loc is used in the definition of Env whereas Loc is used in the definition of Store. When annotating the clause for x:=a we determine the location of a: from the environment and use it to update the store. Consequently the binding time analysis of Chapter 3 will insist on Loc being of run-time kind in the definition of the environment Env, and we no longer have the desired separation between environment and store. Looking for a solution we note that Loc refers to compile-time locations and Loc to run-time locations and in hand-crafted compilers these notions are distinct: often a run-time location is an address on the run-time stack whereas a compile-time location may be a pair of nesting depth and offset [5]. So rather than equating the two kinds of locations we shall need a function access that transforms compile-time locations into run-time locations: we shall say that it is a function that encodes the access path. The detailed definition of this function depends on the machine for which we generate code. Since the store will be an abstraction of the machine state we shall assume that access is a primitive function of the type indicated below: access[Loc—*(Store—>Loc)] The semantic clauses can now be modified to use this function whenever we need to pass from a compile-time location to a run-time location. An example is S[x:=a] = Xp[ ].Xa[ ].(update StO relfst
8.2
Denotational Semantics
277 [ ](lookup Env i (f s t
snd <J) Note that the binding time annotation reflects that first x is looked up in the environment at compile-time, then the resulting location is transformed to a runtime location using access, and finally this location is used at run-time to update the store. Next consider the base type Int which is used as a compile-time type in the definition of Env and as a run-time type in the definition of Store. In the clause for x the binding time analysis of Chapter 3 will force the two kinds of integers to be run-time entities. Thus once again the desired separation between environment and store fails. Looking for a solution we note that a hand-crafted compiler often distinguishes between compile-time and run-time data [5]. In particular the layout of run-time data may depend upon the machine we are generating code for: one machine may use 32 bits to represent an integer on the run-time stack whereas another machine may use 64 bits. So rather than forcing the two base types of integers to be the same we shall introduce a primitive function a l l o c that transforms compile-time integers to run-time integers. As the space allocation procedure depends on the machine at hand we shall let the function take the store as an additional parameter: alloc[lnt-+(Store->Int)] The semantic clauses can now be rewritten to use this function whenever we want to transform a compile-time integer into a run-time integer. Two examples are:
A[x] = \p[].if isvar[ then Xa[ ].(f st cr)^access[ ](lookupEnvi(^st p) (x)) (
278
8
Conclusion
the encoding of the access path using access). We regard this as a major virtue of our approach to compiler construction from semantic specifications: traditional compiler writing insights are used to improve the binding time annotations. This is in contrast to using a new 'magic' technique, like partial evaluation. As illustrated in [81] the necessary transformations may often be indicated by disagreement points showing where the binding time annotations 'fail' . Introducing combinators The final stage, in order to be able to apply the notions of parameterized semantics of Chapter 5, is to transform the 5-level A-expressions into expressions of the mixed A-calculus and combinatory logic. Again the overall approach follows the development outlined above. We shall not go into details but merely present a few 'idealized' equations: S[x:=a] = Xp[ ].Tuple(UpdateStOre
D
Tuple(Fst[ ], Tuple(access[ ](lookupEnvi(fst p Snd[ ]) A[x] = \p[].it
isvar[ ]((/>,*))
then Apply[ ] • Tuple(Fst[ ], access[ ](lookup E n v l (fst e l s e alloc[ ](lookupEnv2(f s t (snd p)) (x))
As should be evident these equations are much shorter than what would have been obtained directly from the algorithm of Chapter 4. To obtain the above results we have performed a few simplifications like changing the functionality of Updatestore to become less curried.
8,2.3
Towards a compiler
We have now shown how to apply • type inference (Chapter 2), • binding time analysis (Chapter 3), and
8.2
Denotational Semantics
semantic specification
279
expand on program
untyped expression type inference
typed semantic specification
typed expression binding time analysis
annotated semantic specification
jg-level expression combinator introduction
combinator semantic specification
mixed expression parameterized semantics
'compiling function'
semantic value
Figure 8.1: Compiler generation • combinator introduction (Chapter 4) to the semantic specification of a toy programming language. This line of development may be continued with
280
8
Conclusion
• parameterized semantics for code generation or abstract interpretation (Chapters 5, 6, 7 and 8). Rather than going into the (rather trivial) details of the remaining step we shall present a general picture of the approach that we have taken and of some other approaches that we could have taken. On the left hand side of Figure 8.1 we have semantic specifications of the kind studied in Subsection 8.2.2 and on the right hand side we have expressions of the languages of Chapters 2, 3 and 4 as well as semantic values as given by the parameterized semantics of Chapters 5, 6, 7 and 8. Each vertical arrow represents some 'transformation' on the semantic specification or expression in question. Each horizontal arrow represents the possibility of expanding the semantic specification on some given program. The naive approach we could have taken in Section 8.2 is to follow the topmost and rightmost route. Then all results of the previous chapters would be directly applicable. However, this represents a rather indirect way of constructing a compiler (assuming that our ultimate parameterized semantics is indeed a code generation like K, O, Ori or similar). To come closer to practical concerns one would want to take the leftmost and bottommost route. We illustrated the first three transformations in the previous subsection; the fourth transformation has little content to it — it merely avoids formulating the mixed term by directly interpreting the combinators — and the fifth transformation is quite standard. Thus the 'compiling function' will be a function that directly maps programs to code. It is not quite a compiler because it is an 'abstract function' and not written in any machine language; however, it may serve as a quite useful specification of a compiler. A natural question to ask is whether the two approaches agree. We do not wish to formally claim that all the small diagrams commute — indeed they probably do not — but surely they give, or should give, 'comparable results' in a vague and intuitive sense that we shall leave unspecified.
8*3
Research Directions
The previous sections have sketched how the techniques of the present book can be applied in the implementation of functional languages and semantic specifications. The following three issues are of fundamental importance when assessing the merits of this development: • correctness: the extent to which the implementation is faithful to the semantics, • automation: the extent to which the implementation is produced without human interaction, and
8.3
Research Directions
281
• efficiency: the extent to which the implementation is comparable in efficiency (e.g. time or space) to those obtained by other means. Below we briefly summarize our main achievements and the areas where further research is still needed.
Correctness One of the aims of the work reported in this book has been to ensure that the development is provably correct. As a consequence we have occasionally traded efficiency for ease of proving correctness. One example is the simple-minded code generation scheme presented in Chapter 6; it could be improved, e.g. by keeping track of the evaluation degree of the argument as in Section 8.1, but this will render the correctness considerations much more complicated. We believe that the main problem is to structure the correctness proofs in a proper way so that as few aspects as possible have to be dealt with at the same time. In the case of generating optimized code this means that the code generation should be specified in a number of stages where each stage is a refinement of some of the previous ones (much as in Section 8.1). The correctness proof of each stage can then rely on the correctness results of the previous stages. The use of Kripkelogical relations seems to provide a convenient basis for this. However, more work is needed to get a better understanding of this issue.
Automation Another aim of our work has been to develop algorithms to perform the various analyses and transformations. As an example, we have presented algorithms for type analysis, binding time analysis and combinator introduction. Also the concept of parameterized semantics is easily implementable so that the complete development can be implemented without too much trouble. However, in a number of cases we shall hardly be satisfied with the results obtained in this way. As illustrated in Chapters 3 and 4 and in Section 8.2 much more satisfactory results may be obtained if we rewrite the original A-expressions slightly and if we perform some simple transformations on the intermediate results. At the current stage it is not clear which transformations we should like to perform and in what order and this is an important area for further research.
Efficiency Many of our considerations have been motivated by the desire to obtain efficient implementations of programming languages. One key ingredient in this is the explicit distinction between binding times so that one can avoid having all 'compiletime computations' deferred until run-time. The other key ingredient is the ability to perform program analyses.
282
8
Conclusion
Traditionally, data flow analysis has been used to enable program transformations at the level of source programs as well as target programs. In this book we have only been concerned with the latter but [79] illustrates how program analyses can be used to validate some well-known program transformations like constant folding. The code optimizations obtained in Section 8.1 largely depend on strictness analysis and more work is needed to exploit other kinds of analyses. Furthermore, it would be interesting to repeat the development for sequential and parallel versions of graph-reduction machines.
Bibliographical Notes The material of Section 8.1 is heavily based on [82] and some further details may be found in [50]. The observation that strictness and strictness as right context are important for avoiding DELAY instructions was already made in [82] as was the observation that evaluation degrees as left context is important for avoiding RESUME instructions. The observation of the relationship between strictness as right context and Burn's notion of evaluation transformers is new as is the study of pre-evaluation in this framework. Combining all of the ideas into one interpretation still presents some problems, in particular for fixed points and the higher-order constructs and for the associated correctness proofs. The material of Section 8.2 presents the underlying philosophy behind many of our early papers on the subject matter of this book. The distinction between binding times as a tool in the implementation of programming languages is by no means new (see e.g. [5]) but it is not usually made an explicit notion in semantics. One exception is [99] where there is a distinction between static expression procedures and (dynamic) expression procedures; as discussed in [72] this corresponds quite naturally to our distinction between compile-time functions and run-time functions. Our treatment of Denotational Semantics is largely based on Section 6 of [72]. It contains a rather detailed study of how to apply a number of heuristics from compiler construction (and semantics) in order to transform a traditional denotational semantics for the language SMALL of [33] into a subset of the mixed A-calculus and combinatory logic that is amenable to our code generation scheme. This includes • ensuring that fixed points have composite types, e.g. by transforming a continuation style semantics into direct style, • ensuring that the desired binding time distinction between environments and stores is maintained, e.g. by using the notion of activation records to encode the access path.
8.3
Research Directions
283
Unlike the treatment in Chapter 6, the results of [72] do not allow run-time function types and so are mainly applicable to PASCAL-like languages.
Exercises 1. Consider the clause for O r(Tuple) and investigate whether or not some of the occurrences of T could be replaced by dn(hi)(T) or dn(h2)(T). 2. Consider the clause for O r (n) as given in Section 8.1.2. Show that (strict(K.) o dn(hi)) JL = 0 is equivalent to
(K O dnih)) i = 0V dn(hi) _L = J_ Discuss the consequences of replacing (K O dn(fei)) _L = 0 by (strict(K) o dn(hi)) _L = 0 Do you favour this replacement? Discuss whether or not the replacement has any consequences for the continuations supplied to f2 and f\. 3. Try to formulate an interpretation Oi that uses local strictness properties as well as evaluation degrees. Are there any constructs in the language that cannot be handled? 4. Try to extend the interpretation Oi of Exercise 3 to an interpretation O{ that performs pre-evaluation of arguments. Are there any constructs in the language that cannot be handled? 5. Try to use the insights of Exercise 4 to define the interpretation O^ that is like Ori but performs pre-evaluation of arguments. Verify that the code generated for sum is as stated in Example 8.1.5. Are there any constructs in the language that cannot be handled? 6. Consider a variant of the language Imp where the identifiers introduced by var-declarations are initialised as indicated by D ::= var x:=a; D | • • • Repeat the development of Section 8.2 and make sure that the computations involving the environment still may be performed at compile-time.
Bibliography [1] S.Abramsky: Strictness Analysis and Polymorphic Invariance, Programs as Data Objects, Springer Lecture Notes in Computer Science 217 (1986) 1-23. [2] S.Abramsky: Abstract Interpretation, Logical Relations and Kan Extensions, Journal of Logic and Computation 1 1 (1990) 5-40. [3] S.Abramsky: The lazy lambda calculus, Research Topics in Functional Programming, D.Turner (ed.), Addison-Wesley (1990) 65-116. [4] S.Abramsky, C.Hankin: Abstract Interpretation of Declarative Languages, Ellis Horwood (1987). [5] A.V.Aho, R.Sethi, J.D.Ullman: Tools, Addison-Wesley (1986).
Compilers - Principles, Techniques and
[6] A.W.Appel: Semantics-directed code generation, Proc. of the 12th ACM Conference on Principles of Programming Languages, ACM Press (1985) 315-324. [7] J.Backus: Can Programming be Liberated from the von Neumann Style? A Functional Style and its Algebra of Programs, Communications of the ACM 21 (1978) 613-641. [8] H.-J.Bandelt: The tensorproduct of continuous lattices, Mathematische Zeitschrift 172 (1980) 89-96. [9] H.Bekic: Definable operations in general algebras, and the theory of automata and flowcharts, Springer Lecture Notes in Computer Science 177 (1984) 30-55. [10] F.Bellegarde: Rewriting Systems on FP Expressions to Reduce the Number of Sequences Yielded, Science of Computer Programming 6 (1986) 11-34. [11] R.S.Bird, P.L.Wadler: Introduction to Functional Programming, PrenticeHall International (1988). 285
286
Bibliography
[12] B.Bjerner, S.Holmstrom: A compositional approach to time analysis of first order lazy functional programs, Proc. Functional Programming Languages and Computer Architectures, ACM Press (1989) 157-165. [13] G.L.Burn: Evaluation transformers — a model for the parallel evaluation of functional languages (extended abstract), Proc. Functional Programming Languages and Computer Architecture, Springer Lecture Notes in Computer Science 274 (1987) 446-470. [14] G.L.Burn: A Relationship Between Abstract Interpretation and Projection Analysis (Extended Abstract), Proc. ACM Symp. on Principles of Programming Languages, ACM Press (1990). [15] G.L.Burn, C.Hankin, S.Abramsky: Strictness analysis for higher-order functions, Science of Computer Programming 7 (1986) 249-278. [16] R.M.Burstall, J.Darlington: A Transformation System for Developing Recursive Programs, Journal of the ACM 24 (1977) 44-67. [17] L.Cardelli: The functional abstract machine, Bell Labs. Technical Report TR-107 (1983). [18] H.Christiansen, N.D.Jones: Control flow treatment in a simple semanticsdirected compiler generator, in: Formal Description of Programming Concepts II, D.BJ0rner (ed.), North-Holland (1982) 73-97. [19] G.Cousineau, P.-L.Curien, M.Mauny: The Categorical Abstract Machine, Science of Computer Programming 8 (1987) 173-202. [20] P.Cousot, R.Cousot: Systematic design of program analysis frameworks, Proc. 6th ACM Symp. on Principles of Programming Languages, ACM Press (1979). [21] P.-L.Curien: Categorical Combinators, Sequential Algorithms and Functional Programming, Pitman (1986). [22] L.Damas, R.Milner: Principal type-schemes for functional programs, Proc. ACM Symp. on Principles of Programming Languages, ACM Press (1982) 207-212. [23] L.Damas: Type Assignment in Programming Languages, Ph.D.-thesis CST33-85, University of Edinburgh, Scotland (1985). [24] J.Darlington, H.Pull: A program development methodology based on a unified approach to execution and transformation, Partial Evaluation and Mixed Computation, D.BJ0rner, A.P.Ershov and N.D.Jones (eds.), North-Holland (1988) 117-131.
Bibliography
287
[25] J.Despeyroux: Proof of translation in natural semantics, Symposium on Logic in Computer Science (1986). [26] P.Dybjer: Using domain algebras to prove the correctness of a compiler, Proc. STACS 1985, Springer Lecture Notes in Computer Science 182 (1985) 98-108. [27] A.P.Ershov: Mixed Computation: Potential Applications and Problems for Study, Theoretical Computer Science 18 (1982) 41-67. [28] M.S.Feather: A System for Assisting Program Transformation, ACM Transactions on Programming Languages and Systems 4 (1982) 1-20. [29] A.B.Ferguson, R.J.M.Hughes: An Iterative Powerdomain Construction, Functional Programming, Glasgow 1989, K.Davis and J.Hughes (eds.), Springer (1989) 41-55. [30] A.J.Field, P.G.Harrison: Functional Programming, Addison-Wesley (1988). [31] J.H.Gallier: Logic for Computer Science, Harper & Row (1986). [32] C.K.Gomard, N.D.Jones: A partial evaluator for the untyped A-calculus, Journal of Functional Programming 1 (1991) 21-69. [33] M.J.C.Gordon: The Denotational Description of Programming Languages, Springer (1979). [34] P.R.Halmos: Naive Set Theory, Springer (1974). [35] P.G.Harrison: Linearisation: An Optimisation for Nonlinear Functional Programs, Science of Computer Programming 10 (1988) 281-318. [36] F.Henglein: Efficient Type Inference for Higher-Order Binding Time Analysis, Functional Programming Languages and Computer Architecture, Springer Lecture Notes in Computer Science 523 (1991) 448-472. [37] J.R.Hindley: The principal type-scheme of an object in combinatory logic, Trans. American Math. Soc. 146 (1969) 29-60. [38] P.Hudak, J. Young: A Collecting Interpretation of Expressions (without Powerdomains), Proc. 15th ACM Symp. on Principles of Programming Languages, ACM Press (1988) 107-118. [39] P.Hudak et al.: Report on the Programming Language Haskell — A NonStrict, Purely Functional Language, Version 1.1, Yale University (1991).
288
Bibliography
[40] G.Huet: Cartesian Closed Categories and Lambda-Calculus, Springer Lecture Notes in Computer Science 242 (1986) 123-135. [41] J.Hughes: Supercombinators: a new Implementation Method for Applicative Languages, Proc. of 1982 ACM Conf. on LISP and Functional Programming, ACM Press (1982) 1-10. [42] J.Hughes: Strictness detection in non-flat domains, Proc. Programs as Data Objects, Springer Lecture Notes in Computer Science 217 (1986) 112-135. [43] J.Hughes: Backwards Analysis of Functional Programs, Partial Evaluation and Mixed Computation, D.Bj0rner, A.P.Ershov and N.D.Jones (eds.), North-Holland (1988) 187-208. [44] S.Hunt, D.Sands: Binding Time Analysis: A New PERspective, Proc. ACM Symposium on Partial Evaluation and Semantics-Based Program Manipulation, ACM Press (1991) 154-165. [45] N.D.Jones, F.Nielson: Abstract Interpretation: a Semantics-Based Tool for Program Analysis, invited paper (in preparation) for The Handbook of Logic in Computer Science, Oxford University Press. [46] N.D.Jones, D.A.Schmidt: Compiler Generation from Denotational Semantics, in: Semantics Directed Compiler Generation, Springer Lecture Notes in Computer Science 94 (1980) 70-93. [47] N.D.Jones, P.Sestoft, H.S0ndergaard: An experiment in Partial Evaluation: The Generation of a Compiler Generator, Proc. of Rewriting Techniques and Applications, Springer Lecture Notes in Computer Science 202 (1985) 124140. [48] U.j0rring, W.L.Scherlis: Compilers and Staging Transformations, Proc. 13th ACM Symp. on Principles of Programming Languages, ACM Press (1986)
86-96. [49] J.Lambek, P.J.Scott: Introduction to Higher Order Categorical Logic, Cambridge Studies in Advanced Mathematics 7 (1986). [50] T.Lange: Correctness of Code Generations based on a Functional Programming Language, M.Sc.-thesis, Aarhus University, Denmark (to appear). [51] P.Lee, U.Pleban: On the use of LISP in implementing denotational semantics, in: Proc. of the 1986 ACM Conference on LISP and Functional Programming, ACM Press (1986) 233-248. [52] D.Lester: Combinator Graph Reduction: A Congruence and its Applications, Ph.D.-thesis PRG-73, Oxford University, England (1989).
Bibliography
289
[53] D.MacQueen, G.Plotkin, R.Sethi: An ideal model for recursive polymorphic types, Proc. ACM Symp. on Principles of Programming Languages, ACM Press (1984) 165-174. [54] T.J.Marlowe, B.G.Ryder: Properties of data flow frameworks: A Unified Model, Ada Informatica 28 (1990), 121-163. [55] R.Milne, C.Strachey: A Theory of Programming Language Semantics, Chapman and Hall (1976). [56] R.Milner, M.Tofte, R.Harper: The Definition of Standard ML, MIT Press (1990). [57] R.Milner: A Theory of Type Polymorphism in Programming, Journal of Computer Systems 17 (1978) 348-375. [58] T.Mogensen: Binding Time Analysis for Higher Order Polymorphically Typed Languages, TAPSOFT 1989, Springer Lecture Notes in Computer Science 352 (1989). [59] M.Montenyohl, M.Wand: Correct flow analysis in continuation semantics, Proc. 15th ACM Symposium on Principles of Programming Languages, ACM Press (1988) 204-218. [60] F.L.Morris: Advice on structuring compilers and proving them correct, Proc. ACM Conference on Principles of Programming Languages, ACM Press (1973) 144-152. [61] P.D.Mosses, D.A.Watt: The use of action semantics, in Proc. IFIP TC2 Working Conference on Formal Description of Programming Concepts III, North-Holland (1987). [62] A.Mycroft: Abstract Interpretation and Optimizing Transformations for Applicative Programs, Ph.D.-thesis CTS-15-81, University of Edinburgh, Scotland (1981). [63] A.Mycroft: The theory and practice of transforming call-by-need into callby-value, Proc. 4th International Symposium on Programming, Springer Lecture Notes in Computer Science 83 (1980). [64] A.Mycroft: A Study on Abstract Interpretation and 'Validating Microcode Algebraically', in: Abstract Interpretation of Declarative Languages, S.Abramsky and C.Hankin (eds.), Ellis Horwood (1987), 199-218. [65] A.Mycroft, F.Nielson: Strong abstract interpretation using powerdomains, Proc. ICALP 1983, Springer Lecture Notes in Computer Science 154 (1983) 536-547.
290
Bibliography
[66] F.Nielson: Abstract Interpretation using Domain Theory, Ph.D.-thesis CST31-84, University of Edinburgh, Scotland (1984). [67] F.Nielson: Program Transformations in a Denotational Setting, ACM Transactions on Programming Languages and Systems 7 (1985) 359-379. [68] F.Nielson: Tensor Products Generalize the Relational Data Flow Analysis Method, Proc. of the 4'th Hungarian Computer Science Conference (1985) 211-225. [69] H.R.Nielson, F.Nielson: Semantics Directed Compiling for Functional Languages, Proc. 1986 ACM Conference on LISP and Functional Programming, ACM Press (1986) 249-257. [70] F.Nielson: Towards a Denotational Theory of Abstract Interpretation, Abstract Interpretation of Declarative Languages, S.Abramsky and C.Hankin (eds.), Ellis Horwood (1987) 219-245. [71] F.Nielson: Strictness Analysis and Denotational Abstract Interpretation, Information and Computation 76 1 (1988) 29-92. — Also see ACM Conference on Principles of Programming Languages, ACM Press (1987) 120-131. [72] F.Nielson, H.R.Nielson: Two-Level Semantics and Code Generation, Theoretical Computer Science 56 (1988) 59-133. [73] F.Nielson: A Formal Type System for Comparing Partial Evaluators, Partial Evaluation and Mixed Computation (Ebberup 1987), D.Bj0rner, A.P.Ershpv and N.D.Jones (eds.), North-Holland (1988) 349-384. [74] H.R.Nielson, F.Nielson: Automatic Binding Time Analysis for a Typed Acalculus (Extended Abstract), Proc. 15th ACM Symp. on Principles of Programming Languages, ACM Press (1988) 98-106. — The full version is [75]. [75] H.R.Nielson, F.Nielson: Automatic Binding Time Analysis for a Typed Acalculus, Science of Computer Programming 10 (1988) 139-176. — Also see [74]. [76] F.Nielson, H.R.Nielson: 2-level A-lifting, Proc. ESOP 1988, Springer Lecture Notes in Computer Science 300 (1988) 328-343. [77] F.Nielson, H.R.Nielson: The TML-approach to Compiler-Compilers, ID-TR 1988-47, Department of Computer Science, Technical University of Denmark (1988). [78] F.Nielson: Two-Level Semantics and Abstract Interpretation, Theoretical Computer Science — Fundamental Studies 69 2 (1989) 117-242.
Bibliography
291
[79] H.R.Nielson, F.Nielson: Transformations on higher-order functions, Proc. Functional Programming Languages and Computer Architecture, ACM Press (1989) 129-143. [80] H.R.Nielson, F.Nielson: Functional Completeness of the Mixed A-Calculus and Combinatory Logic, Theoretical Computer Science 70 (1990) 99-126. [81] H.R.Nielson, F.Nielson: Eureka definitions for free!, Proc. European Symposium On Programming 1990, Springer Lecture Notes in Computer Science 432 (1990) 291-305. [82] H.R.Nielson, F.Nielson: Context Information for Lazy Code Generation, Proc. LISP and Functional Programming 1990, ACM Press (1990) 251-263. [83] H.R.Nielson, F.Nielson: Semantics with Applications — A Formal Introduction, Wiley (1992). [84] F.Nielson, H.R.Nielson: The Tensor Product in Wadler's Analysis of Lists, Proc. European Symposium on Programming 1992, Springer Lecture Notes in Computer Science (to appear). [85] L.Paulson: Compiler generation from denotational semantics, in: Methods and Tools for Compiler Construction, B. Lorho (ed.), Cambridge University Press (1984) 219-250. [86] S.Peyton Jones: The Implementation of Functional Programming Languages, Prentice-Hall (1987). [87] G.D.Plotkin: Call-by-name, call-by-value and the A-calculus, Theoretical Computer Science 1 (1975) 125-159. [88] G.D.Plotkin: LCF considered as a programming language, Theoretical Computer Science 5 (1977) 223-255. [89] M.R.Raskovsky: Denotational semantics as a specification of code generators, in: Proc. of the SIGPLAN 1982 Symposium on Compiler Construction, ACM Press (1982) 230-244. [90] J.A.Robinson: A Machine-Oriented Logic based on the Resolution Principle, Journal of the ACM 12 (1965) 23-41. [91] D.A.Schmidt: Denotational Semantics — A Methodology for Language Development, Allyn & Bacon (1986). [92] D.A.Schmidt: Static Properties of Partial Evaluation, Partial Evaluation and Mixed Computation, D.BJ0rner, A.P.Ershov and N.D.Jones (eds.), North-Holland (1988) 465-483.
292
Bibliography
[93] D.S.Scott: Data types as lattices, SIAM J. Comput. 5 (1976) 522-587. [94] D.S.Scott: Domains for denotational semantics, Proc. ICALP 1982, Springer Lecture Notes in Computer Science 140 (1982) 577-613. [95] R.Sethi: Control flow aspects of semantics directed compiling, ACM TOPLAS 5 4 (1983) 554-595. [96] J.E.Stoy: Denotational Semantics — The Scott-Strachey Approach to Programming Language Theory, MIT Press (1977). [97] J.E.Stoy: The Congruence of two Programming Language Definitions, Theoretical Computer Science 13 (1981) 151-174. [98] M.B.Smyth, G.D.Plotkin: The category-theoretic solution of recursive domain equations, SIAM J. Comput. 11 (1982) 761-783. [99] R.D.Tennent: Principles of Programming Languages, Prentice-Hall (1981). [100] J.W.Thatcher, E.G.Wagner, J.B.Wright: More on advice on structuring compilers and proving them correct, Theoretical Computer Science 15 (1981) 223-249. [101] D.Turner: A New Implementation Technique for Applicative Languages, Software, Practice and Experience 9 (1979) 31-49. [102] D.A.Turner: Miranda: A Non-strict Functional Language with Polymorphic Types, Proc. Functional Programming Languages and Computer Architectures, Springer Lecture Notes in Computer Science 201 (1985) 1-16. [103] D.A.Turner: Miranda release 2, on-line manual (1989). [104] P.Wadler: Strictness analysis on non-flat domains (by abstract interpretation over finite domains), Abstract Interpretation of Declarative Languages, S.Abramsky and C.Hankin (eds.), Ellis Horwood (1987) 266-275. [105] P.Wadler, R.J.M.Hughes: Projections for Strictness Analysis, Proc. Functional Programming Languages and Computer Architecture, Springer Lecture Notes in Computer Science 274 (1987), 385-407. [106] M.Wand: Deriving target code as a representation of continuation semantics, ACM TOPLAS 4 3 (1982) 496-517. [107] A.Wikstrom: Functional Programming Using Standard ML, Prentice-Hall International (1987).
Summary of Transformation Functions remove type information from expressions or programs. £TA>'PTA
insert type information into untyped expressions or programs.
e'TA removes type annotation from variables. ATA collects the polytypes of free variables. remove binding time annotations from types, expressions or programs. The function TBTA rnay also be used on type environments and sets of assumptions. insert trivial binding time annotations into types (c or r), expressions (c or r) or programs (c or r ) .
^BTA^BTA
i n s e r t sensible binding time annotations into types, expressions or programs. The function TBTA may be used to enforce agreement with a constraint set C and is then called
^BTA^BTA
£
BTA
£"BTA
removes type and binding time annotations from variables. annotates variables with type and binding time annotations. collects the type and binding time annotations of free variables, expand combinators in expressions or programs. ,Vci introduce combinators for the run-time constructs in expressions or programs.
UD takes care of rules [up] and [down] in the i?-level A-calculus. Sy8* take care of shortening of position environments in order to simulate rule [up]. 293
294
Summary
,w take care of enlargement of position environments in order to simulate rule [down].
Index # A 141, 142 A(i) 140 A(i..) 140, 142 A(i..j) 140, 142 Ax k A2 170 A(x) 17, 56 4X 17 5(pt) 14 (C,5r) 139, 140 (D°°,Q 114 (£>*,[=) 119 (£>,[=) 108 (Z>x,Q 120 (D*E,Q 118 {D^SE,Q 119 (D-yE,C.) 112 (DxEJZ) 110 {L®L\cross) 240 (5 X ,E) 109 {C,»} 143 {C;v} 143 -> 140 ^ 35 2 208 v::nothunk 167 V::thunk 167 £enu[£/xi] 10 [e](J)peni- 123 Ipen»](I) 123 rr ill
/>-r-\
-| i /j
115
0 10 Ye 211 ye 211 tenv\X 23,30 («?i->e) 113 r e Y' 233 /® 240, 241 I®L' 241 e!^e 2 47 Px^p 2 47 h-.b^ti-.h 48 ii^i 2 47 r ^ c 34 AocC 169 Aoc5T 169 LJ^ 108 nr47 J/iny2 48 h p 10, 45 h c p 15, 45 h i : 6 37 ierat; h e : £ 10, 83 ienu h e : t : 6 39 ierav hc e : t : b 44 tenv he e : t 14 <enw f e . n 3 3 ^ BTA A
55
16
xn.
7
29
S; 186 £BTA 62, 63
r ,
,
±109 17
i
n i / l
^BTA
fc
5
1fi
6
Index
296
admissible 114 algebraic 110 algebraic transformation 102 arity() 169 automation 280, 281
±t2) 116 0 211 V£Tk 47, 68 8, 101 20, 46 53
48 2/ 15, 29 a 7, 13 0 7, 13, 89 6(perav,f) e 91 8* penv 91 A(penu) t 91 S*BTA 68, 76 9
36, 55 * 86, 87 4 A 21, 29 £TA 16, 29 * 40, 63 ^BTA 36 7TBTA 3 6 TTCI 86, 88 TTJ env 90
7r T A 27 TTps 123 1189 PBTA 89
/>&i89 ^89 PPS 122 T BTA 36 TBTA 36, 44,
57
rci 86 u>(penv,t) e 92 w(t) penv 92 A 208 abstract machine 140 additive 240
B 109 bi 143 base value 143 BD 110 Bekic's Theorem 133 binding time 34 C 12, 44, 140 c34 call-by-name 119 call-by-value 119 cartesian product 110 Case 105, 232 CCi() 241 CC2() 241 CE2 81 chain 109 closed 13 closed in both components 241 closure 143 Code 140 combinator 80 compact 110 compile-time 34 complete induction 49 complete lattice 108 completeness 22 composite 132 compA 257 compC 187, 257 compS[] 155, 156 compS 162, 256 compW 167, 168,256 comp' 247 comp 214, 215 configuration 139
Index consistent 48, 53, 56, 108 consistently complete 110 constraint 12, 44 continuous 112 convex 114 Config 140 CONSps 115
correctness 280, 281 covers 14 cpo 109 cross 241 definitely strict 208 deterministic 142 discrete 109 dn 128 Dom() 13 dom*() 114 dom() 10, 114 domain 110 [down] 42, 43 dual 108 E261 E2' 55 E2 34 £9 e9 eager 119, 120 efficiency 281 enlarging 92 enriched A-calculus 2 evaluation degree 261 evaluation transformer 260 evaluator 260 execution sequence 140, 141 ExSeq() 142 FEV() 16, 29 finite height 109 finite lists 119 FIX' 221
297
FIX 112 fixed point 112 fixed point principle 114 flat 109 FreeLab() 155 frontier type 36, 117 FTV() 13, 14, 17 fun() 17 function space 112 functional 17, 56 generic instance 14 graph() 10 greatest lower bound 47 ground substitution 14 Ad 213 HDps 116
78 infinite lists 114 INSc() 22 instance 14, 45 Ins 140 interpretation 126 interpretation of operators 126 interpretation of types 117 ISNILps 116
Ki262 K 144 Kripke-logical relation 156, 256 Labels 140
layered predicates 168, 187 lazy 119, 120 LC() 241 least element 109 least upper bound 108 left-closed 241 left context 258 lengthQ 169
Index
298
length 213 length 213, 236, 245 2-level A-calculus 34 S-level instance 44, 45 £-level polytype 44 £-level type 34 B-level language L 35, 39, 43 lifted 120 list 143 logical relation 214 lower bound 47 mixed A-calculus and combinatory logic 80 monotonic 111 monotype 13 NILPS
116
nothunk 166
O'266 O{ 267, 283 O'rl 283
Oi283 Orl 264 Or 258, 283 O 254 oi 143 P9
P(E%T*)9 pair 143 parameter monotonicity 156 partial evaluation 102 PE 16 pe 16 penv 89, 122 polytype 13 polytyped expression 16 position environment 89, 122 postC 186
postW 166 potentially infinite lists 114 pre-evaluation 266 predicate 114 preC 186 preW 166 Prod 244 PT2 44 PT 13 Pt2 44 pt 13 pure type 36 pure() 131 r34 RC() 211 Reducet 34 reduce 3 RelCode 144 C-restricted 126 right-closure 210 right context 258 run-time 34 Seee 120 Slel 121
S m 118, 127 S 121 S 13 safe 208 Scott-domain 110 Scott-open 211 separately additive 240 shortened 91 smash product 118 soundness 22 Split 244 ST 140 Stack 140 standard interpretation 127 step function 113 strict 112
299
Index
strict 221 strict function space 119 structural induction 160, 184, 204, 231, 248 substitution 13, 45 substitution property 154, 157 Subst 155 sum 213 sum 3, 213, 236, 245, 255, 260 sum9B 102 sum9B 81 sum9a 73 sum9b 73 sumg 45 sumt 8 Sumt 33 T2 34 T 9 t9 TC() 241 tensor product 240 tenv 10, 40 thunk 143 TLps 116 transition relation 140 TS 13 ts 13 TV 13 tv 13 TYPEQ 125
type analysis 16 type environment 10 type scheme 13 type variable 7, 13 typed expression 8 UD() 58 UE9 ue 9 unifying substitution 15 untyped expression 9 [up] 39, 42, 43 up 128 upper bound 108 val' 247 val 214 Val 140 valC 185, 186 valW 165 Wadler's construction 210 well-behaved code 163 well-formed 36, 39, 91, 123 1-well-formed 64 ^-well-formed 65 well-founded 49 WFF C () 22 Z 109