Introduction This text was written for a first programming subject in the object-oriented language Eiffel. It assumes no...
25 downloads
749 Views
3MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Introduction This text was written for a first programming subject in the object-oriented language Eiffel. It assumes no previous computing experience, presents almost every part of the Eiffel language, and shows how to design good, reusable systems. The Eiffel language is described in the first half of this text in 13 chapters, divided into three parts. The first part presents the basic concepts and constructs in the language, covering the topics of data flow, control flow, routines, objects, classes, and assertions. The second part describes the common data structures, arrays and lists. The third part covers inheritance and its sub-topics: simple, multiple, and repeated inheritance, plus file storage and generic classes. 1. 2. 3. 4. 5.
An instruction in the language is presented in the following parts: Look and feel Syntax Mechanism Common errors Example
The second half of the text presents a case study in 13 parts to show three things. First, the case study for each chapter shows how the topics described in that chapter are implemented in a working system. Second, each part of the case study shows how to design good, reusable Eiffel classes. Third, the case study shows that an Eiffel system is built by adding new code to an existing system; almost no code is re-written when the system is extended from one simple class to 16 interlocking classes. 1. 2. 3. 4. 5.
A section of the case study is presented in the following parts: Specification. Analysis. Design. Charts: one or more of client, inheritance, and class diagrams. Eiffel code that is new or changed in that section.
The complete, working code for each part of the case study may be examined and executed in the directory /pub/psda/oopie The language described in the text is ISE Eiffel version 3.3.7 running under Solaris. Eiffel has a large library of reusable classes, that may be examined in the directory /opt/Eiffel3/library and its sub-directories base lex parse vision
classes used to build basic Eiffel systems classes used to build and apply lexical analysers clases used to build document processing systems classes used to build graphical interfaces
The base directory contains the directories kernel basic Eiffel classes, including files and arrays structures other data structures supportmathematical and other supporting classes
© R. S. Rist, 1993
1
Table of contents CHAPTER 1: LOOK AND FEEL
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
1.1 Programming languages
Ошибка! Закладка не определена.
1.2 The key: data
Ошибка! Закладка не определена.
1.3 The routine as a module
Ошибка! Закладка не определена.
1.4 The class as a module
Ошибка! Закладка не определена.
1.5 Code layout
Ошибка! Закладка не определена.
1.6 Building a system from classes
Ошибка! Закладка не определена.
1.7 Running an Eiffel system 1.7.1 A simple class 1.7.2 A simple Ace file 1.7.3 eifstart 1.7.4 The system file 1.7.5 eif 1.8 Case study: the balance
Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена.
CHAPTER 2: BASIC DATA TYPES ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. 2.1 Class INTEGER
Ошибка! Закладка не определена.
2.2 INTEGER declaration
Ошибка! Закладка не определена.
2.3 INTEGER expressions
Ошибка! Закладка не определена.
2.4 Assignment
Ошибка! Закладка не определена.
2.5 Error messages
Ошибка! Закладка не определена.
2.6 INTEGER input and output
Ошибка! Закладка не определена.
2.7 Output formatting
Ошибка! Закладка не определена.
2.8 Class REAL 2.8.1 Declaration and numeric features 2.8.2 Input and output
Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена.
2.9 Class DOUBLE
Ошибка! Закладка не определена.
2.10 Mathematical classes
Ошибка! Закладка не определена.
2.11 Class CHARACTER
Ошибка! Закладка не определена.
2.12 Case study: the BANK system
Ошибка! Закладка не определена.
CHAPTER 3: ROUTINES
© R. S. Rist, 1993
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. 2
3.1 Look and feel
Ошибка! Закладка не определена.
3.2 Routine syntax and mechanism
Ошибка! Закладка не определена.
3.3 Procedure format and use
Ошибка! Закладка не определена.
3.4 Local variables 3.4.1 Example: a local amount 3.4.2 Local or attribute? 3.5 Passing data to a routine 3.6 Functions 3.6.1 Syntax and mechanism 3.6.2 Function or attribute?
Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена.
3.7 Comments
Ошибка! Закладка не определена.
3.8 Cause and effect routines
Ошибка! Закладка не определена.
3.9 Once routines
Ошибка! Закладка не определена.
3.10 Listing order
Ошибка! Закладка не определена.
3.11 Case study: the BANK system
Ошибка! Закладка не определена.
CHAPTER 4: OBJECTS 4.1 Object creation 4.1.1 Creation code 4.1.2 Data structure 4.1.3 Creation procedure 4.1.4 Creating an object 4.1.5 Using an object
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена.
4.2 Calling a feature from a client
Ошибка! Закладка не определена.
4.3 Operators
Ошибка! Закладка не определена.
4.4 Value and reference semantics
Ошибка! Закладка не определена.
4.5 Reference assignment
Ошибка! Закладка не определена.
4.6 Reference equality
Ошибка! Закладка не определена.
4.7 Object copy
Ошибка! Закладка не определена.
4.8 Deep versus shallow operators
Ошибка! Закладка не определена.
4.9 Passing an object
Ошибка! Закладка не определена.
4.10 Strings
Ошибка! Закладка не определена.
4.11 Case study: the BANK system
Ошибка! Закладка не определена.
CHAPTER 5: BEHAVIOUR 5.1 Look and feel
© R. S. Rist, 1993
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. Ошибка! Закладка не определена.
3
5.2 Routine behaviour
Ошибка! Закладка не определена.
5.3 Behaviour versus implementation
Ошибка! Закладка не определена.
5.4 Class behaviour
Ошибка! Закладка не определена.
5.5 Listing order
Ошибка! Закладка не определена.
5.6 System charts
Ошибка! Закладка не определена.
5.7 Assertions
Ошибка! Закладка не определена.
5.8 Class invariants
Ошибка! Закладка не определена.
5.9 Documentation: the short form of a class
Ошибка! Закладка не определена.
5.10 The Eiffel library class STRING
Ошибка! Закладка не определена.
5.11 Errors 5.11.1 Antibugging 5.11.2 Debugging 5.12 Case study: export and assertions
CHAPTER 6: SELECTION
Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена.
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
6.1 Sequence, selection, and iteration
Ошибка! Закладка не определена.
6.2 BOOLEAN values
Ошибка! Закладка не определена.
6.3 Relational operators
Ошибка! Закладка не определена.
6.4 Boolean operators
Ошибка! Закладка не определена.
6.5 Boolean functions
Ошибка! Закладка не определена.
6.6 Selection: the if statement
Ошибка! Закладка не определена.
6.7 Examples: the if statement
Ошибка! Закладка не определена.
6.8 Selection: the inspect statement
Ошибка! Закладка не определена.
6.9 Case study: selection
Ошибка! Закладка не определена.
CHAPTER 7: REPETITION
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
7.1 Iteration: the loop statement
Ошибка! Закладка не определена.
7..2 Examples: the loop statement
Ошибка! Закладка не определена.
7.3 Input validation
Ошибка! Закладка не определена.
7.4 Menu processing
Ошибка! Закладка не определена.
7.5 Recursion
Ошибка! Закладка не определена.
7.6 Case study: iteration and menu
Ошибка! Закладка не определена.
© R. S. Rist, 1993
4
CHAPTER 8: ARRAYS
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
8.1 The definition of an array
Ошибка! Закладка не определена.
8.2 Using an array
Ошибка! Закладка не определена.
8.3 The Eiffel library class ARRAY
Ошибка! Закладка не определена.
8.4 Example: ARRAY [INTEGER]
Ошибка! Закладка не определена.
8.5 Example: Insertion sort
Ошибка! Закладка не определена.
8.6 Example: ARRAY [PERSON]
Ошибка! Закладка не определена.
8.7 The strip operator
Ошибка! Закладка не определена.
CHAPTER 9: LISTS
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
9.1 The definition of a list
Ошибка! Закладка не определена.
9.2 The Eiffel library class LINKED_LIST 9.2.1 Structure 9.2.2 Features
Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена.
9.3 Scanning a list
Ошибка! Закладка не определена.
9.4 Cause and effect: matched routines
Ошибка! Закладка не определена.
9.5 A local cursor
Ошибка! Закладка не определена.
9.6 Array or list?
Ошибка! Закладка не определена.
9.7 Class RANDOM
Ошибка! Закладка не определена.
9.8 Case study: the BANK system
Ошибка! Закладка не определена.
CHAPTER 10: INHERITANCE
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
10.1 Look and feel
Ошибка! Закладка не определена.
10.2 Inheritance chart
Ошибка! Закладка не определена.
10.3 Syntax and mechanism
Ошибка! Закладка не определена.
10.4 Inherit or client?
Ошибка! Закладка не определена.
10.5 Inherit example: class WORKER
Ошибка! Закладка не определена.
10.6 Redefine
Ошибка! Закладка не определена.
10.7 Redefine example: class WORKER
Ошибка! Закладка не определена.
10.8 Redefine example: class CONTRACTOR
Ошибка! Закладка не определена.
10.9 Rename
Ошибка! Закладка не определена.
10.10 Rename example: class WORKER
Ошибка! Закладка не определена.
© R. S. Rist, 1993
5
10.11 The precursor of a feature
Ошибка! Закладка не определена.
10.12 Export
Ошибка! Закладка не определена.
10.13 Case study: inheritance
Ошибка! Закладка не определена.
CHAPTER 11: POLYMORPHISM
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
11.1 The Eiffel type hierarchy
Ошибка! Закладка не определена.
11.2 Conformance
Ошибка! Закладка не определена.
11.3 Deferred features
Ошибка! Закладка не определена.
11.4 A deferred example: class POLYGON
Ошибка! Закладка не определена.
11.5 An effective example: class RECTANGLE
Ошибка! Закладка не определена.
11.6 Dynamic types
Ошибка! Закладка не определена.
11.7 Dynamic creation
Ошибка! Закладка не определена.
11.8 Dynamic dispatch
Ошибка! Закладка не определена.
11.9 Polymorphism
Ошибка! Закладка не определена.
11.10 Polymorphism example: a list of polygons
Ошибка! Закладка не определена.
11.11 Assignment attempt
Ошибка! Закладка не определена.
11.12 Case study: the BANK system
Ошибка! Закладка не определена.
CHAPTER 12: COMPLEX INHERITANCE ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. 12.1 Multiple inheritance
Ошибка! Закладка не определена.
12.2 File classes
Ошибка! Закладка не определена.
12.3 Class STORABLE
Ошибка! Закладка не определена.
12.4 A storable list
Ошибка! Закладка не определена.
12.5 Joining features
Ошибка! Закладка не определена.
12.6 Undefine
Ошибка! Закладка не определена.
12.7 Repeated inheritance
Ошибка! Закладка не определена.
12.8 Select
Ошибка! Закладка не определена.
12.9 Dynamic dispatch
Ошибка! Закладка не определена.
12.10 The inheritance clause
Ошибка! Закладка не определена.
CHAPTER 13: GENERIC CLASSES ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
© R. S. Rist, 1993
6
13.1 Generic class
Ошибка! Закладка не определена.
13.2 Generic client
Ошибка! Закладка не определена.
13.3 Generic parent
Ошибка! Закладка не определена.
13.4 Constrained genericity
Ошибка! Закладка не определена.
13.5 Reuse in Eiffel
Ошибка! Закладка не определена.
13.6 Case study: class KEY_LIST [T]
Ошибка! Закладка не определена.
CHAPTER 14: ASSERTIONS AND INHERITANCE ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. 14.1 Look and feel
Ошибка! Закладка не определена.
14.2 Class invariants
Ошибка! Закладка не определена.
14.3 Pre- and post-conditions
Ошибка! Закладка не определена.
14.4 An example class: ORDERED_LIST
Ошибка! Закладка не определена.
CHAPTER 15: EXCEPTIONS
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
15.1 Look and feel
Ошибка! Закладка не определена.
15.2 Rescue clauses
Ошибка! Закладка не определена.
15.3 The retry instruction
Ошибка! Закладка не определена.
15.4 An example: class NODUP_LIST
Ошибка! Закладка не определена.
15.5 Discussion
Ошибка! Закладка не определена.
CASE STUDY
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
PART 1: LOOK AND FEEL
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
1.1 Specification
Ошибка! Закладка не определена.
1.2 Analysis
Ошибка! Закладка не определена.
1.3 Solution design
Ошибка! Закладка не определена.
1.4 Client chart
Ошибка! Закладка не определена.
1.5 Ace file
Ошибка! Закладка не определена.
1.6 Solution code
Ошибка! Закладка не определена.
PART 2: DATA FLOW 2.1 Specification
© R. S. Rist, 1993
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. Ошибка! Закладка не определена.
7
2.2 Analysis
Ошибка! Закладка не определена.
2.3 Solution design
Ошибка! Закладка не определена.
2.4 Solution code
Ошибка! Закладка не определена.
PART 3: ROUTINES
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
3.1 Specification
Ошибка! Закладка не определена.
3.2 Analysis
Ошибка! Закладка не определена.
3.3 Solution design
Ошибка! Закладка не определена.
3.4 Solution code
Ошибка! Закладка не определена.
3.5 Common error
Ошибка! Закладка не определена.
PART 4: OBJECTS
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
4.1 Specification
Ошибка! Закладка не определена.
4.2 Analysis
Ошибка! Закладка не определена.
4.3 Design
Ошибка! Закладка не определена.
4.4 Client chart
Ошибка! Закладка не определена.
4.5 Solution code
Ошибка! Закладка не определена.
4.6 Common errors
Ошибка! Закладка не определена.
PART 5: BEHAVIOUR 5.1 Specification 5.2 Analysis 5.2.1 Creation status 5.2.2 Export policies 5.2.3 Assertions
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена.
5.3 Design
Ошибка! Закладка не определена.
5.4 Client chart and class diagrams
Ошибка! Закладка не определена.
5.5 Solution code
Ошибка! Закладка не определена.
5.6 Common errors
Ошибка! Закладка не определена.
PART 6: SELECTION
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
6.1 Specification
Ошибка! Закладка не определена.
6.2 Analysis
Ошибка! Закладка не определена.
6.3 Design
Ошибка! Закладка не определена.
© R. S. Rist, 1993
8
6.4 Solution code
Ошибка! Закладка не определена.
6.5 Common errors
Ошибка! Закладка не определена.
PART 7: ITERATION
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
7.1 Specification
Ошибка! Закладка не определена.
7.2 Analysis
Ошибка! Закладка не определена.
7.3 Design
Ошибка! Закладка не определена.
7.4 Charts
Ошибка! Закладка не определена.
7.4 Solution code
Ошибка! Закладка не определена.
7.5 Common errors
Ошибка! Закладка не определена.
PART 8: ARRAYS
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
8.1 Specification
Ошибка! Закладка не определена.
8.2 Analysis
Ошибка! Закладка не определена.
8.3 Design
Ошибка! Закладка не определена.
8.4 Charts
Ошибка! Закладка не определена.
8.5 Solution code
Ошибка! Закладка не определена.
PART 9: LISTS
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
9.1 Specification
Ошибка! Закладка не определена.
9.2 Analysis
Ошибка! Закладка не определена.
9.3 Design
Ошибка! Закладка не определена.
9.4 Charts
Ошибка! Закладка не определена.
9.5 Solution code
Ошибка! Закладка не определена.
PART 10: INHERITANCE
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
10.1 Specification
Ошибка! Закладка не определена.
10.2 Analysis
Ошибка! Закладка не определена.
10.3 Design
Ошибка! Закладка не определена.
10.4 Charts
Ошибка! Закладка не определена.
10.5 Solution code
Ошибка! Закладка не определена.
PART 11: POLYMORPHISM
© R. S. Rist, 1993
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. 9
11.1 Specification
Ошибка! Закладка не определена.
11.2 Analysis
Ошибка! Закладка не определена.
11.3 Types of account 11.3.1 Focus: account balance 11.3.2 Focus: account id 11.3.3 Focus: interest rate 11.3.4 Focus: an interactive account 11.3.5 Focus: withdraw
Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена. Ошибка! Закладка не определена.
11.4 Storing the accounts
Ошибка! Закладка не определена.
11.5 Inheritance chart
Ошибка! Закладка не определена.
11.6 Client chart
Ошибка! Закладка не определена.
11.7 Class diagrams
Ошибка! Закладка не определена.
11.8 Solution code
Ошибка! Закладка не определена.
PART 12: COMPLEX INHERITANCE ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. 12.1 Specification
Ошибка! Закладка не определена.
12.2 Analysis
Ошибка! Закладка не определена.
12.3 Design: list storage and retrieval
Ошибка! Закладка не определена.
12.4 Design: an inherited MENU
Ошибка! Закладка не определена.
12.5 Solution code
Ошибка! Закладка не определена.
PART 13: CONSTRAINED GENERICITY ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. 13.1 Specification
Ошибка! Закладка не определена.
13.2 Analysis
Ошибка! Закладка не определена.
13.3 Design: a keyed list
Ошибка! Закладка не определена.
13.4 Charts
Ошибка! Закладка не определена.
13.5 Design: a keyed, storable list
Ошибка! Закладка не определена.
13.6 Solution code
Ошибка! Закладка не определена.
PART 14: THE COMPLETE BANK SYSTEM ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. 14.1 Specification
Ошибка! Закладка не определена.
14.2 Inheritance charts
Ошибка! Закладка не определена.
14.3 Client charts
Ошибка! Закладка не определена.
© R. S. Rist, 1993
10
14.4 Class diagrams
Ошибка! Закладка не определена.
14.5 Class listings
Ошибка! Закладка не определена.
APPENDIX A: RESERVED WORDS, SPECIAL CHARACTERS, ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. OPERATOR PRECEDENCE
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
A.1 Reserved words
Ошибка! Закладка не определена.
A.2 Special characters
Ошибка! Закладка не определена.
A.3 Operator precedence order
Ошибка! Закладка не определена.
APPENDIX B: EIFFEL SYNTAX
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
B.1 Class
Ошибка! Закладка не определена.
B.2 Sequence
Ошибка! Закладка не определена.
B.3 Selection
Ошибка! Закладка не определена.
B.4 Iteration
Ошибка! Закладка не определена.
B.5 Inheritance
Ошибка! Закладка не определена.
B.6 Genericity
Ошибка! Закладка не определена.
B.8 Naming conventions
Ошибка! Закладка не определена.
APPENDIX C: ACE FILE
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
C.1 Structure
Ошибка! Закладка не определена.
C.2 Assertions
Ошибка! Закладка не определена.
C.3 Debug
Ошибка! Закладка не определена.
APPENDIX D: CHARTS
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
D.1 Client chart
Ошибка! Закладка не определена.
D.2 Inheritance chart
Ошибка! Закладка не определена.
D.3 Class diagram
Ошибка! Закладка не определена.
D.4 Data structure chart
Ошибка! Закладка не определена.
APPENDIX E: DESIGN PRINCIPLES ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. E.1 Object-oriented programming
Ошибка! Закладка не определена.
E.2 Eiffel
Ошибка! Закладка не определена.
© R. S. Rist, 1993
11
E.3 Design guidelines
Ошибка! Закладка не определена.
E.4 The process of design
Ошибка! Закладка не определена.
APPENDIX F: GLOSSARY OF EIFFEL TERMS ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА. F.1 Data, routine, class, and object terms
Ошибка! Закладка не определена.
F.2 Inheritance, genericity, and assertion terms
Ошибка! Закладка не определена.
REFERENCES
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
INDEX
ОШИБКА! ЗАКЛАДКА НЕ ОПРЕДЕЛЕНА.
© R. S. Rist, 1993
12
Chapter 1: Look and feel Keywords: data, code, routine, class, system, object This chapter presents the look and feel of object-oriented systems in Eiffel. Computer code or instructions are placed in a set of small routines in the class. A class contains both data and the routines that change and use this data, so a class encapsulates both data and code. The class is designed around the data it contains, and the routines define the behaviour of the class. An object is an instance of a class; each object has its own data but shares the routines for that class.
1.1
Programming languages
A programming language consists of a set of instructions, that can store and retrieve data. A problem specification defines one or more goals to achieve, plus constraints on the methods or plans that can be used to achieve these goals. A program is thus a sequence of instructions that produce the data values defined in the problem specification, so we say that the program is a solution to the problem. When the program is run, the code (instructions) in the program read in data values if needed, calculate new values, and store or display the final values. This is true for any programming language. There are three main types of programming language, called procedural, functional, and object-oriented (OO) languages. Each language is based on a different way of grouping actions (instructions in the language) into larger units, so each type of language provides a different way to cut a task into parts. Each approach provides a different way to see and analyze the problem so each approach defines a different paradigm, a different way of looking at the world, a different way to carve the world up into pieces. The basic way to define a “larger” action that contains other, detailed actions, is to place the actions or instructions into a set of routines. A routine is a chunk of code that is executed as a single unit. A routine is given a name and whenever that name is mentioned in the code, the routine definition is found and all the code inside the routine is executed. Routines are essential to building any solution, because they allow the programmer to divide a problem into pieces, solve each piece of the problem, and then combine the pieces to solve the overall problem. Routines provide a basic level of abstraction in problem-solving. Procedural languages such as Pascal, COBOL, Basic, and C group instructions together into procedures. A procedure changes the value of one or more items of data, and procedures are linked together to define the control flow of the program; procedures are called in serial order and each procedure changes the data as needed, until the final procedure is executed and the final value has been calculated. Functional languages such as ML, Miranda and Lisp group their instructions into functions, where each function produces a new data value. The functions are connected via their data flow, where one function produces a value that is then used by another function. In both paradigms, all the data is defined in one part of the program and then the routines are defined in another part of the program, so these paradigms divide the world into two primary parts, data and code; the code is then divided into a set of procedures or functions. Object-oriented languages such as Eiffel, C++, Smalltalk and CLOS also have procedures and functions so the programmer can group a set of instructions together and execute a routine as a single operation. In Eiffel, the distinction between procedures and functions is very precise: a procedure changes one or more values, and a function calculates a new value and changes nothing. Objectoriented languages go beyond the basic level of abstraction, however, because they group routines into a larger chunk called a class. A class consists of a small amount of data, and the routines that useor change this data. An OO system is built by defining the data in the problem, placing each element of data in a class, and defining the routines that use or change the data in the appropriate class. This approach divides the world into a set of classes and objects. We use the idea of class every day, naturally, without thinking twice about it. A class is a way to describe a set of objects that all behave the same way and can thus be used in the same way. We can
© R. S. Rist, 1993
13
talk about the way that people behave, for example, without mentioning any particular person. We can talk about planets, moons, stars, ice cream and points without mentioning a particular instance of the class. The class provides a concise way to describe a set of objects and in this approach the world consists of objects and classes, and OO languages provide a formal model for the intuitive ideas of class and object. A class consists of a set of data and the code that uses or changes that data. An object is an instance of a class, so each object has its own data but shares the behaviour (code). A point, for example, will have a location defined by its x and y values; say the point located at (1, 1). All points share the same behaviour and can be used in the same way; we can find the distance between points, display their location, and so on. All instances of a type or class behave in the same way; that is what we mean by the word “class”. The trick in building an OO system is to precisely define and capture that shared behaviour. It is said that OO languages provide a different approach to programming, and this is true; OO languages provide the high level structure of a class, a level of organisation higher than the function or procedure. An OO language uses routines like procedural and functional languages, and then wraps the routines in with the data to define a set of classes. To understand and use an OO language, you need to understand the other approaches and then integrate them into a class structure. OO is not different; it is more.
1.2
The key: data
A class is built around the data in that class. In system design, the first and most important decision is to place the data in a class. Once an item of data is defined in a class, the code that uses or changes that data is defined in that class; this is what is meant by the phrase “the code lives with the data”. There is no choice about code that changes a data value; in Eiffel, a data value can only be changed by code that is in the same class as the data. There is a great deal of choice about where to place code that uses a data value, however. Data is stored in variables. A variable consists of three parts: name, type, and value. The name and (basic) type of a variable do not change, but its value can change as different values are calculated and stored in the variable. The structure of a variable can be drawn like this:
name
type
value
A simple two-dimensional point has a location, described by two real numbers. The data structure of a point is shown to the right below: two variables of type REAL, with names x and y, and values that define the location of the point (assume values of x = 3.0 and y = 4.5). The Eiffel code to define these variables is shown to the left below. The code consists of a class header and a data declaration. The class header defines the name of the class, and the declarations define the form of the data in that class.
class POINT feature x, y: REAL
name
type
value
x
REAL
3.0
y
REAL
4.5
When a point (an instance of the class POINT) is created, the computer reserves enough storage to store two real values, and gives the two areas of storage the names x and y. At some later time, values are then stored in these two variables, to define the location of the point. Exactly how the computer creates an object and assigns values to the variables is explained in later chapters. A variable that is defined to store data for the class is called an attribute of the class. Data declarations are normally written at the top of the class, after the Eiffel keyword feature. In Eiffel,
© R. S. Rist, 1993
14
both data and routines are called features of the class, so a full class definition consists of both the attributes and the routines in the class.
1.3
The routine as a module
A class consists of a small amount of data and a large number of routines that use or change that data. The data for a class are defined by a set of data declarations, and the routines are defined by a set of routine definitions. A routine is a chunk of code that has a name, and can be executed as a single unit; this is known as calling the routine. When the routine is called or executed, all the code inside the routine is executed. In Eiffel, every line of executable code is placed in some routine; there is no code outside of routines. Consider a simple bank account. A bank account has a balance, so the data consists (at least) of the balance of the account. The Eiffel code to define the class name and data structure is
class ACCOUNT creation make feature balance: REAL The routines in an ACCOUNT would then do such things as create an account and set its initial value, deposit money into the account, withdraw money from the account, and show the balance of the account. An Eiffel routine named make to read and store the initial balance of a bank account is shown below, then a routine named deposit that deposits money into the account.
make is -- read the initial balance from the user and store it
do io.putstring (“Enter the initial balance: $”) io.readreal balance := io.lastreal end -- make The name of this routine is make. The routine consists of a routine header (the name and is) followed by a routine comment, followed by a routine body. A comment in Eiffel starts with two minus signs "--"; after this comment marker, any text up to the end of the line is ignored by the computer, so comments are used to communicate with people. The routine body is then coded, enclosed in the keywords do and end. The make routine contains three lines of code to read in a value from the user and store that value in the variable balance. deposit is -- read an amount to deposit, add it to the balance do io.putstring (“Enter the amount to deposit: $”) io.readreal balance := balance + io.lastreal end -- deposit The name of this routine is deposit, because it contains the code to deposit money into the account. The deposit routine reads an amount to be deposited, and adds this amount to the current balance.
© R. S. Rist, 1993
15
In Eiffel, routines are small and do a single thing; that is how you get code reuse. A routine may contain a single line of code, but usually contains several lines of code. A routine larger than a dozen lines of code is unusual in Eiffel because large routines are often not reusable, although it is common in procedural programming. If you want to execute only part of a routine, you have to rewrite the code; it is not possible to call part of a routine. Small routines are the key to software reuse, because they can be called and combined as needed. Large routines are a very visible and clear indicator that your code is not reusable, and should be re-written as a set of small, reusable routines.
1.4
The class as a module
Both data and code are called features in Eiffel. A feature can be data, in which case it is called an attribute of the class and stores some value. A feature can be a routine, in which case it changes or uses an attribute's value; a routine may be a procedure or a function. A class is composed of a small amount of data and a large number of small routines. A class definition begins with the keyword class, followed by the name of the class. The next entry in the class definition is the creation clause, that names the routine used to set any initial values; by convention, the name of this routine is make. The rest of the class consists of a set of features, first the data and then the routines. The class is terminated by an end statement. A complete Eiffel class listing for a simple ACCOUNT is:
class ACCOUNT creation make feature balance: REAL make is -- set the initial balance from user input
do io.putstring ("Enter the initial account balance: ") io.readreal balance := io.lastreal end -- get_balance deposit is -- read an amount to deposit, add it to the balance do io.putstring (“Enter the amount to deposit: $”) io.readreal balance := balance + io.lastreal end -- deposit show is -- display the balance
do io.putstring ("The account balance is $") io.putreal (balance) end -- show
© R. S. Rist, 1993
16
end -- class ACCOUNT Note that this single class ACCOUNT is not a full system; it cannot be executed by itself. Some other class is needed to call the routines; the class ACCOUNT supplies these routines, but some other class (such as CUSTOMER or BANK) actually uses the account. A class has the following basic mechanism. The make routine is executed to give the class variables (attributes) some useful value. Once the variables has been given useful values, the other routines in the class are called and executed to use or to change these value. When a system is run, the make routine for the root class is executed, and that routine calls other routines, in its own or in other classes, until the Eiffel system has completed its task. Control then returns to the operating system. An Eiffel system is built from a set of classes. A class is named as the root class for the system, and the make routine in this class is executed first. That routine calls other routines in the same class or in other classes, and these routines call other routines, until the system has finished execution. An Eiffel system is built by writing a set of classes, then compiling the code into an executable form and executing the compiled system. The class definition exists when the code has been written and compiled, so we say that the class exists at compile time; it is a compile-time entity. When the compiled code is executed, objects are created and used so we say that an object is a run-time entity. Each object stores its own data, and uses the routines from its class. This is the power of OO languages, that we can define the behaviour of a class of objects, and then create instances of the class - objects - as needed. Each instance has its own data, and all instances share the code of the class. We can thus define the behaviour of an object once, in the class definition, and reuse that code every time we create and use an instance of the class, an object. A separate text file must be used for each class. The code for a class is stored in its own text file. The Eiffel compiler looks for a text file containing the string class , to find the class definition, and assumes there is one class per file. By convention, the name of the text file uses the name of its class; the class ACCOUNT is stored in the file “account.e”. This is not necessary - ISE Eiffel searches all the text files in your directory for the class name - but calling the class by one name and the text file by another is sure to lead to confusion. Common error: A text file contains several classes. Effect: Eiffel can only find the first class in the text file. What to do: Use a separate text file for each class definition. Common error: The name of the class is different from the name of the text file. Effect: In some versions, Eiffel uses the name of the file to match, so it cannot find the class. What to do: When you have a class X, store it in a file named x.e Common error: Two classes in your directory have the same name, creating a name clash Error code: VSCN Error: cluster has two classes with the same name What to do: If both classes are needed, change the name of one of them.
1.5
Code layout
Eiffel was designed to support software engineering, and has developed a standard format for coding an Eiffel class. This format is used by Eiffel programmers world-wide. There is no need to invent your own conventions, and in fact it is dangerous to do so, just like deciding to drive on the nonconventional side of the road. No-one can stop you from doing this, and you may get away with it for some time, but you are sure to come to grief in the end. The Eiffel conventions for class layout are 1.
The keywords class, creation, and feature are written at the left margin.
© R. S. Rist, 1993
17
2. 3. 4. 5. 6. 7. 8. 9.
The name of a class is CAPITALISED. This is true both for classes that you write and for classes in the Eiffel library. A space line separates attributes from routines, and routines in the class. An attribute declaration is indented four spaces from the left margin. A routine header is indented four spaces from the left margin; call this a step. A routine header consists of the routine name, followed by any arguments and terminated by is. A routine header comment is indented three steps from the left margin, to the level of the code. A header comment describes what the routine does, not how it does it. The keywords do and end are indented one step from the header. The code in a routine is indented one step from the do and end. The name of the routine is written as a comment at the end of the routine.
The code in a routine is thus indented on the code listing with the following format: • routine header: 4 spaces • routine comment: 12 spaces • do: 8 spaces • routine code: 12 spaces • end: 8 spaces In terms of steps, the format of a routine is 1, 3, 2, 3, 2 steps. This is the international convention. 10. 11.
A space is written after every punctuation mark (comma, colon, semi-colon). A space is written before and after an assignment, and a comment marker.
12.
The name of the class is written as a comment at the end of the class.
These conventions are the standard solution to the problem of code formatting, and are designed to make the code easy to read. Experienced Eiffel programmers expect Eiffel systems to follow this format, and find it difficult to understand code written in some other notation. Good indenting is the most important of these conventions; good indenting makes the structure of the code obvious to the reader. The typefaces in this book follow the standard Eiffel manuscript format (Meyer, 1992). Eiffel keywords are shown in bold face, comments are shown in plain text, and the executable code is shown in italics. This is the convention used to write about Eiffel; the conventions used to format Eiffel code are given above.
1.6
Building a system from classes
The basic interaction in an Eiffel system is captured in a client chart (Meyer, 1992). The client-supplier relationship is the most basic relation between classes, and reflects the common-sense idea of client and supplier. When you use the services of someone else, such as a lawyer, a doctor, or a plumber, then you are a client of that person. In the same way, one class is a client of another if it uses the services provided by that second class. In the example here, class CUSTOMER is a client of class ACCOUNT because it uses or calls the account features. This is known as a client-supplier relationship, in which class CUSTOMER is the client, that uses the services or features supplied by the class ACCOUNT. The “top” level of control in an Eiffel system is provided by the root class, the class that starts and then controls the execution of an Eiffel system. We can describe the class structure of a very simple banking system with a client chart such as that shown below. An oval contains a class name, and an arrow is drawn from left to right, from client to supplier, to indicate that the client calls or uses one or more features defined in the supplier.
BANK
© R. S. Rist, 1993
CUSTOMER
ACCOUNT
18
A system of three classes is shown in the chart: BANK, CUSTOMER, and ACCOUNT. The BANK class is the root class for this system, shown to the far left of the diagram. When the system is run, code (instructions) in the BANK class calls routines in the CUSTOMER class, then code in these routines call routines in the ACCOUNT class. The code in the ACCOUNT class is executed, then control returns back to the CUSTOMER, then back to the BANK; after all the code has been executed, the system terminates and returns control back to the computer system. Informally, a client relation means that the client "has", "uses", or "contains" the supplier; a bank has customers, and a customer uses a bank account. In terms of Eiffel code, we can say that a client-supplier relation is defined when a client class C uses features of a supplier class S, but this is still an informal definition. Formally, a client arrow is drawn from the client to the supplier if and only if a class C declares a variable of type S.
1.7
Running an Eiffel system
The code in a text file is a series of characters; a computer only speaks binary, so the text is not executable by the computer. It is converted to executable code by compiling the system. To compile an Eiffel system, you need to define a separate file, called an Ace file (Assembly of classes in Eiffel) in ISE Eiffel. In the Ace file, you tell the compiler the name of your root class, the name of the creation routine in that class, and the directory where you have stored your Eiffel text files. When you then run the Eiffel compiler, it looks in the Ace file in your current directory to get the starting information, finds the file for the root class, and from the root class automatically finds and links all the connected classes into a single executable file for the whole system. The name of this executable file is also specified in the Ace file. The Eiffel compiler first checks the syntax of your system, then converts the Eiffel code to C code, and then converts the C code to executable code. The C code is kept in a special file created by the system, so a compilation can create many new files on your system. Eiffel has a smart compiler, so a class is recompiled only if it has been changed; any existing, unchanged classes are not recompiled. This is known as “melting ice” technology, where most of the system can be thought of a frozen (unchanged) and only a small part has to be recompiled. The system is executed when the executable file is run on your computer. When you run the system, Eiffel creates an object for the root class and executes the creation routine of the root class, that then creates the other objects and calls their features as necessary.
1.7.1 A simple class The simplest Eiffel system that does something visible is a single class with one output. The single class is, of course, the root class for the system; in fact, it is the whole system. In programming, the sample program that prints a message is usually called the "Hello world" program. Here is the Australian version:
class HELLO creation make feature make is -- say hello to the world
do
io.putstring ("%NG'day mate%N") end -- make
© R. S. Rist, 1993
19
end -- class HELLO When this system executes, Eiffel runs the creation routine of the root class. Usually, this routine then creates other objects and calls other routines; here, the creation routine outputs a message and calls no other routines. The effect of coding, compiling, and running this sytem is to produce the message G'day mate on the terminal screen.
1.7.2 A simple Ace file To create an Ace file, first make a directory named eiffel to store your Eiffel files. To get an Ace file, copy the file /pub/psda/Ace into your eiffel directory as Ace. Edit this full Ace file so it looks like that shown below. This minimal Ace file tells the compiler that the name of the root class is HELLO, that the creation routine in that class is named make, and that the final executable file will be called “gday”. It also tells the compiler to find the root class and all your Eiffel classes in your working directory (eiffel: "./"), to look for any other Eiffel files in the Eiffel kernel library, and to use the precompiled Eiffel files. Make your Ace file look exactly like this: indent four spaces, no extra space lines. system gday root HELLO: "make" default assertion (require); precompiled ("$EIFFEL3/precompiled/spec/$PLATFORM/base") cluster eiffel: "./"; kernel: "$EIFFEL3/library/base/kernel"; end An Ace file contains the following information: 1. After the keyword system, write the name of your executable system file (lower case). 2. After the keyword root, write the name of your root class (upper case) and the name of its creation routine (lower case); by convention, a creation routine has the name "make". 3. After the keyword default, the Ace file lists the assertion checking status and any precompiled libraries. 4. After the keyword cluster, the Ace file lists where to find the files to compile; a cluster is to a directory that contains a set of Eiffel files. The first line in the example below (eiffel: "/";) tells the compiler to look in your current directory for your files. The second line says to use the Eiffel kernel files found in the given directory; for the first lab, you only need the kernel Eiffel files. An Ace file should have only the clusters that you need, because more clusters means a longer compile. The default and cluster sections can usually be ignored when you edit an Ace file. Basically, you need to tell the Eiffel compiler • the name of your root class • the name of the creation routine in the root class • the name of the executable file you want to produce and Eiffel can then find your code and compile it, when you tell Eiffel to compile using eifstart.
© R. S. Rist, 1993
20
1.7.3 eifstart The eifstart command is a special command written for the SoCS computer system. When you enter this command, it gets your Ace file (from your current directory) and uses it to compile a new system. Because Eiffel uses a "melting ice" approach to compilation, you only run eifstart to get an initial compilation. Once your system compiles and you have an executable file, you then use the SoCS command eif to re-compile a changed system, so almost all the time you will use eif to add new changes to your existing system. eifstart can take a very long time to run because it compiles every line of code in the system, where eif is very quick. When you run a compile command (eifstart or eif), you start a long sequence of actions. The Eiffel compiler will write a display to your screen such as: $ eifstart SoCS Eiffel Compilation Suite - eifstart Starting compile Eiffel compilation manager (Version 3.3.7) Degree 6: cluster eiffel Degree 5: class HELLO Degree 4: class HELLO Degree 3: class HELLO Degree 2: class HELLO Degree 1: class HELLO Melting changes System recompiled. Moving executable gday to current directory Done. The eifstart and eif commands look in your Ace file to find your root class and creation routine, and compile the rest of the system from there. The eifstart command creates a subdirectory EIFGEN in your working directory, and below that COMP, F_code and W_code. Eiffel puts various files in these directories; it puts the executable system file in the directory /EIFGEN/W_code and then the SoCS command moves it from there to your current directory. eifstart and eif only exist on the SoCS computer system; they are not a standard part of Eiffel. If you use other compilers or other computers, then you should go through similar actions and see similar output, but it will probably be more complex and take a lot longer because the SoCS system translates Eiffel directly into byte code and skips the normal C compilation. Common errors 1.
No Ace file. You are told: No Ace file in current directory. Giving up.
2.
No root class, or it is stored under some strange name so Eiffel can’t find it. You are told: Error code: VD30 Lace error: Root clause lists improper identifier as root class name
You need to change either • Your Ace file, so it uses the class name used in the class definition (text file) • Your class definition, so it uses the class name used in the Ace file, or • Your current directory, so it is the directory where your text files are stored. 3. Class with this name cannot be found. Your code has referred to a class named X, but there is no class X definition in your current directory, or in the Eiffel library. You are told: Error code: VTCT Error: Type is based on unknown class What to do: use an identifier that is the name of a class in the universe.
© R. S. Rist, 1993
21
“The universe” consists of the directories named in the Ace file; the class definitions are the universe for an Ace file. These are usually the current directory and the Eiffel library directories. 4. The Ace file specifies a class that is not the root class. Eiffel will do exactly what you have told it do: compile a system starting at the specified class. The compilation may even succeed, but the system will be smaller than you expect when you execute it.
1.7.4 The system file You can now run the system by typing the name of the executable file, here gday. $ gday The system will execute, ask for any input data (none in this case) and show the message G'day mate
1.7.5 eif To compile a changed system (new code, same root class),eif melts (re-compiles) the changed code and merges the old and new compiled code. A re-compilation has the same passes as eifstart, but it is quicker. After a lot of changes, eif becomes inefficient and you need to delete the system files (including EIFGEN), produce a new executable file (with eifstart) and then later add any new changes (with eif) to that new version.
1.8
Case study: look and feel
Each chapter in this text has a corresponding case study, that shows how the ideas in the chapter are used to develop a working system. The first part of the case study sets and changes the balance of an account, to illustrate the look and feel of a class with simple data flow. Main points in this chapter •
A traditional language cuts the world into two parts, data and code, and keeps the data separate from the code. An OO system cuts the world into a set of classes, where each class contains both data and code.
•
A class is designed around its data (attributes), and contains both the data and the code (routines) that uses or changes that data. A routine may be a procedure or a function.
•
A class consists of a small number of attributes, and a large number of small routines.
•
The Eiffel code layout convention was designed to make code easier to read.
•
The class defines the behaviour of all objects of that type. An object is an instance of the class. Each object stores its own data values, and all objects of that type share the code in the the class definition.
•
An Eiffel system is compiled from the Ace file in the directory. The Eiffel class (text) files are also inthe current directory.
•
An Eiffel system is run by typing in the name of the executable file. The system starts by executing the make routine for the root class, and following any routine calls from there.
Exercises
© R. S. Rist, 1993
22
1. • • • • • • • • • •
Define the following terms: code attribute routine class object supplier class client class root class client chart system
2.
What is the format of a class definition?
3.
What are the three parts of a variable?
4.
What is the format of a data declaration?
5.
What is the format of a routine definition?
6.
Why is indentation important? How far is a header comment indented? What does a header comment describe?
7.
What is the difference in behaviour between a procedure and a function? How are procedures connected in a program? How are functions connected in a program?
8.
How is a routine called?
9.
Run the “hello world” system.
© R. S. Rist, 1993
23
Chapter 2: Basic data types Keywords: INTEGER, REAL, DOUBLE, CHARACTER, declaration, input, output, assignment This chapter presents the basic data types in Eiffel: INTEGER, REAL, DOUBLE, and CHARACTER. The syntax and mechanism for declaration, input, assignment, calculation, and output are given for each type. A data declaration reserves storage for a variable, and gives that variable an initial, default value. An input commands get a value from the keyboard, an assignment statement stores a value in a variable, and an output command displays a value on the terminal.
2.1
Class INTEGER
Integers in Eiffel are instances or objects of type INTEGER. In Eiffel, an INTEGER is stored in 32 bits, so an integer can take any value between 2^31 - 1 and - (2^31 - 1); that is, between + 2, 147, 483, 647 and - 2, 147, 483, 647. The behaviour of an integer is defined by what you can do with it; formally, the behaviour of a class is defined by the operations in that class. The table below shows the symbol, name, and an example expression for each numeric operator defined in class INTEGER: + -
unary plus unary minus
+6 -42
^
exponent
3^2
* / // \\
times divide divisor modulus
hours * rate total / people 365 // 30 hours \\ 12
+ -
binary plus binary minus
3 + total_cost wins - losses
The unary operators + and - take a single value or argument, and return a single value. The unary minus operator returns the negative of its argument, so the value of --3 is 3. All the other INTEGER numeric operators take two values, and return a single value. The exponent or power operator takes two numbers, and returns a new number. Let us call the two integer arguments the number and the power, and the returned value the result. The value of the result is found by raising the number to the given power; several examples are given below. In Eiffel, a number raised to a power results in a real number, even if the two arguments are both integers. 2 ^ 0= 2 ^ 1= 2 ^ 2=
1.0 2.0 4.0
1 ^ 43 3 ^ 3= 5 ^ 2=
= 27.0 25.0
1.0 0 ^ 28 = -4 ^ 2 = -4 ^ 3 =
0.0 16.0 -64.0
The times operator takes two integers and returns an integer that is the product of its two arguments. The divide operator takes two integers, and returns a single real number that is the value of the first argument divided by the second argument. The divisor (symbol //, often named div) and modulus (symbol \\, often named mod) operators take two integers as arguments. The divisor returns the number of times that the first integer divides into the second. The modulus returns the remainder after an integer division. Examples of the div and mod operators are 12 // 10 47 // 10
© R. S. Rist, 1993
= =
1 10 goes into 12 1 time (there is 1 10 in 12) 4 10 goes into 47 4 times (there are 4 10s in 47)
24
53 // 25
=
2 25 goes into 53 2 times
12 \\ 10 47 \\ 10 53 \\ 25
= = =
2 12 is 10 * 1, with 2 left over 7 47 is 10 * 4, with 7 left over 3 53 is 25 * 2, with 3 left over
All INTEGER operators take one (unary operators) or two (binary operators) integer values as arguments. All but the divide and exponent operators return an INTEGER value; divide returns a REAL value, and exponent a DOUBLE precision real value. The integer operators, the types they use, and the types they produce, are shown below; a list of the argument and the returned types is called the signature of the routine.
2.2
symbol
name
use
result
+ -
unary plus unary minus
INTEGER INTEGER
INTEGER INTEGER
^
exponent
INTEGER, INTEGER
DOUBLE
* / // \\
times divide divisor modulus
INTEGER, INTEGER INTEGER, INTEGER INTEGER, INTEGER INTEGER, INTEGER
INTEGER DOUBLE INTEGER INTEGER
+ -
binary plus binary minus
INTEGER, INTEGER INTEGER, INTEGER
INTEGER INTEGER
INTEGER declaration
A declaration reserves storage in the computer's memory to store the value of a variable. Formally, the declaration gives the variable a name, a type, and an initial value, to fill the three parts of a variable (shown below). A variable must be declared before it can be used in code.
name
type
value
A variable declared as a feature of a class is called an attribute. An attribute value is a permanent part of an object, and exists as long as the object exists. If a value is defined in the declaration, the variable is a constant, and the value cannot be later changed; a constant must be declared as an attribute. Some example attributes and constants are shown below, first their data structure and then their declarations. The first declaration names a single variable of type INTEGER, the second names three INTEGER variables, and the third declaration defines an INTEGER constant.
name
value
length: INTEGER
length
INTEGER
0
height: INTEGER
height
INTEGER
0
width: INTEGER
width
INTEGER
0
length: INTEGER is 4
length
INTEGER
4
variable variables constant
© R. S. Rist, 1993
type
length: REAL length, height, width: REAL length: REAL is 4
25
The syntax of a declaration is a name, followed by a colon, a space, and the type of the variable. Multiple variables of the same type can be declared in a single declaration by writing several names (separated by commas) before the colon; in this case, a comma then a space is placed between two identifiers. Eiffel follows the English convention for text layout, where a punctuation mark (colon, comma) is placed immediately after the word, then a space is placed between the mark and the next word. This makes Eiffel code easy to read, by adopting the natural language conventions that we all know. A name (identifier) in Eiffel is a sequence of characters. The first character must be a letter ('a' to 'z', ‘A’ to ‘Z’), but the other characters may be letters, digits ('0' to '9'), or the underscore character ('_'). All identifiers (variable, routine, or class name) obey this rule. The name of an attribute is a noun that describes that attribute. The name is almost always a single word, but there may be situations where a compound name is needed; the convention is to link simple names with an underscore such as tax_rate, my_name, and top_score.. Two atributes in a class cannot have the same name, because the computer needs a unique name for each variable; this situation is known as a name clash. A variable may have the same name as a class. This is not a name clash, because there are not two variables with the same name: there is a variable and a class with the same name. Eiffel has a set of reserved words, listed in Appendix B, that have a special meaning in the language and cannot be used as names. The type of a variable is written after the name, separated from the name by a colon. Every type in Eiffel is a class, except for generic classes (see Chapter 11). The simple or basic types of variable in Eiffel are INTEGER, REAL, DOUBLE (double precision real numbers), CHARACTER, and BOOLEAN. Class names are always written in upper case in Eiffel. Eiffel variables get a default value when they are created. Variables of type INTEGER are given an initial value of 0 (zero). The value is changed by an assignment statement. A constant is declared by giving the name, type, and value in the attribute declaration. A constant is considered to be part of the class definition, so it has to be declared as an attribute. The name of a constant (such as Void) starts with a capital letter, by convention. Examples of integer constant declarations are Days_in_year: INTEGER is 365 Days_in_week: INTEGER is 7 Unique constants represent a set of unique values, and are used to define enumerated data types. A set of values are defined or enumerated in a declaration, the order of these values is defined from left to right in the declaration, and then we can then test and assign such values. The series of names in the declaration are given a series of INTEGER values that are guaranteed to be unique and increasing, but whose exact value is unknown. They represent enumerated types, when the set of possible values can be enumerated and no further detail is needed. The declaration has the form Red, Orange, Yellow, Blue, Green, Indigo, Violet: INTEGER is unique Each name is a constant, so it can be assigned to an INTEGER variable, and that value can be tested, as in if colour < Yellow then .... Variable names begin with a lower case letter, constants with an upper case letter, and compound names are connected by an underscore. These are the standard, worldwide Eiffel conventions and should be followed; experienced Eiffel programmers assume that the author has followed the conventions when they read someone else’s code. Great care should be taken to choose the correct name. Programmers spend most of their time reading other people’s code, not writing new code. Most of the money spent in computing is spent on changing existing code, not on writing new code. The “software crisis” has arisen because much of the existing code cannot be understood or modified, and has to be either thrown away or re-written from scratch. Eiffel was designed to build reusable software and to help solve the software crisis, so you should always write your code for the next person, and remember that words are very useful in communicating between people. Common error: two attributes have the same name, generating a name clash Error code: VMFN
© R. S. Rist, 1993
26
Error: two or more features have the same name What to do: If they must indeed be different features, choose different names or use renaming
2.3
INTEGER expressions
Expressions are used to calculate a value; that value may be stored in a variable using an assignment statement, or the value may be used immediately in the code. An expression is a sequence of values connected by operators. The values may be variables, constants, or literals; a literal is an explicit value written as part of the code, such as 0.5 in the expression price * 0.5. The basic way to evaluate an expression is to evaluate the operators in left to right order, but there are two exceptions to this basic rule: operator precedence and explicit bracketing. Each operator has a specific precedence or priority. When a flat expression (an expression with no brackets) is evaluated, the operators are evaluated first in precedence order, then in left to right order. The precedence order for the INTEGER operators is shown below, from highest to lowest precedence: + ^ * / // \\ + -
unary plus, unary minus exponent times, divide, divisor, modulus plus, minus
For a flat expression, all exponents are evaluated before any multiplication is done, all multiplication is done before any addition, and so on. Operators at the same level of precedence are executed from left to right in the expression. Brackets are used to force evaluation, because expressions within brackets are evaluated before unbracketed ones. 1. 2. 3. 4. 5.
The rules that the computer uses to evaluate a complex expression are thus: Select an expression in brackets, starting with the deepest brackets In that expression, evaluate all operators at the top level of precedence, left to right In that expression, repeat the evaluation for each level of operator precedence When the flat expression is evaluated, find the next, most nested flat expression Repeat 2-4 until the whole expression has been evaluated
Several example expressions, the order of evaluation, and their results, are shown below. The complete Eiffel operator precedence order is given in Appendix A. 1+2*3^3-4/5 -> 1 + 2 * 27.0 - 4 / 5 -> 1 + 54 - 4 / 5 -> 1 + 54 - 0.8 -> 55 - 0.8 = 54.2 1+2^2*3/4 -> 1 + 4.0 * 3 / 4 -> 1 + 12 / 4 -> 1 + 3.0 = 4.0
(1 + 2 ^ (2 * 3)) / 4 -> (1 + 2 ^ 6) / 4 -> (1 + 64.0) / 4 -> 65 / 4 = 16.25 -73 \\ 12 // 5 + 3 ^ 2 / 4 -> -73 \\ 12 // 5 + 9.0 / 4 -> -1 // 5 + 9 / 4 -> -1 + 9 / 4 -> -1 + 2.25 = 1.25
Brackets should be used to make the expression clear where there is any possibility of confusion. You should always write code with the next person in mind, who has to read and understand your code, and explicit brackets can make an expression clearer to understand.
2.4
Assignment
A variable gets a value when it is created. This value may be changed when the code is executed by an assignment statement. An assignment statement or instruction has the form identifier := expression
© R. S. Rist, 1993
27
A value is produced when the expression on the right-hand side of this statement is executed or evaluated, and the value of the expression is then stored in the variable listed on the left hand side. The Eiffel convention for writing assignment statements is to place a space before and after the assignment operator. Some examples of integer assignments are shown below: total_cost := air_fare + bus_fare + hotel area := length * height count := count + 1 Assignment creates the data flow in a system. Data in a variable on the right hand side of an assignment (the source) “flows” into the variable on the left hand side of the assignment (the target). To illustrate how data values flow through the variables, a code fragment is shown to the left below and the values of the variables at each step are shown to the right.
length
height
area
0
0
0
length := io.lastint
5
0
0
height := io.lastint
5
6
0
area := length * height
5
6
30
A data flow diagram shows the data flow, illustrated in the centre below. The values of length and height are used to calculate area, and the values of area and width are used to calculate the volume. A data flow chart shows the data flows in the system, not the order of the operations. We could swap the order of the first two inputs and still have a correct solution.
length length := io.lastint height := io.lastint area := length * height width := io.lastint volume := area * width
height
width
area
volume Assignment is a two step process that occurs in time. The assignment count:= count + 1 is both correct and common; the variable count is incremented by 1, so the value of the variable is larger by one, after the statement has been executed. This is very different from an equality test; it is impossible for a value to be larger than itself. The assignment operator should thus be called something like assigns, gets, or is. The operator should never be called equals; it does not test for equality, it assigns a value to the variable.
2.5 1. 2. 3.
Error messages If you write code that Eiffel cannot parse, you get an error messsage that describes What the error is Where the error is How to fix the error
This is exactly the information you need to fix your mistake, so it is important to listen to what the Eiffel compiler tells you in the error message. Novices often ignore the content of an error message, however, possibly because it uses words with which they are not familiar. Consider the simple class shown below:
© R. S. Rist, 1993
28
class X creation make feature a, b: INTEGER make is -- add two numbers together do sum := a + b end -- make end -- class X This code shows the most common novice error, to forget to declare a variable. You must declare a variable before it can be used; if not, Eiffel gives you the compile time error: Error code: VEEN Error: unknown identifier What to do: <make sure that the identifier is defined> Class: X Feature: make Identifier: sum The last three lines give the location of the error. In this case, Eiffel could not work out what to do with the identifier sum in the routine make in the class X. By reading the last three lines of the error message, you know the class, the routine, and often the line where the error was found. The first two lines give the error code (ignore this), and a short description of the error. In this case, the error was that Eiffel ran across an unknown identifier sum in the make routine of class X, so it says “unknown identifier”). An identifier can be the name of a variable, a routine, or a class, so you need to work out which of these is the case, by looking at the name of the identifier on the last line of the error message. The hardest part of the error message is the “What to do” section, because an error message uses words that have a precise meaning in Eiffel, but that you may not understand. Often the rest of the message provides all the information you need to find and fix the error, as is the case here. The “fix” given in the message above (enclosed in < >) is not even close to the Eiffel error message (run the code to see the message), but the rest of the message is so good you do not need to understand the “What to do” message.
2.6
INTEGER input and output
The word “input” is used here to describe the act of typing data into the computer via the keyboard, and “output” is used to describe data that is shown on the terminal screen. Input and output operations are controlled in Eiffel by a special system object whose name is io, so every input and output command uses the object io, and has the form io.feature. Because Eiffel is a fully OO language, it uses a different command for each type of input and output. To get a single, integer value from the user requires three lines of code. The first line is a prompt, that writes a string to the terminal screen so the user knows what to do. The second line is a command that reads the value that the user types into the keyboard. The third line takes this value and stores it in a variable in your system. An example of this input code is io.putstring ("Please enter your age in years: ")
© R. S. Rist, 1993
29
io.readint age := io.lastint To read an integer, you use a command of the form io.readint This command gets a value from the keyboard and stores that value in the Eiffel integer buffer; a buffer is a temporary storage area in the computer. Each input type in Eiffel has its on buffer. After Eiffel executes this instruction, the value from the keyboard has been read and stored, and your Eiffel system can now use that value. To find out what the entered value was, you ask for the last integer read in using the query io.lastint This function returns the last integer value that was read. Because it is a function, it can be called many times and will return the same value each time; by definition, a function changes nothing. Every input from the user should be preceded by a prompt, so the user knows what to type. You don't need to use quotes to input values; the computer finds out how to treat the input by looking at the type given in your input command. Note that the input command io.readint does not mention a variable. Eiffel stores its input in a buffer, one buffer for each type of object. To access the buffer, you issue a query to get the last value read into the buffer. Getting the last value does not change the value of the buffer, it only reads that value, so you can get the last value as many times as you want. The input will only change when you read in a new value for that type of object. Reading in a new value changes the value stored in the buffer, so any input value must be stored used by your system before the next value is read in from the user. To output an integer, you write a label to explain the output to the user, followed by an output command. Few things are as puzzling as a number showing on the screen with no explanation of what the value means. A label has the same form as a prompt (it is a string shown on the screen), so the code needed to output an integer is io.putstring (“Your age in years is now “) io.putint (age) An output is a command, because it changes the state of the screen. The output command, the type it requires, and a sample screen output for the command, are io.putint (x)
12
x: INTEGER
Each input command reads a single value, and each output command displays a single value; if you need to read or display several values, then you have to use a separate command each time. Output of a label and three integers thus needs four lines of code. At least one space should be placed between a prompt and its input value, and between a label and its ouput value.
2.7
Output formatting
Prompts and labels are examples of STRING output, where a string is displayed on the screen to communicate with the user. String output uses the command io.putstring and takes a single argument, the string to be displayed. A string is enclosed in double quotes. A long string may be written on two lines by starting every line but the first with a '%' and ending every line but the last with a '%'. Any number of blanks or tabs may be placed between the '%' signs with no effect on the format of the output. An example of a long string output is io.putstring ("Having a wonderful time.% % Wish you were here.% % Please send money") This command produces the screen output Having a wonderful time. Wish you were here. Please send money.
© R. S. Rist, 1993
30
A new line may be started on the terminal screen by issuing the command io.new_line; this forces the next output to start at the beginning of the next line on the screen. If the next line output is a string, the two lines can be combined by using the new line indicator ‘%N’. The following two pieces of code have the same behaviour: io.new_line io.putstring (“Gday”) io.putstring (“%NGday”) In the same way, a tab can be written to screen using the special character ‘%T’; a character is enclosed in single quotes, where a string needs double quotes. Special characters are written as two symbols, the special symbol '%' and a following symbol; the new line character is denoted by '%N', the tab character by ‘%T’, and the percent character by ‘%%’. The complete list of special characters is shown in Appendix B. An unprintable character can be specified as an ASCII value using the special character form '%/code/', where the code is the ASCII (decimal) value of the character. A new line character, for example, has an ASCII code of 13, so it can also be specified as '%/13/'. When an integer is displayed on the screen, Eiffel shows all the digits in the integer, so the number of characters shown on the screen changes with the value of the number. If tabs are used to make the output appear in columns, then each number has to be shown in a fixed length field, so the columns line up on the screen. The class FORMAT_INTEGER is used to produce fixed length integer output by changing the integer into a fixed length string, and then displaying the string on the screen. An object of this class is created and given a field width as an argument. The feature formatted of this object is then used to print integers with width number of characters. The basic way to format integers for output is shown below, using a field width of 8:
class X creation make feature test: INTEGER format: FORMAT_INTEGER make is -- show how integer formatting is done do !!format.make (8) test := 43 io.new_line io.putstring (“Value:%T”) io.putstring (format.formatted (test)) end -- make
-- create formatting object -- start in column 1 of screen -- label and tab -- display in width eight
end -- class X This code looks complex, but is easy to adapt as needed. To display in a different field size, all you need to change is the field width when the object format is created, in line 1 of the make routine here. To display a new variable, all you need to do is to write the name of that variable in brackets after the word formatted; this is the argument to formatted. The feature formatted takes the integer, converts it to a string of size 8 (here) by adding leading blanks if needed, and returns this string value. The string is then output using the normal io.putstring command. If the number of digits is larger than the field width, Eiffel takes as many characters as needed to print all the digits in the value. The class FORMAT_INTEGER has many other
© R. S. Rist, 1993
31
formatting features to offer; to see these features, look at the class listing in /opt/Eiffel3/base/support or run short on the class (see Section 5.9). 2.8
Class REAL
A real number is a number with both an integer and a fraction part, and is implemented by class REAL in Eiffel. In Eiffel at SoCS, the integer part of a real number can take on a value between about -10^39 and +10^39, and a fraction value from 0 to about 10^-39.
2.8.1 Declaration and numeric features Declaration of REAL variables is similar to INTEGER declarations, for both variables and constants. A real number is given the default value 0.0. Examples of REAL declarations are shown below, followed by their data structures;
name
type
value
pay: REAL
pay
REAL
0.0
hours, bonus: REAL
hours
REAL
0.0
bonus
REAL
0.0
pay_rate
REAL
12.0
pay_rate: REAL is 12.0
A REAL constant is written with a decimal point and a fraction; if you simply write an integer value here, Eiffel will report a type mismatch error. The numeric operations defined on real numbers are similar to those for integers, except that mod and div are only defined for integer values. The REAL operators are shown in their precedence order below: + -
unary plus unary minus
+6.2 -42.7
^
exponent
3.6 ^ 2.4
* /
times divide
hours * rate total / people
+ -
binary plus binary minus
3.7 + total_cost total_hours - overtime_hours
Now that we have two numeric types, we need to define how their values are used in an expression. Any numeric value in an expression is first converted into the "heaviest" type in the expression, then the expression is evaluated. A REAL value is heavier than an INTEGER value. If two INTEGERs are added, the result is an INTEGER; if an INTEGER and a REAL number are added then the result is of type REAL; minus, times and divide behave in the same way. Any number of operators can be combined within a single expression. Because an operator is defined on a type, the effect of the operator can differ depending on the type of its arguments; this technique is called overloading. We have already seen an overloaded operator, the operator make. This operator has the same name in each class, but does different things depending on the code defined in the class. The Eiffel class INTEGER contains the integer operators, and the Eiffel class REAL contains the real operators. Each class contains operators named “+”, “-”, “*”, “/”, and “^”; which operator is actually called depends on the type of the arguments. If the arguments are both integers, then Eiffel uses the integer operators. If at least one of the arguments is of type REAL, then the real operators are used. The signatures of the real operators are shown below; note that for the binary operators, if one argument is of type REAL then the other is converted to type REAL before the result is calculated.
© R. S. Rist, 1993
32
symbol
name
use
result
+ -
unary plus unary minus
REAL REAL
REAL REAL
^
exponent
REAL, REAL
DOUBLE
* /
times divide
REAL, REAL REAL, REAL
REAL REAL
+ -
binary plus binary minus
REAL REAL
REAL REAL
The most common novice error at this point is to use incompatible types; the two sides of an assignment statement, for example, must have compatible types. Two types are compatible if they are the same type, or one can be converted to, and thus stored in, the other. For the numeric types, an INTEGER value can be stored in a REAL variable, but a REAL value cannot be stored in an INTEGER variable. The code given below produces the following error message:
feature a, b: REAL sum: INTEGER ... sum := a + b Error code: VJAR Error: source of assignment does not conform to target What to do: make sure that type of source (right hand side) conforms to type of target Class: X Feature: make Target name: sum Target type: INTEGER Source type: Eiffel tells you where the error is located: in class X, in routine make, in the identifier sum. The fix part of the message then refers to the source and target of an assignment statement. The source is where the value comes from; the right hand side. The target is where the value goes; the left hand side. The type of the source is REAL, of the target INTEGER, so of course they are incompatible; you can’t store a real number in an integer. All of this information makes your task very simple; you need to simply declare sum to be REAL, and the types of the assignment statement are then compatible. The only tricky part of the message is the word “conform”; informally, “conform” means that two variables either have the same type, or are compatible.
2.8.2 Input and output Input and output are similar to integer IO, but the operations have slightly different names. To read in a real number, we use io.readreal to get the value from the keyboard and store it in the REAL buffer, then io.lastreal to get the value from the buffer and store it in a variable. To output a real number, we use io.putreal to display the value on the screen. An example of REAL input and output is shown below:
feature test: REAL
© R. S. Rist, 1993
33
make is -- illustrate REAL input and output do io.putstring (“Enter a number: “) io.readreal test := io.lastreal io.putstring (“%NThe value was “) io.putreal (test) end -- make When Eiffel executes the input command io.readreal, it reads a value from the screen and stores it as a real value in the REAL value buffer. You do not have to type in a decimal part if the decimal part is zero, because you have told Eiffel that the input is a real number and it will automatically add a decimal part to the input value if none is explicitly given on the screen. An input value of 3, for example, is stored as the real value 3.0. When Eiffel executes the output command io.putreal, it displays as many digits as needed to show the value of the number. If the decimal part is zero, then only the integer part is displayed. If the decimal part has only two digits, then only two decimal digits are displayed. Several examples of real values and their screen output via io.putreal are shown below: test := 3.0 test := 3.1 test := 3.12345
io.putreal (test) io.putreal (test) io.putreal (test)
3 3.1 3.12345
The command io.putreal expects a real value, and that is the only constraint on io.putreal. The value could be a real literal, a real variable, a real constant, a real expression, or a real function. All that matters to io.putreal is that the value it is given as an argument is a real value; several examples are shown below: io.putreal (3.1)
3.1
test := 3.1 io.putreal (test)
3.1
test := 3.0 io.putreal (test + 0.1) 3.1 io.putreal (sqrt (3.0)) 1.73205 The output can be formatted nicely by converting a number to a string, and then printing the string; the features to do this are supplied by class FORMAT_DOUBLE (there is no class FORMAT_REAL). An object of type FORMAT_DOUBLE is created and used to control the ouput via a format specifier. The format specifier for real numbers gives an output field width and the number of decimal places, so the format specifier has the form (width, precision). The width is the total length of the string, with precision decimal places. A correct output value is always returned, so Eiffel takes as large a field as needed to print the integer part of the value, and uses the specified precision for the decimal part. If the number is too small, leading spaces are added before the output value. Code to output a field of total length 8, with two decimal places, is shown below. The value displayed on the screen will be eight characters long: three spaces, followed by the two digit integer part, then the decimal point, then two digits for the decimal part.
class X creation make feature © R. S. Rist, 1993
34
test: REAL format: FORMAT_DOUBLE make is -- show how real formatting is done do -- create formatting object
!!format.make (8, 2) test := 43.789 io.new_line io.putstring (“Value:%T”) io.putstring (format.formatted (test)) end -- make
-- start in column 1 of screen -- label and tab -- display in width eight
end -- class X The class offers many other formatting features.
2.9
Class DOUBLE
A double precision real number is implemented by class DOUBLE in Eiffel. This type of number has the same range of values as a REAL number for the integer part, and can store more precise values in the fraction part of the number. The integer part of a DOUBLE number in Eiffel at SoCS can vary from about -10^39 and +10^39, and the fraction part can have a value from 0 to about 10^-49. A double variable has a default value of 0.0. Three declarations of type DOUBLE are shown below, with the data structures of the variables. The value is shown with elipses (“...”) to indicate the extra precision of the fraction part.
name
type
value
pay: DOUBLE
pay
DOUBLE
0.0000...
hours, bonus: DOUBLE
hours
DOUBLE
0.0000...
bonus
DOUBLE
0.0000...
pay_rate
DOUBLE
12.0000...
pay_rate: DOUBLEis 12.0
Input is done using the command io.readdouble and the function io.lastdouble. Output is done with the command io.putdouble. Output formatting can be controlled by features of the class FORMAT_DOUBLE. An example of double precision input and output is
feature test: DOUBLE make is -- illustrate DOUBLE input and output do io.putstring (“Enter a number: “) io.readdouble test := io.lastdouble io.putstring (“%NThe value was “) io.putdouble (test)
© R. S. Rist, 1993
35
end -- make The signature of each DOUBLE operator is shown below: symbol
name
use
result
+ -
unary plus unary minus
DOUBLE DOUBLE
DOUBLE DOUBLE
^
exponent
DOUBLE, DOUBLE
DOUBLE
* /
times divide
DOUBLE, DOUBLE DOUBLE, DOUBLE
DOUBLE DOUBLE
+ -
binary plus binary minus
DOUBLE DOUBLE
DOUBLE DOUBLE
In summary, the exponent operator always returns a value of type DOUBLE while the other operators may return a value of type INTEGER, REAL, or DOUBLE due to operator overloading.
2.10 Mathematical classes The class REAL supplies the basic real number opersations to do arithmetic, but there are also many other useful, but complex, operations we can do on real numbers. Sophisticated mathematical functions such as sqrt and sin are provided by the Eiffel class SINGLE_MATH . This class is inherited by the class that wants to use its features; to see the features in the class, you can ask for a short or a flat display by typing short SINGLE_MATH, or flat SINGLE_MATH The short tool shows you all the features that are defined in a class. The flat tool shows you all the features that are offered by a class. A feature may be immediate (defined in the class) or inherited; inheritance is discussed in Chapter 10. To find the square root of a real number, for example, we use the class SINGLE_MATH with the code shown below:
class X inherit SINGLE_MATH creation make feature test: REAL is 3.0 make is -- show how to use the inherited function sqrt do io.putstring (“The square root of the test value 3.0 is “) io.putreal (sqrt (test)) end -- make end -- class X
© R. S. Rist, 1993
36
Inheritance is discussed in detail in Chapter 10, but for now all we need to know is to add the keyword inherit and the class to inherit (SINGLE_MATH here) after the class and before the creation clause. When this is done, all features of the class (such as sqrt here) can be used within the class X. The feature sqrt receives a single argument in brackets (the number to use) and returns a real value that is the square root of the argument. This real value is then displayed using the output command io.putreal. Conversion between numeric types is provided by operators in the REAL and DOUBLE classes (for heavy to light conversions), or automatically by the compiler (for light to heavy conversions). Note that the value is not changed; instead, a function is called on the value and the function returns a new value; the existing value is unaltered. The name and signature of the conversion operators are shown below. operator name
argument
result
truncated_to_real d.truncated_to_real truncated_to_integer d.truncated_to_integer truncated_to_integer r.truncated_to_integer
DOUBLE
REAL
DOUBLE
INTEGER
REAL
INTEGER
example
It is also possible to use these operators with expressions or functions by wrapping brackets around the expression, such as (3.6 + sqrt (2)).truncated_to_integer => 5 A larger example of type conversion is
class X creation make feature test: REAL is 3.12345 make is -- show how to truncate a real value do io.putstring (“The integer part of the test value”) io.putreal (test) io.putstring (“ is “) io.putint (test.truncated_to_integer) end -- make end -- class X The numeric classes provide many more features than are used here; you can see all the features in a class by running short or flat on that class.
2.11 Class CHARACTER A printable character is a character with a value in the set ‘a’ to ‘z’, ‘A’ to ‘Z’, ‘0’ to ‘9’ as well as punctuation marks (‘!’, ‘.’, ‘,’, ‘;’, ‘:’, ‘?’), and logical (‘<‘, ‘=‘, ‘>‘) and arithmetic (‘+’, ‘-’, ‘*’, ‘/’) symbols. There are also other, less common printable values such as ‘@’, , #’, ‘$’, ‘%’ and so on. A printable character is written in Eiffel enclosed in single brackets.
© R. S. Rist, 1993
37
A character may also have an unprintable value, such as the new line character that is named CR (Carriage Return), the backspace character (named BS), the end of transmission character (named EOT), and even the character that was used to ring the bell on a teletype (named BEL)! The common unprintable, but useful, characters are shown in Appendix B.2. The full character set is the standard ASCII (American Standard Code for Information Interchange) character set of 128 characters. A lexical order is defined on the set, based on the ASCII value of each character. For the common printable characters, ‘0’ < ‘1’ < ... < ‘9’ < ‘A’ < ‘B’ < ... < ‘Z’ < ‘a’ < ‘b’ < ... < ‘z’. A character is declared like any variable. A variable of type CHARACTER is given an initial, default value of NUL, written as ‘’, and named the null character. Several declarations of type CHARACTER are shown below, with the resulting data structures:
name
type
gender: CHARACTER
gender
CHARACTER
''
choice, symbol: CHARACTER
choice
CHARACTER
''
symbol
CHARACTER
''
CHARACTER
'm'
male: CHARACTER is 'm'
male
value
A character is input in the normal way, by a procedure (io.readchar) to read the keyboard and store the value in a character buffer, followed by a function (io.lastchar) to return the value of that buffer. As always, every input should be preceded by a prompt to tell the user what to do. Character input thus uses the following basic method:
choice: CHARACTER ... io.putstring (“Enter your menu choice: “) io.readchar choice := io.lastchar Care must be taken with character input, because most of the time your system wants one character but the user actually types in two characters: the printable character and then the new line character CR. Consider the following wrong code: account, choice: CHARACTER ... io.putstring (“Enter your account id (S, C): “) io.readchar account := io.lastchar io.putstring (“Enter your choice for this account (D, W, S): “) io.readchar choice := io.lastchar When the user sees the first prompt, they type in an account identifier, such as ‘S’ for a savings account and ‘C’ for a credit account; the quote marks are not actually typed into the keyboard, they are simply used here for clarity. The user then hits the Return key on the keyboard, and the command io.readchar takes the account choice character and stores it in the character buffer, then io.lastchar gets the value from the buffer and returns it so it can be assigned to the variable account. The second io.readchar then reads the next input character, which was a CR!
© R. S. Rist, 1993
38
To overcome this problem, you need to “flush” the unwanted CR from the system. One way to do this would be with a dummy read, but this is an ugly solution; we really don’t want to read and then throw away the CR. Instead, we can tell the computer to start reading input from the next input line, after the CR; this is done with the command io.next_line. The correct way to read a single character from a line is thus
io.putstring (“Enter your account id (S, C): “) io.readchar account := io.lastchar io.next_line that throws away the CR. The next input is then read from the start of the next line. Character output is implemented in the normal way by a command io.putchar. A literal value can be output, such as io.putchar (‘D’), but more often a character variable is output. Here is the code to display the user’s account choice on the screen: io.putstring (“Your account choice was “) io.putchar (choice) Declaration, input, assignment, and output have now been covered for the four basic Eiffel types INTEGER, REAL, DOUBLE, and CHARACTER. The remaining basic type, BOOLEAN, is covered in Chapter Five.
2.12 Case study: data flow The problem specification is "Money is deposited into and withdrawn from a bank account, and the balance can be displayed. Interest is added daily on the current balance; the interest rate is 4.5% a year." Main points in this chapter •
The numeric types in Eiffel are INTEGER, REAL, and DOUBLE
•
An input command io.read stores an input value in a buffer for each type of object. The last value of that type read from the user is returned from the query io.last
•
An output command of the form io.put shows a data value of that type
•
An assignment statement evaluates the expression on the right-hand side, and stores the value in the variable on the left-hand side
•
Operators have a strictly defined precedence order, but this precedence order can be overridden by parentheses
•
Operators are defined on types, so an operator can have different effects by overloading.
•
The basic type CHARACTER includes both printable and unprintable characters; the carriage return character CR, for example, is a valid character.
Exercises 1. How is an attribute declared? How is a constant declared? What is the difference betwen a literal and a constant? Can an attribute have the same name as a class? 2.
What is the default or initial value of an attribute?
© R. S. Rist, 1993
39
3. What are the names of the three numeric types in Eiffel? Is there a class for each type? Is there a text file for each type? How can you get a list of the operators defined for each type? 4. • • • •
Write down a command to read in an integer a real number a double precision real number a character
5. An input command gets a value from the user, and stores it in a buffer. How do you get the value from the buffer? How many instructions are there to get a value from an input buffer? 6. Describe the general method for getting a value from the user. Write the declarations and code needed to read and store • two integers • two real numbers • an integer and a real number 7. • • • •
Write down the command to output an integer a real number a double precision real number a character
8.
When do you need a prompt? a label?
9.
Explain, step by step, how an assignment statement works.
10.
What is meant by operator precedence? What is the numeric operator precedence order?
11. • • • • • • •
Evaluate the following expressions, showing each step: 1 + 2 * 3 / 4.0 1 // 2 34 // 4.5 1 // 2 \\ 3 1 \\ 2 * 3 // 4 / 5.0 -43 // 4 ^ 2 -((12 / 3.0) * (0 // 7) + 2)
12. Write a class X that consists of a single make routine, plus attributes. Write a make routine to read in the weight of an object in pounds, convert the weight to kilograms, and show the answer. The program should print out both the weight in pounds and in kilograms. One pound is equal to 0.453592 kilograms. 13. Write a class X that consists of a single make routine, plus attributes. Write a make routine that reads in an employee's hourly rate, the number of hours worked, and the tax rate. It then finds and shows the gross salary (before tax) and net salary (after tax) for the employee. 14. Write a class X that consists of a single make routine, plus attributes. Write a make routine that converts degrees in fahrenheit to degrees in centigrade. Fahrenheit degrees range from 32 degrees F (freezing point of water) to 212 degrees F (boiling point of water), Centigrade degrees range from 0 (freezing) to 100 (boiling). 15. Write a class X that consists of a single make routine, plus attributes. Write a make routine that finds the time and cost of a car trip. The input data is the distance covered on the trip, the average speed, the number of litres of petrol used per hundred kilometres, and the cost of a litre of petrol. 16.
Write a class X that consists of a single make routine, plus attributes. The specification is:
© R. S. Rist, 1993
40
"You have decided to become a rock concert entrepreneur and want to use your knowledge of computing to help with the accounting. Write a system that calculates your individual profit and the total attendance for a rock concert. There are three ticket prices, the cheap seats at $10, the standard seats at $20, and the special seats for $100 each. The special ticket holders get to sit in the front row, plus a pair of autographed sunglasses, plus a chance at a backstage pass. You must pay for the rent on the stadium, cost of the band, security and insurance. The security is calculated at 32 cents per person attending. The insurance is 3.6% of the income. You have two partners in this venture, and must split the profits evenly between all three partners. You must pay 12.5% tax on any profits made from the concert. Show the net (after tax) profit per person, and the total attendance at the concert."
© R. S. Rist, 1993
41
Chapter 3: Routines Keywords: routine, procedure, argument, function, Result A routine is a named piece of code. When the name is encountered during code execution, the routine is called: data is passed to the routine, and the code in the routine is executed. After the routine has executed, control returns to the location where the routine was called. Data can be passed to a routine from the caller through an argument list. The caller supplies actual values, and the formal arguments in the routine header are bound to the actual arguments, in serial order. Actual and formal arguments must agree in number, order, and type; names are irrelevant. The value of a formal argument cannot be changed. A procedure changes one or more values and returns nothing. The only way to get data back from a routine is to use a function: a function returns a value and changes nothing.
3.1
Look and feel
A routine is a named piece of code. Consider the make routine; the name of the routine is make, and the routine contains one or more lines of code. When an Eiffel system executes, the make routine of the root class is found and executed. One routine can call other routines, so the flow of control moves from one routine to another, until all the called routines have been executed. Control then returns to the root make routine, that routine terminates, and control returns to the user. The calling diagram below shows three levels of routine calls. At the top level, the routine make is called (1). This routine contains the names of three other routines, each of which is called in turn. The read routine is called (2), the code in that routine is executed (3), it calls no other routines, so control returns to the location in make just after the read routine was called (4). The name update is then seen so the update routine is called (5), and in turn calls the routines simple and complex; neither of these call other routines. The simple routine is called, executed, and control returns to the update routine (6-8). The complex routine is then called (9), the code in that routine executes (10), control is returned to update (11), that routine finishes, and control returns to make (12).
read 1
3
2
simple 4
make read update display
5 12
6 update simple complex
16
8 9 complex 11
13 15
7
10
display 14
The make routine then calls display (13), that routine executes (14), and returns control to make (15). The make routine has now been completely executed, so the system has been fully executed and control returns to the user (16).
© R. S. Rist, 1993
42
The code for this sequence of routine calls is shown below. Only the names of the routines, and the routine headers, are shown; routine code that does not contain a routine call is indicated by ... . The routines are shown below in a concise format; note that you cannot write Eiffel code in three columns as shown below in the illustration. When the make routine is executed, the following events occur. make calls read, then update, then display. read calls nothing. update calls simple and then complex. simple calls nothing, complex calls nothing. display calls nothing. The order of code execution is found by tracing out each routine call, and executing the code in each routine, in serial or listed order. make is do read update display end -- make read is
update is do
display is do
...
do simple
... end -- read
end --
complex
display end -- update simple is do
complex is do ...
end -- simple
3.2
... end -- complex
Routine syntax and mechanism
A routine is called when the name or identifier of the routine is encountered in the code. Eiffel finds the routine definition in the class, and executes the code in the routine. When all the code has been executed, the routine exits and control is returned to the caller, immediately after the routine call. This code encapsulation is fundamental to code reuse, because the named code can be executed as a single instruction (the routine call). The code in the routine does not have to be re-written every time the user wants to execute it; the routine is written once, and then called as needed. There are two kinds of routine, named a procedure and a function. The two types of routine behave differently, so care must be taken in deciding which to use. A routine definition consists of two main parts, the routine header and the routine body. The routine header defines the name and signature of the routine, by listing the type of each data value received and returned to the caller. The routine body provides the code that is executed in the routine. Comments and local variables are listed after the header and before the body. The standard layout of a routine in steps of four spaces is 1 step header 3 steps header comment 2 steps do 3 steps body (executable code) 2 steps end
• • • • •
A routine can have several names; one name is often a shorter form of the other, more meaningful name. When this occurs, the names are simply listed in the routine header before the arguments, separated from each other by a comma. The routine can then be called by any of these names.
3.3
Procedure format and use The format of a procedure is
© R. S. Rist, 1993
43
name (arguments) is -- description of change local declarations do routine code end -- name • • • • • • •
a name (possibly followed by other names for the procedure) any arguments to the procedure, enclosed in round brackets the keyword is a header comment any local variables the body of the procedure, enclosed in the keywords do and end the name of the procedure as a comment.
The procedure header consists of a name, followed by any input arguments in brackets, and the keyword is. The name of a procedure is a verb that describes what the procedure does; specifically, that describes the change made by the procedure. The header comment describes what the procedure does with no mention of how this behaviour is implemented in the routine body. Any local variables are then declared after the keyword local; a local variable exists only within the routine. The procedure body is then coded, first the keyword do, then the routine code, then the keyword end. The name of the procedure is then written as a comment after the end. The indentation for each part of the procedure is shown below; note that the comment is indented to the level of the code, not the level of the do and end. The following convention is used for names of procedures in the case study: • • • •
get: read then set read: prompt then read with io.readX set: store a value, often from io.lastX show: show a value, possibly with label.
A procedure is a routine that changes one or more values. A procedure is used if and only if you change the value of an attribute in the procedure. If a new value is to be calculated, then a function is used to calculate and return that value. Input and output code must be placed in a procedure and not a function, because input changes the value of the relevant input buffer, and output changes the “value” of the screen. Common error: Two routines have the same name, generating a name clash Error code: VMFN Error: two or more features have same name What to do: if they must indeed be different features, choose different names or use renaming
3.4
Local variables
A local variable is a variable that is declared in a routine. The format of a local declaration is simply the word local, followed by whatever declarations are needed. If there is only one local variable, the word local and the declaration are written on the same line. If there are multiple declarations, then the word local is written on a line by itself, and the declarations are written on the following lines, indented four spaces from the word local.
3.4.1 Example: a local amount An example of a local declaration in a routine is shown below. The local variable amount is declared at the top of the routine, and used to store the input value. Once the value has been stored in this variable, it is passed as an argument to the deposit routine.
example is -- show how to use a local variable
© R. S. Rist, 1993
44
local amount: REAL do io.putstring (“Enter amount to deposit: “) io.readreal amount := io.lastreal deposit (amount) end -- example A local variable exists only while the routine is being executed. It is created when the routine is called, given its initial or default value, used in the routine, and destroyed when the routine exits. A local variable thus cannot be referenced outside of its routine. Compare this to an attribute. An attribute is a variable declared in a class, and exists while the system is being executed. This notion of existance is made precise by the term scope. The scope of a variable is the part of a program where that variable can be seen or referenced. An attribute can be seen and used by any routine in its class, so the scope of an attribute is its class (but see Section 5.4). A local variable can only be used in its routine, so the scope of a local variable is its routine. Two identifiers with the same name cannot have the same scope; this is a name clash. If you write codewith a name clash, Eiffel cannot work out from the name which variable you want, so it gives up and your compilation fails. Two attributes in the same class cannot have the same name. A local variable cannot have the same name as an attribute, because they share the same scope in the routine body. Two local varibles in the same routine cannot have the same name. Two local variables in different routines can have the same name, because their scopes do not overlap so there is never any confusion about which to use in which routine. Common error: name clash between local and feature Error code: VLRE (1) Error: local entity has same name as feature of class What to do: change the name of the local entity, or of the feature
3.4.2 Local or attribute? There are now two ways to store data, in an attribute and in a local variable, so we must ask “When should you use an attribute, and when should you use a local?”. The answer is simple: use a local variable whenever you can. When writing code, start by coding every data value as a local variable. If a value is used by two routines, then you are forced to store the value as an attribute. Attributes are there to store the state of the object, not to make the code efficient. The number of attributes should be kept as small as possible, and the way to do this is to use local variables wherever this is possible.
3.5
Passing data to a routine
When the name of a routine is encountered during code execution, control is transferred to that routine; we say the routine is called. It is possible to pass data values from the caller to the called routine, to be used inside the called routine. A data value passed to the routine is called an argument to the routine. The data supplied by the calling routine are the actual arguments, the actual values used in that routine call. The variables that store this data in the routine are called formal arguments, because they define the formal behaviour (signature) of the routine. A formal argument is a local variable. An example of the calling and the called code with arguments is shown below. The make routine (among other things) reads in a value from the user, stores the value in the local variable this, and passes this as an actual argument to the procedure add. The procedure receives the actual argument, and stores its value in the new, local variable new; new is the single formal argument to the routine add. The procedure then changes the value of sum, by adding the new number to it.
feature sum: REAL make is © R. S. Rist, 1993
45
local this: REAL do ... this := io.lastreal add (this) end -- make add (new: REAL) is -- add the new number to the sum do sum := sum + new end -- add Data is passed from the caller to the called routine through an argument list. The calling code supplies a set of values to be passed; call these the actual arguments. The routine header has a matching list to receive these values; call these the formal arguments. The formal argument list in the routine header consists of one or more variable declarations; declarations of different types are separated by semi-colons. Some examples of routine calls and the routine headers with arguments are deposit (43.60) -- actual argument is the value 43.60 deposit (number: REAL) is -- one formal argument of type REAL gcd (45, 35) -- actual arguments are the values 43 and 35 gcd (this, that: REAL) is -- two formal arguments of type REAL do_something (42, 64, 83.7, 0.0001, ‘y’) -- actual arguments are the five values ... do_something (a, b: INTEGER; c, d: REAL; e: CHARACTER) is -- five formal arguments of types ... When the routine is called, each formal argument is bound to the corresponding actual argument. Argument binding is simple: each formal argument is created as a local variable and given the value of the actual argument, when the routine is called. The first formal argument is bound to the first actual argument, the second formal argument is bound to the second actual argument, and so on until all the arguments are bound. Binding is done purely on the order in which the arguments occur. The name is irrelevant; only the shape, defined by the argument list, matters when the actual and formal arguments are bound. The actual and formal arguments must therefore agree in number, order and type or the two argument lists cannot be bound. Once a formal argument has been bound, you are not allowed to change its value; if you try, the system will not compile. The calling code supplies values to the routine, to be used in the routine. The type of the value that is passed is defined in the routine header by the formal argument, and this is the only constraint on the argument. The calling code can supply a literal, a constant, a variable, an expression, or a function as its value; all the called code cares about is that the supplied value be of the defined type. A routine header and several legal routine calls are shown below: add (number: REAL) is add (3)
-- INTEGER literal, converted to heavier
REAL add (3.6) -- literal of type REAL add (Pi) -- constant of type REAL add (this) -- variable of type REAL add (this + 32 - 4 * that) -- expression that evaluates to a REAL value add (sqrt (this))
-- function that returns a REAL
value
© R. S. Rist, 1993
46
Common error: a formal argument has the same name as a feature in the class, a name clash Error code: VRFA Error: Formal argument has same name as feature of the class What to do: Change the name of the argument, or that of the feature Common error: a formal argument has the same name as a local in the feature, a name clash Error code: VRLE (2) Error: local entity has samen name as formal argument of the same routine What to do: Change name of local entity, or of argument Common error: the number of actual and formal arguments do not match Error code: VUAR (1) Error: wrong number of actual arguments in feature call What to do: make sure that number of actuals matched number of formals Common error: the type of the actual and formal arguments does not match Error code: VUAR (2) Type error: non-conforming actual argument in feature call What to do: make sure that type of actual argument conforms to type of corresponding formal argument. Explanation: “conform” means roughly “of the same type”; for the moment, assume that it means “same or heavier type”. Two variables of the same type conform, INTEGER conforms to REAL, and REAL conforms to DOUBLE, so INTEGER conforms to DOUBLE. A more precise definition is given in Section 11.2.
3.6
Functions
A function has a type and returns a value, like an attribute. A function calculates a value and changes nothing, so a function is used when you calculate a new value from existing values.
3.6.1 Syntax and mechanism The format of a function is
name (arguments): TYPE is -- description of value local declarations do routine code Result := expression end -- name • • • • • • • •
a name (possibly followed by other names for the function) any arguments to the function, enclosed in round brackets the type of value returned by the function, written after a colon and a space the keyword is a header comment any local variables the body of the function, enclosed in the keywords do and end the name of the function as a comment.
A function header starts with the function name, followed by any input arguments. The name of a function is a noun, that describes the value returned by the function The type of the returned value is then given, preceded by a colon. In contrast, a procedure has no return type because a procedure does not return a value. The function header is terminated by the keyword is. The function comment is then written, followed by any local declarations and the function body enclosed in do and end. The indentation for each part of a function is shown below:
© R. S. Rist, 1993
47
When a function is called, its formal arguments are bound to the actual arguments. In addition, a special local variable called Result is created to contain the function’s value; the type of Result is given in the header as the returned type. On entry to the function, Eiffel creates a variable of that type with the appropriate initial value. At some point in the function body, this variable is usually given a more useful value. If the value is not changed, then Result still has its initial, default value. When the function returns control to the caller, the value of the function is whatever value is stored in Result. The variable Result is like any local variable, except that it’s value on exit is the value of the function. You can use the variable to store a value, or to provide a value; in particular, you can do such things as Result := Result + value. Several functions have conventional names in the case study. These are • •
valid: a BOOLEAN function that returns true if a value is valid finished: a BOOLEAN function that returns true if the user has chosen to finish
Many standard arithmetic functions, such as sqrt (square root) and sine (sin of an angle) are functions that receive a single value and return a single value. Examples of function calls are shown to the left below, and the function header is shown to the right. Without knowing anything about how these functions are implemented, the function headers can be defined for these routines, because the signatures are known; a single REAL value is passed in, and a single REAL value is passed back. length := sine (30) answer := sqrt (36 / 7.4) io.putreal (sqrt (hypotenuse))
sine (value: REAL): REAL is sqrt (value: REAL): REAL is
The routines are defined in the Eiffel library class SINGLE_MATH, that supplies features for single precision mathematics. The code for the two arithmetic functions above return a real value, so the functions must include the code shown below, as well as any other code needed to actually calculate the result.
sqrt (value: REAL): REAL is -- square root of value do ... Result := end -- sqrt sine (angle: REAL): REAL is -- sine of the angle do ... Result := end -- sine When you code a routine, you can write the header name, the arguments and the outline body as shown above, without thinking about what happens inside the routine Whatever actual code is written in the routine definition can then be added later. Common error: Use Result in a routine with no return type listed in the routine header Error code: VEEN (2) Error: Illegal use of Result What to do: Remove use of Result, or make sure that context is body, post-condition, or rescue clause of a function (not procedure, class invariant, or pre-condition). Common error: Call a function without using the returned value Error code: VKCN (1) Error: Function call used as instruction What to do: Call a procedure rather than a function, or keep the function but use the call as expression rather than instruction.
© R. S. Rist, 1993
48
3.6.2 Function or attribute? Now we have seen two ways to get a value, by storing it an attribute and by calculating it in a function. When should you use which? The answer is simple: use a function whenever you can. If a value can be calculated from other values, then you should use a function. If you store it in an attribute and the input arguments later change, then the stored value is obsolete and incorrect. If you need to calculate a value and use it immediately, then use a function. Keep the number of attributes small, by using local variables and functions wherever possible. Only if you cannot use a local or a function, should you create a new attribute. One of the most powerful methods to keep the attributes hidden is to strictly define the behaviour. Often, what might at first glance look like a need for exporting an attribute is really a behaviour. In a class ROOM, for example, there may be a test to see if the size of a door is greater than some value. The obvious code to do this is something like if height > limit then ... The correct way to test if the door is high enough is shown below. Logically, the client wishes to know if the door is high enough for some use; this should be implemented as a test to see if the door is high enough, as a function in the class DOOR. With this technique, the height of the door is not known to the client, so the attribute can be hidden. As an added advantage, the meaning of the test is now obvious from reading the code.
class WALL ... if door.higher (limit) then ... end -- class WALL class DOOR ... feature {NONE} length, height: REAL ... feature {WALL} higher (limit: REAL): BOOLEAN is -- is the door higher than limit? do Result := height > limit end -- higher 3.7
Comments
A comment is used in a program to describe the code and thus to help the reader of the code understand it. The syntax of a comment is two minus signs, followed by a text string; the convention is to place a single space between the comment marker (“--”) and the text. The mechanism is that Eiffel ignores anything to the right of the comment marker, so a comment has no effect on the code execution. A comment adds information to the code, so it should mention nothing that can be seen by glancing at the code. The language of a comment is clear, simple, active, and present tense. An attribute comment is placed on the same line as the attribute. It is unusual to comment an attribute, however, because the name of the attribute conveys all the meaning that is needed; good names reduce the need for comments. If you need a comment to describe an attribute, then don’t start it with “This attribute ...”; the convention
© R. S. Rist, 1993
49
is to place a comment on the same line and to the right of its attribute, so saying “This attribute” adds nothing and wastes space. A routine header comment describes the effect of the routine. A header comment says what the routine does, and should say nothing about how it is done. The header comment describes routine behaviour, not implementation. In a hedaer comment, don’t say: 1. 2.
“Try to ...”; a routine does something, it doesn’t “try”, “hope”, or “attempt". “ ... will ...”; a routine does something when it executes, so this is redundant. The comment of a procedure describes the change madeby the procedure. Don’t say:
1.
“This procedure ...”; it is obvious at a glance that it’s a procedure, so this is redundant. The comment of a function describes the value returned by the function. Don’t say:
1. 2.
“This function ...”; it is obvious at a glance that it’s a function, so this is redundant. “Return”, “Find”, or “Calculate”; this is what a function does by definition.
A comment inside a routine is unusual in Eiffel, because the routines are small, and the routine and variable names are carefullly chosen to carry a lot of the meaning. Many comments simply repeat the code information, such as “-- add 3 to sum”, so they are redundant, clutter up the listing, and can be deleted to improve the listing. Large comments reveal a flaw in design: if a chunk of code needs a long explanation, it has probably been designed and named badly and can be cut into smaller, simpler, more meaningful pieces.
3.8
Cause and effect routines
It is possible for a function to change an attribute; the code to do so can easily be written as part of the function's body. Avoid this. Don't do it. Such an action contradicts the whole idea of the function: a query (function or attribute) changes nothing. This style of programming is called programming by side-effect. You call a function to return some value, and the code in the function returns the value, but also changes something "on the side". This can lead to code that is immensely hard to test and understand, because you are changing a value without admitting that you are doing so. If you admitted it explicitly, the function would be split into a procedure and a function, and your code would be clean and wholesome. Avoid side-effects in Eiffel. Other languages, especially C++, use side-effects as a basic programming tool, but it can be argued with some force that such a practice is error-prone, and creates complex and non-reusable code. The argument can be made formal when assertions are used to enforce programming by contract, discussed in the fourth section of this book; C++ does not use assertions. A seductive piece of code to write in Eiffel is a function that accepts a string, uses the string as a prompt, reads in a value from the user, and returns that value. This small routine allows you to ignore the tedious sequence of events needed to get user input. Of course, you actually need several functions, because each will return only one type of value (real, integer, character, or string). The (bad and politically incorrect) code looks like get_real (prompt: STRING): REAL is -- get a value from the user and return it do io.putstring (prompt) io.readreal Result := io.lastreal end -- get_real The problem is that this code is a procedure masquerading as a function. It is a function, because it returns a value. It is a procedure, because it changes the 'value' of the terminal screen, and also the value of the feature io.lastreal . Another routine might wish to use the last input value, and expects that the value has not changed; a user would expect the value to still be accesible, because a function was called and functions change nothing, so the value should be unchanged. This routine uses programming by side-effect, by changing a value inside a function, so it is misleading and error-prone.
© R. S. Rist, 1993
50
A function with a side effect is actually two routines in disguise; a procedure to make a change, followed by a function to report the effect of the procedure. This is the solution that Eiffel uses for input; a command io.read<X> that reads a value from the user, and a function io.last<X> that returns the value read in by the procedure. Unfortunately (or fortunately, depending on your perspective), this is only slightly less clumsy than writing the straight code. A strong implication of no side-effects is that a function usually cannot call a procedure, because a procedure changes one or more values. Formally, a function can change the state of the world as long as it replaces the state afterward; the state of the world is then the same on entry to, and on exit from, the function. A function can set and change the values of local variables, because they do not exist outside of the function; looking from outside the function, nothing has changed. A procedure can call a function and use its value, because that is not a side-effect.
3.9
Once routines
The power of an OO system comes from encapsulating the data and the processing, so each object has its own data that is used and changed by the routines in the class. Sometimes, however, there is a need to use the same data across many objects; in a procedural language, this is done by global variables, that can be seen across the system. In Eiffel, global variables are implemented by once routines. A routine may be defined so that it executes only once, no matter how many times it is called. Once routines allow a value to be initialised once, and then shared across objects. A once routine is defined by replacing the keyword do with the keyword once at the start of the routine body. A once procedure may be called many times, but the code inside the procedure only executes the first time that it is called; subsequent calls have no effect. A once function executes its code the first time it is called and returns a value, and all subsequent calls return that same value. Once routines are useful for initialising values the first time a structure is used, and for shared information. An example of a once routine is given in Section 9.7. The input/output system uses a once routine so that all objects share the same I/O system. The object io is of type STANDARD_FILES. Every client that calls this object should use the same I/O system, so the creation routine for the I/O system is executed once, and the same I/O system is used by all subsequent calls.
3.10 Listing order Eiffel does not care what order features are listed in the class definition. To compile a system, it uses the complete feature name in the routine call to look up the feature definition. This is fine for a computer, but people need more support so a set of conventions have arisen that make code easier to read. The basic approach is to think local: provide the information needed to understand the code either in the code, or with the code. A good order to list the class features is to divide the class into attributes (with their set and show routines), followed by exported routines (in calling order); export policies are discussed in Chapter 5. In more detail, the convention is 1. List the attributes at the top of the class, under a feature clause feature {NONE}. Under each attribute, place the routines that set and show that attribute; these are usually private features, like the attributes. 2. List the other routines. For routine calls within a class, list the calling routine before the called definition, and list the called routines in calling order. For routine calls between classes, list the exported supplier features in called order. Under each exported routine, list the private routines that are called by that routine. If both writer and reader use this convention, it is easy to find a routine in the code listing.
3.11 Case study: routines The case study from last chapter is extended in Part 3 to show how routines are used to group and organise code. The problem specification is unchanged, but the solution is much improved by making the code modular and thus reusable.
© R. S. Rist, 1993
51
Main points in this chapter •
A routine is called by listing the name of the routine, plus any arguments. When Eiffel encounters the name, it transfers control to the routine and executes the routine's code. When the code has been executed, control returns to the caller.
•
A routine receives data values through the argument list.
•
Actual and formal arguments must agree in number, order and type; names are irrelevant. A formal argument is a local variable, but its value cannot be changed once it is bound.
•
A procedure returns nothing. A function returns a value by assigning it to the special variable Result in the function.
•
A routine may be defined to execute once only. For a once procedure, subsequent calls have no effect. For a once function, subsequent calls return the same value as the first call.
•
The order of routines in a code listing reflects the control flow, to help the next person read and understand the system.
Exercises 1. • •
Describe the format of a procedure a function
2. signature of • • •
Describe the
3. called? What happens when a routine is called?
How is a routine
4. passed to a routine from its caller? How is data returned by a routine?
How is data
5. format of an argument list. Define what is meant by • • •
Describe the
an attribute a procedure a function
actual argument formal argument argument binding
6. How is a local variable declared? What is the scope of a local variable? What is the scope of an attribute? Can a local variable have the same name as an attribute of the class? Can a local constant be declared? 7. Is Result a local variable? What is the initial value of Result in a function? Can the value of Result be used inside a function? 8. What type of routine is used to read from a user? Why? What type of routine is used to show data to the user? Why? What is a side effect? Why are side effects bad for reuse? 9. you use a function, and when a procedure?
© R. S. Rist, 1993
When shoulkd
52
10. Can a function call a function? Can a function call a procedure? Can a procedure call a function? Can a procedure call a procedure? Why? 11. What are the values of the formal arguments in the following examples? Assume that the class contains the following declarations and code:
a)
local a, b: REAL do a := 12.6 do_it (a, a + 32) do_it (b, a) do_it (a, b) end -do_it (b, c: REAL) is ...
b)
local a, b: REAL p, q: POINT do !!p.make_input do_that (p, q, a) !!q.make_input do_that (q, p, 134/6*3-1) end -do_that (a, b: POINT; p: REAL) is ...
12. returned by each call to the routine wonder below?
What values are
wonder (about, this: REAL): REAL is -- wonder what this does? local hero: INTEGER do hero := about.truncated_to_integer + 4 Result := hero * this end -- wonder wonder (3, 4) wonder (96.2, 17.8) wonder (1, 2, 3) wonder (“about”, “this”) wonder (sqrt (3), abs (-12)) 13. This is an exercise on syntax and mechanism; do not worry about style. The class X has a REAL attribute called number, a make routine, and four other routines. The first routine (set) receives the initial value of number as an argument. The second routine (add3) adds 3 to the number and displays the new value. The third routine (add) takes an integer as argument, adds this to number, and displays the new value. The fourth routine takes an integer and two strings, and displays the first string, then the sum of number and the integer, then the second string. Define the signature of every feature. Code the class.
© R. S. Rist, 1993
53
14. This is an exercise on syntax and mechanism; do not worry about style. Add two new features to class X. The first feature (square_num) returns the square of number. The second feature (formula) accepts two integer values and a real value (call the arguments i, j, a) and returns the value ((number + i) / j) * a. Define the signatures of these routines. Code the header for both routines, then the body of each routine. Add code to the class X to call each function and to display the value returned by each feature. 15.
Code a solution for the specification given in Chapter 2, question 12, using routines.
16.
Code a solution for the specification given in Chapter 2, question 13, using routines.
17.
Code a solution for the specification given in Chapter 2, question 14, using routines.
18.
Code a solution for the specification given in Chapter 2, question 15, using routines.
© R. S. Rist, 1993
54
Chapter 4: Objects Keywords: creation, object, value, reference, equal, copy, clone An object is an instance of a class, and a class is a set of variables and their routines. Each object has its own variables, and all objects share the routines of their class. An object is created by the creation command !!. A creation routine can be called if needed, to give the variables a more useful value; if the default values are useful, no creation routine is needed. The value of a variable may be a simple value that can be used immediately, or it may be a reference to an object. For two references, we must distinguish between the same reference value (point to the same object) and the same content value (different objects, same content); two copies of this book, for example, contain the same content but are not the same object.
4.1
Object creation
Objects are the second fundamental way to reuse code in Eiffel. A class is defined as a set of variables and a set of routines, and this class defines the template or appearance of an object. When an object is created, it has its own set of variables, and can use the routines defined in the class that set and use these variables. The class definition is written once, and then objects are created. Once a bank ACCOUNT class has been defined, for example, we can create and use 10,000 actual bank accounts, because they all behave in the same way. This is possible to do because of two things: object creation and class encapsulation. Object creation is presented below in several stages. First, the code in the client class is shown that defines the object template (the class), then the code in the supplier class used to create an object is shown. Second, the data structure of the object, after it has been created, is shown. Finally, the mechanism for how the code creates this data structure at run-time is shown.
4.1.1 Creation code The creation command is two exclamaton marks !!. The term “exclamation mark” is cumbersome to say, so the term “bang” is often used in computing instead, so we can say the creation command as “bang bang”. The code in the client class declares an object and then creates it with a creation instruction, a creation command followed by the name of the object. The type of object is found from the declaration, the class definition from the type, and the variables and routines are found in the class definition. To create two points, for example, we need the following code in the client class:
class LINE creation make feature left, right: POINT make is -- create two points do !!left !!right end -- make ...
© R. S. Rist, 1993
55
end -- class LINE This partial class definition shows that a LINE has two attributes of type POINT, that are the left and right ends of the line (we can’t name them start and end, because end is a reserved word (see Appendix B)). The make routine in class LINE consists of two creation instructions (of the form !!name). When the make routine is executed, two objects of type POINT are created. Eiffel finds the definition of class POINT, creates two point objects and gives the basic variables in each point the relevant default values. A partial defintion of the supplier class POINT is class POINT feature x, y: REAL ... end -- class POINT An object of type POINT has two attributes, two real numbers with names x and y, that store the location of the point. The default value of a real number is 0.0. The internal data structure and initial values of a point are thus
name
type
value
x
REAL
0.0
y
REAL
0.0
4.1.2 Data structure Consider the first creation instruction in the make routine of class LINE, !!left. After the point has been created, the identifier left refers to a composite object, a point, where a point consists of two real numbers. The identifier left is thus a reference, because it refers to an object with its own internal structure. The initial or default value of a reference type is Void, a special value that indicates the identifier does not (yet) refer to any object. When an object is created, the location of that object is placed in the reference variable. Formally, the value of the identifier left is a pointer to some location in memory, the location where the object’s data is stored. After the first point has been created, the data structure of a line is A032F440
left
POINT
A032F440
x
REAL
0.0
left
POINT
Void
y
REAL
0.0
data in a line
data in a point
and the value of the identifier left is a reference to memory, shown here in hexadecimal notation. A second point is created by the second creation instruction, and after both creation instructions have been executed the make routine terminates. The structure of a line after termination of the creation routine make in the client class LINE is shown below.
© R. S. Rist, 1993
56
A032F440
left
POINT
A032F440
x
REAL
0.0
right
POINT
A032F460
y
REAL
0.0
x
REAL
0.0
y
REAL
0.0
A032F460
All of this is initiated when a client of LINE creates a line, or LINE is the root class of the system. When Eiffel starts executing a system it creates an object for the root class, and then executes the creation routine make in the root class. The creation routine can then create other objects, and these in turn create other objects, as needed. While the final data structure of the system can be very complex, it can be found by drawing a data structure diagram from the data declarations in each class and then tracing out the client-supplier links for the reference types.
4.1.3 Creation procedure The final part of the mechanism is the creation procedure. The name of the creation procedure for a class is written under the creation keyword, at the top of the class after the class name; by convention, the name of a creation routine is make. A creation routine has a creation policy, that lists the classes who can call the routine as a creation routine. The format of a creation clause is
creation {CLASS} name If no class is listed after the keyword creation, the creation policy is ANY; like export, multiple classes may be listed in the policy. A class may have multiple creation routines, in which case they are all listed under the creation keyword, separated by commas. If different classes can create an object using different creation routines, then the creation clause is repeated for each creator class. A creation routine is used if the default values are not enough. In the class POINT, for example, we might add a creation routine to read in the location of the point from the user when the point is created, instead of always creating a point at location (0.0, 0.0). If the default values are good enough, then no creation routine is needed. If there is no creation routine, then the creation keyword is omitted from the class listing. If there is a creation routine, then it must be used when the object is created. A creation routine make has been added to class POINT in the listing below, that reads in two values from the user and stores them in the x and y attributes. A good convention in writing creation procedures is to have only routine calls in the procedure. This way the make routine can be changed as needed to use more or different routines, without re-writing any of these specific called routines. The POINT creation routine thus looks like:
class POINT creation make feature x, y: REAL make is -- read the x and y values from the user
do get_x
© R. S. Rist, 1993
57
get_y end -- make get_x is -- read the x value from the user, store it
do io.putstring (“Enter the x value: “) io.readreal x := io.lastreal end -- get_x get_y is -- read the y value from the user, store it
do io.putstring (“Enter the y value: “) io.readreal y := io.lastreal end -- get_y
... end -- class POINT To use the creation routine, the name of the routine is added to the creation instruction. A client of class POINT calls the creation routine by using “dot notation”: a dot (fullstop, period) is written after the object identifier, followed immediately by the name of the routine (no intervening space). The code to create a line at a specific location, by creating two points at specific locations, is contained in the make routine of the client class LINE:
class LINE creation make feature left, right: POINT make is -- create three points
do !!left.make !!right.make end -- make A creation routine can be used as a normal (non-creation) routine simply by omitting the creation command !!. In this case, for example, we could create a point (at some location) and then re-initialisethe same object so it has another location:
!!left.make left.make
-- create a new object called left -- read in new values for x and
y, same object The same object (left) is here used in the second line with no creation command, so no new object is created; the x and y values of the object created by the first line are used and changed by the second line of code.
© R. S. Rist, 1993
58
4.1.4 Creating an object When an object is created, a series of things happen. First an area of memory for the new object is allocated by the operating system, then the attributes in the new object are set to their default values. The object's creation routine is then executed if it exists, and finally a pointer to the allocated storage is attached to the object identifier in the client. An object has to be declared before it can be created, so Eiffel knows how much storage to allocate for an object of that type. Creating an object results in the following events: Allocate storage for the attributes of the object. For the class POINT, there are two real-valued attributes, so 1. enough storage for two REAL numbers is allocated. 2. Set the attributes to their default values. When attributes are created, they are set to a default value. INTEGER, REAL, and DOUBLE numbers are set to zero initially, a CHARACTER to the null value (''), a BOOLEAN to the value false, and a reference variable to Void. Run the creation routine if it is defined. The creation routine usually sets the attributes of the object to some 3. more specific value. In the POINT class, for example, the make routine sets the attributes to the initial location of the point, wherever that location might be. Set a pointer from the name to the storage. Every attribute has a value. For the basic attributes (INTEGER, 4. REAL, CHARACTER, BOOLEAN), this value is the stored value of that field. For objects, the value is a pointer to the allocated storage for that object. There are three variants of this process: variant
calling code
events
create object, no creation routine !!name 1, 2, 4 create object, with creation routine !!name.make 1, 2, 3, 4 change object, use creation routine name.make 3 If there is a creation routine for a class, then the name of the creation routine is written under the creation keyword in that class, the creation routine has to be defined, and the creation routine must be called. When a creation routine exists, you cannot create an object without calling the creation routine. To really understand the syntax and mechanism of object creation, you need to understand the words used to describe each part of the object creation code. The words and their code are: creation command !! creation instruction creation keyword creation routine
!!identifier creation make is ...
!!identifier.make
If you try to use an identifier that does not refer to anything (usually because you forgot to create it) then Eiffel flags an error for using a "Void reference"; formally, a void reference is found when you try to use a reference type whose value is Void. An object of a basic type need not be created, but an object of a reference type has to be created or assigned a value. Common error: Try to use !!object form when a creation routine is defined for the class. Error code: VGCC (5) Error: Creation instruction should include call, but does not What to do: Since the corresponding base Class lists creation procedures, use form of Creation instruction which includes call to one of them Common error: Try to use a creation routine without a creation export Error code: VGCC (6) Error: Creation instruction uses call to improper feature
© R. S. Rist, 1993
59
What to do: Make sure that feature of call is a creation procedure, is not ‘once’, and is available for creation to enclosing class.
4.1.5 Using an object To use an object, you must do four things: 1. 2. 3. 4.
Define the supplier class class POINT ... Declare the object in a client p: POINT Create an object in the client !!p.make Use the object in the client p.move (1.0, 1.0)
Common error: A Void reference when you try to use the identifier (step 4) without creating the object. Your system compiles, but then crashes at run time when Eiffel tries to use the identifier. The value of the identifier is Void, so the identifier does not refer to any object.
4.2
Calling a feature from a client
A feature is an attribute or a routine. A feature can be called from within the same class, by writing its name and arguments. A feature can be called from a client class, by writing the name of the object, a dot, and the feature name and arguments. The action is identical in both cases: if the feature is an attribute then its value is returned, if the feature is a function then its code is executed and its value returned, and if the feature is a procedure then its code is executed. The general form of a feature call is object.feature (pronounced "object dot feature"). If there is a feature call without an object, then Eiffel assumes the current object is being used and looks inside the current object for the feature definition. A call within the same class thus has the implicit form Current.feature; the reserved word Current denotes the current object. When a feature is called, the Eiffel compiler looks at the object on which it is called. The name of the object is located to the left of the feature call, on the left side of the dot, or is Current. Given the name, Eiffel can find the type of the object from the variable declaration. Given the type, it can find the class definition. Given the class definition, it looks inside the class for a feature with that name. There is never any confusion; a feature is called on an object of a some type, and that class must contain a feature of the correct name. The sequence of events for the feature call object.feature is listed and diagrammed below. 1. 2. 3. 4.
find the object to the left of the dot; if there is no dot, use Current find the class of this object from the declaration find the feature defined in that class definition execute the feature on the object 3
left: POINT 2
left.move (4.2, 6.9)
class POINT feature move (Žx, Žy: REAL) is ... 4
1
It is possible to nest feature calls, so that a feature is called remotely by a class that is not a direct client; the general form of the call is "object.feature1.feature2...". A remote call such as this is evaluated left to right. The first feature is called on the object, and returns a value. The second feature is then called on this value, and returns another value. The third feature is then called on this value, and do on. The features within the sequence must be queries, so that an object reference is returned by each feature. Each value is used to call the next feature, until the end of the chain; the last feature may be a query or a command.
© R. S. Rist, 1993
60
The same result can be achieved by storing each value returned from successive calls, but temporary variables are then needed to store each returned object. A sequence of feature calls is shown below, followed by the equivalent nested feature call; both these code fragments have the same behaviour when executed. b := a.feature1 c := b.feature2 d := c.feature3 d.feature4 a.feature1.feature2.feature3.feature4 A remote call that returns the last character on the third line of the fourth page of a book, for example, could be done using the sequence of calls book.page (4).line (3).last. Remote calls should be treated very carefully, however, because a remote call often indicates a complex, hidden connection between client and supplier. Such a connection should be broken into a set of routines and the routines placed in a set of classes, so there are only simple, direct relations between two classes. It is common to have features with the same name in different classes; this is known as overloading a name. This does not create a name clash, because Eiffel simply follows the procedure described above. As an example, we might define make routines for both a LINE and a POINT, so we have two different routines with the same name. Given the code
line: LINE ... !!line.make point: POINT ... !!point.make Eiffel finds the object from the instruction (object.feature), the class from the declaration, and then finds the correct routine in the class definition. The name of an object is used to convey meaning about that object, and the name of the feature conveys the meaning of that feature. Consider a class ROOM that contains various heights. There is no need to call the attributes door_height, window_height, and wall_height, because they are features of the appropriate class. If an attribute is used with no client, the feature is obviously a feature of that class and can be called height. If an attribute is used externally, then the name of the object carries the meaning, such as room.height, door.height, window.height, and wall.height.
4.3
Operators
An operator is a function that is called slightly differently from a typical function. An operator provides no extra functionality in the language, because it can be implemented as a normal function. Operators provide syntactic sugar, to make the code sweeter to write. The usual mathematical notation can be used when writing expressions (such as 3 + 2) instead of the normal Eiffel operator form (which would be 3.+ (2) here). Operators are written in two forms, called infix and prefix operators. Infix operators, such as "+", are written in the middle of their arguments, as in "3 + 5". Prefix operators, such as "not", are written before their arguments, as in "not (x > 3)". An operator is a function, so it returns a value and has a type; the returned value from "+", for example, can have the type INTEGER , REAL, or DOUBLE. For a prefix operator, the function call lists the operator name, then the object: actual call: + me operator header: prefix “+”: INTEGER is ... For an infix operator, the function call lists the object, then the operator, and then one argument: actual call: add + me
© R. S. Rist, 1993
61
operator header: infix “+” (this: INTEGER): INTEGER is ... Three examples of feature calls are shown below, to show the different formats. Assume that we have a variable i: INTEGER, and we wish to define addition of two integers. The first example shows a function named “+”, that takes a single argument, adds it to the value of the object, and returns the new value. The second example shows the infix operator binary plus that is named “+”. The third example shows the prefix operator unary plus, named “+”, where there is only an object and no argument: function infix operator prefix operator
object.feature (argument) object feature argument feature object +i
i.+ (3) i+3
To define an operator, the name of the operator is enclosed in double quotes and preceded by the keyword infix or prefix. The argument (if any) is then written, the type of the operator is coded, and the local variables and routine body follow. Class INTEGER, for example, contains operator definitions with the following header lines:
prefix “+”: INTEGER is infix “+” (other: INTEGER): INTEGER is prefix “-”: INTEGER is infix “-” (other: INTEGER): INTEGER is infix “*” (other: INTEGER): INTEGER is infix “/” (other: INTEGER): INTEGER is The standard numeric, relational, and boolean operators are all defined as operators in the relevant class, and their code can be examined in the relevant class listing. A user-defined operator is known as a free operator, whose name must begin with one of the characters '@', '#', '|', or '&'. A free operator is defined and called like any operator, but it has a higher precedence than other operators; free operators have the highest precedence of all operators. The complete operator precedence order is given in Appendix B. Consider an infix operator named "#percent" that is called on a real number, and takes a real number as argument. It returns the percentage value of the original number, so the operator call percent := 34.5 #percent 10 would return a value of 3.45. The operator would have to be defined as part of the class REAL, because it is called on real values. The operator definition would look like
infix "#percent" (percent: REAL): REAL is -- percent ofCurrent do Result := Current * (percent / 100.0) end -- #percent This operator uses the current object (a REAL value) to provide the baic value, and the argument percent provides the percentage. The returned value is that percent of the basic value. Operators provide no extra power to the language, because they behave the same as functions. They are used so that the normal arithmetic and logical notation can be retained, and implemented within the same framework as other routine definitions.
4.4
Value and reference semantics
In Eiffel, every line of code resides in some class definition, and every variable is of some type or class. Basic types such as INTEGER and REAL have immediately useful values stored in the value part of the variable, and reference types have a pointer or reference as their value. This difference in the value of a variable affects the way
© R. S. Rist, 1993
62
that a variable is used, so different rules are needed to describe the meaning or semantics of value and reference types. Every object in Eiffel is an instance of a class, so there are classes for the simple data types as well as for complex objects; the name of the class is given in the variable declaration. A class may be stored in two different ways, however, called reference and expanded types. If a class is defined as a reference type, then its value is a reference that is set to Void when the object is declared, and set to a reference when the object is created. If a class is defined as an expanded type, then the values of an object of that type are not references, but the objects themselves, and the object does not need to be explicitly created. The basic types INTEGER, REAL, DOUBLE, BOOLEAN, andCHARACTER are expanded types, and all other types are reference types. Expanded types improve the efficiency of an Eiffel system, because the value is used immediately and does not require tracing through a reference to a location in memory, then using the value at that location. A type is defined as expanded by writing the keyword expanded as the first word of the class definition. The class INTEGER, for example, has the class header
expanded class INTEGER If the keyword expanded is not included in the class header, then the class is a reference type. A class may be defined as a reference type and declared as an expanded type, by using the keyword in the declaration, such as x: expanded X. The interaction between expanded and reference objects of the same type has subtle implications that are not discussed in this text. The interested reader is referred to the book Eiffel: The Language (Meyer, 1992) for further details.
4.5
Reference assignment
An assignment of the form a := b stores the value of b in the variable a. This is an assignment of values if a and b are expanded types. It is an assignment of references if the variables are reference types. To illustrate this difference, consider two points and two real values that are declared and created by the code shown below; here, the initial location of the point is passed to the creation routine as two real values:
a, b: REAL p1, p2: POINT !!p1.make (12.6, -3.4) !!p2.make (12.6, -3.4) Points p1 and p2 now exists, with their x and y co-ordinates set to the same values; there are two points. These are different objects, because the values of the identifiers p1 and p2 (the location of the co-ordinates in memory) are different; the values of a, b, p1 and p2 are indicated by the data structure chart shown below.
© R. S. Rist, 1993
63
p1
p2
POINT F039AD00
POINT
F0943200
a
REAL
12.6
b
REAL
-3.4
F039AD00
x
REAL
12.6
y
REAL
-3.4
x
REAL
12.6
y
REAL
-3.4
F0943200
The feature call p1.x returns the value of the x co-ordinate of the object p1, and the feature call p1.y returns the value of the y co-ordinate of the object. The x and y values can be assigned to the REAL variables a and b by the code
a := p1.x b := p1.y because a, b, p1.x, and p1.y are all of the same type, REAL. If the point p1 is now moved, the values of a and b are not affected, because they are expanded types; the value has been stored in those variables, and is not affected by any change to x and y. The point p1 can be assigned to p3 by the code
p3: POINT ... p3 := p1 because they are both of the same type, POINT. A point is a reference type, however, so the assignment of p1 to p3 assigns a reference. The name p3 now contains the same reference value as the name p1; so they both refer to the same point, as shown in the diagram below, and are different names for the same object.
p1
p3
POINT F039AD00
POINT
F039AD00
x
REAL
12.6
y
REAL
-3.4
F039AD00
Changing the content of p1 by the code
p1.move (4.2, 12.8) changes the content of p3 because both names refer to the same object. The value of p3 is not changed; its value is still a pointer to some location in memory. The value stored at that location has changed, however, indirectly affecting the variable p3. Care must therefore be taken to separate the name of a reference variable from its value. For basic types, the story is simple: a basic variable has a simple value, that can only be accessed through the name of that variable. For reference types, however, an object can have any number of names, and the single object can be accessed
© R. S. Rist, 1993
64
through any of these names. To be completely clear and accurate, we should say "the object that is referred to by the identifier p1", but for most purposes we can use the shorthand form of this statament and refer to the object p1. An object can have multiple names if it is created and then assigned to other variables. As a consequence of the different values in reference and expanded types, assignment and equality work differently. Formally, we say that the meaning or semantics of assignment and equality are different for the two types of value.
4.6
Reference equality
Basic or expanded types can be tested for equality with the "=" operator, but this does not work for reference types. The equality operator tests if two values are the same. If you compare two points that have the same location, such as p1 and p2, they will not be "=" because you are comparing references, not contents, and the references are different. There are thus two meanings of equality that need to be separated: "has the same content" and "identical". Equality of content is called equal and is tested by the operator equal. Equality of reference is called identical and is tested by the operator "=". The two POINT objects p1 and p2 are equal but not identical, because they have the same content but different values, whereas the two points p1 and p3 are both equal and identical. Eiffel provides a special function that is defined on all objects to test if two objects have the same content. The function equal takes two objects and does a field by field comparison to determine if the fields are equal, and thus if the objects are equal. The two points p1 and p2 can be compared to see if they have the same location by calling the Eiffel function
equal (p1, p2) that returns a boolean value (true or false) saying whether the objects are field by field equal, using the '=' operator to compare each field. It is usual for a class to define its own equality operator, so the meaning of equal can be tailored to each type of object. This more specific function is usually called is_equal, and it is called on an object by passing the test object as an argument. For the class POINT, for example, the equality function could be called in the client and defined in the class POINT as
if p1.is_equal (p2) then ... is_equal (test: POINT): BOOLEAN is -- is the location of test the same as the location of Current? do Result := (x = test.x) and (y = test.y) end -- is_equal This function in class POINT tests each co-ordinate of the two points, and returns true if the x and y coordinates of both points are equal. Object equality may be tested using the Eiffel function equal, or by a more specific function called is_equal within a class, that tests specific fields in that class for content ("=") equality.
4.7
Object copy
Assignment of values means that a new value is stored in the new variable. Assignment of references, however, means that two variables with different names now refer to the same object. Assignment therefore cannot be used to get a copy of an object, so Eiffel supplies a command copy to copy objects, that is defined for all classes. It is called on the object to be copied, and takes a name of the correct type as an argument. It makes a field by field copy of the object, and attaches it to the name given as the argument. A copy of the point p1, for example, can be made and given the name p4 by declaring both names to be of type POINT, and writing the code
p4: POINT p4.copy (p1)
© R. S. Rist, 1993
65
As a result of executing this command, the name p4 now refers to a new object that is a field by field copy of the object p1. An object must exist to be copied, so p1 here must have a (non-Void) reference as its value. The function clone is also defined for all classes. It is passed an object as its argument, and returns a field by field copy of the object. The object can be Void; in which case a Void reference is returned. A copy of the point p1, for example, can be made and given the name p4 by writing
p4 := clone (p1) As a result of executing this command, the name p4 now refers to a new object that is a field by field copy of the object p1. While there are other, subtle differencs between the two operators (see Eiffel: The Language for details), objects are normally cloned rather than copied because clone handles Void references with no problems. The code for copy, clone and equal can be inspected in class ANY.
4.8
Deep versus shallow operators
The Eiffel operators equal, copy, and clone are called shallow copy, clone, and equal, because they only look one level inside the object; they do a field by field operation on each object. If an object contains other objects, then the fields of that object are references, so true comparison of the content is not done. The operator only looks one level down the chain of pointers, instead of following the chain down to the bottom level. Consider two objects of type TRIANGLE, both created with their vertices at the same location, such as the equilateral triangle with vertices at (-1, 0), (1, 0), (0, 1); the data structures for the triangles are shown below. The two objects are not = (tested bytri1 = tri2), because the value of tri1 (the memory address F000) is not the same as the value of tri2 (the memory address F050). The two triangles are not equal (tested byequal (ttri1, ri2)), because equal does a field by field comparison one level down. A triangle has three points (top, left, and right), and a point is a reference type, so the value of each field (each point) refers to a different memory location and thus the values are not =. A field by field comparison of two triangles looks one level down the data sructure and compares point references. Looking at the first field (top) of each object, the value F008 does not = the value F058.
tri1
tri2
TRIANGLE F000
TRIANGLE F050
top
POINT
F008
left
POINT
F020
right
POINT
F038
top
POINT
F058
left
POINT
F070
right POINT
F088
Operator: tri1 = tri2 ?
tri1.equal (tri2) ?
Result:
values not =
© R. S. Rist, 1993
values not =
x y
REAL
x y
REAL REAL
-1.0 0.0
x y
REAL
1.0
REAL
0.0
x y
REAL
0.0 1.0
x y
REAL REAL
-1.0 0.0
x y
REAL
1.0
REAL
0.0
0.0 1.0
REAL
REAL
tri1.deep_equal (tri2) ? values =
66
The problem of shallow testing can be overcome by using deep versions of copy, clone, and equal, called deep_copy, deep_clone, and deep_equal. These routines are defined on all Eiffel classes, and are called in the same way as their shallow versions. The deep version traces down the references to the very end of the chain, deep within each data structure, and then copies or compares the values at the end of each reference chain. Looking at the first field at the end of each chain, the value of tri1.top.x = tri2.top.x. Looking at the next field, tri1.top.y = tri2.top.y, and the comparison can continue with the eventual decision that deep_equal (tri1, tri2). The exact hexadecimal values of the pointers can be ignored in practice; they have been added in the diagram to make clear that the pointer values are different, because they refer to different objects located in different places in memory.
4.9
Passing an object
A variable of a reference type (an object) gets the initial value Void, and retains that value until either the object is explicitly created by !! or a value is assigned to the identifier by assignment. If the same object is used in two places, then it must be passed as an argument from the place where it was created to the place where it is used, and stored as attribute in the second object. More formally, a reference to the object is passed and stored. Consider a system that simulates delivering a letter. One person writes (creates) the letter, hands it to the postie, and the postie then delivers it to another person. An outline of this system is given by the class outlines below:
class WRITER
class READER
feature letter: STRING
feature letter: STRING
make is -- write a letter
do letter := “Wish you were here” end -- make
end -- class WRITER
get (this: STRING) is -- get a letter from the postie do letter := this end -- get
end -- class READER
class POSTIE feature writer: WRITER reader: READER make is
-- pick up letter from writer, give it to reader local letter: STRING do !!writer.make !!reader letter := writer.letter reader.get (letter) end -- make
end -- class POSTIE STRING is a reference type, so when the writer is created their letter is set to its default value, Void. The make routine then gives the string an actual value. When the reader is created, their letter
© R. S. Rist, 1993
67
also gets its initial value, Void, that is not changed by the READER creation routine. The writer’s letter - formally, the reference to the object - is then stored in the local variable letter in POSTIE; the local variable was, of course, initialised to Void. Finally, the letter is sent to the reader as an argument in the procedure call get. The get routine in the READER stores the value of the argument - the letter - in the reader’s letter. Once the value has been set, the reader can then read their letter (not shown). Until the assignment of the argument value has been made, the value of letter in READER remains Void. Common error: Object value not passed to using object. Error: Void reference at run-time What to do: if the object is only used in the class, then create the object; if the object is created in another class then pass its value as an argument and store the value.
4.10 Strings A string is a sequence of characters treated as a single unit. Where a character always has a length of one, a string can have almost any length. A value of type STRING in Eiffel is written enclosed in double quotes. STRING in Eiffel is a reference class, so a variable of type STRING has an initial, default value of Void. A string is declared in the usual way, as shown below.
name name: STRING message, address: STRING my_name: STRING is '"Rob"
type
value
name
STRING
Void
message
STRING
Void
address
STRING
Void
my_name
STRING
"Rob"
Although it is a reference type, a string is usually not created with a creation instruction; it is usually either given a constant value, a literal value or a value that is assigned from keyboard input. String input is done as usual with a command followed by a query, but there are three string input commands in Eiffel: 1. 2. 3. character
io.readline read and store up to the CR, discard the CR io.readstream (n) read and store a string of n characters io.readword read and store up to the next space or CR, keep the
io.readword reads up to the end of the word, and stores the delimiter (usually a space or a CR) in the character buffer, so you need to flush the buffer explicitly with io.next_line:
io.putstring (“Enter a password: “) io.readword io.next_line password := clone (io.lastword) Whichever input command is used, io.laststring returns the value of the string buffer. Because STRING is a reference type, however, we cannot simply assign this value to a variable because the value of the string buffer never changes; it is always the same location in memory. The content of the buffer changes as we read in new values, and to get this content we have to clone the string’s value. The normal way to do string input is thus io.putstring (“Enter your name: “) io.readline name := clone (io.laststring)
© R. S. Rist, 1993
68
which returns a new copy of the string value stored in the string buffer. String output is like any other output, and uses the command io.putstring: io.putstring (“Your name is “) io.putstring (name) Class STRING is covered in much more detail in the next chapter. Common error: Omitting the clone. Forgetting to clone your input makes the value of every input string a reference to the input string buffer. The symptom is that all your input strings have same content, the value of the last input string io.laststring.
4.11 Case study: objects "A bank has a single customer, and the customer has a single account. A customer has a name, gender, address, and a bank account. Money can be deposited into and withdrawn from the account, and the balance can be displayed. Interest is added daily on the current balance; the interest rate is 4.5% a year. The customer executes a single transaction of each type." Main points in this chapter •
An object is an instance of a class. Each object has its own variables, and all objects share the routines in their class.
•
An object is created by the creation command !!. If needed, a creation routine can be called to change the object’s data from their default values.
•
A feature is an attribute, a function, or a procedure. A feature is called from a client class by writing a feature call of the form object.feature (arguments)
•
If a feature is called from within its own class, no object and dot are needed; Eiffel uses the current object and expands the feature call to the form Current.feature (arguments)
•
A basic or expanded type has an immediate value and need not be created. A reference type has a reference as its value. The semantics of assignment and equality are different for expanded and reference types, and are called reference and value semantics.
•
An operator may be an infix or a prefix operator. It has the same behaviour as a function, but it is called using infix or prefix notation, not the normal dot notation.
•
Eiffel provides special operators to test objects for equality (equal) and to copy objects (copy, clone), in both shallow and deep versions.
•
STRING is a reference type, so each string input has to be cloned to get a new version of the string from the string input buffer.
Exercises 1. each.
What is the data structure of a basic type? What is the structure of a reference type? Draw an example of
2. What happens when an object is created? State what happens when there is no creation routine for the class, and when there is a creation routine for the class. Why must a creation routine be a procedure?
© R. S. Rist, 1993
69
3. What code is needed in the client and in the supplier classes to create an object using a creation routine? If there is a creation routine in a class, can I create an object without calling that creation routine? Can a creation routine receive arguments? Can a class have several creation routines? 4. Describe the mechanism used by Eiffel to find the code needed to execute a feature call from one client to another (object.feature). What happens when there is no explicit object mentioned? What is the value of Current? 5. What is the difference between an operator and a function? What is an infix operator? What is a prefix operator? Give three examples of operators in Eiffel. 6. Show the format of an infix operator definition. Write a definition for the infix operator #mod (modulus). What is the signature of this operator? 7. Define a class FAHRENHEIT that stores a temperature in degrees Fahrenheit. Define a class CELSIUS that stores a temperature in degrees Celsius (Centigrade). Write a function in FAHRENHEIT that returns a REAL number, the temperature in Centigrade. Write a function in CELSIUS that returns a REAL number, the temperature in Fahrenheit. Write a DRIVER root class to read in a single temperature, create the two objects, and display their values. The basic form of the creation routine in the root class is
make is
-- drive the rest of the system local fahrenheit: FAHRENHEIT celsius: CELSIUS do io.putstring (“Enter the temperature in fahrenheit: “) io.readreal !!fahrenheit.make (io.lastreal) !!celsius.make (fahrenheit.to_celsius) fahrenheit.show celsius.show end -- make 8. A farmer has 3 pigs. A pig has a name, weight, and age. Write a system that creates the three pigs, then shows their average weight. Write your system in the following stages: a) List the classes b) Draw a client chart c) For the class PIG, define the attributes and the signature and header for each routine d) For the root class FARMER, define the attributes, signatures, and headers e) Code the class PIG f) Code the class FARMER g) Run the system 9. Look at the rock concert specification in Chapter 2, question 16. Change the solution from its procedural form to an OO form; that is, change it from a single block of code to a set of routines in a set of classes. Hint: you need to create three objects of the same class. 10. Write equality functions for the classes POINT, LINE, and TRIANGLE. An equality function returns true if two objects are equal in some sense; here, two objects (of the same type) are equal if they are in the same location. An equality routine has the header is_equal (this: like Current): BOOLEAN is ...
© R. S. Rist, 1993
70
The keyword like allows the type of an argument to be defined as “like this one”.
© R. S. Rist, 1993
71
Chapter 5: Behaviour Keywords: signature, export, assertion, STRING, debugging Reusable software is based on the idea that you never re-write code: you just use it in different ways, and add code as necessary. The most important technique for designing reusable code is to design for reuse from the very beginning, and write a solution for future users, not just for the current use. Reuse will fail if you design a minimal solution to the current problem. Some basic principles for designing reusable software are presented in this chapter, and illustrated with the Eiffel library class STRING. The advantages of reuse principles to avoid bugs in the first place, and recover from any bugs that do arise, are then discussed.
5.1
Look and feel
Reuse is made possible by the definition of a clear, precise object interface. You do not need to know how an object works to use that object, all you need to know is how it behaves. To use a television set, for example, you need to know how to turn the set on, change channels, change volume, and so on. To use a computer you type on the keyboard and see the effects on the screen. To use a car you need to know how to turn it on, how to steer, and how to accelerate and brake. In no case do you actually need to know what happens “inside the box”. Few of us understand the technology of signal transmission, reception, and transformation but we all know how to watch TV. Any discipline develops a set of standard solutions, and over time one of these standards comes to predominate: driving on the left versus right, VHS versus BetaMax, IBM clone versus Macintosh, and so on. You can buy any CD player and it will play any CD. The player and the CD were probably made by different organisations, probably in different countries, but that is not a problem because their interaction has been standardised. The definition of a standard interface allows us to “plug and play”, and any differences are hidden behind the interface. Eiffel was designed to support reuse, so it has a set of strategies for defining a standard interface. The interface to, or external appearance of, a routine is defined by five things: 1. 2. 3. 4. 5.
the name of the routine any values passed to the routine any values returned by the routine the routine’s header comment the assertions on a routine (discussed in Section 5.7).
These five parts allow a programmer to use a well-defined routine with no idea about how that routine is implemented. To find the square root of a number, for example, you use the function sqrt that receives a REAL value and returns a REAL value; exactly how the square root is calculated can be ignored as long as the routine is correct and behaves in the right way. The interface to a class is the set of features that can be used by a client of that class; formally, by the set of exported features. The external features of a class can be found by running a system tool named short on a class, to show the standard interface to that class.
5.2
Routine behaviour
The signature of a feature is a list of the types that are passed to, and returned by, a feature. The signature of a routine can be read directly from the routine header, because the header lists the received types in its argument list, and the returned type for a function. A routine can be identified as a function or a procedure purely by its signature, because a function returns a value and a procedure does not. The routine header and the routine signature contain the same type information, but the header also contains other information.
© R. S. Rist, 1993
72
A signature is usually written as a set of input types, followed by a semi-colon, followed by the output type if any, so it has the form < input types; output type>. The signature of the routine sqrt, for example, is . In system design, much of the code can be written without thinking much about the implementation, by defining the routine headers and leaving the bodies empty. This defines the interaction between routines in the system, with no internal detail. The class ACCOUNT in the first part of the case study contains 12 features. Two of the features are attributes: one variable attribute (balance) and one constant attributes (the interest rate). Six of the features are basic procedures, to set and show the balance, to show the rate, to deposit and withdraw money, and to add interest. Two high level routines are defined to make and show the account; these call the basic routines. and to deposit, withdraw, and add interest to the account. The remaining features are two functions, that return the daily interest rate and the interest to be added to the account each day. The header for each feature is shown to the left below, and the signature is shown to the right: balance: REAL is < - ; REAL > set_balance is < - ; - > show_balance is < - ; - > rate: REAL is show_rate is
< - ; REAL > <-;->
make is < - ; - > show is < - ; - > deposit (amount: REAL) is
<
withdraw (amount: REAL) is
<
REAL ; - > REAL ; - > add_interest is < - : - > interest: REAL is< - ; REAL > day_rate: REAL is < - ; REAL > The signature defines how data is received by and returned to routines, not how the user interacts with the system. User input (io.readX) and output (io.putX) are implemented by code inside a routine, not by arguments passed to, and values returned from, a routine. The only effect that IO has on the signature of a routine is to make it a procedure: if a routine interacts directly with the user, either for input or output, then that routine is a procedure. Input changes the state of the input buffer and output changes the state of the terminal screen, so any input or output routine is implemented in Eiffel as a procedure. The feature header and the signature show the types of the input and output values for the feature. The signature may be defined before the routine is coded and thus define the external behaviour of the routine; design then consists of implementing the behaviour in code. If the correct way to divide a task into parts is not known initially, the code may be written first and then wrapped in a routine to define the signature and support reuse. In either case, the signature defines the precise, external appearance of the routine, and provides the clear division between internal and external that is essential for the design of reusable software.
5.3
Behaviour versus implementation
There are really only two types of behaviour, queries and commands. A query returns information about the state of the object, where a command changes the state of an object. The behaviour of an object is defined by the commands and queries that the object provides or supplies. This approach can be illustrated by thinking of an object as a big, black box with two sets of buttons, "query buttons" and "command buttons". If you push a query button, an indicator lights up on the button and gives you some information about the internal state of the machine. Pushing the button does not change the value, so if you push the button ten times in a row, you'll get the same answer each time (unless a command has changed the state in between queries). On the other hand, when you push a command button, the machine starts screeching and clicking but you do not get any information about what is happening inside the
© R. S. Rist, 1993
73
box. When the machine stops and you push a query button, the answer you get will usually be different from the answer you had before the command was done. The machine has changed state. A procedure changes one or more attribute values, and returns nothing. The attribute values define the state of an object, so a command is implemented by a procedure. A function returns a value and changes nothing, so it is a query. An attribute also returns a value and nothing is changed by getting an attribute value, so an attribute also behaves like a query. A query can thus be implemented by a function or by an attribute. When a value is returned from a query, the value may have been stored in the object as an attribute, or computed by a function. An attribute behaves identically to a function with no arguments; from the outside, it is impossible to tell if a returned value was stored or computed. In the ACCOUNT class, for example, the interest rate was stored as a constant and the daily interest rate was calculated. The daily interest rate could have been stored as an attribute, and the code would behave identically. A function was used because there is a data dependency between the yearly and daily interest rates, so storing two separate figures is error-prone; if the yearly rate changes, the daily rate should change accordingly. It is interesting to compare the two views of a class, from the outside and from the inside. From the outside, all we see is behaviour and the distinction is between command and query; this view is shown to the left below. From the inside, we see that behaviour is implemented as data or routines; this view is shown to the right below.
behaviour
command
procedure
implementation
query
function
attribute
data
code
attribute function procedure
A feature of a class is either a command or a query. It can be implemented as an attribute, a function, or a procedure. The behaviour of the class is more important than its implementation, so all features are indented equally in the class listing.
5.4
Class behaviour
Any feature of an object can always be called from inside that object. Only some features can be seen and used outside of their class, however; we call these exported features. Each feature in a class has an export policy that defines whether it can be seen by a client of the class. The export policies of all the features thus define the external behaviour of the class. The export policy is defined when the feature is defined, by writing one or more class names in curly brackets after the feature keyword preceding the feature definition. The named client classes can then use the exported features. If no names are placed after the feature keyword, then any class can use the features. The main export policies that can be set on a feature are Export clause
Meaning
feature feature {ANY} feature {X, Y, Z} feature {} feature {NONE}
exported to all classes exported to all classes exported to classes X, Y, Z exported to no class exported to no class
The feature keyword appears in the class definition as many times as needed to define the export policy of each feature in the class. A policy is set by the feature clause, and is in effect until the
© R. S. Rist, 1993
74
next clause. All features exported to a specific class are written under the feature keyword with that class, then the next set of features defined with its export policy, and so on. Hidden features are listed in the class under a feature {NONE} policy. The export policy provides two advantages for system design. First, it provides a clean, precise definition of the behaviour of a class; the class behaviour is defined by the behaviour of its exported features. Second, it provides security for the system, because only the listed clients can use the features of that class. In the bank example, it is crucial for the security of the bank that only a small, specified set of classes can actually create a new bank account, or deposit money into it. Common error: feature is not exported to client Error code: VUEX (2) Error: feature of qualified call is not available to client class. What to do: make sure feature after dot is exported to caller. The creation clause may also contain an export policy to list the classes that can call the routine as a creation routine. The format of the export policy is the same as for feature:
creation {A, B} make A class can specify several creation routines under the single creation keyword. There may be several creation clauses, if the designer wants a different export policy for each creation routine. The creation and the export status of a routine are independent, so a routine can be called as a creator (using !!) or as a normal feature (no !!). It is even possible (though unusual) to export the creation status (creation) to one class and the non-creation status (feature) to another. Common error: creation routine is not available to client Error code: VGCC (6) Error: Creation instruction uses call to improper feature. What to do: Make sure that feature of call is a creation procedure, is not ‘once’, and is avalable for creation to enclosing class. A class is designed around the data that it contains, but is defined by its behaviour. The actual way that the data is stored or implemented is usually hidden, and only the behaviour is exported. This allows us to change the way that the data is stored, without affecting the behaviour of the class. If the attributes of the class are exported, then every time an attribute changes, the class and its users have to be changed; a simple change can thus have large effects on the system. In order to insulate as much of the system as possible from change, the attributes are normally hidden. It is possible to export an attribute, and in fact it is often necessary to export an attribute, but this should be avoided; it makes the code less reusable. It forces the client to know about the exact form of the data, and the client as well as the supplier has to be amended when the representation changes. If you absolutely have to export an attribute, then simple export it. Do not write a function that returns the attribute value and export the function. Seen from the outside, there is no difference between these methods. Seen from the inside, the function takes more code, is less easy to read, and is more likely to trip you up if the attribute ever changes.
5.5
Listing order
Exported features provide a way to group the feature definitions in a class. Exported features define the class interface, so some programmers prefer to make the interface clearly visible in the code by listing all the exported features first, then all the hidden or private features. This view of the system is provided by a class diagram (see section 5.6) or a short listing (see section 5.9). Dividing the listing into public and private features makes the calling structure of the system very hard to follow, however; the reader cannot predict where a routine is likely to occur in the class listing, and has to search through pages of code to find a feature definition. The call structure provides another way to organise a code listing. If the designer decides that calling order is most important, then each exported routine is followed by its hidden routines, in calling order. The location of a called feature definition is then easily predicted, and easily found by looking
© R. S. Rist, 1993
75
down the listing from the call. The listing tends to look cluttered and the public features are hard to find, however, because each change (from exported to private and back) needs a new feature keyword and policy. No convention on listing order has yet become standard. This book places the attributes and their get, set, use, and show routines at the top of the class; both the attributes and their basic routines are usually private, and it is clear where to search for code that gets and shows a value. Often, the remaining features are all exported so both the class interface and the called routines are easy to find in the listing. Where an exported routine calls a private routine, they are listed in calling order to help the reader (and designer) of the code. Utility routines, that are called in many places in a class, can be listed at the end of a class because they are usually simple and are tested (called) often, so they should be correct and are seldom examined. The classes in a system are listed in client order.
5.6
System charts
The basic relation betwen two classes is that of client-supplier. A supplier class provides a set of services that are used by the client class. The client declares, creates, and uses objects of the supplier type. Formally, a client relation is defined by a declaration of type SUPPLIER in the client class. A client chart is used to make the structure of the system clear, without a great amount of detail. Each class in the chart is shown by the name of the class enclosed in an oval. A line is drawn from the client to the supplier, from left to right on the page, to show the client links and reveal the overall structure of the system. Note that this chart does not show the objects in the system, just the classes. The code in one class may create 10,000 objects of another class, but this defines only a single clientsupplier relationship. Showing every declared type on a diagram would make the diagram hard to read and thus defeat the purpose of the diagram, so three things are not shown in a client chart: 1. 2. 3.
The expanded types: INTEGER, REAL, DOUBLE, BOOLEAN, CHARACTER. The reference type STRING The formatting classes FORMAT_INTEGER and FORMAT_DOUBLE
The client chart for the simple banking system in Case Study 2 is shown below. It says that a bank has at least one customer, and a customer has at least one account.
BANK
CUSTOMER
ACCOUNT
A class diagram is used to describe the structure of a class. The standard form of a class diagram (Booch, 1994; Coad and Yourdon, 1990; Rumbaugh et al., 1991) is a box with round corners, divided into three parts. At the top of the box is the class name. The name of each attribute in the class is then listed, followed by the name of each routine. This notation can be extended to provide more detail by including the type of each attribute, and the argument list (if any) and the returned type (if any) for each routine. A class diagram does not quite match the way Eiffel works, because it shows all the attributes and all the routines of the class. First, Eiffel makes a strong distinction between behaviour and implementation, so strong that from outside the class you cannot tell if a query is implemented as an attribute or as a function. Second, it is good software engineering practice to hide as many of the attributes as possible. A standard class diagram does not show the interface to the class; it shows the implementation. I have adapted the standard notation to be slightly more Eiffel-like, by showing the signatures and only the exported features. A class diagram for each class in the simple banking system is shown below. A class diagram is not drawn for an Eiffel library class, because such a class often has a large number of features, and is described in the Eiffel Library Manual.
© R. S. Rist, 1993
76
BANK patron: CUSTOMER
make
CUSTOMER name: STRING gender: CHARACTER address: STRING account: ACCOUNT make display use add_interest
ACCOUNT balance: REAL interest_rate: REAL
make display deposit (amount: REAL) withdraw (amount: REAL) show_balance add_interest
The two types of diagram can be combined to form a system diagram, that shows a client chart for the system and a class diagram for each user-defined class. A system diagram does not show the fine control or calling structure of the system, nor does it show the data flow in the system. It shows the client (and will later show the inheritance) structure of the system, and a partial implementation of each class. Designing and using notations is a huge industry, and no notation for OO systems has yet emerged as standard. Pure Eiffel notations for system design and documentation are described in Jézéquel (1996) and Waldén and Nerson (1995).
5.7
Assertions
An explicit feature interface allows a feature to be used without worrying about the implemetation; this is just what a programmer wants. A feature is defined once, and then reused forever. A caller must pass the right arguments to the feature, because actual and formal arguments must agree in order and type. The definition of arguments says nothing about the values of those arguments, however, it just defines their type. In most cases, however, there are also restrictions on the values that are passed to and received back from a routine. To define these restrictions, a contract is established between client and supplier by setting preconditions and postconditions on a routine. A precondition is a condition that must be true before the routine is executed; we say the condition is asserted to be true on entry to the routine. If I wish to calculate the average, for example, the appropriate code is average := sum / count. Before this code is executed, I must be assured that the value of count is non-zero, because a divide by zero is undefined and will crash the system. Thus, a precondition on this routine is that count be non-zero. If this precondition is satisfied (if it is true), then the routine guarantees to return the correct value for the average. The correct value of the average is is a postcondition on the routine; it is an assertion that must be true when the routine exits. Pre- and postconditions can be explicitly defined, and checked when the routine is entered and exits. If the precondition is violated, then the software contract becomes void, and no result is guaranteed. If the precondition is true and the postcondition is violated, then the code in the routine is incorrect. Assertions enforce the software contract. For two assertions pre and post, the contract is defined by testing the pre assertions on entry to, and the post assertions on exit from, the routine. The general form of an assertion is a statement that evalutes to true or false; it may be implemented as a value, expression, or function call. An assertion may be preceded by a name, that is used to make the meaning of the test clear. The general form of an assertion is thus name: logical expression Multiple assertions in a pre- or postcondition are separated by semi-colons. The format for a routine with assertions is shown below, where an optional TYPE is shown for functions.
name (arguments) <:TYPE> is -- header comment
© R. S. Rist, 1993
77
local declarations require pre-condition(s) do code ensure post-condition(s) end -- name When a routine with assertions is called, the following sequence of events occurs: 1. 2. 3. 4. 5. 6.
Any arguments are bound. The pre-conditions are tested. If they fail, the system dies and an error message is shown. Any local variables are created and set to their default values. The routine body is executed. The post-conditions are tested. If they fail, the system dies and an error message is shown. The contract was valid, so control is returned to the caller.
As an example, consider the example of withdrawing money from a bank account. The routine signature is , because an amount of money is passed to the routine, deducted from the balance, and no value is returned to the caller. Three conditions can be defined on this routine to make sure that it works in the right way. First, the amount of money to withdraw should be greater than zero; you cannot ask for a negative amount of money. Second, the value of the balance should be reduced by this amount, after the routine has been executed. Third, the balance should never be negative. A routine that implements withdraw is shown below, with assertions on its input and effect; multiple assertions are separated by semi-colons.
withdraw (amount: REAL) is -- deduct this amount of money from the balance require positive: amount > 0 funds: amount <= balance do balance := balance - amount ensure changed: balance = old balance - amount end -- withdraw The old operator can be used to check the value(s) changed by a procedure. The keyword old before an attribute in a postcondition refers to the value of that attribute on entry to the procedure, so the change made by the procedure can be defined and checked by the post-condition. Functions change nothing, so functions never use this keyword. The routine header shows the signature: a single REAL value is passed to the routine from its caller, and no value is returned. The precondition defines what must be true when this routine is called: the amount to withdraw must be positive, and the withdrawal must succeed. The postcondition defines what must be true when the routine exits: the current balance must equal the old balance plus the amount withdrawn. These facts completely define the behaviour or action of the routine, and can be used to design the routine, to check that it is correct, and to describe the routine to a user. A function that is only a simple expression usually does not contain a post-condition. The post-condition in this case would simply repeat the expression, so nothing is gained by repeating the expression as a post-condition. Local variables can only be used in a routine body or in the postcondition of their routine. The special local variable Result can only be used in the body or postcondition of a function.
© R. S. Rist, 1993
78
A routine does not contain code in the routine body to test if it has been called in the right way. In Eiffel, design by contract says: "If you call me in the right way, I guarantee to return the correct answer". If the client calls the routine incorrectly and violates the contract, the contract is broken by the client and the supplier routine no longer guarantees anything. This is an important principle in the design of an Eiffel system: it is the responsibility of the client to call the routine in the right way. The signature and the assertions on a routine define the formal behaviour of a routine. They can be used to specify the behaviour of a feature before any code is written, and thus be used to design the system. Assertions may be be used while the system is running, to check that the code really does do what was specified. Finally, the assertions can be used to communicate with a programmer who is looking for a routine to accomplish a task, and scans through the class to find if such a routine already exists. You turn assertion testing on and off by setting the assertion status in the Ace file. Assertion testing can be set to one of six levels, where each level adds to the previous one: 1. 2. 3. 4. 5. 6. 6.
assertion (no) no assertion checking assertion (require) test pre-conditions assertion (ensure) also test post-conditions assertion (invariant) also test class invariant assertion (loop) also test loop variants and invariants assertion (check) also test check instructions assertion (all) same as assertion (check)
Loop variants and invariants, and the check instruction, are not covered in this text.
5.8
Class invariants
An assertion can be defined on a class, as well as on a routine. The class invariant defines what must be true of any object of that type. The class invariant is checked on entry to, and on exit from, any feature of the class except the creation routine; the object does not exist until after a creation routine has returned control to the caller. The invariant may be false within a routine, though this is unusual. The format of a class invariant is one or more assertions placed under the invariant keyword; multiple assertions are separated by semi-colons. The keyword is placed by convention at the end of a class, after the last feature definition. The case study uses a class ACCOUNT. Consider the situation where the balance of an account should never be less than zero. This is part of the definition of the class, so it is an assertion on the class and not on a feature of the class, so it is defined as a class invariant. The invariant is added by writing the two lines of code
invariant not_negative: balance >= 0.0
in the class listing after the last feature definition, and before the end of the class. The balance is set to zero when an object is created, and should never become negative. If the code in a routine of the class does make the balance negative, then Eiffel flags a class assertion violation on exit from the routine and generates a run-time error. 5.9
Documentation: the short form of a class
Many forms of notation and documentation are used to describe a computer system at a higher level than the code. It is a commonplace in computing that the documentation never keeps pace with
© R. S. Rist, 1993
79
the system itself. When the system changes, any existing documentation has to be changed to reflect the new system. If this is done, it is a constant drain on resources for the company. If it is not done, the documentation is out of date. Eiffel solves this problem by storing the documentation in the code, and provides a set of tools to derive class and feature definitions directly from the code. The short tool is an executable program that takes a class as an argument and produces a document from the class definition. The tool reads the text file, and records the external interface of every exported feature in the class. Non-exported features are not listed, because they are not part of the external interface. To find the useful features in a class, I simply type short and the system displays a short listing of the class. You can produce this document for any class that you define, or for any class in the Eiffel library. Eiffel uses your Ace file to find the location of the class, and scans the text file for that class. short will only work if you have an executable system file in your current directory. To illustrate how useful such documentation can be, the short definitions (output by running the tool short on the class ANY) for the routines copy, clone, and equal are given below. The routine header gives the type of the input data passed to the routine, and output data passed back from a function; this defines the signature of the feature. The interface provides all the information needed to use these routines, with no knowledge of the implementation. The interface definitions are:
copy (other: like Current) is -- Copy every field of other onto -- corresponding field of current object require other_not_void: other /= Void ensure is_equal (other) end -- copy clone (other: ANY): like other is -- Void if other is void. -- Otherwise, new object is field-by-field identical -- to object attached to other ensure equal (Result, other) end -- clone equal (some: ANY; other: like some): BOOLEAN is -- Are some and other either both void -- or attached to field-by-field identical objects? ensure Result = (some = Void and other = Void) or (some /= Void and other /=Void and then some.is_equal (other)) end -- equal The three routine definitions use two Eiffel features that have not yet been covered. The keyword like allows the type of an argument to be defined as “like this one”. The class ANY matches a class of any type. Because the routines can be applied to an object of any type, the object received as an argument can be of any type. Defining one argument to be like another (to be of the same type) means that the routine can declare and use an argument of the appropriate type, no matter what type of object was passed to it.
© R. S. Rist, 1993
80
A short listing does not list all the features offered by a class, just the features that are defined in the class and exported. A class may also inherit features, discussed in Chapter 10; a listing of all the available features can be genrated by riunning flat on the class, with a command of the form flat Common error: No executable in the current directory; short produces no output. What to do: Compile an Eiffel system so that you have an executable in your current directory.
5.10 The Eiffel library class STRING An abbreviated version of the short form of the ISE Eiffel Version 3.3.7 library class STRING is shown below. The class offers many more features than these; the full list may be found by looking in the Eiffel Version 3 Library Manual, that lists the services of over 100 library classes, or by running short on the class in your system. A sequence of selected features are shown below, then some of them are described after the short listing.
-- Character strings class interface STRING creation procedures
make (n: INTEGER) --- Allocate space for at least n characters. require non_negative_size: n >= 0 ensure capacity = n exported features infix “<=“ (other: like Current): BOOLEAN -- Is current string less then or equal to other? ensure Result implies not (Current > other) infix “<“ (other: STRING ): BOOLEAN -- Is current string lexicographically lower than other? infix “>=“ (other: like Current): BOOLEAN -- Is current string greater then or equal to other? ensure Result implies not (Current < other) infix “>“ (other: STRING ): BOOLEAN -- Is current string greater than other? ensure Result implies not (Current <=other)
append (s: STRING) -- Append a copy of s at end of current string. require
© R. S. Rist, 1993
81
argument_not_void: s /= Void ensure count = old count + s.count capacity: INTEGER -- Number of characters guaranteed to fit in space -- currently allocated for string copy (other: STRING) -- Reinitialise with copy of other. require other /= Void ensure count = other.count -- For all i: 1 .. count, item (i) = other.item (i) count: INTEGER -- Actual number of characters making up the string ensure Result >= 0 empty: BOOLEAN --Is string empty? fill_blank -- Fill with blanks ensure -- For all i: 1 .. capacity, item (i) = Blank is_equal (other: STRING): BOOLEAN -- Is current string made of the same character sequence as other? item, infix “@” (i: INTEGER): CHARACTER -- Character at position i require index_large_enough: i >= 1; index_small_enough: i <= count put (c: CHARACTER, i: INTEGER) -- Replace by c character at position i. require index_large_enough: i >= 1; index_small_enough: i <= count ensure item (c) = c remove (i: INTEGER) -- Remove i-th character. require index_large_enough: i >= 1;
© R. S. Rist, 1993
82
index_small_enough: i <= count ensure count = old count - 1 resize (newsize: INTEGER) -- Reallocate if needed to accommodate at least newsize characters -- Do not lose any characters in the existing string require new_size_non_negative: newsize >= 0 ensure count >= newsize; count >= old count substring (n1: INTEGER, n2: INTEGER): STRING -- Copy of a substring of current string containing all characters -- at indices between n1 and n2 require meaningful_origin: 1 <= n1; meaningful_interval: n1 <= n2; meaningful_end: n2 <= count ensure Result.count = n2 - n1+ 1 -- For all i: 1 .. n2 - n1, Result.item (i) = item (n1+ i - 1) to_lower -- Convert string to lower case. to_upper -- Convert string to upper case. end interface -- class STRING The first feature in the listing is the creation routine for a string, that allocates enough storage to store n characters, where n is the single input argument. The precondition states that n must be nonnegative, and the postcondition states that the string can now contain n characters. The operator "<=" is an infix operator that compares the content of two strings on the basis of lexicographic order. Lexicographic order compares characters in the two strings from the first (leftmost) to the last, stopping when the characters are not equal. The value used for comparison is the ASCII value of the character, so 'a' < 'b' < ... 'A' < 'B', and so on. Because it is an infix operator, it is called by writing "s1 <= other" instead of using a normal feature call of the form "s1.<= (other)". The feature is called on a string, and takes a string as argument (the argument is like the current object). There is no restriction on the value of the input argument. If a value of true is returned, then the current string is not greater than the string passed as an argument. The append operator takes a string as an arguemnt, and appends the argument to the end of the current string. It requires that the argument not be Void, so the passed reference must actually point to an object of type STRING. If this precondition is satisfied, then the routine guarantees that the new value of the current string has a total length of the old string and the string passed as an argument; the old operator shown in the listing refers to the value of the current object on entry to the routine. The feature substring takes two integers as arguments. It returns a value of type STRING, that is the part of the current string between positions n1 and n2 in the string. Consider the strings s1 and s2 shown below, where a substring of s1 ("der") is assigned to the string s2:
© R. S. Rist, 1993
83
s1: STRING is "Wonderful time" s2: STRING s2:= s1.substring (4, 6) io.putstring (s2)
==> "der"
The function returns a string of length 3, guaranteed by the postcondition Result.count = n2 n1 + 1). The returned string is a copy of the string containing all the characters from the third to the fifth position, as stated in the header comment and the commented postcondition
-- For all i: 1 .. n2 - n1, Result.item (i) = item (n1+ i - 1) This is a comment rather than an executable postcondition because the Eiffel proof machinery to process assertions cannot handle the logical quantifier "for all"; a discussion of this topic would take us far beyond Eiffel, however, into the field of automatic proof checking. The preconditions of each feature are used to check that the feature has been called in the right way; this is part of the assertion checking that enforces programming by contract. If a precondition is violated, then the label to the left of that precondition is displayed as an error message. If a client called the feature substring to get a part of the current string and passed values that were invalid, then the name of the violated precondition (meaningful_origin, meaningful_interval, or meaningful_end) would be displayed as part of the error message to tell the user exactly what went wrong. There are two names for the function that returns a character from a string, given the position of the character. The first name is item, a function that is called using the normal dot notation, such as s.item (3). The second name for the function is the infix operator @, that is called using the normal infix format, such as s @ 3. Both names refer to the same function body, and are thus names of the same feature. The class interface illustrates how reusable software is reused. A programmer need not write code for any of these STRING features, simply find the appropriate feature in the class, examine the interface to see how it is used, and then use it.
5.11 Errors Errors can occur at three levels in the design and coding of a computer system; errors in a program are commonly called bugs. The first and simplest level is that of syntactic errors, where the syntax of a statement is incorrect; an error in the syntax is flagged when you try to compile the system. The second level includes type and interface errors, where each part of a system is correct in isolation, but the pieces don't fit together; in Eiffel, this level is also discovered and flagged at compile time. The third level includes semantic errors, where the system compiles and executes but prodcues the wrong result. A system may also be stylistically wrong or badly designed, but no compiler can yet catch this type of error. Eiffel tries to catch as many errors as possible at compile time, because it is better to fix bugs as soon as possible in system development. The best solution to the problem of errors is to avoid them in the first place (antibugging). All the design and layout rules given in this book are antibugging tools; they are the products of long experience in making code easier to write and understand. If, after designing the system in parts, using classes and routines to define small modules, and coding and testing each part as it is added, your system is still buggy, then it is time for debugging; this is the most infuriating, unpleasant aspect of programming.
5.11.1 Antibugging Antibugging is the use of tools and techniques to avoid the generation of bugs. The most common novice error is to forget an end statement. Every routine, every if statement, every inspect
© R. S. Rist, 1993
84
statement, and every loop statement requires an end. Because the end is not the focus of attention during coding, it is easy to forget it. The compiler will discover that an end is missing, and generate an error message that specifies a line number, but the error may not be in that line. The Eiffel compiler generates an error message when it can't parse the current statement; this is what the compiler means by an error. For this reason, the actual error will be at or before the flagged line. If an end is omitted from an if statement inside a loop, the compiler will probably not notice the error at that point, because the code can still be parsed. The statements following the 'missing' end are simply added to the code under the control of the if statement, as if they were part of the if statement. The end of the loop is then interpreted as the end of the if, and the end of the routine is interpreted as the end of the loop. The first time the compiler realizes that something is wrong is when it sees the next routine header inside the 'current' routine. At that point, the compiler gives up and generates an error message, but the real error may be long before the line flagged by the compiler. The best way to avoid this error is to check that every compound statement is terminated by an end. The best way to ensure this is to indent the code to show flow of control; a missing end then visually 'jumps out' at the programmer. Indenting should be done when the code is written, not as an afterthought; it is a simple, powerful aid to the programmer. The second level of errors involves inconsistent code, where two pieces of code are correct by themselves, but do not fit together. This type of error is normally found by the compiler as a result of Eiffel's strict type checking. One example of type checking is that actual and formal arguments must agree in number, order and type; if they do not, then the calling and called pieces of code don't fit together. Eiffel catches a large number of errors during the compilation stage, errors that in a less strict system would not be found until run time. Finding errors at the compilation stage is a great advantage, because the compiler tells you where the error is, and sometimes what the error is; you don't have to track it down.
5.11.2 Debugging The third and most difficult type of error is one that occurs at run time, when the compiled code is executed. The error may cause the system to crash with a run time error, or the system may run to completion but produce the wrong output. This last type of bug is the hardest to find, because there is no obvious place to look for it; the system executes and terminates normally. The hardest part of debugging is finding the bug; once found, it is comparatively easy to fix. When Eiffel detects an error at run time, the system terminates with an exception message. The message says that an exception was generated because an assertion was violated; the name of the assertion is usually shown, to indicate what was wrong. Eiffel also shows the location of the error by displaying the calling stack at the time the error occurred. When a routine is called, it is placed on a stack; more formally, a record of the routine call is placed on the call stack. When the routine exits, the record is taken off the stack. The current feature is thus at the top of the stack, the feature that called it is second on the stack, and so on down to the creation routine for the root class at the bottom of the stack. By looking at the routines on the stack, it is easy to find which routine contained the run time error. By looking at the assertion that was violated, it is easy to find what the error was in that routine. In the Case Study, Part 2), the make routine in BANK calls the creation routine in CUSTOMER. The make routine in CUSTOMER creates an account, then executes one transaction of each type on the account. In the scenario used here, assume that an error occurs when the withdraw routine is executed in class ACCOUNT, because the funds precondition is violated. At that point, Eiffel will halt the system and show the current state of the calling stack. While each version of Eiffel uses a different format, the run-time error output should look something like CLASS
ROUTINE ERROR
ACCOUNT
withdraw
funds:
CUSTOMER
withdraw
Assertion
balance >= 0 violation
© R. S. Rist, 1993
85
BANK
make
Assertion
violation The bug that caused the crash is shown at the top of the stack of routine calls. The table says that the funds assertion was violated in the withdraw routine of class ACCOUNT. This routine was called from the withdraw routine of class CUSTOMER, which in turn was called from the make routine of class BANK. Knowing that the balance was wrong, it is a simple step in this example to find the error; the code in the withdraw routine of class CUSTOMER did not check that the withdrawal was legal, before executing it. It is the responsibility of the caller to do this, so new code would have to be added in class CUSTOMER to validate the user input and fix the bug. If the system runs but produces the wrong output, your code is syntactically correct but it solves the wrong problem. In this case, the location of the bug can be very hard to find. Few things are more frustrating than staring at code for minutes ... hours ... days, and then realizing that the bug was in a different part of the system entirely! Don't stare at code for more than a couple minutes; if you don't find the bug in that period, it is time to work smarter, not harder. There are two standard debugging tools; be the computer, and see the data. The first tool requires you to play the role of the computer, and see what the code actually does, as opposed to what you think it does. Pretend that you are really dumb, as dumb as a computer, and that you can understand nothing except very simple instructions; but you know exactly what to do with each instruction. Go through the code using actual data values, and see if the code behaves the way you expected. This technique is known as hand execution of the code, because you execute the code "by hand" and use paper and pen to write down the values made by the code. The second standard debugging tool is the use of debugging output to locate the bug. If your system is hundreds or thousands of lines long, then the first task is to work out where the bug is not, so you can narrow down the location of where the bug must be. The technique of writing a series of small routines is the antibugging solution to this problem, because the bug must be located in the small amount of code inside the routine. If your routine is large, then you should think strongly about making it more modular by breaking it into a number of simpler pieces. This technique also makes the code reusable, and allows you to test a set of small, easy to understand routines. If you have examined the code in the small routine and still can't find the bug, then collect more information by placing output statements in the code. If the output is correct, then the error must occur after that position in the code; if the output is incorrect, then the error must occur before that statement. When you don't know the answer to a question, seek more information, don't stare at the code; debugging output gives you that information. Eiffel supplies a special debug keyword that allows you turn on and off debugging output. In your code, you write a debug clause of the form
debug instruction instruction ... end and the instructions in this clause (probably some form of output) are executed when debugging is turned on. To control debugging, set the debug status in your Ace file to: 1. 2. 3.
debug (no) debug (yes) debug (all)
no debugging execute debug clause same as debug (yes)
You can also have several types or levels of debugging, by attaching a string to the debug clause in your code, of the form
debug (“Hard stuff”). © R. S. Rist, 1993
86
instruction instruction ... end In your Ace file, you turn on this labelled debug clause with a statement of the form debug (“Hard stuff”) You can have multiple occurrences of a labelled debug clause, and multiple labels. Execution of each labelled clause is turned on by including a debug line with that label in your Ace file. To be able to use debugging output, you must know the correct or expected value of the output, so you can compare the actual to the predicted value. For this reason, test values should be as simple as possible so you can easily calculate the correct answer; values of 0 and 1 are good candidates. In testing the gross_pay routine above, for example, you might enter 40 for the number of hours and 1 for the pay rate; if you entered 53.72 for the hours and 12.346 for the rate, then it is hard to even calculate the correct answer, and thus hard to see if your code is correct. A good place to check that the code is correct is at the boundaries of a routine; Eiffel uses pre- and postconditions for just this purpose.
5.12 Case study: export and assertions Part Five of the case study shows the class and feature assertions for the system of three classes, and makes most of their features private. Main points covered in this chapter •
Reuse is made possible by defining a clear, explicit interface between parts. Eiffel makes a strong distinction between behaviour and implementation. To use something, we do not need to know how it works, only how it behaves.
•
The interface to a routine is defined by its name, signature, comment, and assertions. The signature constrains the type that can be used, and the assertions constrain the value that can be used. An assertion is a statement that evaluates to true or false.
•
A pre-condition defines what must be true on entry to the routine, and is listed under the require keyword before the routine body. If a pre-condition fails, then the caller is wrong.
•
A post-condition defines what must be true on exit from the routine, and is listed under the ensure keyword after the routine code. If a post-condition fails, then the routine is wrong.
•
It is the responsibility of the caller to call a routine in the right way. Design by contract says “If you call me in the right way, I guarantee to produce the right results. If not, not.”
•
The behaviour of a class is defined by its exported features. The class interface can be seen by running the short or flat tool on a class.
•
Eiffel uses assertions to check that a routine has been called correctly. If a call is incorrect, then the system dies at run-time and shows the state of the call stack at that time. The failing assertion is shown at the top of the stack.
•
Antibugging is the prevention of errors by careful design and good habits. The best habit is to define a set of small, simple, reusable routines that are easy to understand and check.
•
The hardest part of debugging is finding the error. The most powerful debugging tools are hand execution and the use of debugging output. Eiffel supplies the debug clause to control debugging output.
© R. S. Rist, 1993
87
Exercises 1. precondition? What keyword precedes the precondition in a feature?
What is a
2. postcondition? What keyword precedes the postcondition in a feature?
What is a
3. when a routine with assertions is called?
What happens
4. Eiffel use preconditions to generate error messages?
How does
5. complete interface definition for a feature?
What is the
6. class REAL. What is the ouput from running short on a class?
Run short on
7. class REAL. What is the ouput from running flat on a class?
Run flat on
8.
Run short on each class of the current case study to show the class interfaces. Run the system, using data that will crash the system with an assertion violation.
9.
How is a creation policy specified? What does a creation policy control?
10.
How is an export policy specified? What does an export policy control? Can a feature have two export policies?
11.
Why must an equality function be exported to its own class?
12.
Consider the following specification: Bill the builder
Bill the builder has come to you for help. He has a job to convert a tool shed into a shrine to Elvis Presley, and needs to know how much he should charge for the building job. Write a system that prompts him for input, and shows him the amount of material he needs, and the amount of money he should charge for the job. The tool shed consists of one large, rectangular room. It is 3 meters high, 2.8 meters wide, and 5.6 meters long. The owner, Mr. Prince, wants to cover the walls in an expensive wallpaper made of crushed red velvet, with silver outlines of Elvis on it. The windows he wants are specially glazed with frosted outlines of angels. The doors are covered with mats of Kentucky blue grass. Show Bill how much wallpaper, glass, and matting he should buy, and how much to charge Mr. Prince. Wallpaper comes in rolls of 50 square meters, and costs $299.99 per roll; you cannot buy partial rolls. Glass is cut to size, so Bill can buy exactly as much as he needs; glass costs $89.99 per square metre. Matting is bought in units of a square metre, and costs $312.00 per square metre; you cannot buy it in smaller pieces. Bill charges $45 an hour for his labor. Mr. Prince keeps changing his mind about how many windows and doors he wants, and how large they are. Thus, you have to read in all the relevant data as input, since the plans can change without notice. When he took the job, Bill insisted that all the doors in a wall were the same size, and all the windows in a wall were the same size. Read in the number of windows in each wall, and their dimensions, read in the number of doors in each wall and their dimensions, and calculate the amount of
© R. S. Rist, 1993
88
material that Bill has to buy. After all the room details have been input, ask Bill how many hours he thinks he will need to do the job. Show the amount and price of each material needed, the amount and price of labor needed, and the total price for the job. Program details The program reads in, for each wall, the number and size of the windows, and the number and size of the doors. It also reads the estimated number of hours needed for labour. The program calculates the amount of each material needed, and the cost of buying the materials. Bill can cut the wallpaper and matting to size, so you need not worry about whether a window cuts Elvis in half, or not. The program also finds the cost of labour, and the total cost of the job. Assume that all measurements are accurate to two decimal places. A sample output from the program looks like this: BILL'S BUILDING BILL
Wallpaper: Glass: Matting: Labor:
a) b) c)
Amount
Cost
Buy
Total
41.67 0.57 4.80 18
299.99 89.99 312.00 45
1 rolls 0.57 sq. meters 5 sq. meters 18 hours
$299.99 $51.29 $1560.00 $810.00 -----------$2721.28 ------------
Define the data used in the system and work out the data flow. Define the classes in the system by placing each of the variables in a class. For each class, define the signature for each feature.
© R. S. Rist, 1993
89
Chapter 6: Selection Keywords: control flow, BOOLEAN, if, block, inspect Any program can be built from three components: sequence, selection, and iteration. These components define the control flow in a system, the order in which actions are executed. Listing order and routine calls define the sequence, the if and inspect statements define the selection, and the loop statement defines the iteration in a system. The basic flow of control is serial order: instructions are executed in the order given in the routine listing. When a routine is called, the code in the routine is executed and control returns to the caller. Both listing order and routine calls define a single, linear path through the code. The execution path can be split into several alternate paths, depending on the value of a test. The simplest test uses a BOOLEAN value to test if an expression is true or false, and uses this value to control an if statement. Depending on the value of its test, the if statement selects which action to execute next. The inspect statement inspects an enumerated value and selects the appropriate instructions to execute next.
6.1
Sequence, selection, and iteration
A computer executes actions in the order they are listed in the code, unless told to do something different. There are three ways to change the flow of control in a system from the simple listing order of code. Calling a routine transfers control to that routine. Selecting a block of code transfers control to that block. Iteration controls how many times a block of code is repeated inside a loop. In all three cases, control returns to the code immediately after the routine, the selection statement, or the loop. Any program can be built from sequence, selection, and iteration; the approach of using only these three components to build a program is called structured programming.
Sequence
Iteration
Selection
test ?
false
true
action
action 1
next action
next action
test ?
true
false
action 2
action
next action
A control flow chart shows the order in which actions are executed when the code is run. Because a control flow chart was the first type of chart in general use, it is often referred to as a flowchart. Control flows from one line of code to the next if actions are executed in sequence; this is shown to the left of the diagram below. Control may be split so that several next actions are possible, depending on the value of a test; such a pattern is shown in the middle of the diagram. Control may return to a previous action or not, depending on the value of a test; this pattern is shown to the right of
© R. S. Rist, 1993
90
the diagram. A test evaluates to either true or false, so it is implemented as a Boolean variable, expression, or function. In a flowchart, there are three main symbols. A small circle (not used here) indicates the start or the end of the program, a box represents one or more lines of code, and a diamond represents a Boolean test. The arrows that connect these symbols indicate the flow of control. The standard conventions are to draw the control flow from top to bottom and left to right in the chart, and to draw each box so it has only a single entry point. If a chart is spliyt across several physical pages, then numbered circles are used at the top and bottom of each page, to show how the charts connect across pages.
6.2
BOOLEAN values
A variable of type BOOLEAN can have either the value true or the value false; true and false are keywords in Eiffel. A Boolean variable is declared like any other variable, and has an initial, default value of false.
name
type
value
valid: BOOLEAN
length
BOOLEAN
false
good_news: BOOLEAN
height
BOOLEAN
false
bad: BOOLEAN is true
width
BOOLEAN
true
A Boolean value can be calculated either by writing a Boolean expression, or by writing a Boolean function that returns a Boolean value. The values true and false can be used as literal values in an expression, but this is unusual because their values don’t change and do not provide a test of anything. A Boolean value is stored in a Boolean variable by an assignment statement. There is no mechanism to read in a Boolean value from the terminal screen (there is no io.readbool). A Boolean value is output by the commandio.putbool that shows either the word “true” or the word “false” on the screen.
6.3
Relational operators
Relational operators take two values as arguments, compare the values, and produce a BOOLEAN result. The operators, their names, and an example of each, are operator = /= < > <= >=
meaning
example
equal not equal less than greater than less than or equal greater than or equal
hours = 8 reply /= 'q' weight < maximum value > minimum my_wage <= your_wage discriminant >= 0.0
All the relational operators are infix or binary operators, because they are placed between the two values they use. Relational operators have lower precedence than the numeric operators, so the numeric values in a flat expression are calculated first and then compared; the total precedence order for all operators is given in Appendix B. Relational operators all have equal precedence, so they are evaluated left to right in an expression, unless the default precedence is overridden by the use of brackets.
6.4
Boolean operators
© R. S. Rist, 1993
91
Boolean or logical operators take one or two Boolean values as arguments, and return a Boolean value. The Boolean operators are not, and, and then, or, or else, xor, and implies; the behaviour of each operator is given by the truth tables shown below. In the table, the symbol T means true, F means false, and ? means don't care. The operator not takes a single argument, so it is a unary, prefix operator; all other Boolean operators take two arguments, so they are binary and infix operators. The operator xor is the exclusive or operator; the expression is true if one or the other, but not both, of its arguments is true. The operator implies is the logical implication operator, that has the formal behaviour that a false premise (the value of a here) implies anything.
not
and
or
xor
a T F
not a F T
a
b
a and b
T T F F
T F T F
T F F F
a
b
a or b
T T F F
T F T F
T T T F
a
b
T T F F
T F T F
a xor b F T T F
and then
or else
implies
a
b
a and then b
T T F F
T F ? ?
T F F F
a
b
a or else b
T T F F
? ? T F
T T T F
a
b
a implies b
T T F F
T F ? ?
T F T T
The operators and then, or else, and implies are lazy operators, because the second argument is evaluated only if necessary. In a or b, for example, if the value of a is true then it doesn't matter what the value of b is; true or anything evaluates to true. In the normal form of the operator, the value of both arguments is tested, so the value of b is always evaluated and tested in a or b. For the lazy operators, the first argument is always evaluated but subsequent arguments are only evaluated if they are needed to find the value of the expression. In a or else b, for example, the value of b is not tested if a is true, because there is no need; if a is true, then the expression must evaluate to true. Arguments in a Boolean expression can be literals (true, false), variables, or expressions. There is no need to compare a Boolean value to true or false to find its value, because the value is either true or false. Two solutions are shown below that have the identical behaviour, but the first is longer and harder to read: if good_gender = true then ... if good_gender then ... If the Boolean has a simple, meaningful, and clear name, then the meaning of the test is obvious and the code is crystal clear. © R. S. Rist, 1993
92
The precedence order for the Boolean operators is shown below, using the precedence levels given in Appendix B; a high level means that an operator is applied before an operator at a lower precedence levels. Precedence level 11 5 4 3
Operator not and or implies
and then or else
xor
The precedence order for numeric, relational, and Boolean operators given in Appendix B states that not is evaluated first in a flat expression, then the numeric operators are evaluated, then the relational operators, and finally the remaining Boolean operators. Complex expressions are evaluated using this default operator precedence order, unless it is overridden by brackets. Several examples of complex Boolean expressions and their order of evaluation are given below; the values of the variables in each expression are given above the examples:
6.5
a = true, b = false, c = false
x = 12, y = 3, z = 6.5
a and b or not c -> a and b or true -> false or true = true
not (x > y) and (x >= z) -> not true and (x >= z) -> false and (x >= z) -> false and true = false
not a and then b or c implies b -> false and then b or c implies b = false
(x / 3 * y) /= z + 6 -> (4.0 * y) /= z + 6 -> 12.0 /= z + 6 -> 12.0 /= 12.5 = true
Boolean functions
An Eiffel system is built from a large number of small routines. Each routine is placed in its appropriate class, with the data that it uses or changes. Design by contract means that often the control is placed in a client class, and both the test and the actions are defined in a client class. There are four benefits from this convention: 1. 2. 3. 4.
a set of reusable routines is defined the responsibility for using a routine is clearly placed in the client the responsiblity for supplying a correct routine is clearly placed in the supplier the names of the routines make the meaning of the code obvious.
Consider the print_title routine shown earlier, that tested the gender code for two values. If the gender code was input by a user, it may be incorrect and the input has to be validated before it is used. One way to validate the gender is to use a BOOLEAN function, that returns true for a valid gender code and false otherwise. The solution shown below sets Result to be true if the code is valid; if the code is not valid, then the function returns its default value of false:
good_gender (code: CHARACTER): BOOLEAN is -- is code a valid gender code? do Result := code = 'M' or gender = 'F' end -- good_gender
© R. S. Rist, 1993
93
The caller of this code is responsible for calling it correctly, so the caller (often a client) would contain code such as
if good_gender (gender) then print_title (gender) end This code is short and blindingly obvious, because care was taken to define small routines with meaningful names that behave correctly. Large blocks of in-line code are a sure sign that the author has implemented their first idea, and not thought about the next person to use the code.
6.6
Selection: the if statement
A selection operator selects an action to execute, depending on the value of its test or condition. The basic conditional operator in Eiffel is the if statement. The basic format of an if statement is the keyword if, followed by a Boolean expression, followed by the keyword then, followed by at least one action, and terminated by the keyword end. The condition in the statement is executed and either zero or one actions are selected and executed, then control passes to the statement following the end of the selection. There are three variants of the if statement; each of them has a final end. Zero or one actions may be executed in the first two variants, but the last variant has to execute exactly one of the alternative actions. An action may be a single line of code, but more often is multiple lines of code, that may include routine calls. A group of actions that are executed as a single unit, without being enclosed in a routine, is called a block of code. The three variants of the if statement are i)
Single condition:
if condition then action end If the condition is true, then the enclosed action is executed. If the condition is false, then no action is done. ii)
Multiple conditions:
if condition1 then action1 elseif condition2 then action2 elseif condition3 then ... end If condition1 is true, then action1 is executed, and control passes to the statement after the end of the compound if statement. If the first condition is false, then the elseif part of the statement is tested, and the same rules are applied: if condition2 is true, then action2 is executed, and if condition2 is false, then the alternate part (elseif) of that statement is tested. The conditions are tested one by one until some condition evaluates to true, or all the conditions have been tested. If none of the conditions evaluate to true, then none of the actions are executed. iii)
Default final action:
if condition1 then © R. S. Rist, 1993
94
action1 ... else
default action end One of the actions must be executed in this form of the statement. The conditions are tested in the listed sequence. If one of them is true, then its action is executed. If none of the conditions are true, then the final else keyword is reached, and the final action is executed, so the final action provides a default if none of the conditions turned out to be true. Because the action can be compound, it is possible code an if, inspect, or loop statement as part of the selected action. This practice is known as nesting code, because one complex action is contained or nested in another. It is common in traditional languages, but uncommon in Eiffel. In Eiffel, a routine provides the basic unit for action, and any complexity is hidden inside in the routine. Each routine is small, does a single thing, and can be reused. Eiffel practice is to enclose complex code in a routine, and select and call that routine rather than writing the code "in-line" as a block inside the control statement. Often the selection occurs in a client class and the action is placed in a supplier class; this makes it clear that the client is responsible for calling the supplier routine in the right way. The layout of an if statement on the listing follows the Eiffel convention for indenting. For a basic if statement, the if and the then are written on the same line, and the end is placed on a new line. When there are multiple alternate actions, each elseif starts a new line, followed by its block of code. If the block is a single instruction that fits on the same line as the condition, it is written on that line. If the action is too large to fit on the line, or if there are multiple actions, a new line is taken and the actions in the block are indented four spaces. Indenting the code is an invaluable aid to reading, understanding and debugging code. When used well, the control flow “is obvious” just from the layout of the listing. Indenting should be done when you write code, because it is so useful in design and debugging. An indentation in the code makes the control flow obvious; if you have actions controlled by a condition, then the controlled actions are visibly indented to show that control. Sloppy, incomplete, or inconsistent indenting is a sure sign that the author did not care about the next person to read their code. Selection is not shown on a client chart. If there is a feature call, then that call defines the client relation between two classes. The relation is true whether or not a particular run of the system executes that feature call, so the client relation does not reflect conditional actions.
6.7
Examples: the if statement
The first example of selection uses the gender of a person to generate a title for their name. The task is to examine the value of a character variable ('M' or 'F'), and produce a string output ("Mr." or "Ms."). There are many ways to do almost any task in programming, and the selection of the best alternative separates the good from the bad designer. All the variants shown below except the first are reasonable solutions for this problem; better solutions are presented later. The first solution is shown below. This is a bad solution. It does the task but is error-prone, because the two if statements are independent. First, this solution does not reflect the logic of the problem, because one of the alternates must be chosen; the choices are not independent. Second, a value should be tested once, not twice, because you run the risk of getting it wrong the second time. If the valid values for gender are changed, it is possible that the person modifying your code would change the first line, consider the change to be finished, and produce buggy code by not seeing and fixing both tests.
if gender = 'M' then io.putstring ("Mr.") end if gender = 'F' then io.putstring ("Ms.") end
© R. S. Rist, 1993
95
A better solution is to have a single test, with two actions, where the second action is the default if the value of the variable is not 'M':
if gender = 'M' then io.putstring ("Mr.") else io.putstring ("Ms.") end This code can be wrapped in a routine, and called to generate the title. The procedure definition is shown below. This print routine is not responsible for checking that it has been called correctly; that is the responsibility of the caller.
print_title (code: CHARACTER) is -- print the appropriate title do if code = 'M' then io.putstring ("Mr.") else io.putstring ("Ms.") end end -- print_title The next example shows how multiple actions may be controlled by a single test, by enclosing the actions in a block; when the condition is true, then all the actions are executed in sequence. The task is to calculate the gross pay, where there are two pay rates, one for normal hours (hours <= 35) and one for overtime (hours > 35). Overtime is paid at time and a half (1.5 times the normal rate). Two solutions are given for this task. The first solution is clearer, the second is shorter; the clearest solution is the best. A short solution is good, and brevity is always to be encouraged, but not at the expense of clarity. You should always write your code with the next person in mind, and clear code is easier to understand and modify.
gross_pay (hours, rate: REAL): REAL is -- gross pay for hours at this pay rate, including overtime -- this is the collapsed version of the feature do if hours <= 35 then Result := hours * rate else Result := 35 * rate + (hours - 35) * rate * 1.5 end end -- gross_pay gross_pay (hours, rate: REAL): REAL is -- gross pay for hours at this pay rate, including overtime -- this is the explicit but long version of the feature local normal_pay, overtime, overtime_pay: REAL do -- normal rate if hours <= 35 then normal_pay := hours * rate else -- overtime rate normal_pay := 35 * rate overtime := hours - 35 overtime_pay := overtime * rate * 1.5 end Result := normal_pay + overtime_pay end -- gross_pay
© R. S. Rist, 1993
96
The normal pay and the overtime pay are calculated separately here, and added in the last line to get the total pay. If the hours worked is zero, then the normal pay is calculated to be zero. If the hours worked is less than 35 then the overtime pay is not calculated, but is used in the final result. The overtime pay is zero if a value is not explicitly assigned, because REAL variables are initialised to zero. This example has used a literal value for the normal time limit of 35 hours, and another for the overtime rate of 1.5. As you can see from the code, the normal time value is used three times in the code, violating the heuristic that something should be done once. A better solution would define and use a constant, so any change to the value needs a single line change (the constant definition) instead of mutiple lines scattered through the code. The last example of selection shown below illustrates a common use of selection, where a range of values is divided into a series of adjoining intervals that cover the range. In this example, a grade has to be calculated based on a student's mark. There are five possible grades, of which only one can be given. The grades are 0 <= mark < 50 50 <= mark < 65 65 <= mark < 75 75 <= mark < 85 85 <= mark < 100
"Fail" "Pass" "Credit" "Distinction" "High distinction"
The obvious solution for this problem is to write a series of five if statements, one for each range. In this solution, each statement tests if the value is greater than the minimum, and smaller than the maximum, so the total solution would use five statements, each with two tests. This first and worst solution is needlessly long and complicated, because the elseif can be used to test each range in turn; the best solution for this task is
grade (mark: REAL): STRING is -- grade for this mark do if mark < 50 then Result := "Fail" elseif mark < 65 then Result := "Pass" elseif mark < 75 then Result := "Credit" elseif mark < 85 then Result := "Distinction" else Result := "High distinction" end end -- grade Only a single boundary value is tested in each line. There is no need to explicitly test if the mark is greater than 50 and less than 65 (for example), because the second test (mark < 65) is executed only if the first condition failed. The mechanism of the elseif statement guarantees that, if the mark is compared to the value 65, then that mark has already failed the previous test and therefore has to be greater than or equal to 50. For the same reason, there is no need for a final condition; if the else clause is ever executed, then no previous condition was true, and the mark must be greater than 85. Explicitly testing both ends of the range in each line is bad, because each boundary is tested twice. The mechanism of the elseif form allows short, clear, and safe code. There is also no need to include code for the cases when the argument value is less than zero, or greater than 100, because that is not the responsibilty of the routine; it is the responsibility of the client to call the routine in the right way. If the routine is called in the right context, then it guarantees to return the right result; if it is not, then the software contract is broken and no result is guaranteed. A precondition could be defined on the routine to enforce this contract, but the client would still need to test the mark's value before passing it as an argument to the function. Common error: use an if statement when a Boolean expression is shorter and clearer. Bad solution:
valid_gender (gender: CHARACTER): BOOLEAN is © R. S. Rist, 1993
97
-- the wrong way to return a Boolean value do if gender.upper = ‘M’ then Result := true elseif gender.upper = ‘F’ then Result := true else Result := false end end -- valid_gender Good solution:
valid_gender (gender: CHARACTER): BOOLEAN is -- is the gender code a valid value? do Result := gender.upper = ‘M’ or gender.upper = ‘F’ end -- valid_gender 6.8
Selection: the inspect statement
The inspect statement allows a multi-way branch for discrete variables; it is similar to a case or switch statement in other languages. An expression is tested at the top of the statement, and a list of possible values is given within the statement; a selection value cannot be listed twice within the statement. When the statement is executed, the expression is evaluated, the value matched to one of the listed values in a when clause, and the corresponding action is then taken. An optional else clause may be included to deal with cases that do not match any of the listed values. The format of the inspect statement is
inspect expression when values then action when values then action ... else action end The inspect, when, and end keywords are indented equally, and the when and then are placed on the same line. If the action can also be placed on the same line, then it follows immediately; if it cannot, then the action is indented on the following line. A block of actions may be controlled by a single test. The else keyword is placed on a new line, and its action follows on that line, or is indented on the next line. All the possible values of the expression must be enumerated in the statement, or an else clause has to be included. If the expression has a value that is not listed in the statement, then the system will crash with a run-time error. The inspect statement can only be used when the values of the expression are of type INTEGER or CHARACTER.. Because unique values are of type INTEGER, inspect can be used to select from a set of unique values. There are three ways to denote possible values for selection: i) ii) iii)
A single value: when 3 then ... A set of values: when 'a', 'e', 'i', 'o', 'u' then ... A range of values: when 1..12 then ...
Multiple values are separated by commas, such as when 1, 2, 6..7, 43, 99..112 then...
© R. S. Rist, 1993
98
The inspect statement can be used to validate the gender code and to generate a title (Mr. or Ms.), and provides the best solution for this task. The possible values are listed together in a single statement, and the action for each value is shown immediately to the right of that value. In the solution shown below, either upper or lower case codes are valid. Because four values are tested here, a long Boolean expression is too cumbersome and the inspect version is both short and clear:
good_gender (code: CHARACTER): BOOLEAN is -- is code a valid gender code? do inspect code when 'M', 'm', F', 'f' then Result := true else Result := false end end -- good_gender Here, there needs to be an explicit else clause, because the user can input values other than the correct ones. The code to print a title can be implemented with an inspect statement, that clearly shows the action for all the valid gender codes. The best solution for this problem is:
print_title (code: CHARACTER) is -- print the appropriate title do inspect code when M', 'm' then io.putstring ("Mr.") when 'F', 'f' then io.putstring ("Ms.") end end -- print_title An example of the inspect statement showing multiple actions for one of the choices is shown below. In this example, the code examines a grade (fail, pass, credit, distinction, high distinction) that was produced from a mark, and prints a friendly message for the student:
inspect grade when 'F' then io.putstring ("Too bad") when 'P' then io.putstring ("OK") when 'C' then io.putstring ("Good work") when 'D', 'H' then io.putstring ("%NOh frabjous day") io.putstring ("%NCalloo, Callay!") else io.putstring ("I know I've made some poor decisions lately, % %but I'm feeling much better.") end In this example, the usual grades are covered by the explicitly listed choices, and any unusual but valid grades are covered by the default action. Common error: input to inspect matches none of the choices What to do: guard the inspect so that only valid inputs are tested, or add an else clause.
6.9
Case study: selection
© R. S. Rist, 1993
99
The BANK system case study is extended by adding guards on the gender and on the amounts to deposit and withdraw. A gender code must be one of ‘M’, ‘m’, ‘F’, and’f’. The amount must be positive, and the balance cannot go negative. Main points covered in this section •
A Boolean value is either true or false.
•
Relational operators take two basic values of the same type, compare them, and return a Boolean value.
•
Boolean operators take one (not) or two Boolean values, and return a Boolean value. The operators are not, and, and then, or, or else, xor, and implies. The operators and then, or else, and implies are lazy, so their second argument is only evaluated if necessary.
•
Operator precedence is not, then the numeric, relational, and Boolean operators. Brackets are used to override the default preferences and to make an expression clearer.
•
A Boolean function returns a Boolean value. The body of the function is often a Boolean expression that is assigned to Result.
•
The basic selection statement in Eiffel is the if statement. It has three forms: if ... then ... end if ... then ... elseif ... elseif ... end if ... then ... elseif ... else ... end The selection executes zero or one of the actions, depending on the value of the conditions.
•
The inspect statement provides a simple multi-way branch for selection, but it can only be used for expressions with INTEGER or CHARACTER values. Its form is inspect expression when values then action ... else action end All possible values of the expression must be included in one of the tests.
Exercises 1. meant by structured programming?
What is
2. What symbols are used in a flowchart? What conventions are used in a flowchart? How many lines of code are in a box? 3. flowchart for dining at a restaurant.
Draw a
4. List the relational operators. What arguments are taken by the relational operators? What is returned? What is the precedence order of the relational operators? 5. List the Boolean operators. What arguments are taken by Boolean operators? What is returned? What is the precedence order of the Boolean operators?
© R. S. Rist, 1993
100
6. precedence order for the numeric, relational, and Boolean operators?
What is the
7. following expressions: • 43 + 5 and 'a' > 'A' or "cat" /= "category" • 11 * 8 \\ 8) xor "myHeight" > "yourHeight" • 5 and then 't' < 'z' and then not (p = Void) • (i = 0) or (j / i ) = k, where i = 0, j = 3, k = 2 • (i = 0) or else (j / i ) = k, where i = 0, j = 3, k = 2 • (i = 0) or else (j / i ) = k, where i = 3, j = 3, k = 1
Evaluate the 32 * 6 <= 9 / not (33 // 3 = 4-6^2/3<
8. structured programming does an if statement implement?
Which part of
9. three forms of an if statement? How many times does end occur in each?
What are the
10. Write the code to show whether you need an umbrella or not. Read in the rainfall for the last half hour as a REAL number. If it is raining (rain > 0), then output a message to take an umbrella; if not, take your sunglasses. 11. Write a class SHOP that sells teddy bears. The shop stores the number and price of a teddy bear. The user inputs a number, you check if you have enough, and reply "OK" or "Nope". If you have enough bears, then sell that number to the happy shopper (decrement the number of bears and increment the money). Develop the code in two steps: a) Code the class template and feature headers. b) Code a routine to process a single user input. 12. diagram to show the logic of this problem:
Draw a
If it is later than 7:30, then I get up; otherwise I stay in bed. If I get up then I have breakfast. For breakfast, I have a cup of coffee and something to eat. If it is summer, I eat corn flakes; if winter I eat porridge; otherwise I eat toast. If I eat toast, then I spread it with butter and either vegemite, peanut butter, or jam. On the first and last days of the month, I use vegemite. Otherwise, on even days I use peanut butter and on odd days I use jam. If the year is 2000, I skip the toast. 13. statement, implement a routine to display the following advice:
Using the if
Age
Reaction
16 17 18 19 20 +
child eager hardworking let me out of here not to be trusted
Do not consider values other than those shown here; that is the responsibility of the caller. 14. What is the format of an inspect statement? What is the restriction on its use? What are the three ways to specify selection values? Can these be combined? What happens if the current value of the test does not match any listed value?
© R. S. Rist, 1993
101
15. inspect statement to implement the following selection and output: Age
Reaction
16-20 21-25 26-30 31-35 36 +
child eager hardworking let me out of here not to be trusted
Use an
16. For problem 15, use an inspect statement with one inspect value per branch, not a range. Is this a better solution than you used fr question 15? Why?
© R. S. Rist, 1993
102
Chapter 7: Repetition Keywords: loop, count, sum, recursion, input validation Any program can be built from three components: sequence, selection, and repetition. The loop statement defines one form of repetition, called iteration. It has three parts. Any loop initialisations are placed after the from keyword. The loop termination condition is placed after the until clause. The loop body is placed between the loop and end keywords. The block of statements in the loop body is repeated until the termination condition evaluates to true, when the loop exits. If the condition is true on entry to the loop, the loop body is not executed. A block of actions can also be repeated by recursion. In recursion, a routine executes a block of code in the routine and then tests a termination condition. If the condition is false, then the routine calls itself to execute the block of actions, passing a value as an argument. If the condition is true, then the routine exits back to its caller. The routine is called and executed until the task has been completed.
7.1
Iteration: the loop statement
Iteration or repetition allows a set of actions to be repeatedly executed inside a loop, until some test forces the loop to exit. After the loop exits, the next statement after the end of the loop is executed. There is only one iteration instruction in Eiffel, the loop statement. The loop statement is made up of a series of clauses. The format of the statement is from initializations until exit condition loop action end Each keyword starts a new line. If the remainder of the clause can be placed on the same line, then it is; usually, it is placed on the next line and indented. If there are no initializations, the from keyword is still coded, but it is immediately followed by the until keyword. The loop's mechanism is 1. 2. 3.
The initializations are performed, if there are any. If the exit condition is true, the loop action is skipped. Otherwise, the loop instructions are repeated until the exit condition becomes true.
Iteration is not shown on a client chart. The chart shows which class uses the services of another, so the number of times that a service is used (above zero) is not relevant. Common error: The infinite loop. If the loop is entered, then the exit condition must be set to true by some action in the loop. If the condition is not made true at some point, then the loop will never exit, and it will execute forever; this is known as an infinite loop. An infinite loop that does not contain an output statement causes the screen to freeze: there is no obvious action, but your code is being executed millions of times a second. An infinite loop that contains a counter soon causes the counter to overflow and generates a run-time error message. An infinite loop that does not cause an overflow error must be terminated by the user, usually with some special key combination such as ^C for a Unix system (hold down Control and ‘C’ at the same time). Do not use ^X or ^Y.
7.2
Examples: the loop statement
© R. S. Rist, 1993
103
The first loop example is a function that sums the numbers from a start value to an end value, inclusive; these values are passed as arguments. It is the responsibility of the caller to pass the correct values, so a call to the routine could result in an infinite loop if finish is initially larger than start; this should be checked with a pre-condition. The loop is controlled by a counter, so it is called a countercontrolled loop. Counters are very common in computing and the variables i, j, k, l, m, and n are often used as counters, because these names are often used as a counter or index in mathematical notation.
sum_between (start, finish: INTEGER): INTEGER is -- sum the integers from start to finish, inclusive local i: INTEGER do from i := start until i > finish do Result := Result + i i := i + 1 end end -- sum_between Common error: infinite loop, caused by a missing increment What to do: Add an increment (i := i + 1 above) inside the loop body The second example finds the average rainfall in a period. Rainfall is recorded each day of the period; end of input is signalled by the special sentinel value of -999, so this type of loop is called a sentinel-controlled loop. The input buffer is used to store each input, so the buffer can be tested to see if it is the sentinel or a valid rainfall value. If the first input is the sentinel value, the code inside the loop should not be executed, so a value has to be read in before the loop and this value tested by the loop condition. If the value is not the sentinel, it has to be processed (added to the sum here), and then the next value read in at the bottom of the loop. The code inside the loop thus has the form "process, then read". The intuitive form of a loop is "read, then process", but this form will not work in this case, because a read has to be placed before the loop test. The input routine prompts the user and stores a value in the input buffer, here in the buffer io.lastint. The routine is a procedure because it changes the state of the terminal screen and the input buffer. The main routine below loops around getting input and processing it by incrementing the value of sum, and the day counter days. The day counter has to start at zero, because there may be no input and the counter should then contain the value zero. The function to calculate the average rainfall value from the sum and count is shown below the input routine.
rainfall, days: INTEGER end_of_input: INTEGER is -999 get_all_rainfall is -- get all the rainfall for the period -- record the sum and the interval do from io.putstring ("%NEnter the rainfall values for the period") io.putstring ("%NTo finish, enter the value -999%N"); get_rainfall (days) until io.lastint = end_of_input loop rainfall:= rainfall + io.lastint © R. S. Rist, 1993
104
days := days + 1 get_rainfall (days) end end -- get_all_rainfall
get_rainfall (today: INTEGER) is -- prompt the user for today's rainfall value do io.putstring ("Enter the rainfall value for day ") io.putint (days + 1) io.putstring (": ") io.readint end -- get_rainfall average_rainfall: REAL is -- average rainfall for the period do if days > 0 then Result := rainfall / days end end -- average_rainfall Common error: No input before the termination test What to do: If the first value can be the special end indicator, use a process - read loop body, not a read - process loop body. A third variety of loop may be called a result-controlled loop, in which the loop repeats until some flag is set to be true. The number of times that the loop executes is not known initially, so the loop cannot be controlled by a counter. The loop is not controlled by the input, so it is not a sentinelcontrolled loop either. A common pattern in programming is to have a loop that executes until either a flag is set, or a counter exceeds a test value, so the exit test for a loop may itself be a complex construct. In such a case, it is usual to define the test as a Boolean function and hide the complex test in that routine. Eiffel has a very simple syntax, so there is only one iteration construct, the loop statement. The language Pascal, in comparison, has three iteration commands: for, while, and repeat. The Eiffel loop construct corresponds to the Pascal while statement, because the test is placed at the top of the loop, before any code in the body of the loop is executed. In Eiffel, most of the effort in system building is devoted to designing a correct solution, so the implementation language has been kept very simple. Common error: Infinite loop, caused by not setting the result value What to do: Make sure that the test value becomes false in the loop body
7.3
Input validation
It is the responsibility of the client to provide the correct data to any routine called by the client. A computer system cannot rely on the user to always input the correct data, however, so input from the user should be validated before it is allowed into the system. The standard way to validate input in Eiffel is to use the I/O system buffers (io.last) to store the input until a valid input has been entered. User input is read inside a loop, until a valid input is received. The loop exits when a Boolean test for valid data returns true, when the buffer contains a valid value. This value is then queried and
© R. S. Rist, 1993
105
used by the system, so the standard input validation technique needs three parts: a loop to get the data, a function to validate it, and a function to return the valid value. Consider the task of getting a menu choice from the user, where the choice consists of a single character. The routine get_choice shows how a value can be read in a loop, until a valid value is input. The loop then terminates and the valid value is used by the rest of the system. The input routine must be a procedure, because it changes the state of the screen and the value of the input buffer. All communication between routines is therefore done via the input buffer using the query io.lastchar.
get_choice is -- get a valid menu choice from the user do from read_choice until valid_choice loop show_error read_choice end end -- get_choice read_choice is -- read a menu choice from the user do io.putstring ("%NEnter menu choice: ") io.readchar io.next_line end -- read_choice valid_choice: BOOLEAN is -- has the user entered a valid choice? do inspect io.lastchar.upper when 'D', 'W', 'B', 'H' , Q' then Result := true else Result := false end end -- valid_choice show_error is -- tell the user the choice was wrong, and how to fix it do io.putstring ("That is not a valid choice. Please try again") io.putstring ("%NThe valid choices are D, W, B, Q, and H%N") io.putstring ("%NYou may use upper or lower case letters") end -- show_error
7.4
Menu processing
The menu system gets the choice and executes it by calling the appropriate routine in another class. In selecting a choice from an automatic teller machine (ATM) menu, for example, a MENU class will get a valid choice and call the appropriate account routine. Class ACCOUNT contains all the code used to manipulate an account, but has nothing to do with MENU handling, because there are many
© R. S. Rist, 1993
106
ways to interface with an account, through a character or graphical menu or directly from the bank. This separation of objects and concerns is crucial to the design of an OO, reusable system. The main control structure needed to get and execute a choice in the class MENU shown below is provided by two routines, get_choice anddo_choice. The get_choice routine was presented in the last section. Because it is a procedure, it cannot return a value, so do_choice picks up the valid value directly from the I/O buffer, in this case from the feature io.lastchar. The routine do_choice must also be a procedure, because it calls other procedures that change the balance of the account or display data on the screen. A customer has a single account in this example. The menu choices on that account are D, d deposit money W, w withdraw money B, b show balance H, h show choices (help) Q, q quit the system An outline of the code in class MENU is shown below.
class MENU creation make feature account: ACCOUNT
make is -- display the menu, execute the choice do
show_choices from get_choice until end_chosen loop do_choice get_choice end io.putstring ("%NY'all have a nice day, hear%N") end -- make feature {NONE} show_choices is -- show and explain the menu choices do ... end -- show_choices
get_choice is ... end_chosen: BOOLEAN is -- has the user chosen to finish? do
© R. S. Rist, 1993
107
Result := io.lastchar.upper = 'Q' end -- end_chosen do_choice is -- execute the choice made by the user do inspect io.lastchar.upper when 'D' then deposit when 'W'' then withdraw when 'B', then account.show_balance when 'H'' then show_choices end -- inspect end -- do_choice deposit is -- read an amount, deposit this amount in the account do io.putstring ("Enter the amount to deposit: ") io.readreal account.deposit (io.lastreal) end -- deposit withdraw is -- read an amount, withdraw this amount from the account do io.putstring ("Enter the amount to withdraw: ") io.readreal account.withdraw (io.lastreal) end -- withdraw end -- class MENU In this solution, class MENU is a client of class ACCOUNT, because it uses the features of the account; the creation or assignment of the account has not been shown above. Such a solution is strange, because it implies that the menu has an account when we normally think that an account has a menu. Getting the choice and executing it have to be separated to make a reusable system, so two classes are required; the menu defines the interface, and the account supplies the actions. The account routines have to be called from inside the menu, so there is no choice in how the solution is coded at this point in time. A much better solution is provided by the use of inheritance; however, presentation of this solution must be deferred until the topic of inheritance is covered in Chapter 10.
7.5
Recursion
A sequence of actions can be repeated by iteration in a loop, or by recursion in a routine. A recursive routine is one that calls itself. The routine is initially called by a client, does some work on the problem, and then calls itself. The called copy of the routine then does some work, and calls itself again. At each step, part of the problem is solved, so eventually the problem is completely solved and no further calls are needed. At that point, the last copy of the routine returns control to its caller, that returns control to its caller, and so on until control is returned to the original client. Input validation provides a simple example of recursion. The input routine gets a value from the user. If the value is correct, then the routine exits. If the value is incorrect, then the routine calls itself to get a new value. Calling continues until a correct value is input, at which point the last called
© R. S. Rist, 1993
108
version exits, returns control to the caller, that exits, returns control to the caller, and so on. The code for a routine that gets a valid menu choice is
get_choice is -- get a valid menu choice from the user do read_choice if not valid_choice then show_error read_choice end end -- get_choice Most recursive routines are functions, not procedures. A recursive function receives an argument, does some work, and passes on a smaller argument to the next copy of the function. Recursion continues until the argument is basic or empty; this is known as the base case in recursion. When the base case is reached, there may be many copies of the routine in memory, each waiting for the next to return control. The last copy of the routine then returns the basic answer to its caller, that returns a value to its caller, and so on. A recursive function has three parts: 1. 2. 3.
The action: at each call, the function does some of the work and passes on a simpler value. The recursive call: the simpler value is passed as an argument to the next copy. The base case: when the base case is reached, control returns back up the stack of routines.
The classic example of recursion is the factorial function, that returns the value of n factorial (written n!). The factorial of an integer is the integer multiplied by all smaller integers, down to 1; four factorial, for example, is given by 4! = 4 x 3! = 4 x 3 x 2! = 4 x 3 x 2 x 1! This is a clear case for recursion, because the value of n! is defined in terms of a simpler problem, that of finding (n-1)! The code is
factorial (n: INTEGER): INTEGER is -- n x n-1 x n-2 x ... x 1 do if n = 1 then Result := 1 else Result := n * factorial (n - 1) end end -- factorial The factorial function is an example of tail recursion, because the recursion is the last code in the function. In tail recursion, the routine does some work and passes on a simpler problem. Some problems are more easily handled by head recursion, in which the recursive call is followed by the routine's processing; in head recursion, the routine passes on a simpler problem, then does some work with the returned value. The factorial function is an example of one-way recursion, because the routine only calls a single copy of itself. Two-way recursion is common, and multi-way recursion is possible. A good example of two-way recursion is the quicksort algorithm, in which a list of values is split into two and each half of the list is sorted - by splitting each half and sorting each quarter of the values. This is twoway recursion, because each call to quicksort generates two recursive calls, one for each half of the argument list.
© R. S. Rist, 1993
109
Recursion is an extremely powerful tool that can produce very compact and powerful code, that is especially useful for scanning data structures such as lists and trees. In some languages, such as Lisp, recursion is the basic form of looping and iteration is quite rare. Traditionally, functional languages tend to use recursion and procedural languages tend to use iteration; Eiffel is a procedural, OO language.
7.6
Case study: iteration
"Build a simple banking system, in which the bank has a single customer, and the customer has a single account. A customer has a name, gender, address, and a bank account. Money can be deposited into and withdrawn from the account, and the balance can be displayed. Interest is added daily on the current balance; the interest rate is 4.5% a year. The customer has access to the account through an interactive menu, such as that used by an automatic teller machine (ATM). The system starts up and waits for the customer to enter a password. The customer is allowed three attempts to enter a valid password. If no correct password is entered after three attempts, then the system terminates. If the password is correct, then the customer is shown a menu of account choices, and the system reads and executes the choices. Any number of transactions may be made; processing on the account continues until the customer chooses to exit the system. The valid menu choices (upper or lower case) are D W B Q H
Deposit Withdraw up to the total amount in the account Show the balance Quit the system Help: Show the menu choices
Interest is added to the account after the system exits." Main points covered in this section •
The iteration statement in Eiffel is the loop statement. It has the form from initializations until exit condition loop action end
•
The most common loop error is an infinite loop, where the designer has forgotten to increment a counter.
•
Input should be validated, so the rest of the system can use a value that is guaranteed to be valid. It is better to guard than to fail and recover.
•
Menu processing consists of the user entering a choice, and the system validating and executing that choice. The menu class should be separate from the action class that actually executes the choice.
•
Recursion occurs when a routine calls itself, and provides another way to repeat a block of code. Procedural languages tend to use iteration, where functional languages use recursion.
Exercises
© R. S. Rist, 1993
110
1. Write a class SHOP that sells teddy bears. The shop stores the number and price of a teddy bear. The user inputs a number, you check if you have enough, and reply "OK" or "Nope". If you have enough bears, then sell that number to the happy shopper (decrement the number of bears and increment the money). Develop the code in four steps: a) class template and feature headers. b) routine to process a single user input. c) around the test, and exit the loop when all the bears have been sold. d) ability to exit the loop at any time. Show the number of bears left at that time.
Code the
2. routine factorial to calculate n! (n factorial) using a loop.
Write a
3. produce the following pattern; use routines for modularity:
Write code to
Code a Wrap a loop Add the
* *** ***** *** * 4. code to produce the following pattern:
Adapt your * +++ ----^^^ !
5. Write the code needed for a simple menu system. The menu gets a single character, and outputs a message or takes some action. The choices and actions are: a, A d, D e, E
"eh?" ask for a number, double it, and display the result exit the menu
6. Validate the user input. If the user's choice is not valid, tell them and get a new choice; repeat this until a valid choice is entered. 7. Write a recursive routine to find the power of a number. The arguments will be the number (REAL), the power (INTEGER), and the function returns a REAL value. 8. The Fibonacci numbers begin with the sequence 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, and so on. Each Fibonacci number is the sum of the previous two numbers; the first two are defined to be the value 1. Write a recursive routine to show the first ten Fibonacci numbers (tricky).
© R. S. Rist, 1993
111
Chapter 8: Arrays Keywords: ARRAY, index, content, parameter, strip The Eiffel library class ARRAY is a container or data structure that allows many objects of the same type to be stored in and retrieved from the data structure. An element of an array is indexed by its position in the array, and is stored to, and retrieved from, that position in the array. This means that array lookup is fast. An array has a fixed size, and the size is not usually changed. If a new element is inserted into an array at some position, then all later elements have to be moved to make room for the new element to be placed at that position. Element insertion into, and deletion from, an array is expensive.
8.1
The definition of an array
An array is an ordered set of elements of the same type, so each element is an object. An element is stored at a given position in the array, such as first, second, third, and so on. Each element thus has an index (1, 2, 3, and so on) that is used to store and retrieve the element. A STRING is an array of characters, so a character can be stored in, and retrieved from, each location in the string. A street may be represented as an array of houses, or as two arrays, one on each side of the street. Each element of the array is a HOUSE. A book may be represented as an array of pages, a page may be represented as an array of lines, and a line may be represented as a STRING. There are two basic operations on arrays: store an element in a specified location, and retrieve the element from a specified location. The location of the element is called its index, and the value of an element is called its content. An object has a name, a type, and a value. This is also true of arrays, because an array is a (compound) object, but now there is an extra level to get to the content of the array. An element of an array has a compound name, because it is located by both the name of the array and the index of the element. Consider an array of 37 integers, whose name is roulette_wheel. To find the value of the first element in this array, you give the name of the array and the index of the element, using a notation such as roulette_wheel[1]. The value of this element is usually 27 on the standard roulette wheel. An element of an array has a compound name, a type, and a value.
8.2
Using an array
A variable is declared to be of an array of a specific or base type, so an array of integers is declared as ARRAY [INTEGER]. The array is declared to be of typeARRAY, then the type of the array elements is listed in square brackets. An array of INTEGER numbers, an array of POINTs, and an array of ACCOUNTs, for example, are declared by the Eiffel code
roulette_wheel: ARRAY [INTEGER] triangle: ARRAY [POINT] accounts: ARRAY [ACCOUNT] Because I can declare an array to be of any type - more formally, the array can contain objects of any type - the class ARRAY is called a generic class in Eiffel. The array is a class, and the type of its elements is called the base class of the array. The base class is defined in the array declaration, and passed to the array definition as a parameter; to avoid confusion, a type passed in a declaration is called a parameter, and a value passed in a routine call is called an argument. The actual parameter is passed to the class header, and bound to the formal parameter in the header; the header for class ARRAY is
class ARRAY [T] ...
© R. S. Rist, 1993
112
Inside the class ARRAY, the code simply refers to a variable of type T, so the code in the class is generic; it works for anything. Parameter binding occurs at compile time, when the declaration is checked and compiled. Argument binding, however, does not occur until run time when the argument value is known. The difference between a type and a class can now be stated: a non-generic class is a type, but a generic class can produce many types of objects, such as objects of type ARRAY [REAL], ARRAY [POINT], and so on. An array is created by calling the ARRAY creation routine make. The creation routine has two arguments that specify the lower and upper bounds of the array, such as
!!roulette_wheel.make (1, 37) Arrays often have a lower bound of 1 (the first index is 1), but this is not essential; if you wanted, you could create an array with indices 123 to 947, or even -32 to 43. The upper bound has to be greater than the lower bound, and the index must be an integer, but there is no other restriction on the values of the bounds. When an array is created, the contents of the array are set to the element default value, such as 0.0 for REAL numbers and Void for reference types. An element is placed in an array by the put operator, that takes two arguments: the element to be put, and the position to put it in. The value 27 is placed in position 1 of roulette_wheel, for example, by the command
roulette_wheel.put (27, 1) An element is retrieved from the array by the item operator, that takes the position of the element as its single argument. The value at location 27 of the roulette wheel, for example, is returned by the query
roulette_wheel.item (27) A string is an array of characters, so the same feature names are used to store (put) and retrieve (item or @) elements in the class STRING, as are used for any array class. The code to declare and create an array of REAL numbers is shown to the left below, and the array is shown to the right. The array is an object with the name example, the type ARRAY [REAL], and a reference value that points to the location in memory where the array contents are stored. In the example, an array of six real numbers is created, so the initial values of the contents are six values of 0.0.
example: ARRAY [REAL] a, b, c: REAL
example 1
2
0.0
!!example.make (1, 6)
3
0.0
4
0.0
5
0.0
6
0.0
0.0
Some code to use the array is shown top the left below, and the effect of the code on the data structure is shown to the right.
example.put (32.6, 1) example.put (9.7, 2) example.put (84.3, 6) a := 66.6 example.put (a, 4)
example
a := example.item (1) b := example.item (3) c := 6 * example.item (2) + a
© R. S. Rist, 1993
1
32.6 a
32.6
2
3
9.7 b
0.0 0.0
4
5
6
66.6 0.0 84.3 c
90.8
113
The formal definition of the client-supplier relation does not quite work for data structures, because an array of integers, say, does not contain a declaration of type INTEGER; the type is passed as a parameter, not declared in the class ARRAY. The convention given in Waldén and Nerson (1995) is to show the client, the array, and the parameter as three classes, connected by client links from left to right; this is different from the convention given in Meyer (1992). The formal parameter T (for type) is shown in square brackets for the ARRAY oval, to make explicit the fact that this is a parameter link instead of the usual client (declaration) link. Consider a casino that uses a roulette wheel, where the wheel is implemented as an array of 27 integers; the client chart for this example is:
ARRAY [T]
CASINO
INTEGER
Values are normally stored in and retrieved from the array one at a time, but it is possible to store a series of values into an array in one operation, by using a manifest array. A manifest array is a series of values separated by commas, and enclosed by the symbols "<<" and ">>". A manifest array may be used in an assignment as shown below, where the elements of the manifest array are stored in the array a, starting from the first element of a.
a: ARRAY[INTEGER] !!a.make (1, 5) a := <<1, 2, 43 + x>> A manifest array may be passed as an actual argument to a routine, if the formal argument is an array of the correct type. This technique allows a routine to have any number of arguments, because the whole sequence of actual values is bound to the elements of the formal argument of class ARRAY. Arrays of higher dimension can be declared and used, such as matrix: ARRAY [ARRAY [REAL]] In a matrix, each element of the (first) array is an array.
8.3
The Eiffel library class ARRAY
The feature interfaces for the class ARRAY may be found from the Eiffel Library Manual, or by running short on the class. The class interface for the ISE Version 3 class ARRAY is
-- One dimensional arrays class interface ARRAY [T] creation procedures
make (minindex, maxindex: INTEGER) -- Allocate array; set index interval to minindex .. maxindex -- (empty if minindex > maxindex). ensure minindex > maxindex implies count = 0; minindex <= maxindex implies count = maxindex - minindex + 1 exported features
copy (other: ARRAY[T]) -- Make current array an element by element copy of other.
© R. S. Rist, 1993
114
require other /= Void ensure lower = other.lower; upper = other.upper; -- For all i: lower .. upper, item (i) = other.item (i)
count: INTEGER -- Number of available indices empty: BOOLEAN -- Is array empty? is_equal (other: ARRAY [T]): BOOLEAN -- Is current array element by element equal to other? -- (Redefined from ANY) ensure -- Result true if and only if, for all i: other.lower .. other.upper, -- item (i) = other.item (i) item, infix “@” (i: INTEGER): T -- Entry at index i, if in index interval. require index_large_enough: lower <= i; index_small_enough: i <= upper lower: INTEGER -- Minimum index upper: INTEGER -- Maximum index put (v: T; i: INTEGER) -- Replace i-th entry, if in index interval, by v. require index_large_enough: lower <= i; index_small_enough: i <= upper resize (minindex, maxindex: INTEGER) -- Rearrange array so that it can accommodate indices down to minindex -- and up to maxindex. Do not lose any existing item. ensure lower <= minindex; maxindex <= upper wipe_out -- Empty the array: discard all items. ensure wiped_out: empty invariant consistent_size: count = upper - lower + 1;
© R. S. Rist, 1993
115
non_negative size: count >= 0 end interface -- class ARRAY The basic operations are to create the array using make, store an element using put, and retrieve an element using item. The lower (start) and upper (end) bounds of the array are given by the features lower and upper, and the size of the array is given by count. It is possible to dynamically increase the sizeof an array with the resize command, but this is rarely done; if the number of elements changes dynamically then it is probably better to use a list, not an array. The exact name and form of the features provided depends on the version of Eiffel you are using. To be absolutely certain about the features, run short on your version.
8.4
Example: ARRAY [INTEGER]
The first example stores integers in an array, writes them to the screen, and then finds the largest value in the array. The index of each element in the array is an integer, and the square of the index is stored as the content of each element. This example shows a very common pattern in array usage, where an index or count is used to access each element of the array. The index starts at the first element of the array, and is incremented by 1 inside a loop until the end of the array is reached. It is essential to increment the index in the loop, or you will generate an infinite loop, that never terminates. This general pattern can be used to fill an array, or to scan through the array. It is used in the make routine to fill the array, and in the largest routine to scan the array.
class X creation make feature {NONE} a: ARRAY [INTEGER] size: INTEGER is 20 feature make is -- create the array of size size -- fill each element with the square of its index -- show each element do
!!a.make (1, size) fill display end -- make fill is -- fill each element with the square of its index local index, content: INTEGER do from index := a.lower -- start at front of array until index > a.upper -- stop at end of array loop content := (index * index) \\ 43 -- put the value at this position a.put (content, index)
© R. S. Rist, 1993
116
index := index + 1
-- increment the index
end end -- fill
display is -- display each element of the array local i: INTEGER do from i := a.lower until i > a.upper loop io.putint (a.item (i)) io.new_line i := i + 1 end end -- display largest: INTEGER is -- largest value in the array local i: INTEGER do from Result := a.item (a.lower) i := a.lower + 1 until i > a.upper loop if a.item (i) > Result then Result := a.item (i) end i := i + 1 end end -- largest end -- class X 8.5
Example: Insertion sort
The second example shows how an array is used to sort a set of numbers into ascending order. The unsorted numbers are placed in the array raw. This array is then traversed from start to end, and each number is placed in the sorted array in its correct position, so the sorted array is always sorted from smallest to highest; formally, array (i) <= array (i + 1) for i from 1 to n-1. The sorting algorithm shown here is called an insertion sort, because it inserts each new value into the correct position in a sorted list. The algorithm has four main parts. First, a new value is found; here, the index of the raw array is incremented to get the next value. Second, the sorted array is scanned from the front to find where the new value should be placed; it should be inserted when a value is found in the array greater than the new value. The new element cannot be simply stored in the array, however, because it would overwrite the value already stored in that location. The third part of the algorithm moves all the later elements down one location, opening up a space in the array. Finally, the input value is stored in its correct location, and the length of the sorted array is incremented by one.
© R. S. Rist, 1993
117
New value
12.3
Find place to insert
3.1 1
23.8 66.6 66.6 2
3
4
0.0 5
0.0 6
Move higher elements
3.1
Insert new value
3.1 12.3 23.8 66.6 66.6 0.0
23.8 66.6 66.6 0.0
The insertion sort algorithm assumes that the sorted array is initially empty and values are added in their correct position one at a time. The sorted array is always sorted and gets larger as new values are added. The key steps are the second (find location) and third (move values) steps; these are illustrated in the diagram above. To insert a new value into the sorted array, elements of the array have to be moved one at a time, starting at the end of the array and moving backwards to the insertion point. Moving elements one at a time forward from the insertion point will not work. In the example, the value 23.8 has to be moved from position 2 to position 3. Simply moving it will destroy the value at position 3 (66.6), however, so the value at that position has to be moved first. The same story can be told for every value to be moved, so the last value must be moved up one position, then the second last, and so on until the desired value can be moved safely. The new value can then be inserted in its correct position, and the size of the used array incremented. The code for an insertion sort is shown below. This code has to be placed in some class, but the class wrapper is not shown here.
raw, sorted: ARRAY [REAL] top: INTEGER is 10 length: INTEGER make is -- demonstrate an insertion sort do
!!raw.make (1, top) raw := <<3, 1, 5, 2, 9, 7, 4, 10, 8, 6>> !!sorted.make (1, top) sort display end -- make sort is -- sort the elements of the raw array into the sorted array do from length := 1 until finished (length) loop value := raw.item (length) © R. S. Rist, 1993
118
insert (value, position (value, length)) length := length + 1 end end -- sort
finished (size: INTEGER): BOOLEAN is -- have all the unsorted elements been used? do Result := size > top end -- finished position (new: REAL; place: INTEGER): INTEGER is -- position to insert new number do from Result := 1 until Result = place or elsenew < sorted.item (Result) loop Result := Result + 1 end end -- position insert (value: REAL, place: INTEGER) is -- insert value into the array at location place local i: INTEGER do from i := length until i = place loop sorted.put (sorted.item (i - 1), i) i := i - 1 end sorted.put (value, place) end -- insert 8.6
-- until finished -- or found
-- from last value -- until place to insert -- move value to next position -- insert new value
Example: ARRAY [PERSON]
The third example stores complex objects in an array, then displays each object. The routine that generates and stores the people uses a single local variable of type PERSON. A person is created and stored, then the next person is created using the same variable. This destroys any existing pointer from the variable name to the object, but that is no problem; the pointer to the object is now stored in the array. The sequence of operations is illustrated below in four steps. First, the array is declared and created (1), then a person is declared (using a local variable) and created (2). At this point in time, there is a reference from the name person to the object (call this first person person-a). The object is then placed in the array (3); more formally, the value of person is placed in the array, so the array now contains a pointer to the object person-a. Person-a is now referenced from two places, from the name of the local variable and from the array. A new person (call this person person-b) is then created (4), breaking the link between the local variable person and person-a, but person-a can still be accessed by retrieving the third element of the array. The new object (person-b) may now be stored in the array.
© R. S. Rist, 1993
119
1
people: ARRAY [PERSON] !!people.make (1, 5)
people
1
1
2
3
4
5
Void Void Void Void Void person 2
local person: PERSON !!person.make
3
2
a 1
3
3
4
Void Void
people.put (person, 3)
person 4
2
4
Void Void
b
!!person.make 1
5
a 2
Void Void
3
4
5
Void Void
The code for this example is very similar to the first example that used an array of integers. The array of size 5 is created, five people are stored in the array, and then the people are displayed. The creation and display routines for class PERSON are not shown here; they were listed in Chapter 4.
class CROWD creation make feature {NONE} people: ARRAY[PERSON] size: INTEGER is 5 feature make is -- create the array, fill it with people, display each person do
!!people.make (1, size) insert_people show_people end -- make insert_people is -- fill the array with people local i: INTEGER person: PERSON do from i := people.lower until i > people.upper loop !!person.make io.new_line © R. S. Rist, 1993
120
people.put (person, i) i := i + 1 end end -- insert_people
show_people is -- display each customer in order local i: INTEGER do from i := people.lower until i > people.upper loop io.putstring ("%NPerson number ") io.putint (i) io.putstring (": ") people.item(i).display i := i + 1 end end -- show_people end -- class CROWD Several routines above each used a local variable to index the array. This is correct, because local is beautiful. It is incorrect to use the same (attribute) variable each time, because an attribute is not meant to increase efficiency. Attributes are used only if a local variable cannot be used. Attributes store the state of the object, so the number of attributes should be kept as small as is reasonable.
8.7
The strip operator
The old operator is used to describe what things have not been changed by a procedure. This can be quite tedious if the class has many features which are not modified. Consider the example of a coke machine, where you put in money and get back a can of soft drink. The class COKE_MACHINE has four attributes, as shown below:
class COKE_MACHINE feature money: LINKED_LIST [COIN] cans: LINKED_LIST [CAN] size: INTEGER is 100 paid: BOOLEAN The class has a feature put_in_money, that records the act of putting a coin in the machine. The routine modifies the attributes paid and money, but leaves cans and size unchanged. We could write an assertion for the routine as follows:
put_in_money (coin: COIN) is -- put a coin in the machine do ... ensure same_size: size = old size;
© R. S. Rist, 1993
121
same_cokes: cans = old cans end -- put_in_money This will work, but it is a bit clumsy and will get more and more so as the number of features in the class increases. We can do much better by providing a notation for "everything but". The strip operator returns an array whose elements are all the attributes of the object, except for those attributes named by the operator; it strips away the named attributes. For example, the expression strip (x, y) evaluates to an array whose elements are all the features of the current object except x and y. We use the strip operator in a post-condition to check the features that have not changed. Because strip returns an array, we must check the content of the array using the array equality test is_equal. An expression such as strip (x, y).is_equal (old strip (x, y)) in a postcondition thus checks that the procedure only modifies the features x and y. More formally, the assertion checks that all attributes except x and y have the same values on exit from the procedure, as they had on entry to the procedure. We can now modify the post-condition to contain a clause ensuring that money and paid are the only features modified by the procedure. The full routine is shown in the listing below:
put_in_money (coin: COIN) is -- put a coin in the machine -- machine must register coin entered require not_paid: not paid; not_empty: not empty; coin_in: not coin.Void do money.insert (coin) paid := true ensure paid: paid; not_empty: not empty; more_money: money.count = old money.count + 1; no_extra_changes: strip (money, paid).is_equal (old strip (money, paid)) end -- put_in_money The expression "strip (money, paid)" in the postcondition evaluates to an array whose elements are all the features in the class besides money and paid: an array with two elements cans and size. The no_extra_changes clause states that the value of this array is the same on exit from the routine as it was on entry, so the only attributes changed by the routine are money and paid. Main points in this chapter •
An array is a data structure that stores a sequence of elements of the same type. Each element has an index, that is used to store and retrieve the content of that cell.
•
An array is declared by passing a parameter in the ARRAY[...] declaration. This actual parameter is bound to the formal parameter (usually called T) in the class header
•
If an element is inserted into the middle of an array, then the existing value at that location is over-written. To save the existing value, all later values may be moved up one cell. If an
© R. S. Rist, 1993
122
element is deleted from the array, that cell may be marked with a special value, or all the later cells may be moved down by one •
The strip operator returns an array of the attributes of the object, with the named attributes stripped out. It is used in the post-condition of a procedure to check that “everything else” has not been changed.
Exercises 1. Write a program to create and display an array of integers. The content of each cell is read in from the user. 2. Execute the insertion sort program using real data; use an array 1..10 of ten integers. Simulate the operation of the sort on paper, by hand. 3. Use an insertion sort to sort an array of people into ascending alphabetical order, by name. Assume that a PERSON has a field name: STRING. 4. Use a bubble sort to sort an array of type INTEGER. In a bubble sort, adjacent elements of the data structure are compared, and swapped if out of order. This basic operation is applied in a scan through all adjacent pairs in the array, and the scan is repeated until the whole list is sorted. 5.
Bubble sort an array of type STRING.
© R. S. Rist, 1993
123
Chapter 9: Lists Keywords: LINKED_LIST, key, scan, RANDOM The Eiffel library class LINKED_LIST allows many objects of the same type to be stored in a list structure. An element of a list is usually indexed by a unique identifier or key. A list has a cursor, which is moved along the list until the desired item is found. The class provides features to move the cursor, and features to use the item at the current cursor location. It is expensive to search a list, because each item in turn has to be tested. Inserting an item into, and deleting an item from, a list is simple and cheap because the size of the list changes as needed.
9.1
The definition of a list
A list is a data structure composed of a series of cells or elements of the same type. Each cell contains a value (such as an integer or a person) and a pointer to the next element on the list. The last cell has a value, but no pointer; more formally, it has a null pointer. The name of the list points to the head or front of the list. To find an element on the list, you start at the front and check each element in turn until you find the cell you want, or get to the end of the list and run out of places to look. A list is normally shown from the left of the page to the right; a list built from linked elements can be shown as a series of linked cells. The name of the list points to the first cell, and each cell in the list points to the next except for the last cell, that has a null pointer.
An object
An object with a link (a linkable cell)
list: LINKED_LIST [TYPE]
A list of cells A singly-linked list, such as that shown above, can only be scanned from start to end because there are no backward pointers from a cell to the previous cell. The doubly-linked lists discussed in this chapter can be scanned in both directions.
9.2
The Eiffel library class LINKED_LIST
The list data structure is implemented by the ISE Eiffel library class LINKED_LIST. The list uses a cursor to point to the current element in the list. The cursor can move up and down the list, so it can be used to scan the list. To find if an element is in the list, for example, the cursor is set to the start of the list initially, and moved down the list one element at a time until the target has been found, or the end of the list has been reached.
9.2.1
Structure
An ISE Eiffel list (henceforth just a list) is illustrated below, showing the cursor and associated features of the list. The first element of the list is returned by calling the query first on the
© R. S. Rist, 1993
124
list, and the last element is returned by last. The element before the current cursor position is returned by previous, the element at the cursor is returned by item, and the element after the cursor is returned by next. The position of the cursor can be tested to see if it is before or after the list; if the cursor is before or after the list, then it is not pointing to an element of the list. An Eiffel list is an object with an internal state; this state is defined by the contents of the list itself, and by the position of the cursor. There are commands to change this position, and queries to find out where it is. There are commands to change an element of the list at the current cursor position, and queries to look at its value. The routines that change or return list elements use the element at the current cursor position, or the element immediately before or after the cursor. There are also features to combine or separate whole lists of objects.
before
first
previous
item
next
last
after
list cursor By convention, a list grows by adding elements to the right. Thus, the front of the list is shown at the left of the page, and the end at the right. You can move the cursor forward one position in the list by the command forth, and move it backward one position by the command back. If the cursor is moved too far to the left, it goes before; too far to the right, it goes after. A LINKED_LIST is a generic class, because it can contain elements of any type. As we saw for an array, a client chart shows the the client, the generic class, and the parameter class connected by arrows from left to right. The formal parameter is shown in the LINKED_LIST oval, to make explicit the type of relationship. the A bank that has a list of customers, for example, has the client chart shown below.
BANK
LINKED_LIST [T]
CUSTOMER
The declarations for the bank, and for a single object of type CUSTOMER are shown below. If both declarations were in the same class, then the client chart would show an indirect link from BANK to CUSTOMER via LINKED_LIST, as well as a direct client link from BANK to CUSTOMER caused by the second declaration.
patrons: LINKED_LIST [CUSTOMER] me: CUSTOMER A list is created by calling the creation routine in the list class, called make. This creates an empty list into which values can be placed. When the list is created, the cursor is before the list. The code to create an empty list of customers is shown below; no bounds are given, because the list is empty when created, and grows as new cells are added to the list.
!!patrons.make !!me.make A cell may be inserted into the list before, at, or after the current cursor position. If it is added before or after, then the list is one cell larger after the insertion. If it is added at the current position, it
© R. S. Rist, 1993
125
replaces the cell that was there. The commands to add a cell to the list at or near the current cursor position are put (value) replace (value)
-- replace current element -- replace current
put_left (value)
-- add as previous
put_right (value)
-- add as next element
element element The commands to place a new element at the front or back of the list are put_front (value) -- add to front of list extend (value) -- add to end of list A cell is retrieved from the list by positioning the cursor, and then calling the query item to return the current object: -- value at current cursor
item position
A cell is removed from the list by placing the cursor at a suitable position, and then removing the cell before, at, or after the current cursor position. The commands to remove a cell from the list are -- remove current element -- remove previous
remove remove_left element
-- remove next
remove_right element
A cell can be inserted into or deleted from a list without moving the existing elements, because only the pointers in the previous or the current cell are changed. This is a great advantage over the array, where many elements often have to be moved when a single element is deleted or inserted; this pattern was illustrated in the insertion sort routine. An illustration of how to remove the current cell in the list is shown below; the state of the list before deletion is drawn first, then the state of the list after the element has been deleted. The only action required is to reset the pointer in the previous cell, so that it points to the cell after the deleted element. Because the Eiffel implementation uses a cursor, and the cursor cannot point to a cell not in the list, the cursor position also has to be modified to point to the cell following the deleted element. The deleted cell disappears from the list, because no pointer in the list points to that cell. At some later point in time, the now unused cell is garbage collected, and the storage at that location is reused.
a_list before
a
b
c
d
e
a_list.remove
a_list after
a
b
d
e
c
© R. S. Rist, 1993
126
List insertion is done by changing the previous pointer so that it points to the new cell, and setting a link from the new cell to the next cell. No objects need be moved; only the values of the links that connect cells are changed.
9.2.2
Features
A selection of the ISE Version 3 LINKED_LIST features is shown below; a full list of features may be found by looking up the class in the Eiffel Library Manual, or by running short on the relevant class. There are three kinds of features: those that deal with the list itself, with the cursor, and with the elements of the list. List features make make an empty list count: INTEGER returns the number of elements in the list merge_left (other: like Current)
add the other
merge_right (other: like Current)
add the other
list to the left of the cursor list to the right of the cursor wipe_out
remove all elements in the list
Cursor features forth move the cursor forward one element back move the cursor backward one element start move the cursor to the first position finish move the cursor to the last position go_i_th (i: INTEGER) move to position i in the list before: BOOLEAN
is the cursor pointing before
after: BOOLEAN
is the cursor pointing after the
off: BOOLEAN isfirst: BOOLEAN
before or after is the cursor at the first
islast: BOOLEAN
is the cursor at the last
the list? list? position? position? Element features item: G
the element at the current cursor
position first: like item
the element at the first list
last: like item
the element at the last list
previous: like item next: like item
the element before the cursor the element after the cursor
put (v: like item)
put the value at the current
position position
cursor position replace (v: like item) put the value at the current cursor position put_left (v: like item) put the value before the current cursor position put_right (v: like item)
put the value after the
current cursor position
© R. S. Rist, 1993
127
remove
remove the value at the current cursor
position remove_left remove the value before the cursor remove_rightremove the value after the cursor put_front (v: like item)
add value to front of
list extend (v: like item) add value to end of list There are many additional features in the ISE Eiffel library list class. Note that the feature headers given above do not show how the cursor is affected by each operation. To find out the correct and current details of your list class, run short on your version.
9.3
Scanning a list
The basic piece of code used in list manipulation is a routine to scan through a list, such as a list of people. The declarations for a list of people and a single person, and the code to add people to the list, look like
crowd: LINKED_LIST[PERSON] person: PERSON make is -- create a list of five people local i: INTEGER do !!crowd.make from until i = 5 loop i := i + 1 !!person.make crowd.extend (person) end end -- make A list of five elements called crowd now exists. Each element is a complex object, a person, so the value of each cell of the list is a pointer to the object. A list is scanned by starting at the front, then moving the cursor forward by one position until the end of the list is reached. The basic operation in the routine is thus forth, repeated as necessary:
from crowd.start until crowd.after loop crowd.forth end
-- set cursor to start of list -- stop scan at end of list -- move forward one cell
To find a specific person, the list is scanned until the person is found, or there are no more places to look. The value of each element in the list is a reference to an object, so a list element cannot be compared with the '=' operator. If each person has a unique name, then the name can be used to identify the person and each list element is compared using the name as the key; a key is a unique identifier for an object. A match function in class PERSON is thus needed, of the form:
class PERSON
© R. S. Rist, 1993
128
feature name: STRING
match (target: STRING): BOOLEAN is -- does the name match the target string? do Result := target.is_equal (name) end -- match Note that the name cannot be tested for equality of content with the '=' operator, because it is a reference type; the test for equality is is_equal from the class STRING. A procedure that finds a person, if they are in the list, is then
find (name: STRING) is -- cursor points to the person with this name, or after do from crowd.start until crowd.after or else crowd.item.match (name) loop crowd.forth end end -- find If a person with the given name is in the list, the loop stops when the cursor is moved to the element that contains that name. If the name is not in the list, the cursor will be after. If the cursor moves after, then the first part of the or else is true and the second part is not executed; an or test would result in an error, because the code would attempt to look at the value of the element after the list. Common error: No forth, creating an infinte loop. When designing a scan loop, the entry and exit conditions are often the focus of attention as may be seen here, so the forth operation is often left out of the loop. The first entry is tested forever, the loop never terminates, and you are sitting staring at a “frozen” screen.
9.4
Cause and effect: matched routines
A function that searches the list and returns the matched element is incorrect Eiffel, for two reasons. First, the function has a side-effect because the position of the cursor is moved. If some other code relied on the position of the cursor, then this function would create untold havoc by changing the state of the list, and the bug would be very difficult to find because it is very effectively hidden inside the function. The second reason is the value that is returned if there is no matching object. The header of the function would be something like
find (name: STRING): PERSON is The second problem is that the routine header says that it returns an object of type PERSON; more formally, it returns a pointer to the object. If there is no such person on the list, then what is returned? One correct and efficient solution for finding an element is to copy the solution that Eiffel uses for reading input: have a command that scans the list for the person, then a query to test the value of the scan. The scan command causes a change, and the query checks the effect of the change. If the person was in the list, the scan command places the cursor in the correct position, so the person can be returned.
© R. S. Rist, 1993
129
The code that scans a list to find a person with a given name is shown below. A procedure is called to scan the list, then the position of the cursor is tested. If the cursor points to an element in the list, then the person has been found, and is returned by the query item, that returns the element at the current cursor position. If the cursor is after the list, then no matching person was found. To hide the link between find and after, a function found may be defined to check the location of the cursor:
find (name) if found then target := crowd.item else io.putstring ("No such person") end found: BOOLEAN is -- check if the person was found in the list do Result := not crowd.after end -- found All the stylistic constraints have been satisfied here: queries and commands are correct, the code in the caller is simple, clear and understandable, and the processing is hidden inside the routines. The code is also efficient, because the list is scanned only once to find the person. The pattern of matched routines is very common in Eiffel, due to the strict division between procedures and functions. The general principle is that the command makes a change, a the query tests if the command succeeded.
9.5
A local cursor
An alternate, simpler, but more dangerous solution is to just use a function, but we must be very clear about why this is a good solution and what it implies for the rest of the system. Formally, a function changes nothing and returns a value. The problem of the returned type can be solved by returning Void if no match is found; the function header is still correct because Void conforms to any reference type, including type PERSON here. The real problem is the cursor; it does move, so a function seems impossible. The trick is to note that “changes nothing” is a shorthand version of a more formal rule, that the state of the world is the same before and after the function call. With this more precise definition, we can change something as long as its value is replaced before the function ends; that is why we can use local varibales in a function. Early versions of Eiffel supplied mark and return features to store and reset the cursor value, but these were removed from later versions. The solution is to treat the cursor position as though it were a local variable, that only exists while the routine is executing. Some versions of Eiffel do actually use local variables of type ITERATOR to iterate through a list, so there is no global cursor position in these versions. Given this approach, we can now write a simple scan function that returns the object if there is a match, and Void if there is no match. The function would look something like this:
find (name: STRING): PERSON is -- the person in the list with this name, or Void do from crowd.start until Result /= Void or crowd.after loop if crowd.item.match (name) then Result := crowd.item else crowd.forth end
© R. S. Rist, 1993
130
end end -- find This solution works because the cursor is defined to not exist outside this function. The price of this solution is that if it is used once, it must be used forever more, by everyone who ever uses or extends your code. An explicit reference to the cursor position, outside this function, destroys the fiction that the cursor is a local variable and reveals that the function was a lie: it changes the (now explicit) cursor position.
9.6
Array or list?
An array is a good data structure to store a series of objects if the objects are stable. It is computationally cheap to find an element of the array from its index so array access is, in general, faster than list access. If the data is volatile (elements are often added or removed), then the list is a more efficient data structure because its elements don't need to be moved when the list length changes. A choice between the array and list data structures basically depends on the volatility of the data, where low volatility implies an array, and high volatility implies a list.
9.7
Class RANDOM
The Eiffel library class RANDOM provides a random number generator. An object of type RANDOM is created, and is then used to provide a sequence of random numbers. A random number generator produces a sequence of numbers within some defined range, that appear to be generated randomly. They are generated by a mechanism, however, so they are actually completely predictable. The first random number is based on a seed valueThe calculation of the next random number has the form item := ((item * multiplier) + increment) / modulus where item is the current random number. The generator can produce integers with value from 0 to modulus. The current item can also be returned as a real or as a double number, with a value from 0 to 1. Because the numbers are generated by a computation, the same sequence is generated every time a RANDOM object is created; to produce a new sequence, the new object is given a new seed value to start its sequence of random numbers. A partial short listing for the class is given below.
class interface RANDOM creation make, set_seed
make -- Initialize structure using a default seed ensure seed_set: seed = default value
set_seed (s: INTEGER) -- Initialise sequence using s as the seed require non_negative: s > 0 ensure seed_set: seed = s modulus: INTEGER © R. S. Rist, 1993
131
-- Default value 2^31 - 1 = 2, 147, 483, 647 -- May be redefined for a new generator
multiplier: INTEGER -- Default value 7^5 = 16, 807 -- May be redefined for a new generator increment: INTEGER -- Default value 0 -- May be redefined for a new generator item: INTEGER -- Item at current position real_item: REAL -- The current random number as a double between 0 and 1 double_item: DOUBLE -- The current random number as a double between 0 and 1 forth -- Move to next position end A sequence of random numbers is generated by calling forth and item within a loop. Here is the code to show five REAL random numbers:
show_random is -- show five random real numbers local random: RANDOM i: INTEGER do !!r.make from until i = 5 loop random.forth io.putreal (random.real_item) io.new_line i := i + 1 end end -- show_random A single random number generator object can be used in different parts of a system by defining a once routine that returns the same object every time. There is then just a single random number generator shared by every part of the system, so there is only a single sequence of numbers. A once routine to return the same RANDOM object every time is shown below, preceded by its calling code:
varied (n: REAL): REAL is © R. S. Rist, 1993
132
-- the value of n varied by a factor within + or - 10% of n local random: RANDOM factor: REAL do random := shared random.forth factor := (real_item * 20) - 10 -- value in -10 to +10 Result := n + n * (factor / 100) -- value in n - 10% to n + 10% end -- varied
shared: RANDOM is -- the same random number generator object every time once !!Result.make end -- shared The modulus, multiplier, and increment are implemented as once functions. To give them a new value, the class RANDOM is inherited and the features are redefined (see Chapter 10).
9.8
Case study: lists
"The bank can have many customers. Each customer has a unique integer key; successive integers are used for every new customer. The bank runs over an extended period. At the start of each day, a bank teller adds interest to every existing account and then creates new customers; customers are never deleted. The ATM then runs all day, handling multiple customers. To use the ATM, a customer enters their unique key and password. Any number of transactions may be made on the account, until the customer is tired of playing with their money and exits the system. The ATM then waits for the next customer, until a special key of 666 is entered; this exits the ATM system for the day. The cycle then resumes for the next day: the teller adds interest and new customers, and the ATM system runs. Entry of the special key value of 999 into the ATM shuts down the whole system." Main points in this chapter •
A list is a data structure that stores a sequence of elements of the same type. An element often has a unique key that is used to identify that element
•
A list is declared by passing an actual parameter in the LINKED_LIST[...] declaration. The actual parameter is bound to the formal parameter in the class header
•
A list allows you to insert and delete objects at any point in a list by changing at most one link pointer
•
The Eiffel class LINKED_LIST has a cursor, that is used to move around in the list. The class offers three kinds of features: those dealing with the list, with the cursor, and with the elements of the list.
•
The Eiffel class RANDOM is a random number generator. The numbers behave as though they were retrieved from a list of random numbers.
Exercises 1.
Write the code to create and display a list of integers. Use a list of size 10.
© R. S. Rist, 1993
133
2.
Adapt the insertion sort code to sort the list of integers.
3. Implement the Unix command finger. finger takes a single name, and searches the list of system users for the user with that name. It returns their name and login name, so you can send mail to the login name. a) Assume that a single name is supplied, search the family names. b) Assume that a single name is supplied, search both personal and family names c) What happens when there is a space in the supplied name? 4.
Consider the following system specification: The FeedMe Plant Nursery
The FeedMe plant nursery has employed you to design a simple inventory and accounting system for them. They want to keep track of the money and the amount of stock on hand. The nursery sells fruit trees, such as apple, orange, plum and apricot trees. The nursery has to keep track of how many trees of each type it has on hand. Money is divided into three categories. First, there is the money in the nursery's bank account. Second, there is money owed to the nursery from credit sales to customers. Third, there is the money owed by the nursery to its suppliers. At any time, the total capital of the nursery can be calculated from these sources. There are two basic transactions in the system: selling trees to customers, and buying them from suppliers. The transactions change the amount of stock on hand, and may change the amount of money on hand. Trees are moved immediately the transaction is complete. If a customer pays cash, then the bank balance is also changed immediately. If a transaction is on credit, the money will not change until four days later. Suppliers get the same deal. A NURSERY has the attributes: name: STRING balance: REAL trees: LIST [TREE] An instance of a TREE is not a single tree. It is a record of the nursery's stock of that tree. In particular, the instance contains a count of the number of trees on hand of that type. A TREE has the attributes: name:STRING; season: STRING; -- season when fruit is ripe buyPrice: REAL; sellPrice: REAL; stock: INTEGER; Customers owe money to the nursery; this is money coming in. Suppliers are owed money by the nursery; this is money going out. The lists of credit and debit transactions has to be stored, so that money can be transferred four days after the transaction. A transaction has the form A TRANSACTION has the attributes: amount: REAL; delay: INTEGER; When a credit transaction is made, a transaction object is added to the appropriate list. Four days later (a day is indicated by an 'N' message), the money is added to, or subtracted from, the bank balance of the nursery. Cash transactions change the bank balance instantly. The system is interactive, and offers a menu of choices to the user. The user makes a choice, the choice is executed, and control then returns to the menu. The menu choices are S
Sell any number of trees, of one or more types, to a customer, using cash or credit.
© R. S. Rist, 1993
134
B D T C N E ?
Buy any number of trees, of one or more types, from a supplier. The transaction can be cash or credit. There is no requirement that the nursery already has the tree type. Display a list showing all the stock on hand. Show the type of the tree, the number of trees on hand, and the selling price of the tree. Show all the details of a single tree type. Show the capital of the nursery at this instant in time. To find this, take the bank balance, add the money owed to the nursery, and deduct the money owed by the nursery. Show the three subtotals, then the balance. This transaction simulates a new day; I include it so you do not have to access a system clock. When the user chooses this option, it means that a new day has started. As stated above, credit transactions do not change the money until four days after the transaction. Exit the system. Show and explain the menu choices.
For each menu choice, you may need more details. If a customer buys some trees, then you must get the relevant details from the user. For each type of tree purchased by the customer, you will need to prompt for the type of tree and the number wanted." a) b) c) d)
Write the class names and attributes. Write the routine names and signatures for each feature in each class. Debate the location of the transactions. There are at least three 'reasonable' solutions. Code and test the system.
© R. S. Rist, 1993
135
Chapter 10: Inheritance Keywords: open-closed, inherit, parent, child, redefine, rename, Precursor, export Inheritance provides a new way to reuse a class in an OO system, separate from the clientsupplier relationship. When one class inherits another, all the features of the inherited, parent class become features of the child, inheriting class. This means that an existing class can be extended with no change to its code. Existing users of the parent class are not affected, because the behaviour of the parent is not changed. A child class inherits the parent class and can use the parent features unchanged, can redefine parent features, can rename parent features, and can add new features.
10.1 Look and feel Inheritance supports a style of software development different from traditional approaches. Instead of trying to solve a new problem from scratch, existing solutions are inherited and extended. The benefit of this approach may be explained in terms of the open-closed principle, which states that a good module structure is both closed and open; in Eiffel, the module is the class. A module should be closed so that clients are protected from any changes in the working system. A client uses the services supplied by a module, and once these services have been defined the client should not be affected by the introduction of new services they do not need. A module should be open so that it can be changed and extended as needed. There is no guarantee that every service offered by a class can be defined once and never changed. Successful software systems continually undergo change, as the needs of the users change and develop. This double requirement looks impossible, but it is solved by inheritance. A class is closed, because it may be compiled, stored in a library, and used by clients. A class is open, because any new class may inherit it as a parent, and add new features as desired. When the child class is defined, there is no need to change the original class or to disturb its clients. New code often contains errors, and a user does not want to deal with errors caused by new software for some other, new user. Inheritance provides a powerful way to extend code by adding new features to an existing class without changing that class at all. This feat is accomplished by allowing one class to inherit another; the new class can use everything in the original class, plus any additional features defined in the new class. Assume we have an existing class A; this will be the parent class. If B (the child) inherits from A (the parent), then all features of A are available in B, with no need to define them. The child is free to add new code for its specific purposes, or use the inherited code in other ways. The child class treats all the inherited features exactly as though they were written inside the child. A client of the child sees no difference between the inherited, and the new features; they are simply features of the child.
10.2 Inheritance chart In Eiffel, a child class inherits services or features from its parent classes. Two examples of inheritance are shown below, using an inheritance chart. In the example to the left, the class SAVING inherits the class ACCOUNT, because a savings account is a specific type of account. The class SAVING thus has all the normal features of an account, plus whatever features are particular to a savings account. In the inheritance chart shown to the right, there are two specific types of account, SAVING and CHEQUE accounts. Again, each child class has all the features of its parent, plus any specific features that separate the child from the parent. In an inheritance chart, a single arrow is drawn from the child to the parent; by convention, parents are drawn above their children in the chart. Unlike the real world, a child class chooses its parents; this is indicated by an arrow pointing from the child to the parent. The parent class cannot know which class might later inherit it, just as a supplier cannot know which class might later use its services.
© R. S. Rist, 1993
136
ACCOUNT
ACCOUNT
SAVING
SAVING
CHEQUE
A more complex inheritance chart of seven classes is shown below. Class A is inherited by both B and C, so A is the parent of the two child classes B and C. Classes D and E inherit from class B, so D and E are children of B. Classes B, C, D, and E inherit from a single parent, so they show single inheritance. The class at the top of an inheritance hierarchy is called the base class, because it is the basis for all the other, more specific classes. Here, class A is the base class for B, C, D, E, and F. Class F inherits from both class C and class G, so it has two parents and uses multiple inheritance. In general, a class can inherit from any number of parents. Inheritance from a single parent is discussed in this chapter, and multiple and repeated inheritance are discussed in Chapter 12.
A
B
D
C
E
G
F
Inheritance is transitive, so class E has all the features inherited from its parents (B here), its grandparents (A here), and so on up the inheritance hierarchy. Any feature defined in A is a feature of B, and any feature of B is a feature of D, so class D can contain features from A, from B, and new features added in D. Class F contains features of its own, plus features inherited from C and from A. We say that any parent of a class is an ancestor of that class, possible many links removed from the base class. In turn, any child of a class is called a descendant, possibly many links removed from the original class. In the inheritance chart, Class A is the ancestor of classes B to F, so classes B to F are heirs or descendants of A. Class F is also a descendant of class G, just as class G is an ancestor of class F. A child is a descendant one link down, and a parent is an ancestor one link up the chart. A feature that is coded within a class is called an immediate feature of that class. All other (non-immediate) features of a class are inherited. A class diagram of a class shows only the immediate features of that class.
10.3 Syntax and mechanism One class inherits another by writing the keyword inherit, followed by the name of the parent class. The keyword is written immediately after the class header. The general syntax is shown below, where CHILD is the name of the new, child class and PARENT is the name of the parent class; the child class is free to define its own features, in addition to the features inherited from its parent:
class CHILD inherit PARENT
© R. S. Rist, 1993
137
When one class inherits another, the code in the child works exactly as though the parent code was written in the child. As far as the operation of the child class is concerned, there is no difference between features inherited from the parent, and features defined in the child. Feature calls treat all features of the class identically, and the source of a features is invisible. A class cannot inherit itself either directly or through a chain of other classes; such a situation is known as a cycle. Cycles are not allowed in the inheritance hierarchy; a class cannot inherit itself either directly, or through a chain of inherited classes. The word inherit is written at the start of the line, at the same level as class and feature. The name of the inherited class is written on the next line, indented four spaces. The inherit statement has a series of clauses, that modify the status of an inherited feature in various ways. The full form of the inheritance statement is shown below, where capital letters indicate a class and small letters indicate a feature. A class can inherit from multiple parents; in this case, the parent classes are separated by a semi-colon in the listing. A child may inherit all features of a class and use them unchanged, or it may change an inherited feature in various ways. If a clause is used to change the status of an inherited feature, then the set of clauses is terminated by an end.
class CHILD inherit A rename m as n export {X, Y} o, p undefine q redefine r, s select t end; inherit B ...
-- new name in child -- new export policy in child -- no definition in child -- new body in child -- select active feature
The rename clause gives an inherited feature a new name in the child; the feature body and signature are retained, but the feature has a new name. The export policy is inherited as part of a feature; the export clause gives a new export policy to a feature. A feature may be deleted or undefined in the child; discussion of this clause is deferred to the next chapter. The name of a feature can be retained, but its body may be changed if the feature is redefined in the child. Finally, if there are several features in the child with the same name, one of them can be selected to be the active feature; discussion of this clause is deferred until Chapter 12. It is the responsibility of the child to name its creation routine under the keyword creation in its class definition. A class inherits its parent’s features, so it inherits the parent's creation routine if one exists. The child does not inherit the creation status of this routine, however. The child must explicitly state its creation routine in its creation clause. A class does not inherit the expansion status of a parent. If the parent class was expanded, and you wish the child to be expanded, then the child class header or the child object declaration should contain the keyword expanded. The base class (that is expanded) must have either no creation routine, or a single creation routine with no argument. A class usually contains features inherited from its parent, and these features do not appear in a short output. The flat command generates a listing of all the exported features in a class, both inherited and immediate. All the exported features in a class can be seen by running the flat tool on a class to get the collection of features exported by a class and by any of its ancestors.
© R. S. Rist, 1993
138
When classes are listed in a system, a parent class is presented before its children; no class listing order is defined on the children within this grouping. A total listing order that makes the code in the system classes easy to follow is to • •
use the client order as the basic listing order when a child class is encountered in client order, show the parent and then the child classes
10.4 Inherit or client? An existing class can be reused in two ways, as a supplier and as a parent. When should you buy, and when should you inherit? The general answer is that inheritance means "is", and client means "has", "uses", or "contains". Looking at the BANK system, a customer has an account; a customer is not a type of account. A customer is a person; a customer does not contain a person. One question to ask when the relation is unclear is "Can the class have two of them?". If the class can have two objects of some type, then the client relation is used; if there is always one object, then inheritance is likely. It is possible, for example, to define a bank customer as a bank account that is a person, so the class CUSTOMER would inherit from PERSON and ACCOUNT; from the person's perspective the customer is a person, but from the bank's perspective the customer is an account. A customer can easily have two accounts, however, so a customer cannot be an account. Inheritance is used when an instance of A may also be seen as an instance of B (a rectangle is a polygon; a cat, dog, or bird is an animal). The client relation is appropriate when an instance of B uses an object of type A. A useful question to ask is "Does it have to be this way?", or "Is this temporary or permanent?". If the relationship is permanent and can never be changed, then inheritance should be used. If the relationship can change from one system to another or one use to another, then a client relation should be used. The decision about how to structure a particular system involves many issues, and a discussion about this topic is left to more specialised texts on OO system analysis and design such as Booch (1994), Henderson-Sellers (1994), and Rumbaugh (1991). The basic idea is to define the behaviour and code you need once, at one place in one class, and then use the class as a parent or as a supplier.
10.5 Inherit example: class WORKER A worker is a person who works. The class WORKER can thus be split into two parts, one part defining what it is to be a PERSON, and one part adding the extra features that define a WORKER. A person has a name, address, and gender. A worker is a person with extra atributes (and routines that use those attributes) that deal with the pay rate, the hours worked, and so on. The dual role of a worker can be neatly captured through the inheritance relation. The class PERSON is defined by the code below; some of the routine bodies have been omitted for simplicity. An object of type PERSON has 11 features: three attributes, two exported routines, and six private routines. Each routine is small and does a single thing; a large number of small routines is tedious to code and creates a long class listing, but this effort has to be made once, and the class can then be reused without change.
class PERSON creation make feature {COMPANY} make is -- set the values of the attributes
© R. S. Rist, 1993
139
do
io.putstring ("%NEnter the personal details%N") get_name get_gender get_address end -- make show is -- show the personal details do
print_title io.putstring (name) io.putstring (" lives at ") io.putstring (address) end -- show feature {NONE} name: STRING
get_name is -- set the value of the name do io.putstring (" Name: ") io.readline name := clone (io.laststring) end -- get_name gender: CHARACTER get_gender is -- loop until the user enters a valid gender, store the gender do from read_gender until good_gender loop io.putstring ("Valid codes are M or F. Try again%N") read_gender end gender := io.lastchar end -- get_gender read_gender is -- read in a gender code do io.putstring (" Enter the gender (M/F): ") io.readchar io.next_line end -- read_gender valid_gender: BOOLEAN is ...
© R. S. Rist, 1993
140
-- has a valid gender code been entered?
show_gender is ... -- print a title (Mr. or Ms.) based on the gender address: STRING get_address is -- set the value of the address do io.putstring (" Address: ") io.readline address := clone (io.laststring) end -- get_address end -- class PERSON The class PERSON is inherited and used by other classes. The new class WORKER inherits PERSON, and adds the additional fields pay_rate, hours, gross and tax, plus their associated routines. The first part of the code for class WORKER is shown below, and the remainder of the code is developed in the rest of this chapter.
class WORKER inherit PERSON creation make feature {NONE} pay_rate, hours, gross, tax: REAL The new class has seven attributes, three inherited from the class PERSON (name, gender, address) and four immediate attributes defined within the class WORKER (pay_rate, hours, gross, tax). It also contains eight routines inherited from class PERSON, plus any new routines defined within the class (not shown). A client of the class WORKER does not know if a feature was inherited, or defined within the class; the client simply uses a feature of WORKER. Assume that we have a class COMPANY that is a client of class WORKER, as shown in the outline code below. Note that the client does not mention class PERSON, because it is not a client of PERSON; it declares and uses an object of type WORKER:
class COMPANY feature me: WORKER ... The client and inheritance charts for the classes PERSON, WORKER, and COMPANY are shown below. Note that there are two charts: one client chart and one inheritance chart. These charts are always separated, because they show different types of information.
© R. S. Rist, 1993
141
PERSON COMPANY
WORKER WORKER
10.6 Redefine A child class uses features inherited from its parent in three main ways. The child may use an inherited feature unchanged. A child may redefine a feature it inherits; the new version of the feature in the child has the name and signature, but its body is different. Finally, a child may rename a parent feature; this uses the parent feature definition, but gives it a new name in the child. These three alternatives allow a class to take its pick of the features offered by the parent; some may be kept as they are, others redefined and overwritten by more appropriate code, while other features are renamed and used as part of a child feature. A feature is redefined in the child when its name is written after the keyword redefine in the inherit clause; if multiple features are redefined, then the names are separated by a comma. The format of the redefine clause is
class CHILD inherit PARENT redefine x, y, z end The redefine clause breaks the link between the name of a feature and its content; the name is retained, but the child defines a new version of the feature. The feature in the parent class is called the precursor of the redefined feature; in English, precursor means something like “the one before this”. The feature used in the current class is called the active or final feature; it may be inherited or immediate.
make is ... show is ...
PARENT
make
show
make
show
redefine make, show make is ... show is ...
CHILD
redefine creates a new version of the parent feature in the child. The feature in the parent now has a new version in the child. A simple system with two versions of make and show is shown in the diagram above. A feature preceded by the keyword frozen cannot be redefined. Freezing the name of a feature is used for system-level features that will never be changed, and can be used by all the classes
© R. S. Rist, 1993
142
in a system. The standard system features clone, standard_copy, and is_equal are frozen, because they are Eiffel-defined features that can be used everywhere and will never change. The routine header for copy lists a name that can be redefined, copy, and a name that is frozen and cannot be redefined, standard_copy. The full feature interface is copy, frozen standard_copy (other: like Current) is
-- Copy every field of other onto -- corresponding field of current object require other_not_void: other /= Void ensure is_equal (other) end -- copy The feature has two names, copy and standard_copy. The name copy can be redefined for specific classes, so each class can define its own copy routine, but the feature standard_copy cannot be redefined and is the same for all classes. A function with no arguments can be redefined as an attribute. A constant cannot be redefined; it makes no sense to change the value of a constant. An attribute cannot be redefined as a function because this can cause working code in the parent to break. Consider the illegal parent and child classes that contain the code
class PARENT pay: REAL use is do
class CHILD inherit PARENT redefine pay end
pay := 43 pay: REAL is do ... In the parent, pay is an attribute and a value is assigned to it in the use routine. In the child, pay is redefined to be a function, we cannot assign a value to a function, so a call to use would cause a runtime error. Eiffel solves this problem by not allowing it to happen; you can’t redefine an attribute as a function. Going the other way (function to attribute) is fine, because there is no danger: a function returns a value, an attribute returns a value and can be given a value by assignment, so nothing is lost in this transition from function to attribute. Redefining applies the open-closed principle, because a feature is inherited and changed in the child. The parent code is unchanged so the system is closed, and the new class provides new functionality so the system is open.
10.7 Redefine example: class WORKER The creation and display features from the class PERSON are inherited by WORKER, but they are not sufficient for the child class. The child has additional attributes that must be set and displayed, but the parent code knows nothing of these. One solution is for class WORKER to define its own features make and show, and call the PERSON features as part of these routines. The code to redefine the inherited features is
class WORKER inherit
© R. S. Rist, 1993
143
PERSON redefine make, show end The make routine in WORKER gets the personal details, then gets the pay rate for the new worker. The show routine in WORKER shows the personal details, then shows other fields used in the child. New routines are added to get the number of hours worked, and to calculate and store the gross pay and tax. The full code for a very simple class WORKER is
class WORKER inherit PERSON redefine make, show end creation {COMPANY} make feature {COMPANY} make is -- read and store values for the name, gender, address, and pay rate do io.putstring ("%NEnter the worker details%N") get_name get_gender get_address get_pay_rate end -- make daily (today: REAL) is -- add the number of hours worked today to the total for the week do add_hours (today) end -- daily weekly is -- set the gross pay and tax do set_pay set_tax end -- weekly show is -- show the worker details do print_title
© R. S. Rist, 1993
144
io.putstring (name) io.putstring (" lives at ") io.putstring (address) show_hours show_pay show_tax end -- show feature {NONE} pay_rate: REAL get_pay_rate is -- read the pay rate from the user, store it do io.putstring ("%NEnter pay rate: ") io.readreal pay_rate := io.lastreal end -- get_pay_rate show_pay_rate is -- show the pay rate do io.putstring ("%NPay rate is ") io.putreal (pay_rate) end -- show_pay_rate hours: REAL add_hours (today: REAL) is -- update the total hours worked do hours := hours + today end -- add_hours show_hours is -- show the number of hours worked do io.putstring ("%NHours worked is ") io.putreal (hours) end -- show_hours gross: REAL set_pay is -- calculate and store the gross pay do gross := hours * pay_rate end -- set_pay show_pay is
© R. S. Rist, 1993
145
-- show the total pay received do io.putstring ("%NGross pay: ") io.putreal (gross) end -- show_pay tax: REAL tax_rate: REAL is 22.5 set_tax is -- calculate and store the tax on the gross pay do tax := gross * tax_rate / 100 end -- set_tax show_tax is -- show the tax paid on the gross income do io.putstring ("%NTax on gross: ") io.putreal (tax) end -- show_tax end -- class WORKER Class PERSON has 11 features: three attributes, six private routines, and two exported routines. Class WORKER has 24 features, nine inherited unchanged, two inherited and redefined and 13 new feature definitions; the feature names are shown in the table below, with exported features shown in bold face. The child class WORKER redefines the two exported routines; the export status of a feature is inherited with the feature, so although the body of these features have changed from parent to child, the export status has not. Fifteen immediate features are defined in the child: two redefined features and 13 new features. Five immediate attributes have been added in the class WORKER, plus eight immediate routines that use the new attributes. PERSON features
redefine
make show name get_name gender get_gender read_gender valid_gender show_gender address get_address
make show
WORKER features make show name get_name gender get_gender read_gender valid_gender show_gender address get_address
pay_rate get_pay_rate show_pay_rate hours add_hours show_hours gross set_pay show_pay tax tax_rate set_tax show_tax
COMPANY is a client of WORKER, because a company uses the services of the worker. The client declares an object of type WORKER, and then uses the services of this class. The code in class COMPANY calls features of class WORKER, that may be inherited unchanged from the parent, defined as immediate routines in the child, or redefined in the child; the source of a feature is invisible to the client. The definition for the simple class COMPANY is
© R. S. Rist, 1993
146
class COMPANY creation make feature me: WORKER make is -- make, work, and display the worker do !!me.make me.has_worked (40) me.find_pay me.display end -- make end -- class COMPANY 10.8 Redefine example: class CONTRACTOR Consider the example of a worker who is an independent contractor, not a full-time employee. The salary of a contractor is not taxed each week; instead, the full salary is paid by the employer and the contractor pays provisional tax at the end of each year. For this reason, the methods for calculating gross pay and tax are incorrect, and need to be redefined. The rest of the information about the worker is the same, so class WORKER can be reused except for the feature that calculates gross pay and tax. The obvious approach is to redefine the tax_rate to be zero, but this is illegal because we can’t redefine a constant. A better solution is to note that a contractor pays no weekly tax, so we simply redefine the routine that sets the tax, and then tax is always zero for the contractor. The complete definition for class CONTRACTOR is
class CONTRACTOR inherit WORKER redefine set_tax end creation make feature
set_tax is -- a contractor pays no tax on weekly income do end -- set_tax end -- class CONTRACTOR Class CONTRACTOR has 24 features, with the same behaviour as the features in its parent, class WORKER. All 24 features are inherited, and one is redefined as an immediate feature.
© R. S. Rist, 1993
147
10.9 Rename The rename keyword in the inheritance clause gives a parent feature a different name in the child class. An inherited feature is renamed by writing the old name, the keyword as, and the new name. If multiple features are renamed, then each name change is placed on a new line, separated by commas. The syntax of the rename clause is shown below:
class A inherit B rename x as y, p as q, r as s end The name of the base version of a feature is called the original name, and the name of the feature in a child class is called the final name of the feature. Inheritance is transitive, so an ancestor of the child class inherits the feature with its new name; that is the name of the feature in the child. Because the feature has a new name in the child, the original name can be given to an immediate feature in the child. rename does not create a new feature; it simply gives the parent feature a new name in the child. The effect of a rename clause is shown in the diagram below, where the two parent features have different names in the parent and in the child. The child then defines two new, child features that have the same names as the original, parent features.
make is ... show is ...
PARENT
make
show
rename make as make_parent, display as show_parent make is ... show is ...
CHILD
make_ show_ parent parent
make
show
Renaming applies the open-closed principle, because a feature can now be inherited and used as part of another routine. The parent code is unchanged so the system is closed, and the new class provides new functionality so the system is open. Common error: feature of child has same name as feature of parent, generating a name clash Error code: VMFN Error: Two or more features have the same name What to do: If they must indeed be different features, choose different names or use renaming
10.10 Rename example: class WORKER The creation routine for a worker should execute the creation routine for a person, and add the extra code required to use the extra fields in the worker. The WORKER creation routine thus needs the ability to call the PERSON creation routine. This ability is already provided by inheritance, because the class WORKER inherits the individual features that set and show the attributes from class PERSON. The creation routine in WORKER, however, should use the standard name make for its own creation
© R. S. Rist, 1993
148
routine and call the parent make routine as part of its processing, so no code is repeated. The solution is to inherit the parent's creation routine, and give it another name within the child class, such as make_person. The obvious but incorrect code to reuse a parent feature as part of the child feature of the same name is shown below:
class WORKER inherit PERSON rename make as make_person, show as show_person redefine make, show end This inheritance clause will not compile, because the clauses are compiled in their listed order. The rename clause gives the parent features new names. When the redefine clause is then seen by the compiler, there is no inherited feature with the listed names, due to the previous rename, so an error message is generated: Error code: VDRS (1) Error: Identifier in redefine subclause does not denote inherited feature. What to do: Make sure that all identifiers in subclause are final names of features inherited fom the given parent. The current solution to this common problem is to use multiple inheritance, discussed in chapter 12. The same parent class is inherited twice. On one inheritance path, the feature is renamed. On the other inheritance path, the feature is redefined and that redefined feature is selected for use in the child. The child can then use the renamed parent feature as part of the redefined parent feature! The correct inheritance code to do this is:
class WORKER inherit PERSON rename make as make_person, show as show_person end PERSON redefine make, show select make, show end creation {COMPANY} make feature {COMPANY} make is ...
© R. S. Rist, 1993
149
show is ... 10.11 The precursor of a feature The practice of using a parent feature as part of the child feature is so common that a special keyword has been added to the Eiffel language (Meyer, 1997) for just this purpose. When Eiffel sees the keyword Precursor in a routine at compile time, it replaces it with a call to the parent routine with the same name. At run time, the parent routine is then called and executed. The solution to the WORKER problem is now simple; a feature is redefined in the inheritance clause, and the new feature calls Precursor. The code in class WORKER to create and to display a worker is then:
class WORKER inherit PERSON redefine make, display end feature {COMPANY} make is -- get and store the name, gender, address and pay rate do Precursor get_pay_rate end -- make show is -- display the worker fields do Precursor show_hours show_pay show_tax end -- show end -- class WORKER If a feature with the same name is inherited from several parents then there is the possibility of a name clash, so the name of the parent is written as a policy before the keyword, such as {PARENT1} Precursor. Unfortunately, Precursor has not yet been implemented. The standard solution is to use multiple inheritance, as described above and explained in detail in Chapter 12.
10.12 Export When a class inherits a feature, the export policy comes along with the feature. The child can retain the existing policies, or define a new policy for a feature. All the parent features are there if the
© R. S. Rist, 1993
150
child wishes to export them, but the child can define its own, different exports and thus its own behaviour. A feature inherits its export status from the parent. The child class can use the inherited export policy, or it can override the inherited policy by using the export clause. The export clause lists the new export policy using the familiar {...} notation, and this policy is then applied to each feature listed after the policy. If there are multiple export policies, each policy is placed on a single line, separated by a semi-colon. The form of the export clause is
export {classes} feature, feature, ... feature; {classes} feature, feature, ... feature Each line of the clause consists of an export policy, followed by the list of features that use the export policy. The export policy is the same as that used for features and for creation, a list of classes separated by commas. Feature names in the feature list are separated by commas, and terminated by a semi-colon. The keyword all may be used instead of a feature list, to denote all the inherited features; this keyword may be used only once within an export clause. The export clause export {NONE} all, for example, hides all the features inherited from a parent. If WORKER wanted to hide the PERSON make and show routines, it could use the code
class WORKER inherit PERSON rename make as make_person, display as display_person export {NONE} make_person, display_person end The inheritance clauses are written in a fixed order, and are executed in that order. This means that the export policy has to use the “current” name of the feature, here the name given to the feature by the rename clause. Common error: The inherited status of a feature overrides any immediate status; what you see is not what you get. In particular, the export status of a feature is inherited with that feature. Consider the code shown below:
class PARENT
class CHILD
feature {NONE} show is ...
inherit PARENT redefine show end feature {ANY} show is ...
end -- class PARENT
end -- class CHILD class CLIENT feature © R. S. Rist, 1993
151
c: CHILD use is !!c c.show end -- use end -- class CLIENT This code will not compile and execute, even though the immediate export policy on the child routine show is {ANY}. The child defines a new feature by inheriting the previous version and redefining it, but the inherited export policy has not been changed along the inheritance path. The Eiffel run-time environment follows a feature down its inheritance path, so Eiffel finds no change to the parent export policy {NONE} and thus cannot call the feature. The code above generates the error Error code: VUEX (2) Error: feature of qualified call is not available to parent class What to do: make sure feature after dot is exported to caller To change the inherited export policy, you set a new policy in the inheritance clause:
class CHILD inherit PARENT export {ANY} show redefine show end feature show is ... end -- class CHILD The inherited export status is changed in the inheritance clause, so the new version has the new export policy.
10.13 Case study: inheritance The existing BANK system is restructured by inheritance to separate out the components of a customer. A customer is a person who has a bank account, plus a user identifier and a password. Two classes can be defined to capture this distinction, where the class CUSTOMER inherits the class PERSON. The personal features are the person's name, gender, and address, plus routines that use this data. The additional customer features are the unique customer identifier, the password and the account, plus routines that use this data. Main points in this chapter •
When a class is inherited, the existing, original and reused class is called the parent and the new, inheriting class is called the child class.
•
A child inherits all the features of the parent, and may add new features of its own. A feature call does not distinguish between parent and child features; all features are used exactly as though they were defined in the child
© R. S. Rist, 1993
152
•
The creation routine is inherited, but not its creation status. It is the responsiblity of the child to choose its creation routine
•
The expanded status of a class is not inherited.
•
redefine makes a new version of a parent feature in the child; the feature name and signature are retained, but the child provides a new feature body.
•
rename changes the name of the parent feature in the child. The signature and body of the feature is retained.
•
A parent feature is used as part of a child feature of the same name either by multiple inheritance, or by writing Precursor in the redefined child feature.
•
The export status of a feature is inherited with the feature, but may be changed by listing the new export policy of the feature in the export clause.
Exercises 1. How is code reused by inheritance? Give an example, showing the client and inheritance charts, and parent, child, and client code. How does inheritance affect a client of the child? 2. Draw an inheritance chart from amoeba to human (you may gloss over some of the stages). On each node of the chart, write the new features for each child. 3. There are three main clauses within an inherit clause. What are they? What does each do? What is the format of each clause? What order are they listed in?
4.
Consider the following Eiffel class headers: class A creation make feature a, b: M c is d: BOOLEAN is e (a: X) is end -- class A class B inherit A rename a as aye, c as cee redefine aye, d
© R. S. Rist, 1993
153
end creation make feature a, f, g: N h is i (a: O) is end -- class B class C inherit B rename aye as eh, a as aiee, b as bee, i as eye redefine h, eye end creation make
feature k, l: Q a is end -- class C Draw a table with the classes listed down the left side, in inheritance order. Write the names of each feature in the first class (class A) along the top of the table, then write the name of that feature in the child. If the definition of the feature has changed in the child, mark it with an asterisk. 5.
Consider the following specification:
"A bank offers four types of account: savings, cheque, scrooge, and investment. The first three types can be accessed through an ATM, so they offer the services deposit, withdraw, and show balance. For each type of account, the rules for each service are slightly different. Any amount may be deposited in an account. A withdrawal from a savings account decrements the balance by the amount withdrawn. A successful withdrawal from a cheque account costs 50 cents. An unsuccessful withdrawal from a cheque account costs $5. There are no charges or penalties for a savings account. The balance of an account cannot be negative. A savings account gets daily interest; the interest rate is 4.5% a year. A cheque account gets no interest. An investment account is created with an initial balance, and accrues daily interest for a period of 3, 6, or 12 months. A 3-month investment account has an annual interest rate of 5.5%, a 6month account has a 6.0% rate, and a 12-month account 6.5%. When the account matures at the end of
© R. S. Rist, 1993
154
the period, the total amount is transferred into the customer's cheque account. If there is no cheque account, then one is created specially to receive the investment funds. A scrooge account allows money to be deposited, but not withdrawn. A scrooge account is set up for some period with an interest rate of 6.0%, and the balance increases over the period due to interest and deposits. Interest is paid daily. At the end of the period, the money is transferred into a cheque account." Write the inheritance hierarchy for this fragment of the system. First, build a table that shows the common and shared parts of the system, then convert the table to an inheritance hierarchy. List the classes across the top of the table, and the behaviours down the side. Within the table, write a cross if a behaviour have the same content for each class, a circle if the details differ for each class, and leave the intersection blank if the behaviour does not occur. The detailed steps to follow are a) b) c) d) e) f)
give a name to each class write down the attributes for each class on the table write down the routines for each class on the table capture the common behaviour in an inheritance hierarchy define the inheritance hierarchy by writing the feature header code in each class implement each feature
The problem given here is complex, but it can be decomosed into smaller pieces. A simpler problem is to consider only two classes. Write the class definition for a single class, then choose another class and design the inheritance hierarchy for those two classes. When you believe that your solution works, add a third class. Finally, add the last class.
© R. S. Rist, 1993
155
Chapter 11: Polymorphism Keywords: conformance, deferred, effective, dynamic type, dynamic dispatch, polymorphism A class in an inheritance hierarchy has many types, from the current type to the most abstract type at the top of the hierarchy. A common pattern is to define the feature interface at a high level in the hierarchy, and to defer the body or action of a feature until a later, descendant class. Such a deferred feature has its name and signature defined in a parent, but the definition or body of the feature is given in the child. The child effects or gives an effective definition for the feature, so each child can define its own specific action. Objects of the various child types can be treated identically by calling the feature on the object; the feature interface is identical, but the content of the feature is specific to each class. This technique is called polymorphism, because it allows the same code to use objects of many shapes or types.
11.1 The Eiffel type hierarchy Eiffel uses a class hierarchy to define the very top, and the very bottom levels of any userdefined class. At the top of the Eiffel hierarchy is the class GENERAL, that defines the generic routines clone, copy, and is_equal. Below that is the class PLATFORM, that defines the specific features needed to tailor the Eiffel language to run on a specific platform or computer system, such as the number of bits used to store numbers of type INTEGER, REAL, and DOUBLE. Below that is the class ANY, which is the ancestor of every user-defined class, so all user-defined classes sit below ANY in the hierarchy.
GENERAL
PLATFORM
ANY
User-defined classes
NONE
Every class written by a user inherits from the Eiffel class ANY, without the need for an explicit inheritance clause. The class ANY allows features to be defined that work for any class; more formally, that work for all classes of type ANY. System-wide features can be defined once at the appropriate level, and used by all the children of class ANY; in particular, the generic features clone, copy, and is_equal are defined once and used in any class. No class inherits NONE, by definition.
© R. S. Rist, 1993
156
Class NONE defines the bottom node in the inheritance hierarchy, so by definition no class can inherit it. The special value Void is of type NONE. With the introduction of inheritance, what is meant by the type of an object has to be defined in more detail. A worker is a person; more formally, an object of type WORKER is also an object of type PERSON, because WORKER inherits PERSON. The base class of a specific class is defined to be the top-level user class, at the top of the user-defined hierarchy. The inheritance hierarchy for a class may be many levels deep, so a class can be of many types, from its current type all the way up to type ANY. In the previous chapter, an inheritance hierarchy was defined for a CONTRACTOR, who is a WORKER, who is a PERSON, who is of type ANY. This reflects the real world: I am a worker, a person, a human being, an animal, a living being, and a thing (in some languages, the top level of the inheritance hierarchy is a class THING). An object that is an instance of a class in an inheritance hierarchy is of many types. A class has an immediate type, as well as one or more inherited types. A generic class can generate many types, one type for each actual parameter. Given the Eiffel class hierarchy, in particular the two classes ANY and NONE, it can now be seen that there is a single, consistent rule for export policies; a feature is exported to the classes listed in its export policy. A feature exported to objects of type ANY is thus available to all classes; a feature exported to objects of type NONE is available to no classes. A feature exported to one or more specific classes is available to those classes, and to descendants of those classes, because a descendant of class X is of type X, as well as more specific types.
11.2 Conformance Inheritance in the Eiffel type system provides a way to define one type in terms of another. It also determines when one type can replace another, and ensures that the type system works in an intuitively reasonable manner. Consider an example where you go into a restaurant and ask for a salad. For this request, it is reasonable that any type of salad will be acceptable, such as a garden salad, a Waldorf salad, a chef's salad, and so on. These are specific types of salad, so they conform to the definition of a salad. On the other hand, receiving a hamburger would be a surprise, because a hamburger is not a type of salad. The notion of conformance makes this expectation explicit. In an assignment statement, the type of the right hand side must be the same as the type of the variable on the left hand side. Because a variable may have many types due to inheritance, a more formal rule must be given: the type of the expression on the right of the assignment must conform to the type on the left. Class B conforms to class A if they are the same class, or class B is a descendant of A; A cannot also conform to B, unless they are the same type. Consider the classes CONTRACTOR, WORKER, and PERSON, a hierarchy of three classes. A contractor is a type of worker, and a worker is a type of person. If I need a job done and advertise for the services of a worker, then there is no surprise if I use a contractor, because a contractor can take the place of a worker. On the other hand, I would be surprised if a person, who is not a worker, answered the ad; I want a more specific class. In the same way, an Eiffel variable can treat an object of a parent type in the same way it treats an object of a subtype, because the child does at least as much as the parent, and possibly more. One type can be used in place of another if they conform. p: PERSON w: WORKER c: CONTRACTOR
p := c p := w w := p
-- valid; a contractor is a person -- valid; a worker is a person -- invalid; a person is not a worker w := c -- valid; a contractor is a worker c := p -- invalid; a person is not a contractor c := w -- invalid; a worker is not a contractor
When a feature is redefined, the signature of the new version must conform to the signature of its precursor. The signature of a feature lists the number, order and type of the values passed as arguments to the routine, and the type of any value returned from the routine. To ensure that redefinition works in a reasonable manner, the signatures of any old and new versions of a feature must conform; formally, each type in the signature of the new version must conform to the type in the old
© R. S. Rist, 1993
157
version. A redefined feature often keeps the original signature; conformance allows descendants to replace their ancestors in the signature. An expanded type conforms directly to its base type, and indirectly to other classes through the base type. Consider the two declarations and the assignment ref_type: T exp_type: expanded T ref_type := exp_type This assignment is valid, and has the effect of copying the values of exp_type into the variable ref_type. Expanded types are discussed in more detail in Meyer (1992).
11.3 Deferred features The foundation of reusable software is to define a feature once, and use the feature as needed. One of the most powerful inheritance techniques is to define the feature interface for a general class of objects in a general class, and then leave each specific, inheriting class to define its own specific action or body of the feature. Every child plays the same role and has similar behaviour, but the exact, internal details of the behaviour are different. The routine header defines the interface to the routine, and the routine body defines the action. A routine may be defined with a header, but no body; such a routine cannot be executed. The interface is defined in a parent class, and the body is deferred; a child class then inherits the routine, and defines the body. The parent defines a deferred routine, and the child makes this routine effective; such a process is called effecting the routine. A deferred routine is defined by replacing the keyword do with the keyword deferred, and leaving the body of the routine empty. A deferred routine to find an area, for example, is
deferred class X feature area: REAL is -- area deferred end A class with a deferred routine is called a deferred class; this is stated in the class header. An instance of a deferred class cannot be created, because Eiffel cannot find a routine body to connect to the header. A deferred class therefore does not contain a creation clause. An instance of a child class can be created if the child effects every deferred routine, so there are no deferred routines in the child. Common error: forget to state that class is deferred, or forget to effect a feature Error code: VCCH (1) Error: Class has deferred feature(s), but is not declared as deferred. What to do: make feature(s) effective, or include “deferred” before “class” in Class_header Common error: Try to create an object of a deferred type. Either you need to make the deferred class effective by effecting every deferred feature, or you need to use an effective child of the deferred class. Error code: VGCC (2)
© R. S. Rist, 1993
158
Type error: creation instruction applies to target of a deferred type What to do: make sure that type of target is effective. 11.4 A deferred example: class POLYGON Consider a system that implements a graphics library. Classes in the library define geometric shapes such as points, lines, circles, triangles, squares, and rectangles. A polygon is a general name for closed geometric objects made of straight lines, such as triangles and squares. An abstract class POLYGON may therefore be defined to capture the general properties and behaviour of polygons; this class is then inherited by specific types of polygon. Operations on polygons include computing the area and perimeter of a shape, moving the shape around, or changing the size of the shape. These behaviours can be defined in the abstract class POLYGON as effective or as deferred routines. A polygon has a perimeter, and the length of this perimeter can be easily calculated for arbitrary polygons, so an effective routine to calculate the perimeter is defined in this class. A polygon has an area, but it is difficult to define a method for finding the area of an arbitrary polygon, so the body of this feature is deferred. Defining the function area as a deferred routine says that all polygons have an area, but the effective routine to calculate the area is left for more specific classes to define. The code for the deferred class POLYGON looks like
deferred class POLYGON feature {NONE} vertices: LINKED_LIST[POINT] feature {ANY} make is -- get the points that define the polygon deferred end -- make perimeter: REAL is -- the length of the perimeter of the polygon local this, previous: POINT do from vertices.start this := vertices.item until vertices.islast loop previous := this vertices.forth this := vertices.item Result := Result + this.distance (previous) end Result := Result + this.distance (vertices.first) end -- perimeter area: REAL is -- return the area of the figure deferred © R. S. Rist, 1993
159
end -- area display is -- display the location of the vertices do from vertices.start until vertices.after loop vertices.item.display vertices.forth end end -- display move (delta_x, delta_y: REAL) is -- move by delta_x horizontally and delta_y vertically do from vertices.start until vertices.after loop vertices.item.move (delta_x, delta_y) vertices.forth end end -- move end -- class POLYGON Specific types of polygon, such as triangles, rectangles and squares, define effective versions for each deferred routine. A deferred routine is not redefined in the child, because it was never defined; the child defines an effective routine. An effective parent routine may be used by the child as written, or they may be redefined to more specific versions.
11.5 An effective example: class RECTANGLE A rectangle is a polygon with four sides, where the sides meet at right angles. Rectangles are created, moved, and displayed like any other polygon, and have an area and perimeter. On the other hand, a rectangle has special features of its own (matching sides, four vertices, right angles) which may result in better ways to do some of the operations. RECTANGLE can thus be defined as a child of POLYGON, the inherited features can be effected or changed, and new features can be added as needed. To create a rectangle, all the RECTANGLE routines have to be effective, because it is impossible to create an object of a deferred type. One possible way to implement the class RECTANGLE is
class RECTANGLE inherit POLYGON redefine perimeter end creation make feature { NONE}
© R. S. Rist, 1993
160
number_of_vertices: INTEGER is 4 side1, side2: REAL feature {ANY} make is -- make a rectangle, store the lengths of the sides local i: INTEGER p: POINT do !!vertices.make io.putstring ("%NEnter the four points of the rectangle") from until i = number_of_vertices loop !!p.make vertices.extend (p) i := i + 1 end side1 := vertices.item(1).distance (vertices.item(2)) side2 := vertices.item(2).distance (vertices.item(3)) end -- make perimeter: REAL is -- length of the perimeter of a rectangle do Result := 2 * (side1 + side2) end -- perimeter area is -- area of the rectangle do Result := side1 * side2 end -- area end -- class RECTANGLE Because RECTANGLE is a descendant of POLYGON, all the polygon features are features of the new class. The features in the two classes are shown below, first those defined in the parent class, then the routines inherited, defined or redefined in the child. Exported features are shown in bold face. POLYGON vertices make perimeter area display move
© R. S. Rist, 1993
RECTANGLE effect redefine effect
vertices make perimeter area display move number_of_vertices side1 side2
161
Inheritance is transitive, so a class that inherits from RECTANGLE, such as SQUARE, has all the POLYGON features as well as all the additional RECTANGLE features.
11.6 Dynamic types With the use of inheritance and conformance, it is now possible for a variable to be declared as one type (such as WORKER) and to actually contain an object of another type (such as CONTRACTOR). A contractor is a type of worker, so a CONTRACTOR object can be stored in a WORKER variable because the child conforms to the parent. This flexibility means that we must be careful about stating the type of an object; more formally, we must be careful about the type of the object pointed to by a variable. In particular, we must distinguish between a variable's static and dynamic type. The static type is the type that was used in the variable declaration, and the dynamic type is the current type of the object stored in the variable. These may be the same type for a particular variable, or they may be different. Consider the example used in the last chapter, where three variables of different types were defined by the declarations and the creations
p: PERSON w: WORKER WORKER c: CONTRACTOR CONTRACTOR
-- static type of p is PERSON -- static type of w is --
static
type
of
c
is
!!p.make !!w.make !!c.make With the existing inheritance hierarchy, the static and dynamic types of c must be the same, because CONTRACTOR has no children, so the only thing that can be stored in c is an object of type CONTRACTOR. On the other hand, I can store workers and contractors in the variable p, because both of these classes conform to the class PERSON. The following assignments are valid by conformance, and change the type of the object pointed to by the name:
p := w
--
dynamic
type
of
p
is
p := c CONTRACTOR w := c CONTRACTOR
--
dynamic
type
of
p
is
--
dynamic
type
of
w
is
WORKER
Changing the type of a variable at run-time can be done by assignment, in which case we refer to the process as dynamic assignment. The same effect can occur during variable binding in a procedure call, when the formal argument is replaced by a conforming actual argument. The type can also be changed by dynamic creation, described below. The general term for the process is dynamic binding, in which the type of a variable is changed at run-time.
11.7 Dynamic creation Eiffel has the ability to specify the type of an object when the object is created, as well as when the variable is declared. An explicit type is given in the creation command for the object, and must conform to the static or declared type of the variable. The syntax of the dynamic creation command is
!CLASS!object.make
© R. S. Rist, 1993
162
where the creation type (the class between the bangs) is a child of the object's type. An example of this technique is provided by a graphics system in which the user can dynamically create objects using an interactive menu. The menu requests the type of object to create, the user types in a single character, and the system then creates an object of the appropriate type. A bad implementation of this scenario is
class GRAPHIC creation make feature poly: POLYGON ... do_choice (choice: CHARACTER) is -- create an object of the appropriate type -- THIS IS THE WRONG WAY TO DO THE TASK local t: TRIANGLE r: RECTANGLE s: SQUARE do inspect choice when 'T' then !!t.make poly := t when 'R' then !!r.make poly := r when 'S' then !!s.make poly := s end -- inspect end -- do_choice In this implementation, a set of local variables are declared, one of them is created, and the new object is then assigned to the attribute poly. There is no need for all these variables, because the type can be dynamically defined when the object is created. A good implementation of the menu feature do_choice is much shorter and simpler than the version shown above; it is
poly: POLYGON ... do_choice (choice: CHARACTER) is -- create an object of the appropriate type do inspect choice when 'T' then !TRIANGLE!poly.make when 'R' then !RECTANGLE!poly.make when 'S' then !SQUARE!poly.make
© R. S. Rist, 1993
163
end -- inspect end -- do_choice An explicit type for the object is defined in the creation command, so no local variables are required; an object of the explicit type is simply created. The static type of the object is the declared type, here a POLYGON. The type of the object when it is created is the dynamic type; here, the dynamic type of poly is one of TRIANGLE, RECTANGLE, or SQUARE. A dynamic type is not shown on a client chart, because there is no declaration of that type in the client. In the code for the good implementation of class GRAPHIC that uses dynamic creation, the single declaration is of type POLYGON, so this class is shown as the supplier in a client chart:
GRAPHIC
POLYGON
This example introduces a new form of the creation command, in which an explicit type is defined in the instruction. Two additional forms of creation have now been seen that are only used with inheritance; the complete list of creation forms is shown below. Creation instruction !!p !!p.make !!.make none
Creation clause none creation {<exports>} make creation {<exports>} make creation
An object may have no creation routine (case 1), or it may have a routine, possibly with arguments (case 2). An object may be dynamically typed (case 3), which affects the creation command but not the creation clause. If the creation keyword is given but the creation clause is empty (case 4), then an object of that type cannot be created. Objects of a deferred class cannot be created, so a deferred class contains either no creation clause, or an empty creation clause. In the case where all features of a class are defined, but the class is not useful by itself, the creation clause is left empty. The class can still be used by inheritance, but not by a client as a stand-alone entity.
11.8 Dynamic dispatch Eiffel finds a child feature by dynamic dispatch. The steps in this mechanism are: 1. 2. 3. 4.
At compiler time, a pointer is set from the feature name to the feature definition, using the type (class) of the object; the type is given in the declaration. At run-time, the pointer to the parent feature is used as a starting point. The Eiffel run-time environment traces down the inheritance path for that feature from the parent class to find the version of that feature in the child class. It executes that version of the feature.
A new, child version of a feature is created only by redefine; the parent feature is redefined. A new, child version of a feature is not created when a feature is inherited and renamed; instead, the single version (of the parent feature) is given a new name in the child. When a feature is renamed, dynamic dispatch starts with the version in the parent, and finds that this is the version of the feature in the child because the feature was not redefined. It executes the version of the feature in the child, which is the version inherited and renamed from the parent. The mechanism is perhaps best explained with an example. Consider a class PARENT, that has routines make and display. Consider a class CHILD that inherits PARENT, renames make and display, and uses the new names as part of the immediate make and display routines in the child. The
© R. S. Rist, 1993
164
key point to note is that the child does contain a new version of the parent features. Instead, the child has two new, child features with the same names as the parent features. Consider a client of these two classes, that contains the code shown below.
class CLIENT creation make feature p: PARENT c: CHILD make is -- illustrate dynamic dispatch do !!c.make p := c p.display end -- make end -- class CLIENT When the make routine in class CLIENT is executed, the following sequence of events occurs: 1. The first line of code is executed. The type of c is CHILD, so Eiffel creates an object of type CHILD, finds the make routine in class CHILD and executes that routine. 2. The second line of code is executed. The value of c (a reference) is assigned to the identifier p. c conforms to p, so the assignment is valid, and the value of p is now a reference to the child object. 3. The third line of code is executed. At compile time, the Eiffel compiler attached a pointer from the name display to the feature definition in the static type PARENT. The Eiffel run-time environment now uses dynamic dispatch to trace down the inheritance hierarchy to find the version of the parent feature that is in the child. The parent version has not been redefined in class CHILD, so the version of the parent routine in the child is the same version. Eiffel then executes the version of the inherited feature in the child, which is the parent feature. In dynamic dispatch, the parent feature is traced down through the child classes. In the example, this means that the client code executes the make routine defined in class CHILD, and the display routine defined in class PARENT, even though there is only one object in the client, of static type CHILD. Dynamic dispatch ignores rename, because rename does not create a new version of the inherited feature. The inherited status of a feature overrides any immediate definitions. Because of dynamic dispatch, what you see (an immediate child feature) may not be what you get.
11.9 Polymorphism Polymorphism means the ability to take several forms (poly = many, morph = shape), where each form can be treated identically. The mechanisms that allow polymorphism are overloading, in which an operator can have several meanings, and dynamic binding, in which a variable can have several types. These techniques allow the same symbol (operator or variable) to be used in different ways, and supports polymorphism. Polymorphism supports the design of reusable classes by defining a standard interface in the parent, and supplying the effective implementations in the child classes. Polymorphism is a very strong
© R. S. Rist, 1993
165
software technique, that allows us to define a feature once, place it in the correct class, and reuse it. Another strong technique that Eiffel uses is strict compile time type checking: Eiffel finds as many bugs as it can at compile time, when it can provide an accurate error message and allow the bug to be easily found and fixed. These two techniques interact, because the Eiffel compiler has to be able to find a feature definition for every feature name at compile time, and then use the dynamic feature at run-time. To compile an Eiffel system that uses polymorphism, the parent class must have an exported feature with the appropriate name and signature. At compile time, it has to find a definition for every identifier (class, attribute, or routine) so it can check that there is a feature definition with the correct signature for every feature call. If a dynamically created object uses a feature, then Eiffel must be able to find that feature when the code is compiled; it cannot wait until the code is executed and the actual feature is known. Eiffel does this by attaching a pointer from the identifier to the parent feature at compile time, and then using dynamic dispatch to find the child feature at run time. Common error: feature in child but not in parent, so Eiffel can’t compile the feature call Error code: VEEN Error: Unknown identifier What to do: Make sure that identifier, if needed, is the final name of feature of class, or local entity or formal argument of routine.
11.10 Polymorphism example: a list of polygons Consider a list that contains different kinds of graphical objects, such as TRIANGLEs, RECTANGLEs, SQUAREs, and HEXAGONs. In this example, the list is statically defined to be a list of POLYGONs. Any object that conforms to POLYGON can be inserted into this list, because the objects are all polygons, so the list is indeed a list of polygons. An example of inserting "different" types of objects on a list is given by the code
shapes: LINKED_LIST[POLYGON] poly_filler is -- insert objects of different types into the list local p: POLYGON do !!shapes !SQUARE!p.make shapes.extend (p) !HEXAGON!p.make shapes.extend (p) end -- poly_filler Polymorphism allows extremely compact and flexible code to be written, because the same line of code can do different things! In the above list, for example, each object can be displayed by calling the display routine for that object:
display_shapes is -- show each of the elements on the list do from shapes.start until shapes.after loop shapes.item.display shapes.forth
© R. S. Rist, 1993
166
end end -- display_shapes If the names and signatures of the child features are the same, then the code can simply call a feature for that object. To execute the feature call, Eiffel examines the type of object to the left of the dot, looks up the relevant class, and executes the feature with that name in that class. As far as the client is concerned, it is simply displaying each object; how the object is actually displayed is an internal detail of each class. The client chart shows the classes declared in the client, so it shows both the polymorphic list and any static classes, but no dynamic classes. The client chart for the code shown above, assuming that the code is contained in class GRAPHIC, is thus
GRAPHIC
LINKED _LIST [T]
POLYGON
POLYGON
11.11 Assignment attempt Polymorphism means that the type of an object is not usually tested at all. A common parent is defined, objects of the various child types are created, and from then on the various objects are treated identically by the client code. Any complexity is hidden inside the features of the children. There may be cases, however, where the type of an object does have to be used; in the case study, for example, a customer can request access to their savings account, or to their cheque account. The type of an object can be found using the conditional assignment operator. The conditional assignment, or assignment attempt statement, uses the symbol "?=", in place of the assignment symbol ":=". If the type of the right hand side conforms to the type of the left hand side, then the assignment is successful and a useful value is assigned; if the types do not conform, then the value of Void is assigned. An object can be tested to see if it is of some type by trying to assign it to a name of that type. A conditional assignment has the form
type: TYPE object: REFERENCE type ?= object After this code has been executed, the name type will contain either the value Void, or have the same value as object.. The static type of type is TYPE, and the dynamic type will be either TYPE or REFERENCE, depending on whether the conditional assignment failed or succeeded. In the case study for this chapter, a customer may have up to three types of account: savings, cheque, and investment. When a customer uses the ATM, they are asked for the account type, and the reply is used to find the correct object; only savings and cheque accounts can be accessed via the ATM. A polymorphic solution is to store the three accounts in a list, and then scan through the list to find the desired object. Once the objects have been stored on the list, however, we have "lost track" of exactly where the object was stored. The standard method to find an object from a polymorphic list is to scan the list matching on a conditional assignment; this method uses the defined type hierarchy to find the type of an object. The code shown below has the basic form of the list scan routine using a matched command and query, as presented in Chapter 7. Here, however, the match is defined by a succesful assignment. The accounts are stored on the list accounts. The desired type is defined in the code below as a local
© R. S. Rist, 1993
167
variable; the type cannot be passed as an argument to the routine, because the value is changed in the routine by assignment. The type of the variable is used to filter the objects in the list. The matched scan routines are
find_savings_account is -- find the savings account in the list local account: SAVINGS do from accounts.start until accounts.after or else account /= Void loop account ?= accounts.item accounts.forth end end -- find_savings_account found_savings: BOOLEAN is -- was a savings account found on the list? do Result := not accounts.after end -- found_savings The loop scans through the objects on the list, attempting to assign each. If an object on the list has the same type as the variable (formally, conforms to the variable), then the assignment succeeds and the loop terminates with the cursor pointing to the desired object. If no object of the desired type is found, then the cursor is left pointing after the list.
11.12 Case study: the BANK system "There are three types of bank account: savings, cheque, and investment. A customer may have one account of each type. Savings and cheque accounts are accessed through the ATM. Savings and investment accounts accrue daily interest. A successful withdrawal from a cheque account costs 50 cents. An unsuccessful withdrawal from a cheque account (a bounced cheque) costs $5. There are no charges or penalties for a savings account. A savings account gets daily interest; the interest rate is 4.5% a year. A cheque account gets no interest. The balance of an account cannot be negative. An investment account may not be accessed through the ATM. It is created with an initial balance, and accrues daily interest for a period of 3, 6, or 12 months. A 3-month investment account has an annual interest rate of 5.5%, a 6-month account has a 6.0% rate, and a 12-month account 6.5%. When the account matures at the end of the period, the total amount is transferred into the customer's cheque account. If there is no cheque account, then one is created specially to receive the investment funds." Main points in this chapter •
The body of a routine may be deferred in a parent class. An effective definition is supplied by descendants of the class. A class with a deferred routine is called a deferred class; objects of a deferred class cannot be created.
•
The declared type of an object is called the static type, and the actual or run-time type is called the dynamic type of the variable.
© R. S. Rist, 1993
168
•
A child class conforms to the parent class, so an object can be assigned to a variable of its own type, or to a variable of its parent type.
•
The type of an object may be changed at run-time by dynamic assignment, binding, or creation; the general process is called dynamic binding.
•
Dynamic binding allows a variable to contain objects of different types, and overloading allows an operator to have different actions, depending on its arguments.
•
Children of a common parent with identical behaviour can be treated identically by a client. The client calls a feature, and the action of that feature is defined by each child. This technique is called polymorphism (many shapes).
•
A polymorphic parent must provide definitions for all the called child features, so Eiffel can bind the parent feature at run time and use dynamic dispatch to find the child feature at compile time.
Exercises 1. Describe what is meant by each of the following terms. Give the format and the effect of the corresponding Eiffel clause: • • • •
inherit rename redefine defer
2.
What is meant by an effective feature? an immediate feature? an inherited feature?
3.
How are two features joined? What must be true before the two features can be joined?
4. What is the difference between a static and a dynamic type? Compare and contrast static and dynamic types in dynamic creation, dynamic assignment, and dynamic binding. 5. What is polymorphism? Do polymorphic classes have to have the same class interface? How does polymorphism replace explicit selection in the code? 6. The game of battleships is a two-player game, played on two boards. Each board is a twodimensional array, containing empty squares (sea) and occupied squares (ships). Each player in turn guesses a location on the opponent's board; if there is a ship at that location, then the ship is destroyed and the player gets another turn. Play continues until all of a player's ships have been destroyed. The most basic version of this game has a small board and a single ship that does not move. Write a system in which you play against the computer. To simplify the system, use a 3x3 board, a single ship that takes up one square, a random number generator for the computer's guesses, no memory of previous guesses, and no validation of the user input. Design a system to play the game of battleships. Hint: the only difference between the two players (person and computer) is the source of the guess (input versus generated). 7. Consider the specification in the previous chapter, which described four kinds of bank account: savings, cheque, scrooge, and investment. Examine your previous solution and see if your solution can be improved by the use of deferred features. 8.
Consider the following specification:
"A bank offers six types of account: savings, check, scrooge, minimum, debit, and investment. The first five types can be accessed through an ATM, so they offer the services deposit, withdraw, and display. For each type of account, the rules for each service are slightly different.
© R. S. Rist, 1993
169
Any amount may be deposited in an account. A withdrawal from a savings account decrements the balance by the amount withdrawn. A successful withdrawal from a check account costs 50 cents. An unsuccessful withdrawal from a check account costs $5. There are no charges or penalties for a savings account. The balance of an account cannot be negative. A savings account gets daily interest; the interest rate is 4.5% a year. A check account gets no interest. An investment account is created with an initial balance, and accrues daily interest for a period of 3, 6, or 12 months. A 3-month investment account has an annual interest rate of 5.5%, a 6-month account has a 6.0% rate, and a 12-month account 6.5%. When the account matures at the end of the period, the total amount is transferred into the customer's check account. If there is no check account, then one is created specially to receive the investment funds. A scrooge account allows money to be deposited, but not withdrawn. A scrooge account is set up for some period with an interest rate of 6.0%, and the balance increases over the period due to interest and deposits. Interest is paid daily. At the end of the period, the money is transferred into a check account. The balance of a minimum account is not allowed to fall below $20,000. It has an interest rate of 7.5%. A debit account is like an always available bank loan. The account is set up with an initial value of at least $5,000. If the balance goes below this, interest is charged on the difference at 9.5% per annum, calculated daily. Note that this is negative interest deducted from the balance; the customer has to pay for the money that was used. Interest deductions are stopped when the balance returns above the initial amount." Write the inheritance hierarchy for this fragment of the system: a) b) c) d) e) f)
Give a name to each class. Write the inheritance chart. For each class, show the feature names. Indicate which features are inherited, and which are immediate. Write polymorphic code (caller and called) to create a list of accounts. Write polymorphic code to add interest at the end of every day.
© R. S. Rist, 1993
170
Chapter 12: Complex inheritance Keywords: multiple inherit, join, undefine, repeated inherit, select Multiple and repeated inheritance are presented in this chapter. A class may inherit from one or many parents. A common pattern is to have two features with the same name and signature, where one is deferred and defines the interface, and the other, effective feature defines the action. These features are automatically merged or joined during inheritance. If both features are effective, one version may be converted to a deferred feature by undefine, so the two features are automatically joined. With multiple inheritance, there may be a name clash in the inherited features, that is resolved by rename, undefine, or by a clause that tells the class to select a particular feature as the active feature. A class may inherit from the same parent one or more times, showing repeated inheritance.
12.1 Multiple inheritance A class may inherit from more than one parent, and the child can use features from both parents. This is a very common practice in OO systems, because it allows each class to define a constellation of useful features, that a child class can inherit and combine. Multiple inheritance is implemented by listing multiple class names in the child's inherit clause, such as
class X inherit A; B; C The inherited class names are separated by semi-colons; if there are sub-clauses within each class, then these are listed within each class, such as
class X inherit A rename p as q, r as s redefine t end; B; C; With multiple inheritance, all the features from all the parents are now features of the child. While the code to inherit multiple classes is simple, care must be taken if there are two or more inherited features with the same name and signature. If one feature is deferred and one effective, then the two features are automatically joined or merged in the child to define one effective feature. If both features are effective, then the name clash must be resolved. To illustrate the power of multiple inheritance, this chapter shows how to store a data structure to a file and how to retrieve that data from file. Consider a system that uses complex data structures such as lists. The list is stored in a file when the system is not executing. When the system is executed, it retrieves the file and converts it to a list, then uses or changes the list as neeeded. The (possibly updated) list is then stored to file before the system terminates. The class STORABLE offers features to store an object to and retrieve an object
© R. S. Rist, 1993
171
from a file, but that class has nothing to do with lists. The desired class can be defined by inheriting from both class LINKED_LIST, and from class STORABLE, to define a storable list. The next two sections describe how to store an object in a file and retrieve it from a file using the classes FILE and STORABLE. A storable list is then defined by multiple inheritance in the third section of this chapter.
12.2 File classes The main classes in the Eiffel file hierarchy are shown below. Class FILE supplies most of the effective features, but it is a deferred class so you need to use objects of type RAW_FILE.
MEMORY
IO_MEDIUM
FILE
RAW_ FILE
Care must be taken to separate the ideas of an Eiffel file object, and the physical file itself that is stored on some external storage medium by the operating system. A file object (of type RAW_FILE) has a name and a pointer to the physical file. The name of the file is a STRING. When the file object is created, Eiffel looks in your current directory for a stored file of that name, and if the stored file exists it attaches a pointer from the file object to the stored file. If the file does not exist, then Eiffel does not create a new file; it simply notes that the file does not exist. The only way to create a file is to create a file object and then use the store command on this object to store the data in the object to file. If a file is to be read from storage, then a file object is created and opened to read data from file. If a file is to be stored, then the file object is opened to write data to file. After the file has been read (written), it should be closed. A much abbreviated short listing of class FILE that provides these features is shown below; the full short listing describes almost 100 features. deferred class interface FILE feature -- Initialization
make (fn: STRING) -- Create file object with fn as file name. require string_exists: fn /= void; string_not_empty: not fn.empty ensure file_named: name.is_equal (fn); file_closed: is_closed
© R. S. Rist, 1993
172
feature -- Status report
exists: BOOLEAN -- Does physical file exist? (Uses effective UID.) feature -- Status setting
open_read -- Open file in read-only mode. require
is_closed: is_closed ensure
exists: exists; open_read: is_open_read open_write -- Open file in write-only mode; create it if it does not exist. ensure
exists: exists; open_write: is_open_write close -- Close file. ensure is_closed: is_closed Older versions of Eiffel used the class UNIX_FILE instead of RAW_FILE, but UNIX_FILE is now an obsolete class. The Eiffel libraries (and the Eiffel language) have grown and changed over the years, so some early features are now obsolete. The Eiffel compiler converts these obsolete features to their modern form, and gives you a warning message so you can change the code at some later time. A warning message is not an error, just a warning. The Eiffel message warning about obsolete classes is Warning code: Obsolete Warning: Type relies on obsolete class. What to do: update to new class at your earliest convenience. The class is still available, but may be removed in the future.
12.3 Class STORABLE The Eiffel Library class STORABLE allows an object to be stored to a file and retrieved from a file. An abbreviated class interface is shown below.
class interface STORABLE basic_store (file: IO_MEDIUM) -- Produce on file an external representation of the -- entire object structure reachable from current object. -- Retrievable within current system only require file_not_void: file /= Void; file_exists: file.exists; file_is_open_write: file.is_open_write; file_is_binary: not file.is_plain_text general_store (file: IO_MEDIUM) -- Produce on file an external representation of the
© R. S. Rist, 1993
173
-- entire object structure reachable from current object. -- Retrievable from other systems for same platform require file_not_void: file /= Void; file_exists: file.exists; file_is_open_write: file.is_open_write; file_is_binary: not file.is_plain_text store_by_name (file_name: STRING) -- Produce on file called file_name an external -- representation of the entire object structure -- reachable from current object. -- Retrievable from other systems for same platform require file_name_not_void: file_name /= Void; file_name_meaningful: not file_name.empty retrieve_by_name (file_name: STRING): STORABLE -- Retrieve object structure, from external -- representation previously stored in a file -- called file_name require file_name_exists: file_name /= Void; file_name_meaningful: not file_name.empty retrieved (file: IO_MEDIUM): STORABLE -- Retrieved object structure, from external -- representation previously stored in file. require file_not_void: file /= Void; file_exists: file.exists; file_is_open_read: file.is_open_read; file_is_binary: not file.is_plain_text ensure result_exists: Result /= Void end interface -- class STORABLE The routine headers require an argument of type STRING or IO_MEDIUM; because IO_MEDIUM is a deferred class, an object of type RAW_FILE is actually used. One way to store and retrieve data is shown below. The first routine creates a file object, and if a physical file with the defined name exists, it opens the file for reading and retrieves the file contents. If no file exists, then a new object is created that will later be stored to file. Conditional assignment is used to retrieve the data, because we only want to attach the object to the identifier if the types are compatible. If the structure of the system has changed between the time the data was stored and the time it is retrieved, then the stored structure no longer matches the defined structure and the assignment attempt will fail. The second routine writes the object to file. The second routine could use the existing file object instead of creating a new one, but this would add to the number of attributes in the class and so has been avoided.
class X
© R. S. Rist, 1993
174
inherit STORABLE feature p: PERSON name: STRING is "person.dat" retrieve is -- retrieve the object from file if possible -- create a new object if there is no file local file: RAW_FILE do !!file.make (name) if file.exists then file.open_read p ?= retrieved (file) file.close end if p = Void then!!p end end -- retrieve store is -- store the object to file local file: RAW_FILE do !!file.make (name) file.open_write p.basic_store (file) file.close end -- store end -- class X Both the storing and the stored classes must be STORABLE, so both class X and class PERSON in this example inherit class STORABLE. An object is stored by a command of the form x.store, so x must supply the feature store, so class X must inherit STORABLE. An object is retrieved by an instruction of the form x ?= retrieved, so the storing class must supply the feature retrieved, so it must inherit STORABLE. A client and an inheritance chart for the example is shown below.
X
RAW_FILE
© R. S. Rist, 1993
STORABLE
PERSON
X
PERSON
175
12.4 A storable list A class that inherits from both LINKED_LIST and STORABLE can create an object that is a list when the system is running, and is stored away in a file between sessions. The definition of such a class is extremely simple, because no features are added or changed; the class exists only to combine the features of both its parents. The complete class definition is shown below2. The formal parameter T in the class header is replaced by the actual parameter passed from the client when the system is compiled.
class STORE_LIST[T] inherit STORABLE LINKED_LIST[T] creation make end -- class STORE_LIST All the features of both parents are inherited by this class, so a client can now declare an object to be of type STORE_LIST, retrieve the stored version of the list, use, change, delete, and add elements to this list during a session, then store the list at the end of a session. A store feature stores a complete object on the external medium; it is a deep operation. The routine starts at the name of the object and traces any pointers through the object's data structure to all parts of the object. In the BANK system, for example, the bank contains all the data in the entire system, every customer and every account for every customer. The single bank object thus contains all the permanent data for this system, so only that object has to be stored. The inheritance charts for this part of the BANK system are shown below.
STORABLE
LINKED_ LIST
STORABLE
STORE_ LIST
BANK
The client chart for this part of the BANK system is shown below, where the bank uses a storable list instead of just a list of customers. Client and inheritance charts capture different kinds of links, so as always they are shown separately.
BANK
STORE_LIST [T]
CUSTOMER
RAW_FILE
This example of multiple inheritance took two classes, and combined them dynamically to give the desired functionality. The same technique can be used to store and retrieve arrays, trees, graphs, or any type of data structure, because the data structure and the file access have been separated into two classes. The example
© R. S. Rist, 1993
176
nicely shows the "mix and match" philosophy of reuse in Eiffel; new code does not have to be written, because all the desired features already exist and can simply be used.
12.5 Joining features Eiffel automatically joins or merges a deferred feature with an effective feature that has the same name and signature. One common use of this technique is to define an interface in one class as a set of deferred features, define the effective features in another class, and join the two sets of features in a child that inherits both classes. In this way, the interface to a class is separate from the class itself, but the two can be combined to form a single class that is used by a client; the separation is made through the inheritance structure, and is thus invisible to the client. As an example, consider the account menu interface in the banking system case study. A class MENU can be defined that contains all the code to interact with the user, and then calls a deferred feature when the input is complete. A separate class ACCOUNT is defined with a set of effective features, and a child class inherits from both parents and joins the features. If the account details change, then the class ACCOUNT can be modified without changing the menu interface. If the menu changes from character-based to some other form of menu, then class MENU can be modified without affecting the account actions. The interactive account class INTERACCT inherits its interactive features from MENU, and the account features from ACCOUNT. The inheritance chart for these classes is shown below.
ACCOUNT
MENU
INTERACCT
The code for one of the menu choices and one of the account actions might look like the code shown below. The user interface and deferred features are given in the class MENU. Because this class contains a deferred feature, it must be defined as a deferred class:
deferred class MENU creation feature ... do_choice is -- execute the choice made by the user do inspect io.lastchar when 'D', 'd' then io.putstring ("Enter the amount to deposit: ") io.readreal deposit (io.lastreal) ... end -- do_choice deposit (amount: REAL) is -- add amount to the balance deferred end © R. S. Rist, 1993
177
end -- class MENU An effective feature for deposit is defined in the class ACCOUNT. This class contains no deferred features, but an object of this type should never be created, only instances of the specific types of account. For this reason, class ACCOUNT is defined with an empty creation clause, that stops any client from creating an instance of this type. The effective code is
class ACCOUNT creation feature deposit (amount: REAL) is -- add amount to the balance do balance := balance + amount end ... end -- class ACCOUNT The child class INTERACCT inherits from both parents, getting the feature interface from class MENU, and the feature definition from class ACCOUNT. The complete code for the child class is
class INTERACCT inherit MENU; ACCOUNT creation make end -- class INTERACCT All the features in both classes are inherited, and the deferred and effective features are joined, because these features have the same name and signature. The child class has to nominate its own creation routine, as always. The full code for the menu and account classes is shown in Parts Six and Seven of the case study. Common error: feature of child has same name as feature of parent, generating a name clash Error code: VMFN Error: Two or more features have the same name What to do: If they must indeed be different features, choose different names or use renaming; if not, arrange for a join (between deferred features), an effecting (of deferred by effective), or a redefinition
12.6 Undefine If a class inherits two effective features with the same name, then they cannot be joined and create a name clash. The name clash can be resolved by converting one of the features from effective to deferred, so the (now) deferred feature is joined with the (single) effective feature. The format of the undefine clause is
class A inherit © R. S. Rist, 1993
178
B undefine x, y, z end The keyword undefine is followed by the names of all features in the inherited class to be undefined. Multiple undefined features have their names separated by commas. A feature's signature can be changed to a conforming signature if the feature is first undefined and then redefined in the same inheritance clause. The code to do this is
class A inherit B undefine x redefine x end feature x (m: N) is ... This dual change allows the signature of an inherited feature to be changed, because the feature is first made deferred and then redefined by the feature definition in the child. The new definition of the feature must conform to the previous definition, so each type in the new signature must conform to the corresponding type in the inherited feature.
12.7 Repeated inheritance Repeated inheritance occurs when a class inherits from a parent more than once. This occurs every time a class inherits from two user-defined class, because both parents inherit from class ANY, and the child thus has repeated inheritance from class ANY. Repeated inheritance may occur at a single level, but more often occurs via different inheritance paths. There are two cases to consider in which a feature is repeatedly inherited. If the feature has not been changed in any path from its parent, then the different versions are merged into a single feature. If the feature has been changed in one of the paths, then each of the different versions are retained in the child, and care must be taken to avoid a name clash. All the features are inherited from both parents, so repeated inheritance could produce a whole series of name clashes because each of the parents has features with the same name. Many of the features in the new child often refer to exactly the same feature in both parents, however. Eiffel simply combines features with the same names and signatures when the features are inherited by repeated inheritance. On the other hand, some features will have to be different, and these features are renamed to prevent them being merged. The shared features in the child are joined into a single feature, while those that have been changed are stored as separate features. A common example of repeated inheritance is a teaching assistant (TA), who is both a student and a member of staff. Both students and staff members are people, so both STUDENT and STAFF classes will inherit from class PERSON. Class TA inherits from both STUDENT and STAFF so it inherits class PERSON twice, once through each inheritance path. This situation is shown in the inheritance diagram below.
© R. S. Rist, 1993
179
PERSON
STUDENT
STAFF
rename id as student_id
rename id as staff_id TA
A TA has a single copy of the personal details, such as name, address, gender, and so on. A student has a student id, and a member of staff has a staff id, so a teaching assistant has two ids, one from each parent. These need to be renamed to avoid a name clash, and the make and display routines may also need to be renamed. An outline of the code for class TA is given below, showing the multiple inheritance that, in this case, leads to repeated inheritance of the class PERSON.
class TA inherit STUDENT rename id asstudent_id end STAFF rename id asstaff_id end feature ... end -- class JOINT The attributes in class TA are the joined and changed attributes from both parents, so a ta has a name, address, and two ids. All the features that were unchanged on the path from the common parent are joined, while those features that were changed have their own names.
12.8 Select If different versions of a feature are inherited from the same parent, then the feature to be used in the child can be stated in the select clause of the inheritance clause. The other, unselected versions are discarded, solving the name conflict. A feature is selected by writing the keyword select as part of the inheritance clause for that class, followed by the name of the selected feature. The format of the select clause is
class A inherit B select x, y, z end
© R. S. Rist, 1993
180
If there are multiple features selected from the same class, then the feature names are separated by commas. Select is used in repeated inheritance, where undefine is used to decide between competing versions of a feature in multiple inheritance.
12.9 Dynamic dispatch The general solution to the problem of reusing parent features as features of the child may now be given. Consider two classes PARENT and CHILD, where each class has its own make and display routines, and the parent features are used as stand-alone features in the child, not only as part of a new version of the parent feature. The solution is to both rename and redefine, along two different inheritance paths. Because the same feature is now inherited twice, the child has to select one of them to be the active feature. The general solution to reusing a parent feature as a feature of the child is thus to inherit the parent twice, rename the feature in one path, redefine the feature in the other path, and select the redefined feature. The code for such a solution is shown below.
class CHILD inherit PARENT rename make as parent_make, show as child_show end PARENT redefine make, show select make, show end creation make feature make is do parent_make ... show is do parent_show ... end -- class CHILD This code is cumbersome to write, so its most common application was simplified in 1997 by the new Precursor keyword. If a parent routine is to be used as a stand-alone feature of the child, then repeated inheritance is the only way to ensure that dynamic dispatch finds and executes the child version of the inherited feature.
© R. S. Rist, 1993
181
12.10 The inheritance clause A class may inherit from a single parent, from multiple parents, or from the same parent repeatedly. Name clashes due to inheritance can be avoided by rename, undefine, or select, followed by the automatic joining of a deferred and an effective version. The signature or body of a feature can be changed by a redefine. The export status of a feature is changed by the export clause. The format of the full inheritance clause is shown below. The sub-clauses are executed in order, so the same feature may be renamed, and then redefined under that new name.
class A inherit B rename m as n export
-- new name in child -- new export policy in
child {C, D} o, p undefine q redefine r, s select t end
-- no definition in child -- new body in child -- select active feature
A feature can be divided into four parts: a name, a type, a value, and an export policy. The name is simply the name of the routine, the type is defined by the signature, the value is defined by the routine body, and the export policy by the feature exports. A rename clause changes the name, but does not affect the signature, the body, or dynamic dispatch. An export clause changes the export status. A feature is changed from effective to deferred by the undefine clause. A redefine clause allows a new version of the parent feature to be defined in the child. The signature of the new version may be changed, so long as the new signature conforms to the old. If there are competing effective features inherited repeatedly from the same parent, then one of these versions is selected to be the active version in the child. The action of each inheritance clause is shown below. Each part of a feature is listed across the top of the table, the clauses are listed down the side of the table, and the effect of the clause on the feature is indicated in the body of the table. name
signature
rename export undefine redefine select
new name
active
body
export
conforms active
new policy deferred new body active
The select clause does not change the definition of a feature, it simply selects the competing feature, from the repeated parent, that is active in the child class. Main points in this chapter •
A class may inherit from multiple parents. If one parent feature is deferred and the other is effective, they are automatically joined to provide an effective feature in the child.
•
A common pattern in multiple inheritance is to have one class define the interface, and another class the action or implementation, so each role can be changed independently.
© R. S. Rist, 1993
182
•
A feature may be undefined when inherited, if two effective routines have a name clash.
•
A class may inherit repeatedly from a parent. Common features are merged, and altered features are kept separate. When two effective versions of a repeated feature create a name clash, one of them may be selected as the active version when it is inherited.
•
Dynamic dispatch follows the new versions of a feature down from the static type in the feature call. To guarantee that the new version of a feature is called, the feature must be inherited twice, using redefine, select, and rename.
Exercises 1.
How are name clashes resolved in multiple inheritance? in repeated inheritance?
2. What happens when two features are joined? What must be true to join two features? Can two effective features be joined? 3. Why is the menu separated from the action in system design? Why is a menu inherited rather than used as a client? 4.
What does multiple inheritance provide, that single inheritance does not?
© R. S. Rist, 1993
183
Chapter 13: Generic classes Keywords: generic, parameter, constrained genericity A generic class is one that can produce many types of object, depending on the parameter passed to it. The code in the class is genric; it makes no assumption about the type or structure of its parameter, and so works for any type of parameter. The simplest generic classes are arrays and lists. A generic class may be used as a supplier, or as a parent class. The type of parameter passed to a generic class can be constrained in the generic class header, so a constrained generic class only accepts a parameter that conforms to the constraint.
13.1 Generic class The classes ARRAY and LINKED_LIST are generic classes, because they can use objects of any type. A generic class is passed an actual parameter in the variable declaration, and the actual parameter is bound to the formal parameter when the system is compiled. The single class ARRAY, for example, can generate many types of objects, such as ARRAY [INTEGER], ARRAY [BANK], and so on. The class header for the class ARRAY is written so that it can receive a formal parameter:
class ARRAY [T] -- define the features for an array of objects of type T In the class definition, formal parameters are written after the name of the class, enclosed in square brackets; multiple parameters are separated by commas. The names "T" ( for Type) and "G" (for Generic) are common names for formal parameters. Parameter binding occurs at compile time, where argument binding (in routines) occurs at run-time. Any number of parameters may be defined in the header, separated by commas; one such class header is
class COMPLICATED [A, B, C, D] -- some compilcated class that receives four actual parameters The code within the class is generic: it makes no assumption about the type or structure of the parameter, and simply stores and retrieves whole objects without using any feature of these objects. This is done via the formal parameter T, which is used in the features of the class and at run-time is of whatever type was passed as a parameter. The feature item in class ARRAY, for example, has the feature header
item (i: INTEGER): T -- entry at index I, if in index interval so it returns an object of the type bound to T when the code was compiled. 13.2 Generic client The most common way to use a generic class is as a supplier class, where the client declares an array of customers (say):
class BANK creation make feature customers: ARRAY [CUSTOMER]
© R. S. Rist, 1993
184
Code in class BANK creates the array, stores customers in it, and retrieves customers from it.
13.3 Generic parent A generic class may be inherited, just like any other class. It may be inherited with the formal parameters bound, or unbound. It is often simpler to inherit a generic class, than to use a generic class as a client. A bank, for example, may be defined to contain a list of customers, so the code in class BANK could be
class BANK creation make feature customers: LINKED_LIST [CUSTOMER] ... On the other hand, a bank may be seen as a list of customers, so long as there is only a single list; the bank is a bunch of customers. The code in class BANK would then look like
class BANK inherit LINKED_LIST [CUSTOMER] creation make and a client would create the bank using the inherited list creation routine make, and could then use all the list features on the bank
class CLIENT creation make feature bank: BANK make is do !!bank.make from bank.start until bank.after ... As always, the child class (BANK) can add new features that are specific to the child. The inheritance chart for class BANK in this example is shown below. Note that the right arrow here indicates parameter passing, not the usual client declaration, so it can be added to an inheritance chart; this is not a client relation. A separate client chart is needed to show the class BANK as a supplier or as a client.
© R. S. Rist, 1993
185
LINKED_ LIST [T]
CUSTOMER
BANK
Class BANK is not a generic class, because it does not receive a parameter and it will only work on a list of customers. A user can define a new generic class simply by writing the appropriate generic class header and code. A generic class is defined by giving a formal parameter in the class header. To define a storable list, for example, we would use the code
class STORE_LIST [T] inherit STORABLE LINKED_LIST[T] creation make end -- class STORE_ARRAY The class STORE_ARRAY is generic, because it can handle any type of object passed as an actual parameter from a client. An actual parameter is passed from a client, and bound to all occurrences of the formal parameter, so the client code to create a storable list of customers would be something like
class CLIENT creation make feature customers: STORE_LIST [CUSTOMER] User-defined generic classes are shown on client and inheritance charts like any generic class. The formal parameter is shown after the class name, with a right arrow to the actual parameter.
13.4 Constrained genericity A generic class is powerful because it makes no assumptions about the objects that it uses; formally, it makes no assumption except that the object is of type ANY. An array and a list manipulate complete objects and never "look inside" an object. There are many cases, however, where a generic class needs to use features of the object. An example of this is a list, where each element has a unique key. The key can be used to search the list and find the desired object, but such a comparison has to be done by the client of the list, and has to be done by every client of a list with a key. The list class cannot assume that all objects it uses have a unique key, because such an assumption is not true for generic lists.
© R. S. Rist, 1993
186
A special class can be defined for a list of objects with unique keys, by inheriting the list class and adding facilities to check the key of each object in the list. Such a class, however, can only handle certain types of objects: those with a key field. The type of object that can be passed as a parameter to this class must be constrained to have a key. This can be done by constraining the parameter list so only certain types of objects can be passed as parameters. A generic class can specify that only certain types should be used by the class, by placing a constraint on the type of the formal parameter. This technique is called constrained genericity, because the parameter type is constrained by the class header. Because the type of parameter is constrained, specialised classes can be defined that make use of the features of the constrained type. A constraint is defined on the formal parameter in the class header by stating that the formal parameter must be of some specific type; the only valid actual parameters are then classes of this type. The mechanism is more general than it might at first appear, because any actual parameter that conforms to the constraint is valid. A generic class header is constrained by listing the formal parameter, the constraint indicator "->", and the name of the constraining class. The form of a constrained class header is class G [T -> C] which means "In the generic class G, the class bound to T must be of type C". Consider the class CUSTOMER in the BANK system. It would be nice to define a keyed list class that could search a list of customers, and look inside each element to see if it matched a given key value. Class CUSTOMER has to contain a key, and an exported routine match that receives a test key and returns true if the test key matches the key of the current object. The code in class CUSTOMER would look like
class CUSTOMER feature key: INTEGER match (test: INTEGER): BOOLEAN is -- does the current object match this key? do Result := test = key end -- match The constrained generic class, that is a list of keyed objects, may then be defined as
class KEY_LIST [T -> CUSTOMER] inherit LINKED_LIST [T] creation make feature find (target: INTEGER) is -- position the cursor at the element with key value target -- or after if no matching element is in the list do from start until after or else item.match (target) loop forth end © R. S. Rist, 1993
187
end -- find The name of the list is not contained in the code, because the code refers to the current object; the object is a list, because it inherits class LINKED_LIST. The code looks inside an object because it uses a feature called match that takes an integer and returns a boolean value, so the class must use constrained genericity. To use the power of matching arbitrary keys, the constraint in the class header is loosened so that any keyed class can be passed as a parameter. This constraint can be enforced by using the inheritance hierarchy to define a deferred class MATCHABLE that contains a deferred feature match, and making this the constraint. A CUSTOMER can then inherit MATCHABLE and effect the deferred routine. A client of class KEY_LIST would then create and use the object of type KEY_LIST [T]. A bank might use the keyed list of customers as a client with the code
class BANK creation make feature customers: KEY_LIST [CUSTOMER] The inheritance (top) and client (bottom) structure for this system is shown below.
PERSON
MATCHABLE
CUSTOMER
BANK
KEY_ LIST [T]
CUSTOMER
The code for such a system could look like
deferred class MATCHABLE feature match (target: INTEGER): BOOLEAN is -- does the key match the target? deferred end -- match end -- class MATCHABLE class CUSTOMER
© R. S. Rist, 1993
188
inherit MATCHABLE; PERSON feature match (target: INTEGER): BOOLEAN is -- does the key match the target? do Result := key = target end -- match ... end -- class CUSTOMER class KEY_LIST [T -> MATCHABLE] inherit LINKED_LIST [T] creation make feature find (target: INTEGER) is -- position the cursor at the element with key value target -- or off_right if no matching element is in the list do from start until after or else item.match(target) loop forth end end -- find found: BOOLEAN is -- was the target found? do Result := not after end -- found end -- class KEY_LIST The general mechanism for constrained genericity can thus be defined in four steps: 1. 2. 3. 4.
Define a deferred parent (MATCHABLE here) Inherit the parent class and effect the deferred feature (match in class CUSTOMER here) List the parent as the generic constraint (KEY_LIST here) Pass the base class (CUSTOMER here) as the actual parameter
© R. S. Rist, 1993
189
Warning: The rules of conformance for an expanded type (Section 11.2) say that an expanded type conforms to objects of the expanded type itself, and to objects of its base (reference) type. This places a restriction on the type of the argument that can be sent to a match routine in the example above. The most general header for a match routine is
match (target: ANY): BOOLEAN is ... because any reference type can be used as an actual argument. An actual argument of an expanded type, such as INTEGER, does not conform to the formal argument and thus cannot be sent as an actual argument to this match function.
13.5 Reuse in Eiffel A traditional programming language, such as Pascal or C, supports code reuse by routines. A routine is defined once and then called as needed, with any needed data being passed as arguments to the routine. Eiffel supports code reuse in four ways: routine, client, inherit, and generic. An Eiffel routine is defined to do a single thing, and complex actions are achieved by calling a series of routines. A set of preconditions can be defined on each routine, to define the values that can be passed as arguments; if the precondition is true when the routine is called, then the routine guarantees the effect or postcondition of the routine. The class provides the basic unit of reuse in OO programming, because it encapsulates a set of related data and routines. In Eiffel, both data and routines are treated equally as features of the class, so we can simply talk about the behaviour of the class. The use of a feature outside the class is explicitly controlled by the export policy; it is usual to export selected routines, and to hide the data inside the class. The simplest form of reuse is provided by the client relation, in which a client of the supplier class creates an object of that type, and then uses the services of that object. The creation of an object is controlled by the export policy on the creation routine. A class can be reused by the inheritance relation. Inheritance occurs when one class can be viewed as a more specific version of another class; client means "has", "uses", or "contains" where inherit means "is". All features in a parent class are available to the inheriting, child class, and more features are usually added by each child. A feature may be defined in the parent class, or deferred until the child. When a feature is inherited, it can be used unchanged, effected, its name, signature, or body may be altered, and its export status may be altered in the child. A class may use single, multiple, or repeated inheritance. A class can be reused by passing it to a generic class, that contains generic code to handle any object passed as a parameter to the class. If there is a need to use a feature of the object, then the inheritance hierarchy can be used to constrain the type of parameter passed to the constrained generic class. This allows type-specific code to be added to the new constrained class, so code can be written once inside the new class and used by a client of the generic class. The price of reuse is a distributed system. In a traditional system, it is possible and common to write a long series of statements that are executed as a single chunk. This can produce very short and efficient code, but it does not support reuse. If any part of this code changes, then the whole routine has to be amended and recompiled. If a similar task needs to be done, then the existing code can offer a template for the new, similar code, but new code has to be written. If one part of the code needs to be done separately, then the existing code has to be re-written to separate out this part, or the code has to be written again in a separate routine. The basic power of an OO system comes from three mechanisms. First, an object has its own data and shares the routines in its class, so we need not pass data around the system; call this data encapsulation. Second, a class contains the code that uses or chnages its data, and the class can be reused as an object, or as a parent; call this class encapsulation. Third, each feature in a class has a completely-defined interface, and the implementation is hidden inside the feature; call this code encapsulation.
© R. S. Rist, 1993
190
The promise of an OO system is that code can be defined once, placed in the correct class, and then used forever. It is much harder and more time-consuming to write a good, reusable OO system than it is to write a procedural system, but the effort has to be made only once, not every time the system is changed.
13.6 Case study: constrained genericity A keyed list is used for account and for customer lookup. Main points in this chapter •
A generic class handles objects of any type, passed as a parameter through the parameter list in the class header; such a process is called genericity. A generic class makes no assumptions about the behaviour of its objects.
•
A generic class may be used as a client, or as a parent.
•
A programmer can define a new generic class simply by writing a formal parameter in the class header, and then using it in the class code.
•
Constrained genericity allows a generic class to use features of a parameter, by limiting or constraining the type of the parameter. A constraining parent class is defined, inherited by a child class, and listed as the constraint in the generic class. Only actual parameters of this type can then be passed.
•
Procedural languages support code reuse by routine call. Object-oriented languages support code reuse by routine call, object creation, inheritance, and genricity.
Exercises 1.
Genericity gives one way of reusing code. How does it do this?
2.
When are parameters bound? How many parameters can be passed to a generic class?
3. Why do we need constrained genericity? What does the word "constrained" mean? What is constrained? How is it constrained? 4.
What are the main steps in constrained genericity ?
5. a) b) c) d)
Build a constrained generic class: Write code to display an ARRAY of POINTs. Write code to display an ARRAY of PERSONs. Write a generic class SHARRAY (show array) that displays all items in the array. Write the code that uses class SHARRAY to replace (a) and (b)
6.
List the ways that code is reused in Eiffel.
© R. S. Rist, 1993
191
Chapter 14: Assertions and inheritance Keywords: inheritance, class invariants, require else, ensure then. Inheritance of assertions guarantees that the behavior of a class is compatible with that of its ancestors. For example, a class should not assume a state that would be invalid for any of its predecessors; therefore it must satisfy all their invariants. Similarly, a class must be able to perform (at least) all the functions of its ancestors; therefore, the redefinition of a routine may only weaken the precondition and strengthen the post-condition. The pre- and post-conditions for a routine redefinition are designated by the keywords require else and ensure then respectively. The precondition for a redefined routine is equivalent to the new precondition or else the precondition from the original routine, while the post-condition is equivalent to the new post-condition and then the original.
14.1 Look and feel Pre- and post-conditions and class invariants allow us to describe many properties of software components. We can use these assertions to write software contracts which may be monitored at run-time to ensure compliance. However, we need to realize that these tools are embedded in a language which supports inheritance; therefore, classes can redefine routines inherited from their parents. To complete our understanding of assertions and their use, we must now examine how the tools presented so far interact with inheritance. The inheritance mechanisms provided by Eiffel are powerful, and if used correctly can lead to smaller, more elegant systems consisting of easily reusable components. However, these mechanisms can also lead to chaos if misused. If redefinition and dynamic binding are allowed to arbitrarily change the behavior of operations, clients will be unable to depend on stable, predictable outcomes. Assertions can provide an answer to this problem. Briefly, in Eiffel descendants inherit the class invariants of their parents, and any redefinition of a routine must satisfy the original pre- and post-conditions; therefore, if client code relies only on properties specified using assertions, all descendants are guaranteed to perform all functions required of their ancestors. We can extend our contracting metaphor to view the inheritance of assertions as programming by sub-contract. In the real world, a general contractor may hire any number of others to perform parts of the original job; however, these sub-contractors must perform the work up to the standards required in the overall contract, and must do it cheaply enough so that the general contractor can complete the entire job for the agreed upon price. In Eiffel, descendants can only do more, cheaper than their ancestors. Specifically, any redefinition of a routine may only change the post-condition by making it stronger (more difficult to satisfy); therefore, all descendants are guaranteed to perform (at least) all the functions of their ancestors. Similarly, any redefinition may only change the precondition by making it weaker (easier to satisfy); therefore, all descendants are guaranteed to perform in (at least) all cases their ancestors would accept.
14.2 Class invariants In Eiffel, a class inherits the invariants of all its ancestors, so each instance of the descendent class must satisfy the invariant of each ancestor class. For example, if the class HOUSE_CAT is a descendent of both CAT and PET, and CAT is a descendant of ANIMAL, then any instance of HOUSE_CAT must satisfy the class invariants of both CAT, PET and ANIMAL. In other words, the invariant for HOUSE_CAT consists of its own invariant "and"ed with the invariants of all its ancestors. If class invariants are being monitored at run-time, then the invariants of all the ancestors of a class will be evaluated before the invariant of the class itself. For example, if class invariants are being checked, then the invariants for CAT, PET and ANIMAL will all be evaluated before the invariant for HOUSE_CAT. As a more concrete example, let us consider the problem of maintaining an ordered list. An ordered list is simply a list in which every element is greater than or equal to all those preceding it; in other words, the first element is the list is the smallest, the second is the next to smallest, and so on. We can use the LIST class described previously to maintain an ordered list by defining a descendant class with an appropriate
© R. S. Rist, 1993
192
invariant. The first thing we must do is define the BOOLEAN function ordered, which returns true if and only if the elements of the list have the correct relationship. The function is defined on any list, so it has no precondition (in other words, it has a precondition of true). The post-condition requires that if the function returns false then the last two items examined were in the incorrect order. The routine consists of a single loop. Result is initialized to true because an empty list is always ordered. Each pair of elements in the list is compared starting at the front of the list and continuing until the end is reached or the result is known to be false. When a pair of elements is compared, the Result is set to false if they are out of order.
ordered: BOOLEAN is local k: INTEGER do from Result := true k := count until k < 2 or not Result loop k := k - 1 if elements.item (k - 1) < elements.item (k) thenResult := false end end ensure not Result implies elements.item (k - 1) < elements.item (k) end -- ordered We can now define the class ORDERED_LIST as a descendant of LIST that exports the same features. The major difference between the two is that ORDERED_LIST has a class invariant requiring that the items be ordered both before and after each external call to any routine. Another difference is that the elements in an ORDERED_LIST must belong to a descendant of the kernel library class COMPARABLE; this guarantees that the necessary comparison operators are defined. An ORDERED_LIST must be able to perform all the functions of a LIST from the client's point of view. While most of the operations will be identical, the insert routine must be modified to maintain the ordering relation when a new element is added.
class ORDERED_LIST [T -> COMPARABLE] inherit LIST [T] redefine insert end feature {ANY} insert (elem: T) is -- insert elem, maintaining sorted order ••• ordered: BOOLEAN is ••• invariant
© R. S. Rist, 1993
193
ordered: ordered end -- class ORDERED_LIST 14.3 Pre- and post-conditions In Eiffel, the inheritance of pre- and post-conditions supports programming by sub-contract; any redefinition of a routine must satisfy a precondition that is at least as easy to satisfy as the original, and a post-conditions that is a least as demanding as the original. In other words, a sub-contractor may weaken the precondition and strengthen the post-condition, but not vice versa.
Specifically, if pre- and post-conditions are present in the redefined routine they are designated with the keywords require else and ensure then respectively. routine_name (<arguments>) is require else new_precondition ••• ensure then new_post-condition end -- routine_name The new pre- and post-conditions are evaluated before the originals. In other words, the above assertions are equivalent to the following, where the original pre- and post-conditions are from the definition of the routine in the parent class. The default new precondition is false, and the default new post-condition is true.
routine_name (<arguments>) is require new_precondition or else original-precondition ••• ensure new_post-condition and then original-precondition end -- routine_name If pre- and post-conditions are being monitored at run time, then the new precondition is evaluated at the beginning of each call to the routine. If it is true then execution continues with the body of the routine. If the new precondition is false then the original precondition is evaluated. If the original precondition is also false then an exception is raised and the system terminates with an appropriate message. If the original precondition is true then execution continues with the routine body. After execution of the routine body completes, the new post-condition is evaluated. If it is false then an exception is raised and the system terminates with an appropriate message. If the new post-condition is true, then the original post-condition is evaluated. If it is true, then every thing is as it should be and the call completes normally. If it is false then an exception is raised and the system terminates with an appropriate message. As an illustration, consider the insert procedure for the ORDERED_LIST class. The original precondition for insert required that the list was not full, and the post-condition ensured that the list was not empty, that count was increased by one, and that elements and count were the only features changed by the operation.
insert (elem: T) is -- put an element into the list require © R. S. Rist, 1993
194
not_full: not full do ••• ensure not_empty: not empty; count_increased: count = old count + 1; no_extra_changes: strip (elements, count).is_equal old
strip
(elements,
count) end -- insert In ORDERED_LIST, the class invariant requires that the list be ordered at the beginning of insert; therefore the precondition need not be changed, and so none appears in the redefined routine. While the original post-condition for insert is also acceptable, we will strengthen it by requiring that elem be a member of the list when the procedure terminates.
insert (elem: T) is -- put an element into the list do ••• ensure then elem_member: member (elem) end -- insert The pre- and post-conditions for the redefinition of insert are equivalent to the following. Since there is no new precondition, the original is used unchanged, while the new post-condition is "and"ed on to the beginning of the original. Notice that we save considerable duplication by inheriting the pre- and postconditions for the insert routine in the LIST class, rather than re-writing them in the pre- and post-conditions for the new routine explicitly.
insert (elem: T) is -- put an element into the list require not_full: not full do ••• ensure elem_member: member (elem); not_empty: not empty; count_increased: count = old count + 1; no_extra_changes:strip (elements, count).is_equal old
strip
(elements,
count) end -- insert If pre- and post-conditions are being monitored at run-time, the precondition for insert will be evaluated at the beginning of each attempt to add a new element to an ORDERED_LIST. The new precondition is false, so evaluation precedes directly to the original precondition (from LIST). If it is false, then an exception is raised and the system terminates with a message stating that the precondition for insert has been violated. If the original precondition (not_full) is true, then everything is as it should be and
© R. S. Rist, 1993
195
execution continues with the procedure body. When the body completes the post-condition is evaluated. The new post-condition (elem_member) is evaluated first. If it is false then an exception is raised and the system terminates with a message stating that the post-condition for insert has been violated. If the new postcondition is true, then the original post-condition is evaluated. If it is false then an exception is raised and the system terminates with a message stating that the post-condition for insert has been violated. If the original post-condition is also true, then everything is as it should be and the call terminates normally. While the external behavior of the insert procedure for ORDERED_LIST is quite simple, its implementation is reasonably complex; therefore, we will examine it in the context of the entire class definition.
14.4 An example class: ORDERED_LIST ORDERED_LIST is a descendant of LIST with a class invariant requiring that the elements in the list be ordered both before and after each external call. The class defines a BOOLEAN function ordered that returns true if and only if the elements of the list have the correct relationship. The items in an ORDERED_LIST must belong to a descendant of the kernel library class COMPARABLE so that the necessary comparison operators are defined. ORDERED_LIST exports the same features as its parent LIST. Most of the operations are inherited with no modifications, but the insert routine is redefined to maintain the ordering relation when a new element is added. The insert routine for ORDERED_LIST inherits both pre- and post-conditions from its parent. The redefinition leaves the precondition unchanged, but strengthens the post-condition by requiring that the new element be a member of the list when the routine completes. The body of insert consists of two loops and two assignments. The first loop finds the proper location for the new element. The chosen element will follow the new item in the resultant list. The second loop moves all the elements from the selected one to the beginning up one location; this frees up the space for the new addition (remember that the items in a list are stored in reverse order, so that the first element has the largest index). Finally, the new item is inserted and the number of elements in the list is incremented.
insert (elem: T) is local k: INTEGER do < first_loop > -- found correct location -- all previous elements smaller -- all succeeding elements greater < second_loop > -- all preceding elements moved up one elements.put (elem, location + 1); count := count + 1 ensure then elem_member: member (elem) end -- insert The variable location (roughly, the location of the new element in the list) is a feature of the ORDERED_LIST class, rather than local to the insert routine. While this is not necessary for our present purposes, it does no harm and will prove necessary for the example given in the next chapter. The first loop finds the proper location for the new element to be added to the list. The loop initialization sets location to the index of the first item in the list. Each element is then considered in turn until the end of the list is reached or an element greater than or equal to the item to be inserted is found.
from location := count - 1 © R. S. Rist, 1993
196
until location = - 1 or elem <= elements.item (location) loop location := location - 1 end The second loop moves all the elements from the selected one to the beginning up one location in elements; this frees up the space for the new addition. The loop initialization sets the loop counter (k) to one more than the highest valid index. Each iteration of the loop then moves an item up one location and decrements k.
from k := count until k = location + 1 loop elements.put (elements.item (k - 1), k) k := k - 1 end We can now examine the class in its entirety.
class ORDERED_LIST [T -> COMPARABLE] inherit LIST [T] rename make as list_make redefine insert end creation make feature {ANY} location: INTEGER; make (size: INTEGER) is -- create a list of size size do !! list_make (size) end -- make insert (elem: T) is -- insert elem into its correct place in the sorted list require not_full: not full local k: INTEGER do from location := count - 1 until
© R. S. Rist, 1993
197
location = - 1 or elem <= elements.item (location) loop location := location - 1 end from k := count until k = location + 1 loop elements.put (elements.item (k - 1), k) k := k - 1 end elements.put (elem, location + 1) count := count + 1 ensure elem_member: member (elem); not_empty: not empty; count_increased: count = old count + 1; no_extra_changes:max_count = old max_count and position = old position end -- insert ordered: BOOLEAN is local k: INTEGER do from Result := true k := count until k < 2 or not Result loop k := k - 1; if elements.item (k - 1) < elements.item (k) then Result := false end end ensure not Result implies elements.item (k - 1) < elements.item (k) end -- ordered invariant ordered: ordered end -- class ORDERED_LIST
© R. S. Rist, 1993
198
Main points covered in this section •
Inheritance of assertions guarantees that the behavior of a class does not differ in unpredictable ways from that of its ancestors.
•
The invariants of all the ancestors of a class apply to the class itself.
•
Programming by sub-contract implies that a routine redefinition may only weaken the precondition or strengthen the post-condition.
•
The pre- and post-conditions for a routine redefinition are designated by the phrases require else and ensure then respectively.
•
The precondition for a redefined routine is equivalent to the new precondition or else the original precondition. The default new precondition is false.
•
The post-condition for a redefined routine is equivalent to the new post-condition and then the original post-condition. The default new post-condition is true.
Exercises 1.
What is programming by contract? How is a contract enforced?
2.
What is programming by sub-contract? How is a sub-contract enforced?
3. Why should a child class honour the contract of its parent(s)? What happens if a sub-contract is not honoured? 4.
“A bird flies. A penguin is a bird”. How can you implement this fragment of a system? The following questions use a strictly ordered list.
Consider the problem of maintaining a strictly ordered list of integers. In such a list, each element is exactly one more than the item preceding it. For example, "1, 2, 3" is a strictly ordered list, but "1, 3, 4" and "1, 2, 4" are not. The empty list is strictly ordered, as is any list with only one element.
strictly_ordered is a BOOLEAN function which returns true if and only if a list is strictly ordered. The precondition is simply true, while the post-condition ensures that if the function returns false then the list is not strictly ordered. The loop counter, k, is initialized to reference the first element in the list, while Result is initialized to true. The loop examines every pair of elements in turn until the end of the list is reached. If a pair is not in the correct relationship, then Result is set to false; otherwise, nothing is done. The invariant for the loop requires that k be within the valid range, and that if Result is false then the last pair of elements examined were in the incorrect relationship.
strictly_ordered: BOOLEAN is local k: INTEGER do from k := count - 1; Result := true invariant k_big_enough: - 1 <= k; k_small_enough: k < count; correct_result: not Result implies elements.item (k - 1) + 1 /= elements.item (k - 2) variant © R. S. Rist, 1993
199
k+1 until k<=0 or not Result loop ••• end ensure correct_result: not Result implies elements.item (k - 1) + 1 /= elements.item (k - 2) end -- strictly_ordered 5.
Write a body for the above loop which completes the definition of strictly_ordered.
We can define the class STRICT_LIST as a descendant of ORDERED_LIST; the main difference is that STRICT_LIST has a class invariant requiring that the list be strictly ordered. For the sake of clarity, the make and insert routines from ORDERED_LIST are renamed ordered_make and ordered_insert respectively. STRICT_LIST exports the routines empty, full, first and member, which are inherited from ORDERED_LIST with no modifications. empty and full return true when the list has no elements or no free space respectively, first returns the item at the beginning of the list, and member returns true is an item is already in the list. STRICT_LIST defines the exported features last and strict_insert. last returns the element at the end of the list, while strict_insert places a new element into the list if possible. The BOOLEAN function strictly_ordered returns true if and only if the list is strictly ordered; it is used in the class invariant.
class STRICT_LIST inherit ORDERED_LIST [INTEGER] rename make as ordered_make, insert as ordered_insert creation make feature {ANY} make (size: INTEGER) is -- create an ordered list of size size do !! ordered_make (size) end -- make last: INTEGER is require not_empty: not empty do Result := elements.item (0) end -- last strict_insert (elem: INTEGER) is ••• end -- strict_insert
© R. S. Rist, 1993
200
strictly_ordered: BOOLEAN is ••• end -- strictly_ordered invariant ••• end -- class STRICT_LIST 6.
Write a clause which completes the class invariant for STRICT_LIST.
We can only insert particular items into a strictly ordered list. Specifically, the element to be inserted must either be one less than the first item on the list, or one more than the last item. Therefore, we can not simply redefine the insert procedure from ORDERED_LIST in STRICT_LIST. 7. Why can't we simply redefine the insert procedure in STRICT_LIST? (Hint: what are the allowable modifications to pre- and post-conditions when redefining a function?) We will define a procedure strict_insert to insert an item into a STRICT_LIST. The routine takes the element to be inserted as an argument, and the precondition requires that space for a new item be available. The routine will insert the new element if it is possible to do so and maintain strict ordering; otherwise, it will leave the list unchanged.
strict_insert (elem: INTEGER) is require not_full: not full do ••• ensure ••• end -- strict_insert 8. Write a post-condition for the procedure. (Hint: good returns true if it is possible to insert an element and maintain strict ordering, while last returns the value of the last item in the list.)
good (elem: INTEGER): BOOLEAN is do Result := empty or else elem = first - 1 or elem = last + 1 end -- good last: T is do Result := elements.item (0) end -- last
© R. S. Rist, 1993
201
Chapter 15: Exceptions Keywords: exception, organized panic, resume, rescue, retry An exception is generated when an assertion is violated at run time. There are two acceptable responses. In an organized panic approach, a stable state is generated before the executing routine terminates signaling failure to its caller. In a resumption approach, the conditions that precipitated the violation are corrected and the routine is restarted from the beginning. Exceptions will propagate from routine to caller until the problem can be corrected, or the top-level is reached and the system terminates. Control passes to the rescue clause when an exception is generated during execution of a routine. The clause then produces a stable state before signaling failure to the caller, or re-invoking the routine by executing a retry instruction. Rescue clauses are designated by the keyword rescue, and retry instructions consist of the keyword retry.
15.1 Look and feel We have seen how assertions can be used to describe the properties of programs in a way that supports the idea of programming by contract. We have also seen how the run-time monitoring of assertions can aid in the enforcement of these contracts and thereby enhance testing and debugging. We have learned that in Eiffel, an exception is raised when an assertion is violated at run-time, and that the default response to an exception is termination of the entire system with a message describing the location of the problem. While this response is satisfactory in many instances, it may have occurred to the reader that in some situations a different approach would be more appropriate. If a routine executes a component which fails (and thereby generates an exception), execution can not precede as if nothing had happened; however, system termination is not always necessary. Sometimes, other plans can be made. The general solution is to provide language constructs for processing exceptions. Ideally, this allows the problem which caused the exception to be corrected and execution to continue normally. When an exception has been raised, the contract between routine and caller has been violated. It is not acceptable to simply terminate silently with an error; the caller must be informed that the contracted work could not be completed. There are two acceptable responses to an exception. In organized panic, a stable state is generated before the routine terminates signaling failure to its caller. In resumption, the conditions that caused the problem are corrected and the routine is restarted from the beginning. Properly, routines must either fulfill their contracts or fail; there is no middle ground. In general, exceptions will propagate from routine to caller until a level is found where the problem can be corrected, or the top-level is reached and the entire system terminates. The action taken when an exception is triggered depends on whether the currently executing routine can discover and correct the underlying problem. If possible, the routine uses a resumption strategy; it corrects the problem and completes its execution successfully. If this is not possible, the routine uses organized panic; it generates a stable state and then signals failure to its caller. Failure of the called routine triggers an exception in the caller and the entire process repeats. In this manner, control is passed in a systematic way from routines to their callers until a way is found to correct the problem, or until the original program invocation is reached, at which point the entire system terminates. Specifically, in Eiffel the rescue clause and retry instruction provide the necessary facilities for exception handling. Each routine may have a rescue clause which is executed when an exception is triggered during execution of the body. The rescue clause performs the actions necessary to produce a stable state before the routine terminates. If the conditions that caused the exception can be discovered and corrected, a retry instruction may be executed in the rescue clause, which will restart the routine at the beginning. Eiffel supports only the disciplined use of exceptions; routines must either satisfy their pre- and postconditions or notify their callers of the discrepancy. Eiffel supports both organized panic and resumption strategies. Organized panic can be implemented using only a rescue clause, while resumption requires a retry instruction in addition to the rescue. Specifically, when an exception is triggered, the rescue clause associated with the current routine is first executed. If the rescue clause implements an organized panic approach then it simply restores a stable state before the routine terminates signaling failure to the caller. This raises an exception in the caller and
© R. S. Rist, 1993
202
the entire process repeats. The default is for this process to continue until failure of the top-level call causes the system to terminate with a message describing the location of the problem. On the other hand, if the clause implements a resumption strategy, then it corrects the underlying problem and executes a retry instruction to restart the routine. If this strategy is successful then the caller will never know that an exception occurred.
15.2 Rescue clauses Ideally, certain processing should be performed whenever an exception occurs during a routine's execution. The object containing the routine should be restored to a stable state whether an organized panic or resumption strategy is being used. A stable state should be established before the routine terminates and signals an error to its caller because leaving the object in an inconsistent state complicates further processing. A stable state should be established before the routine is restarted because correct execution of the routine depends on this assumption. In Eiffel, the rescue clause allows the results of a partial execution to be cleaned up and a stable state to be restored before the routine terminates or is restarted. For example, the clause may contain instructions to restore data structures that were modified during the partial execution. This allows the caller to view an invocation of the routine as an "all or nothing" proposition; either the routine terminates normally fulfilling the contract specified in the postcondition, or an exception is raised with no visible effects of the call. In Eiffel, the rescue clause is designated by the keyword rescue and follows the body and postcondition for the routine.
routine_name (<arguments>) is require precondition local local_variables do routine_body ensure post_condition rescue rescue_clause end Consider a simple class that simulates a folding ruler. Such a ruler is hinged in the middle so that it can be used in either collapsed or expanded form. For example, if the ruler is six inches long folded, it would be twelve inches long unfolded. The class FOLDING_RULER exports the features length, sections and section_length, which are the length, the number of sections in use, and the length of a ruler section respectively. It also provides the operations fold and unfold, which respectively reduce or increase the number of sections in use. The class has an invariant that requires the length of the ruler to be equal to the number of sections in use times the length of a section.
class FOLDING_RULER feature {ANY} length: INTEGER; sections: INTEGER; section_length: INTEGER is 6; ••• invariant
© R. S. Rist, 1993
203
consistent: length = sections * section_length end -- class FOLDING_RULER To illustrate the use of rescue clauses, let us consider the unfold routine from this class. The precondition for this procedure requires that the ruler is currently folded, while the postcondition ensures that both length and sections are increased by the proper amount. The rescue clause for the routine simply sets the length of the ruler to the length of a single section and the number of sections to one.
unfold is require folded: sections = 1 do ••• ensure longer: length = old length + section_length; more_sections: sections = old sections + 1 rescue length := section_length; sections := 1 end -- unfold In Eiffel, the rescue clause is executed whenever an exception is triggered during execution of the routine body, including violation of the postcondition. When the rescue clause executes to completion the routine terminates and an exception is triggered for the caller. Since the call may have originated from outside the object the rescue clause should always restore the class invariant. To illustrate, let us again consider the unfold procedure for FOLDING_RULER. Suppose the postcondition for unfold is violated and an exception is raised. The rescue clause is executed before the routine terminates and an exception is triggered in its caller. The rescue clause sets length and sections to section_length and one respectively. This restores the class invariant and ensures that the caller of unfold will see a stable state for FOLDING_RULER. Therefore, the caller can view unfold as an all or nothing operation; either unfold completes with the postcondition true, or it fails with no visible effects. As a more complex illustration, consider the problem of maintaining an ordered list with no duplicates. We will define a class NODUP_LIST as a descendant of ORDERED_LIST and have it export roughly the same operations. The main difference is that NODUP_LIST has a class invariant requiring that no two elements in the list are identical. Unfortunately, we can not simply redefine the insert operation in the class. If we insert an element already present into a NODUP_LIST then we will invalidate the class invariant. We could avoid this situation by strengthening the precondition for insert (on NODUP_LIST) to require that the new element not already be present, but this violates our rule that redefinition can only weaken the precondition for a routine. Similarly, we could weaken the postcondition so that a NODUP_LIST would simply be unchanged by an attempt to insert a duplicate element, but our rules require that redefinition only strengthen postconditions. Therefore, we will define a new procedure, nodup_insert, which inserts an item into a NODUP_LIST. For the sake of clarity, the insert routine from ORDERED_LIST will be renamed as ordered_insert.
class NODUP_LIST [T -> COMPARABLE] inherit ORDERED_LIST [T] rename insert as ordered_insert end
© R. S. Rist, 1993
204
creation make feature {ANY} ••• invariant no_duplicates: no_duplicates end -- class NODUP_LIST NODUP_LIST defines a Boolean function no_duplicates which returns true only if each element in the list is unique. No_duplicates has a precondition of true, and its postcondition requires that if the function returns false then the last two elements examined are identical. The body of the routine consists of a single loop. The function value is initialized to true, and each pair of elements in the list is examined in turn. If the pair is equal then Result is set to false, otherwise nothing is done.
no_duplicates: BOOLEAN is local k: INTEGER do from Result := true k := 1 until k >= count or not Result loop if elements.item (k).is_equal (elements.item (k - 1)) then Result := false end k := k + 1 end ensure correct_result: not Result implies elements.item (k - 1).is_equal (elements.item (k - 2)) end -- no_duplicates To illustrate the use of rescue clauses, consider a routine that attempts to insert an element into a NODUP_LIST. The procedure try_insert takes the element to be inserted as an argument. Its precondition requires that space be available in the list, while its postcondition ensures that the element has been inserted, but is not a duplicate. The body of the routine simply calls the insert routine from ORDERED_LIST. Specifically, the new element is inserted at position location plus one by ordered_insert. If an element with the same value was already present, then it will be at position location. The postcondition ensures that this possibility has not occurred, and the rescue clause for the procedure removes the duplicate element if necessary. For our present purpose, we will assume that ordered_insert either inserts the element in question, or fails leaving the original list unchanged. Therefore, the postcondition for try_insert can be violated in two different ways. If ordered_insert succeeds, but inserts a duplicate element, then an item must be removed from the list. On the other hand, if ordered_insert fails then the new element is not a member of the list and no further action is possible. Therefore, the body of the rescue clause for try_insert consists of an if statement which branches on the presence of a duplicate element in the list. The code to remove the duplicate element consists of a single loop and an assignment. The loop starts at the duplicate element and
© R. S. Rist, 1993
205
moves each element up one location until the front of the list is reached. The assignment then reduces the number of valid elements in the list by one.
try_insert (elem: T) is require not_full: not full local k: INTEGER do ordered_insert (elem) ensure is_member: member (elem); not_duplicate: not_duplicate rescue if location >= 0 and then elements.item (location).is_equal (elements.item (location + 1)) then from k := location + 1 until k = count - 1 loop elements.put (elements.item (k + 1), k) k := k + 1 end count := count - 1 end end -- try_insert not_duplicate: BOOLEAN is -- have we found no duplicates in the list? do Result := location = - 1 or else not elements.item (location).is_equal (elements.item (location + 1)) end -- not_duplicate To better understand the operation of this routine, we will consider two cases. In both instances, assume that pre- and postconditions are being monitored at run-time, and that the precondition for try_insert evaluates to true. First, suppose we call try_insert with an element that is not a member of the current list. The body of try_insert first calls ordered_insert, which successfully inserts the element at position location, and then the postcondition for try_insert is evaluated. Both the is_member and not_duplicate clauses evaluate to true, so the procedure terminates normally. On the other hand, suppose we call try_insert with an element that is already a member of the current list. As before, the body of the procedure first calls ordered_insert which successfully inserts the element at position location. Then, the postcondition for try_insert is evaluated. The is_member clause evaluates to true, but the not_duplicate clause evaluates to false as the items at positions location and location plus one are identical. Therefore, an exception is raised and the rescue clause for try_insert is executed. The condition of the if statement evaluates to true, so the code to remove a duplicate element is executed. The loop begins with the duplicate element and moves each item up one position until the front of the list is reached. This eliminates the duplicate entry, and so when the assignment corrects count a stable state has been produced. At this point, the rescue clause terminates without having executed a retry
© R. S. Rist, 1993
206
instruction; therefore, execution of the routine is complete, try_insert terminates signaling failure to its caller.
15.3 The retry instruction Rescue clauses allow the restoration of a stable state before a routine terminates signaling failure to its caller. In many situations this is all that can be accomplished; however, in some cases it is possible to discover and correct the cause of the exception at run-time. In these cases we would like to use a resumption strategy: make the necessary adjustments and then restart the routine to run to completion. In Eiffel, the retry instruction provides this capability. It consists of the single keyword retry and may only appear in a rescue clause. To illustrate, consider the nodup_insert routine from NODUP_LIST. This procedure takes the element to be inserted as an argument. The precondition for the routine requires that space be available for the new element, while the postcondition ensures that the element is a member of the list, and that only elements and count have been changed during the execution. The body of nodup_insert consists of a single if statement which branches on the Boolean variable already_tried. If already_tried is false (the first time through the body) then try_insert is called with the new element as an argument; otherwise (the second time), nothing is done. The routine has a rescue clause consisting of a single if statement. If already_tried is false and the new element is already a member of the list (first try failed because element is already a member), then already_tried is set to true and the routine is restarted.
nodup_insert (elem: T) is require not_full: not full local already_tried: BOOLEAN do if not already_tried then try_insert (elem) end ensure is_member: member (elem); no_extra_changes: strip (elements, count).is_equal old strip (elements, count) rescue if not already_tried and member (elem) then already_tried := true retry end end -- nodup_insert When a retry instruction is executed, it causes the routine that contains it to restart at the beginning; therefore, both the precondition and the class invariant should be true before it is invoked. Local variables are not reinitialized before the restart, but the precondition may be rechecked. Use of the retry instruction does not guarantee successful execution of the routine. New exceptions may be generated and must be handled as before. To illustrate, let us again consider the nodup_insert routine. To better understand the operation of this procedure, we will consider three cases. In all instances, assume that pre- and postconditions are being monitored at run-time. Furthermore, assume that space is available to insert the new element into the list; therefore, the preconditions for nodup_insert and try_insert both evaluate to true. First, suppose that we call nodup_insert with an item that is not presently a member of the list in question. When the routine is invoked already_tried is initialized to false; therefore, try_insert is called with the new element as an argument.
© R. S. Rist, 1993
207
Try_insert successfully inserts the new element into the list; therefore, the postcondition for nodup_insert evaluates to true, and the procedure terminates normally. Second, suppose that we call nodup_insert with an item that is already a member of the list. Execution proceeds as before, and try_insert is called with the new element as an argument; however, in this case try_insert fails and an exception is generated. Try_insert produces a stable state before terminating; the rescue clause ensures that the list is the same as before the routine was invoked. When try_insert terminates, the rescue clause for nodup_insert is invoked. At this point, already_tried is false and the item is a member of the list; therefore, already_tried is set to true and the retry instruction restarts nodup_insert from the beginning. This time try_insert is not executed, the body of nodup_insert terminates normally, the postcondition evaluates to true, and the routine completes its execution. Third, suppose we call nodup_insert with an item that is not already a member of the list, but will cause try_insert to fail (for some mysterious reason, try_insert can not add the element, but it will leave the list unchanged). In this case, execution proceeds as in the previous example until the rescue clause for nodup_insert is executed. Since the item in question was not originally a member of the list, and try_insert failed leaving the list unchanged, the item is currently not a member of the list. Therefore, the condition of the if statement in the rescue clause evaluates to false, the instructions inside the if statement are not performed, the rescue clause completes without executing a retry, and nodup_insert terminates signaling failure to its caller. In all three cases, exceptions are used in a disciplined manner; nodup_insert either satisfies the contract defined by its pre- and postconditions, or notifies its caller of the discrepancy. In Eiffel, the rescue clause and retry instruction provide a powerful, elegant system for run-time exception handling.
15.4 An example: class NODUP_LIST We can now examine the code for the entire NODUP_LIST class. NODUP_LIST is a descendant of ORDERED_LIST; therefore, the elements in the list are kept in ascending order. NODUP_LIST exports the routines empty, full, first, nodup_insert, delete, and member; all but nodup_insert are inherited from ORDERED_LIST with no modifications. empty and full return true if the list has no elements or no free space respectively. First returns the element at the front of the list, nodup_insert puts an item into its correct position in the list (if it is not already present), delete removes the element at the front of the list, and member returns true if an item is already present in the list. For the sake of clarity, NODUP_LIST renames the insert routine from ORDERED_LIST as ordered_insert.
NODUP_LIST has an invariant requiring that no two elements in the list are identical. For this purpose, it defines a Boolean function no_duplicates which returns true only if each element in the list is unique. No_duplicates has a precondition of true, and its postcondition requires that if the function returns false then the last two elements examined are identical. The body of the routine consists of a single loop. The function value is initialized to true, and each pair of elements in the list is examined in turn. If the pair is equal then Result is set to false, otherwise nothing is done. NODUP_LIST defines a procedure try_insert that attempts to add an element to the list. The precondition for try_insert requires that space be available in the list, while the postcondition ensures that the element has been inserted, but is not a duplicate. Specifically, the new element is inserted at position location plus one by ordered_insert. If an element with the same value was already present, then it will be at position location. The postcondition for try_insert ensures that this possibility has not occurred, and the rescue clause for the procedure removes the duplicate element if necessary. The postcondition for try_insert can be violated in two distinct ways. If ordered_insert adds a duplicate element, then an item must be removed from the list. On the other hand, if ordered_insert fails to add the new element and it is not already a member of the list then no further action is possible. Therefore, the rescue clause for try_insert consists of an if statement on the presence of a duplicate element in the list. The code to remove the duplicate element consists of a single loop and an assignment. The loop starts at the duplicate element and moves each element up one location until the front of the list is reached. The assignment then reduces the number of valid elements in the list by one.
© R. S. Rist, 1993
208
The nodup_insert procedure takes the element to be inserted as an argument. The precondition requires that space be available for the new element, while the postcondition ensures that the element is a member of the list, and that only count and elements were changed by the procedure. The body of nodup_insert consists of an if statement on the Boolean already_tried. If already_tried is false (the first time through the body), then try_insert is called with the new element as an argument; otherwise (the second time) nothing is done. The routine has a rescue clause consisting of a single if statement. If already_tried is false and the new element is already a member of the list (first time failed because already present), then already_tried is set to true and the routine is restarted.
class NODUP_LIST [T -> COMPARABLE] inherit ORDERED_LIST [T] rename insert as ordered_insert end creation make feature {ANY} make (size: INTEGER) is -- make a list of size size do !! list_make (size) end -- make try_insert (elem: T) is require not_full: not full local k: INTEGER do ordered_insert (elem) ensure is_member: member (elem); not_duplicate: not_duplicate rescue if location >= 0 and then elements.item (location).is_equal (elements.item (location + 1)) then from k := location + 1 until k = count - 1 loop elements.put (elements.item (k + 1), k) k := k + 1 end count := count - 1 end end -- try_insert
© R. S. Rist, 1993
209
nodup_insert (elem: T) is require not_full: not full local already_tried: BOOLEAN do if not already_tried then try_insert (elem) end ensure is_member: member (elem); no_extra_changes: strip (elements, count).is_equal old strip
(elements,
count) rescue if not already_tried and member (elem) then already_tried := true retry end end -- nodup_insert no_duplicates: BOOLEAN is local k: INTEGER do from Result := true; k := 1 until k >= count or not Result loop if elements.item (k).is_equal (elements.item (k - 1)) then Result := false end k := k + 1 end ensure correct_result: not Result implies elements.item (k - 1).is_equal (elements.item (k - 2)) end -- no_duplicates not_duplicate: BOOLEAN is -- have we found no duplicates in the list? do Result := location = - 1 or else not elements.item (location).is_equal (elements.item (location + 1)) end -- not_duplicate
© R. S. Rist, 1993
210
invariant no_duplicates: no_duplicates end -- class NODUP_LIST
15.5 Discussion Exceptions are an extremely powerful mechanism, and as such should be used sparingly. They are not a technique for dealing with uncommon (but acceptable) cases. They should be reserved for unpredictable events, un-testable preconditions and protection against errors remaining in software. The programs in this chapter are not examples of the ideal use of exceptions; they simply illustrate the exception mechanism in Eiffel. In general, exception handlers should be simple. They should contain the minimum code necessary to restore a stable state and terminate or restart the routine. Complex algorithms for unusual cases should be placed in the routine body rather than in the rescue clause. Exceptions are ideal for a situation such as an "out of memory" error on attempted object creation. Checking before each operation would be extremely expensive, and failure of the operation is unusual; therefore, it is an ideal situation to be handled in a rescue clause. Main points covered in this section •
Exceptions are generated when an assertion is violated at run-time and when the hardware or operating system signals an error.
•
Exceptions should be used sparingly. They should be reserved for unpredictable events, un-testable preconditions and protection against errors remaining in software.
•
There are two acceptable responses to an exception. Either achieve a stable state and signal failure, or change the conditions that caused the problem and retry.
•
The rescue clause for a routine defines the actions to be taken when an exception occurs during its execution.
•
The retry instruction may be used in a rescue clause to re-invoke the routine. The routine fails if the rescue clause terminates without executing a retry.
•
The rescue clause should establish both the routine precondition and the class invariant before executing a retry. It should establish the class invariant before terminating in any case.
•
The rescue/retry mechanism guarantees that a routine will only terminate by executing its body to normal completion or by signaling failure to its caller.
Exercises Consider the class STRICT_LIST developed in the exercises for the previous chapter. We will define a routine just_insert which simply calls the insert routine from ORDERED_LIST. The precondition requires that the new element can be inserted while maintaining strict ordering.
just_insert (elem: INTEGER) is require ••• do ordered_insert (elem) end -- just_insert
© R. S. Rist, 1993
211
1. Write a precondition for just_insert. (Hint: the good function returns true if it is possible to insert an element and maintain strict ordering.)
good (elem: INTEGER): BOOLEAN is do Result := empty or else elem = first - 1 or elem = last + 1 end -- good The strict_insert procedure inserts a new element into a STRICT_LIST. The routine takes the element to be inserted as an argument, and the precondition requires that space for a new item be available. The postcondition ensures that either the new element is a member of the list, or the insertion was not possible and the list is unchanged. The body of the routine consists of an if statement on the Boolean already_tried. If already_tried is false (the body is being executed for the first time), then just_insert is called with the item to be inserted; otherwise (the second time) nothing is done. The rescue clause for the routine also consists on an if statement on already_tried. The first time the rescue clause is executed, already_tried is set to true and the routine is restarted; the second time, nothing is done.
strict_insert (elem: INTEGER) is require not_full not full local already_tried: BOOLEAN do if not already_tried then just_insert (elem) end ensure member_if_possible: member (elem) or not good (elem) rescue if not already_tried then ••• end end -- strict_insert 2.
Write a body for the if statement that correctly completes the definition of strict_insert.
3. Does the rescue clause for strict_insert have to restore the class invariant? Which class invariants are involved? (Hint: which classes are ancestors of STRICT_LIST?) 4.
What must be true before retry is executed in the rescue clause for strict_insert?
© R. S. Rist, 1993
212
Appendix A: Reserved words, special characters, operator precedence A.1 Reserved words The reserved words in Eiffel are listed below. You are not allowed to use them as names. alias CHARACTER deferred ensure from inherit local once rename separate unique
all check do expanded frozen inspect loop or require STRING until
and class DOUBLE export if INTEGER NONE POINTER rescue strip variant
as creation else external implies invariant not prefix Result then when
BIT Current elseif false indexing is obsolete REAL retry true xor
BOOLEAN debug end feature infix like old redefine select undefine
Precursor will soon be added to this list.
A.2 Special characters Character
Code
Mnemonic name
@ BS ^ $ FF \ ~ NL ` CR # HT NUL | % ' " [ ] { }
%A %B %C %D %F %H %L %N %Q %R %S %T %U %V %% %' %" %( %) %< %>
At-sign Backspace Circumflex Dollar Form feed backslasH tiLda Newline (back) Quote (carriage) Return Sharp (horizontal) Tab nUll character Vertical bar Percent Single quote Double quote Opening bracket Closing bracket Opening brace Closing brace
A.3 Operator precedence order The precedence order for all Eiffel operators is shown below; highest precedence is at the top of the table, lowest precedence at the bottom. Operators at the same precedence level are shown together; these operators are evaluated left to right in a flat expression. Brackets override the default precedence order. A free operator (levels 10 and 11) is an operator whose name begins with one of the characters '@', '#', '|', or '&'. Level © R. S. Rist, 1993
Operators
213
12 11 10 9 8 7 6 5 4 3 2 1
© R. S. Rist, 1993
. (Dot notation for client feature calls) old strip not unary + unary All free unary operators All free binary operators ^ (power) * / // (integer division) \\ (integer remainder) binary + binary = /= (not equal) < > <= >= and and then or or else implies << >> (for manifest arrays) ; (semicolon separator between assertion clauses)
214
Appendix B: Eiffel syntax The syntax of each part of the Eiffel language is given below, in the order used in the book.
B.1 Class The basic format of a class is a header followed by the class body. The header gives the name of the class, and the name of its creation routine. A class name is written in capital letters. The keywords class, inherit, and feature are written at the left edge of the line. The body consists of a list of features, a feature is an attribute or a routine. All features are indented equally, four spaces from the left edge. All indenting is done with the same step size, in steps of four spaces. A space is written after each comma, colon, and semi-colon, and on either side of an assignment statement (:=) and a comment (--).
-- class header comment class NAME creation make feature attribute_name: TYPE function_name (arguments): TYPE is -- header comment local local variables require preconditions do routine body Result := value ensure postconditions end -- function_name procedure_name (arguments) is -- header comment local local variables require preconditions do routine body ensure postconditions end -- procedure_name end -- class NAME
The export policy of a feature specifies the clients that can use that feature. The policy is set in a feature clause, and remains in force until the next feature clause. The types of export policy are Export clause
Meaning
feature exported to all classes feature {ANY} exported to all classes exported to classes X, Y, Z feature {X, Y, Z} feature {} exported to no class exported to no class feature {NONE} A class states its creation procedures and policies after the keyword creation. The creation policy specifies the clients that can call the procedure as a creator; the format of the creation policy is the same as that of an export policy. There are four forms of the creation clause: creation {X, Y, Z} creation {X, Y, Z} creation no creation keyword
X, Y, Z can use make as creator make X, Y, Z can use make or setup as creator make, setup an object cannot be created for this class <no name> no creation routine
If a class has multiple creation routines with different creation policies, then each routine has its own creation clause and policy. An object is created by a creation command of the form !!. If there is no creation routine, then the creation instruction has the form !!name. If there is a creation routine, then the creation instruction has the form !!name.make. If a creation routine is given for the class, it must be called when the object is created. The export policy of a creation procedure defines who can call the feature as a non-creation routine. A creation routine may be called to change an existing object:
!!object.make -- create new object object.make -- alter existing object
B.2 Sequence An identifier is the name of a variable, routine, or class. An identifier begins with a letter, and may include numbers or the underscore character. Declaration: unique
length: REAL attribute length: REAL is 4.5 constant Red, Orange, Yellow: INTEGER is
unique constants local length: REAL
local
variable An identifier can be declared as the same type of an object by a declaration of the form this: like anchor where anchor is the name of an object in the scope of the declaration. Input:
io.readint io.lastint
read an integer value last integer value read
io.readreal io.lastreal
read a real value last real value read
io.readdouble
read a double precision real value
io.lastdouble
last double precision real value read
io.readchar io.lastchar
read a character last character read
io.readline read a line, discard CR io.readstream (n) read a stream of n characters io.readword read a word up to a space or CR io.laststring last string read
Output:
Assignment:
io.next_line
read from a new line
io.putint io.putreal io.putdouble io.putchar io.putbool io.putstring
write an integer value write a real value write a double precision real value write a character value write a Boolean value write a string value
io.new_line
start a new line for output
variable := expression
store the value of expression
in the variable Procedure:
name (argument list) is -- header comment
local local variables require
preconditions do
routine body ensure
postconditions rescue
rescue clause, possibly including retry end -- name An argument list is a list of declarations, separated by semi-colons.
name (argument list): TYPE is -- header comment
Function:
local local variables require
preconditions do
routine body Result := expression rescue
rescue clause, possibly including retry ensure
postconditions end -- name Infix operator:
infix name (arguments): TYPE is ...
Prefix operator: prefix name (arguments): TYPE is ... The keyword do is replaced by the word once to define a once routine.
A class is expanded by placing the keyword expanded before the keyword class. An object is expanded by writing the keyword expanded before the type in the variable declaration. The keyword Current returns the value of the current object.
B.3 Selection If:
if condition1 then action1 elseif condition2 then action2 ... elsedefault_action end
Inspect: inspect expression when values then action when values then action ... else default_action end i) ii) iii)
Values in an inspect statement may be specified in three ways: A single value: when 3 then ... A set of values: when 'a', 'e', 'i', 'o', 'u' then ... A range of values: when 1..12 then ...
B.4 Iteration Loop: from
initialisations until
exit_condition loop
action end
B.5 Inheritance The basic form of an inheritance clause is class CHILD inherit PARENT The inheritance clause has a set of sub-clauses, that are written and executed in fixed order: class CHILD inherit PARENT rename -- new name in child m as n export -- new export policy in child {C, D} o, p
undefine q redefine r, s select t end
-- deferred in child -- new body in child -- select active feature
Multiple classes can be inherited, when the class names are separated by semi-colons:
class A inherit B; C; D The class ANY is an ancestor of a user-defined class. The class NONE is the bottom of the inheritance hierarchy; no class can inherit NONE. The special value Void is of type NONE.
B.6 Genericity A generic class is passed a class name as a parameter and uses objects of that type. The formal parameter (usually called "T" for type or "G" for generic) is defined in the class header, and actual parameters are passed and bound at compile time. Multiple actual and formal parameters are separated by commas. class ARRAY [G] A formal parameter can be constrained to be of the type given in the class header. The constraint operator is "->", so a constrained class header looks like class GIRLS [T -> FUN] and only actual parameters of type FUN can be passed to GIRLS. B.7
Assertions
An assertion is an expression that evaluates to true or false. It may be implemented as a Boolean variable, expression, or function. An assertion with a label has the form
label: Boolean entity Multiple assertions are spearated by semi-colons. A pre-condition is written under the keyword require, and is true on routine entry. A post-condition is written under the keyword ensure, and is true on routine exit. A class invariant is written under the keyword invariant and asserts what is true for any object of that class. It is tested on entry to and on exit from every routine in the class, except for the creation routine. old attribute in a post-condition returns the value of attribute on entry to the routine. strip (except) returns an array of all attributes in the object except for those listed. Multiple excepted attributes are separated by commas. rescue provides by a rescue body that is executed when a post-condition on that routine fails. Within a rescue body, retry transfers control to the routine body; local variables retain their values set in the previous execution of the routine body.
B.8 Naming conventions A class name is written in capital letters. A variable or routine name uses only lower case letters. A constant begins with an upper case letter, then uses all lower case. Compound names are combined by the underscore character. A variable name is a noun that describes the stored value. A function name is a noun that describes the returned value. A procedure name is a verb that describes the change made by that procedure. The name of a creation procedure is make. In this text, the following conventions are used for common routine names: get read and set read prompt and io.readX set store the value valid a BOOLEAN function that tests for valid values show show a value.
Appendix C: Ace file C.1 Structure The structure of an Ace file consists of the following parts: 1. 2. 3. 4.
The name of the executable file to be produced is written after the keyword system. The name and creation routine of the root class are written after the keyword root. The various run-time options are written after the keyword default. The location of the precompiled Eiffel library is also specified here. The location of the Eiffel files is written after the cluster keyword. The first cluster says that the user’s files are contained in the current directory (“/”). The other clusters used in the system are the Eiffel kernel and support files.
system bank root BANK: "make" default assertion (require); precompiled ("$EIFFEL3/precompiled/spec/$PLATFORM/base") cluster eiffel: "./"; kernel: "$EIFFEL3/library/base/kernel"; support: "$EIFFEL3/library/base/support"; end Eiffel offers many more clusters than were needed to compile the case study system.
C.2 Assertions Assertions can be checked at six levels, where each level adds to the previous one: 1. 2. 3. 4. 5. 6. 6.
assertion (no) no assertion checking assertion (require) test pre-conditions assertion (ensure) also test post-conditions assertion (invariant) also test class invariant assertion (loop) also test loop variants and invariants assertion (check) also test check instructions assertion (all) same as assertion (check)
C.3 Debug The debug status is set in the defaults section of the Ace file to: 1. 2. debug string 3.
© R. S. Rist, 1998
debug (no) debug (“name”)
no debugging turn on debugging for clauses labelled with this
debug (yes)
execute debug clause
1
4.
debug (all)
© R. S. Rist, 1998
same as debug (yes)
2
Appendix D: Charts D.1 Client chart A client link from a class CLIENT to a supplier class SUPPLIER is defined when CLIENT contains a declaration of type SUPPLIER. Informally, a client relation means that a client "has", "uses", or "contains" an instance of the supplier class. A class is shown as an oval that contains the class name. The client relation is shown by a right arrow from the client to the supplier; the client is shown to the left, and the supplier to the right. The code shown to the left produces the chart shown to the right.
class CUSTOMER feature account: ACCOUNT
ACCOUNT
CUSTOMER
A generic relation is defined when a class CLIENT declares a variable of the generic type and passes an actual parameter. The generic relationship is shown in a client chart by writing the client of the generic class first, then the name and formal parameter(s) of the generic class, and finally the actual parameter. The following chart reflects the code
class BANK ... feature patrons: LINKED_LIST [CUSTOMER]
LINKED _LIST [T]
BANK
CUSTOMER
The Eiffel INTEGER, REAL, DOUBLE, CHARACTER, BOOLEAN, and STRING are not shown.
D.2 Inheritance chart An inheritance link is defined when one class inherits another. Informally, inheritance means that one class “isa” type of another class. An inheritance chart shows an inheritance link by an up arrow from the child to the parent. Simple inheritance is shown below, first the code then the chart.
class CUSTOMER
PERSON
inherit PERSON CUSTOMER
Multiple inheritance is shown below, the code to the left and the chart to the right.
© R. S. Rist, 1998
1
LINKED_ LIST [T]
STORABLE
class KEY_LIST [T} inherit STORABLE LINKED_LIST [T]
KEY_LIST [T]
Repeated inheritance is shown by extending the hierarchy through as many levels as needed.
D.3 Class diagram A class diagram shows the name, all the attributes and the exported features in a class. The name, type, and value for a constant, of every attribute in the class are shown. The name, name and type of any arguments, and name of the returned type (for a function) are shown. The format of a simple class diagram is shown to the left below. A generic class has a complex header, such as that shown to the right below.
CLASS
KEY_LIST [T -> KEYED]
attributes read_id find (key: CHARACTER) found: BOOLEAN
exported routines
D.4 Data structure chart A data structure chart shows the name, type, and value of the attributes in a class or in a system. The classes are drawn from left to right in client chart order. Within a class, a triple is drawn for each attribute declaration. Basic classes are shown in a data structure chart, such as the chart for below that shows the data structure of a line.
line
LINE
start
POINT
finish POINT
© R. S. Rist, 1998
x y
REAL REAL
x y
REAL -1.0 REAL 0.0
0.0 1.0
2
Appendix E: Design principles The design principles used in this text are listed below, divided into four parts.
E.1 Object-oriented programming 1. 2. 3. 4. 5.
An object is designed around the data it stores. The data and the code live in the same class. A routine is small and does a single thing. Write code in a supplier and reuse it by calling. Write code in a parent and reuse it by inheritance.
E.2 Eiffel 1. 2. 3. 4. 5. 6. 7.
A value can only be changed by code inside its class. A function returns a value and changes nothing. A procedure changes value(s) and returns nothing. Export the behaviour and hide the representation. Design by contract: define pre- and post-conditions on a routine. The client is responsible for calling a routine safely. Design by sub-contract: a redefined feature honours the contract of its precursor.
E.3 Design guidelines 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
Local is beautiful. Keep the number of attributes as small as possible. Assume a variable is local; if this is not possible, make it an attribute. Assume a value is a function; if this is not possible, make it an attribute. Assume a feature is private; if this si not possible, export it. Hide the complexity inside a routine. Hide the attributes whenever reasonable. Objects communicate little and publicly: • Store data in a class, don’t pass it as an argument. • Don’t export features unless you have to. All names are clear, simple, and meaningful Use matched routines: a command to make a change, and a query to test the effect Don't explicitly test the type of an object: • Use polymorphism to define a common interface. • Use conditional assignment to find a type. Guard so that errors don’t occur, don’t fail and recover. If you repeat code, then your design is wrong.
E.4 The process of design 1. 2. 3.
Decompose: solve a small part of the solution first. Test: Code and test part of the system at a time. Iterate: evaluate your solution, then improve it.
© R. S. Rist, 1998
1
Appendix F: Glossary of Eiffel terms F.1 Data, routine, class, and object terms Ace file: The Ace (Assembly of Classes in Eiffel) file tells Eiffel how to compile your system. The Ace file contains the name of the executable to be produced by the compilation (after the Ace keyword system), and the place to start the compilation (the root class and its creation routine are listed after the Ace keyword root). Other parts of the Ace file tell Eiffel which directories contain your code and the code from the Eiffel libraries, and set the assertion checking level. actual argument: the actual arguments are the actual values passed in a routine call. When a routine is called, the actual arguments are passed to the routine definition and the formal arguments in the routine header are bound to (given the value of) the actual arguments. argument: arguments are used to send data from one routine to another. The code that calls a routine supplies any actual arguments, and the called routine provides a set of formal arguments in the routine header. Formal arguments are bound to actual arguments when the routine is called, local variables are created, and the routine body is executed. A formal argument is a local variable, so its scope is the routine body. argument binding: when a routine is called, the formal arguments in the routine header are bound to the actual arguments supplied in the routine call. A local variable is created from the name and type of each formal argument in the header, and given the value of the actual argument. For argument binding to succeed, actual and formal arguments must agree in number, order, and type. assignment: the expression on the right-hand side of an assignment statement is calculated and that value is stored in the variable given on the left-hand side of the assignment. attribute: a variable declared as a feature of a class. The scope of an attribute is its class. basic type: intuitively, the basic type of information that is stored. The Eiffel basic types are INTEGER, REAL, DOUBLE, CHARACTER, and BOOLEAN. The class STRING is a reference type, not a basic type. The five basic types are often referred to as expanded types, because they have an immediately useful value instead of a reference to a memory location. Because the value is available immediately (without needing to search down the memory pointer to find it), access is fast; because the basic types are used so often, fast access is important and the basic types are expanded and stored. bit: binary digit. There are two binary digits, zero (0) and (1). The base level of representation in a computer is the binary digit, so every piece of data is stored as a patterns of bits, as a sequence of zeros and ones. At the base hardware level a computer stores, reads, and manipulates bits, so both data and code are stored as bits. When an Eiffel system is compiled, the text characters in the code listing are translated into bits and stored as an executable file of machine code. © R. S. Rist, 1998
1
binary: zero (0) or one (1). A computer's hardware can only store and manipulate binary values, so text (instructions) has to be converted to binary form before the computer can execute it. call: a routine is called by writing its name in the code. When the name of the routine is encountered during execution, control is transferred to the routine definition, any local variables are created, the routine body is executed, and control then returns to the caller; more precisely to the code following the call (the routine name) in the calling code. calling stack: a stack of the current routine calls. When a routine is called, a copy of that routine is placed on the call stack; when the routine exits, it is removed from the stack. The top of the calling stack is thus the routine currently being executed. class: a class defines the behaviour of its objects. It is an abstract definition of a class of objects, and every object of the same type behaves in the same way (has the same set of features). From the outside, the behaviour of a class is defined by its exported features. Internally, a class consists of a set of attributes and a set of routines. Because a class is defined by its code listing, we say that a class is a compile-time entity; it is defined when the code is compiled. clone: make a field-by-field copy of an object and return the new object. You have to clone an object of a reference type to get a new object; simple assignment stores the reference to an object, not a reference to a new copy. compile: parse the code listings and translate the code from text (characters) to executable form (binary machine code) so the code can be executed or run by a computer creation: an object has to be created before it exists. The first object to be created at run-time is found from the root class listed in the Ace file, then the code in that class creates the objects that it needs, those objects create other objects, and so on. Except for the root class, an object is created when a class issues a creation command of the form !!name. A class may have a creation routine, and the object can then be created and its creation routine executed by a creation command of the form !!name.make. A creation routine may have arguments (the root class's creation routine can take only a single string as an argument). A class must specify its creation routine by placing the name of the routine after the keyword creation; by convention, a creation routine is named make. class header comment: a comment placed before the class keyword, that describes the class class trailer comment: the word "class" followed by the name of the class, written as a comment after the end of the class. client: intuitively, a client uses features that are exported from or supplied by another class; when you use the services of a lawyer, you are a client of that lawyer. Formally, a client relationship is defined in Eiffel when a (client) class declares a variable of the type of the supplier (class). A client link is shown on a client chart as a straight line going from the client to the supplier, horizontally from left to right across the page.
© R. S. Rist, 1998
2
code: the exact meaning of this term depends on the context in which it is used. When people talk about "writing code", they refer to everything that has to be written; every character in every class in a system, that is compiled to produce executable code. You can talk about the code for a class, which is every character in the class listing. You can talk about the code in a class, which is the set of routines. You can talk about the code for a routine, which is every line of code from the routine header to the routine end, inclusive. You can talk about the ciode in a routine, which is every line of code between the do and the end. Finally, you can talk about an individual line of code. command: a command causes a change in the world (or fails to work). It is implemented in Eiffel by a procedure. comment: a string of words that is meant for people to read, not for the computer to execute. A comment is preceded by two minus signs, and extends to the end of the line. constant: a variable with a fixed or constant value. The value is given in the declaration and cannot be changed by code. In Eiffel, a constant must be declared as a feature of a class; you cannot have a local constant. data: data values are stored in the computer's memory as a pattern of bits. Each type of value (real, integer, character, etc.) is represented in a different way, so your code has to tell the compiler what type of value is to be stored, so the compiler knows the right way to represent that value internally . Data is usually stored as the value of a variable, but may also be written "in-line" in the code as a literal value. declaration: a variable has to be declared before it can be used. A variable declaration lists the name and type of the variable. When a variable is created, it is given a default value. A declaration may occur as a feature of a class, as a formal argument, or as a local declaration in a routine. A constant is considered to be a type of variable, whose value is given in the declaration and cannot be changed. default: the basic or initial value (or setting). When a variable is created, it's value is set to a default value. INTEGER, REAL, and DOUBLE variables get a default value of 0; CHARACTER variables have a default value of a space, BOOLEAN variables are initialised to false, and reference variables are initialised to Void. enumerated type: a set of variables that enumerate all possible values for a type. A unique variable has a name, and an unknown value. The value can be assigned and tested for equality or comparison, but the exact value is unknown. executable: an executable file is one that can be executed or run by the computer. The file contains code that has been translated from source (text) to executable (binary machine code) form and can thus be run by the machine. export: a feature may be exported to, and thus called by, a client. The export policy of a feature is defined by the feature keyword preceding the feature. There are three main types of export policy: keep the feature private (feature {NONE}), export it to specific classes (feature {NAME}), and export it to every class (feature or feature {ANY}).
© R. S. Rist, 1998
3
expression: a sequence of identifiers or literals, connected by operators. When the expresssion is encountered during execution, the values of the identifiers and literals are used to calculate the value of the expression, and we say that the expression evaluates (to some value). An expression often occurs as the right-hand side of an assignment statement, but can be used immediately as well, without storing it. feature: a feature is an attribute or a routine flat expression: an expression with no parentheses function: a function is one of the two forms of a routine. The behaviour is that a function returns a value and changes nothing. The type of the returned value is shown after the colon in the function header, before the keyword is. hexadecimal: a base 16 number system that uses as digits 0-9, A, B, C, D, E, F. A group of four bits can be written as a single hexadecimal digit. Computer memory addresses are often written as hexadecimal values. identifier: a feature name; the feature may be an attribute, a local, a function, or a routine implementation: the mechanism inside the box, the way something is actually coded. A class is implemented by set of attributes and routines: data versus code. The implementation of a routine is the code in the routine body. When viewed from outside the entity, we see only its behaviour and the implementation is hidden. input: getting a value from the terminal or some other external source, such as a file. Four types of values may be input from the terminal: INTEGER, REAL, CHARACTER, and STRING. Input in Eiffel occurs in a two step process. First, a value is read from the terminal keyboard using a command of the form io.readX and stored in an internal buffer (of type X). Second, a query of the form io.lastX is used to get the value from the buffer. instruction: a complete, executable line of code iteration: the use of a loop to repeat one or more actions. The statements in the loop body are executed until the exit condition evaluates to true, when the loop terminates and control flows to the statement after the loop. The value of the exit condition must be initialised before entry into the loop and changed in the loop body, or the loop will not terminate. keyword: a word with a special meaning in the Eiffel language. A keyword cannot be used as the name of a feature, or in any way other than its special, defined meaning. listing: a printout of the code literal: a value "hard coded" in a line of code, that has no name and is not stored in a variable. local variable: a variable that is declared in a routine, as a formal argument or as a local declaration. When a routine is executed, local variables are created, used in the routine body, and destroyed (the storage is de-allocated) when the routine exits. The scope of a
© R. S. Rist, 1998
4
local variable is its routine, so the variable cannot be used outside the routine. Formal arguments are local variables, but they are normally called formal arguments and the term "local variable" is used to refer to those variables listed in a local declaration. loop body: the code between the keywords loop and end. machine code: the basic language of the computer. Each instruction in a source file is translated into the equivalent machine code when a class is compiled, so the executable machine code can be run at some later time. Different types of computer use different machine codes. name: a feature has a unique name or identifier that names or identifies the feature. If two features have overlapping scopes, then you get a name clash that has to be resolved. object: an instance of a class. An object has its own data, but all objects (of the same type) have the same behaviour and share the routines in the class definition. Because an object does not exist until the executable system code is run (and storage is allocated for the object's data), we refer to an object as a run-time entity. operator: an operator takes one or more values and returns a value. A unary operator (such as unary -) takes a single value, and returns a single value. A binary operator (such as *) takes two values and returns a single value. An operator is formally a function and can thus have any number of arguments, but the term is normally used to refer to the "builtin" arithmetic and logical operators. output: writing a value to an external medium, such as the terminal screen or a file. Output is a command of the form io.putX.formal argument: the formal arguments to a routine are contained in the argument list in a routine header. The format of a formal argument is a declaration; a routine may have zero or many arguments. When the routine is called, a local varibale is created for each formal argument and the formal argument is bound to (given the value of) its actual argument, passed from the routine call. precedence: an operator has a defined precedence, so operators of higher precedence are evaluated before operators of lower precedence in a flat expression. procedure: a procedure is one of the two forms of a routine. The behaviour of a procedure is that it changes one or more values, and returns nothing. query: a query gets a value and changes nothing. In Eiffel, a query may be implemented as a constant, a variable, or a function. recursion: the use of a routine to repeat one or more actions. A recursive routine is a routine that contains a call to itself in the routine body. A recursive routine has three parts: the action, the recursive call, and the base case. When the recursive routineis called the first time, it does a part of the task (the action) and passes on the reduced task to a new copy of the routine. Recursion continues until the base case is reached and no more calls are made; each version of the routine then returns to its caller, until the original call returns. reference type: a type whose value is a reference (see basic type)
© R. S. Rist, 1998
5
return value: a function is called, executes, and returns a value to the caller. This value is often described as the return or returned value, that is returned from the call. The function header lists the return type after the colon in the routine header. When the function executes, a local variable called Result is created from the function header, and the returned value is whatever value was assigned to Result inside the function. root class: the class that starts off your system. The root class is shown at the far left of a client chart. routine: a routine is a chunk of code that is executed as a single unit; the routine is called, executes, and returns control to the caller. A routine may be a function or a procedure. routine body: the executable code in the routine, written between the keywords do and end. Formally, the keywords are considered to be part of the routine body. routine header: the first line of a routine definition. The routine header lists the name of the routine, then any formal arguments in parentheses; there may be no formal arguments. A function then lists the type of the (value returned by the) function; a procedure returns nothing. The routine header ends with the keyword is. routine header comment: the comment following the routine header. The routine comment describes the behaviour of the routine, and says nothing about how the routine works. Anything to do with the implementation of the routine is hidden in the routine body. routine trailer comment: the name of the routine, placed as a comment after the routine end. scope: the scope of a variable is the place where it can be "seen". The default scope of an attribute is its class, so an attribute can be used and changed by any code in that class. If an attribute is exported, then the "use scope" is extended to be the class and its client but the "set scope" is just the class. The scope of a local variable is its routine. selection: the choice of one branch of the control flow, when control can split. Eiffel offers two selection instructions, the if and the inspect statements. Inspect can only be used with test values of type INTEGER, CHARACTER, or enumerated types. semantics: this is more formal word for "meaning" or "behaviour". When we talk about how something behaves, about exactly what happens when an operator is applied, we talk about the semantics of the operation. short: a short listing of a class is generated by running short classname and shows the external behaviour of all exported features in the class. Each feature shows its name, any argument name and types, any returned type, and any assertions. signature: intuitively, the values that are passed to, and received back from, a feature. Formally, the types of the values passed to, and received back from, a feature. The input types are shown first, separated by commas, then there is a semi-colon, then the output types are shown. The signature is one part of the behaviour of a feature.
© R. S. Rist, 1998
6
source file: a file that contains the code for a class in character format. For a class named THIS, for example, the source or text file must have the name this.e. stack trace: a printout of the calling stack with the current routine at the top of the trace, showing the name of any violated assertions. supplier: intuitively, a class that exports or supplies one or more features to a client. Formally, a supplier relationship is the inverse of the client link so it is defined by a declaration of the supplier type in the client. system: a set of classes connected by client and inheritance links. type: the class of an object. In Eiffel, every type is a class. As an example, Eiffel classes exist for INTEGER, BOOLEAN, CHARACTER, ARRAY, and so on. The only exception occurs for generic classes such as ARRAY, which can generate many types (such as an array of integers, an array of characters, an array of points, and so on). value: a value can be stored in a variable, written as a literal in the code, passed as an actual argument, or returned as the value of a function. The process of calculating a value is called evaluation. An expression evaluates to some value, and that value can then be used immediately or stored in a variable. A function returns a value, so we say that a function evaluates to some value. Every value has a type. An expression, a function, a variable, a constant, and a literal evaluate to a value. variable: a variable is used to store data in the computer, and the code (instructions) uses and changes the value of variables. A variable has three parts: a name or identifier, a type, and a value; a variable declaration specifies the name and type of a variable. A variable that is declared as a feature of a class is called an attribute, and a variable declared within a routine is called a local variable. Void: the initial value of a reference type. When an object is created, the value of Void is replaced by a reference to the object. If you try to call a feature on an object that has not been given a reference, Eiffel does not have an object to use and tells you that you have a Void reference.
© R. S. Rist, 1998
7
F.2 Inheritance, genericity, and assertion terms ancestor: a class that is inherited by another class. The inherited class may be directly inherited (the parent class) or indirectly inherited through a parent class. assertion: a statement of belief; more formally, a boolean expression that asserts something is true and can be tested. The main forms of assertions in Eiffel are preconditions, postconditions, and class invariants; assertions can also be used for loop invariants and variants, plus check instructions. An assertion may have a label; this label is shown on the stack trace when the assertion fails. assignment attempt: an assignment that succeeds if the values conform. If the right hand side of the assignment conforms to the left hand side, then the assignment succeeds; if it does not, then the value Void is assigned. behaviour: what something does, the way something appears when viewed from the outside. Eiffel makes a strict distinction between the outside view of an entity (its behaviour) and what happens if you like inside the box (the implementation). The behaviour of a feature is defined by its name, signature, and assertions. The behaviour of a class is defined by its set of exported features. child: a class that inherits another, parent class. The child class lists the parent class under the child class's inherit keyword. class invariant: an assertion that must be true whenever the object is in a stable state; that is, between routine calls. A class invariant must be true on entry to each routine in the class, and on exit from each routine. conform: a variable of type D conforms to a variable of type A if either they are of the same type (D and A are the same type), or D is a descendant of A. Assignment and argument binding only work if the variable to be assigned (bound) conforms to the target (the variable on the left of the assignment, or the formal argument). For generic classes, both the generic class and the parameter must conform for substitution to work. constrained generic class: a generic class that can only take parameters of a specified type. The constraint is given in the generic class header of the form class X [P -> T], so any actual parameter passed to X and bound to P must be of type T; more formally, it must conform to type T. Constrained genericity allows a constrained class to assume that any actual parameters have some known feature. contract: see programming by contract deferred feature: a feature whose body is deferred at this level in the inheritance chart. In the routine body, the keyword do is replaced by the keyword deferred and the routine body is empty. A class containing a deferred feature cannot be created, and must be declared a deferred class. A child class must ultimately provide an effective routine body.
© R. S. Rist, 1998
8
deferred class: a class that contains at least one deferred feature. The class header must be of the form deferred class X. descendant: a class that inherits another class. The inheriting class may directly inherited (the child class) or indirectly inherit through a child class. effective feature: a feature that can be executed (see deferred feature) flat listing: a short listing that shows the exported features of a class, as well as the exported features of any parent classes. generic class: a container class such as ARRAY or LINKED_LIST. A generic class allows a series of objects of any type to be stored in the generic class, and makes no assumptions about the internal structure of those objects. The parameter to the generic class (INTEGER in an ARRAY [INTEGER]) is the class given in the array declaration and is bound at compile time. A generic class can thus generate many types, one for each type of parameter, such as ARRAY [TREE], ARRAY [STRING] and so on. immediate feature: a feature whose effective definition is in the class under examination. A deferred feature is made effective by an immediate definition. inherit: one class can inherit another class, and add more features of its own. The inheriting class is called the child class, and the inherited class is called the parent class. When a child inherits, then all the features of parent are features of the child. A client of the child cannot tell if a feature was provided by code in the child, or by code in the parent. The child can change the signature, body, or export policy of a parent feature. Inheritance is transitive, so we can build up inheritance hierarchies at many levels, not just two (parent and child). Multiple and repeated inheritance are allowed in Eiffel. original feature: the first effective version of a feature, going down the inheritance hierarchy. parameter: the class passed to a generic class in the generic class declaration. The actual parameter is defined at compile time, and is bound to a formal parameter in the class header. A generic class may have multiple parameters. parent: a class directly inherited by another class. A parent class does not know who will inherit it. loop invariant: an assertion defined to be true while the loop is executing loop variant: an assertion that varies as the loop is executed, and shows that the loop will eventually terminate. precondition: a routine assertion that describes what must be true when the routine is called. Any preconditions are listed under the require keyword before the loop body. Preconditions are tested after arguments are bound, but before the routine body is executed. When a precondition fails, the name of the failing assertion is shown on the calling stack trace. Preconditions are half the mechanism of programming by contract.
© R. S. Rist, 1998
9
postcondition: a routine assertion that describes what must be true when the routine exits. Any postconditions are listed under the ensure keyword after the loop body. Postconditions are tested after the routine body is executed, but before control is returned to the caller. When a postcondition fails, the name of the failing assertion is shown on the calling stack trace. Postconditions are half the mechanism of programming by contract. programming by contract: a contract is established between the caller and the called routine (often between the client and supplier). The preconditions on a routine define what the caller must do, and it is the responsibility of the caller to make sure the routine is called in the right conditions. The postconditions on a routine define what the called routine must do, when called in the right way. If the caller guarantees the preconditions, then the called routine guarantees the postconditions; that is the contract. If the preconditions fail, then the contract is broken and the called routine guarantees nothing. programming by subcontract: a child may change the signature or behaviour of an inherited feature. A change in the child is valid only if it still implements the contract of the original routine. The new version of a routine can weaken the precondition (require new_pre or else original_pre) so the new version can be called whenever the original was called, and possibly with other argument values as well. The new version can strengthen the postcondition (ensure new_post and then original_post) so the value returned by the new routine can be used wherever the original was used.
© R. S. Rist, 1998
10
References Booch, G. (1994). Object oriented design. New York: Benjamin/Cummins. Coad, P., and Yourdon, E. (1990). Object-oriented analysis. NewYork: Prentice-Hall Henderson-Sellers, B., and Edwards, J. M. (1994). BOOKTWO of object-oriented knowledge: The working object. Sydney: Prentice-Hall. Jézéquel, J. (1996). Object-oriented software engineering with Eiffel. Reading, MA: Addison-Wesley. Meyer, B. (1997). Object-oriented software construction. New York: Prentice-Hall. Meyer, B. (1992). Eiffel: The language. New York: Prentice-Hall. Rumbaugh, J., Blaha, M., Premerlani, W., Eddy, F., and Lorensen, W. (1991). Object-oriented modelling and design. Englewood Cliffs, NJ: Prentice-Hall. Switzer, R. (1993). Eiffel: An introduction. New York: Prentice-Hall. Waldén, K., and Nerson, J. (1995). Seamless object-oriented software architecture. New York: Prentice Hall.
© R. S. Rist, 1998
1