Advances in Industrial Control
Other titles published in this Series: Digital Controller Implementation and Fragility Robert S.H. Istepanian and James F. Whidborne (Eds.) Optimisation of Industrial Processes at Supervisory Level Doris Sáez, Aldo Cipriano and Andrzej W. Ordys Robust Control of Diesel Ship Propulsion Nikolaos Xiros Hydraulic Servo-systems Mohieddine Jelali and Andreas Kroll Strategies for Feedback Linearisation Freddy Garces, Victor M. Becerra, Chandrasekhar Kambhampati and Kevin Warwick Robust Autonomous Guidance Alberto Isidori, Lorenzo Marconi and Andrea Serrani Dynamic Modelling of Gas Turbines Gennady G. Kulikov and Haydn A. Thompson (Eds.) Control of Fuel Cell Power Systems Jay T. Pukrushpan, Anna G. Stefanopoulou and Huei Peng Fuzzy Logic, Identification and Predictive Control Jairo Espinosa, Joos Vandewalle and Vincent Wertz Optimal Real-time Control of Sewer Networks Magdalene Marinaki and Markos Papageorgiou Process Modelling for Control Benoît Codrons Computational Intelligence in Time Series Forecasting Ajoy K. Palit and Dobrivoje Popovic Modelling and Control of mini-Flying Machines Pedro Castillo, Rogelio Lozano and Alejandro Dzul
Rudder and Fin Ship Roll Stabilization Tristan Perez Hard Disk Drive Servo Systems (2nd Edition) Ben M. Chen, Tong H. Lee, Kemao Peng and Venkatakrishnan Venkataramanan Measurement, Control, and Communication Using IEEE 1588 John Eidson Piezoelectric Transducers for Vibration Control and Damping S.O. Reza Moheimani and Andrew J. Fleming Windup in Control Peter Hippe Manufacturing Systems Control Design Stjepan Bogdan, Frank L. Lewis, Zdenko Kovaˇci´c and José Mireles Jr. Nonlinear H2 /H∞ Constrained Feedback Control Murad Abu-Khalaf, Jie Huang and Frank L. Lewis Modern Supervisory and Optimal Control Sandor A. Markon, Hajime Kita, Hiroshi Kise and Thomas Bartz-Beielstein Publication due July 2006 Wind Turbine Control Systems Fernando D. Bianchi, Hernán De Battista and Ricardo J. Mantz Publication due August 2006 Soft Sensors for Monitoring and Control of Industrial Processes Luigi Fortuna, Salvatore Graziani, Alessandro Rizzo and Maria Gabriella Xibilia Publication due August 2006 Practical PID Control Antonio Visioli Publication due November 2006 Magnetic Control of Tokamak Plasmas Marco Ariola and Alfredo Pironti Publication due May 2007
Torsten Bohlin
Practical Grey-box Process Identification Theory and Applications
With 186 Figures
123
Torsten Bohlin Automatic Control, Signals, Sensors and Systems Royal Institute of Technology (KTH) SE-100 44 Stockholm Sweden
British Library Cataloguing in Publication Data Bohlin, Torsten, 1931Practical grey-box process identification : theory and applications. - (Advances in industrial control) 1.Process control - Mathematical models 2.Process control Mathematical models - Case studies I.Title 670.4’27 ISBN-13: 9781846284021 ISBN-10: 1846284023 Library of Congress Control Number: 2006925303 Advances in Industrial Control series ISSN 1430-9491 ISBN-10: 1-84628-402-3 e-ISBN 1-84628-403-1 ISBN-13: 978-1-84628-402-1
Printed on acid-free paper
© Springer-Verlag London Limited 2006 MATLAB® and Simulink® are registered trademarks of The MathWorks, Inc., 3 Apple Hill Drive, Natick, MA 01760-2098, U.S.A. http://www.mathworks.com Modelica® is a registered trademark of the “Modelica Association” http://www.modelica.org/ DymolaTM is a trademark of Dynasim AB, Research Park Ideon, Lund 223 70, Sweden. www.Dynasim.se Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Printed in Germany 987654321 Springer Science+Business Media springer.com
Advances in Industrial Control Series Editors Professor Michael J. Grimble, Professor of Industrial Systems and Director Professor Michael A. Johnson, Professor (Emeritus) of Control Systems and Deputy Director Industrial Control Centre Department of Electronic and Electrical Engineering University of Strathclyde Graham Hills Building 50 George Street Glasgow G1 1QE United Kingdom
Series Advisory Board Professor E.F. Camacho Escuela Superior de Ingenieros Universidad de Sevilla Camino de los Descobrimientos s/n 41092 Sevilla Spain Professor S. Engell Lehrstuhl für Anlagensteuerungstechnik Fachbereich Chemietechnik Universität Dortmund 44221 Dortmund Germany Professor G. Goodwin Department of Electrical and Computer Engineering The University of Newcastle Callaghan NSW 2308 Australia Professor T.J. Harris Department of Chemical Engineering Queen’s University Kingston, Ontario K7L 3N6 Canada Professor T.H. Lee Department of Electrical Engineering National University of Singapore 4 Engineering Drive 3 Singapore 117576
Professor Emeritus O.P. Malik Department of Electrical and Computer Engineering University of Calgary 2500, University Drive, NW Calgary Alberta T2N 1N4 Canada Professor K.-F. Man Electronic Engineering Department City University of Hong Kong Tat Chee Avenue Kowloon Hong Kong Professor G. Olsson Department of Industrial Electrical Engineering and Automation Lund Institute of Technology Box 118 S-221 00 Lund Sweden Professor A. Ray Pennsylvania State University Department of Mechanical Engineering 0329 Reber Building University Park PA 16802 USA Professor D.E. Seborg Chemical Engineering 3335 Engineering II University of California Santa Barbara Santa Barbara CA 93106 USA Doctor K.K. Tan Department of Electrical Engineering National University of Singapore 4 Engineering Drive 3 Singapore 117576 Professor Ikuo Yamamoto Kyushu University Graduate School Marine Technology Research and Development Program MARITEC, Headquarters, JAMSTEC 2-15 Natsushima Yokosuka Kanagawa 237-0061 Japan
To the KTH class of F53
Series Editors’ Foreword
The series Advances in Industrial Control aims to report and encourage technology transfer in control engineering. The rapid development of control technology has an impact on all areas of the control discipline. New theory, new controllers, actuators, sensors, new industrial processes, computer methods, new applications, new philosophies}, new challenges. Much of this development work resides in industrial reports, feasibility study papers and the reports of advanced collaborative projects. The series offers an opportunity for researchers to present an extended exposition of such new work in all aspects of industrial control for wider and rapid dissemination. Experienced practitioners in the field of industrial control often say that about 70 – 80% of project time is spent on understanding and modelling a process, developing a simulation and then testing, calibrating and validating the simulation. Control design and investigations will then absorb the other 20 – 30% of the project time; thus, it is perhaps a little surprising that there is so little published on the formal procedures and tools for performing these developmental modelling tasks compared with the provision of simulation software tools. There is a very clear difference between these two types of activities: simulation tools usually comprise libraries of numerical routines and a logical framework for their interconnection often based on graphical representations like block diagrams of the actual steps needed to arrive at a consistent model but replicating observed physical process behaviour is a far more demanding objective. Such is the agenda underlying the inspirational work of Torsten Bohlin reported in his new Advances in Industrial Control monograph, Practical Grey-box Identification. The starting point for this work lies in the task of providing models for a range of industrial production processes including: Baker’s yeast production, steel rinsing (the rinsing of moving steel strip in a rolling-mill process), continuous pulp digestion, cement milling, an industrial recovery boiler process (pulp production process unit) and cardboard manufacturing. The practical experience of producing these models supplied the raw data for understanding and abstracting the steps needed in a formal grey-box identification procedure; thus, it was a project that has been active for over 15 years and over this period, the grey-box identification procedure was formulated, tested, re-formulated and so-on until a generic procedure of wide applicability finally emerged.
x
Series Editors’ Foreword
In parallel with this extraction of the fundamental grey-box identification procedure has been the development of the Process Model Calibrator and Validator software, the so-called MoCaVA software. This contains the tools that implement the general steps of grey-box identification. Consequently it is based on an holistic approach to process modelling that uses a graphical block-diagram representation but incorporates routines like loss function minimisation for model fitting and other statistical tools to allow testing of model hypotheses. The software has been tested and validated through its use and development with an extensive and broadly based group of individual processes, some of which are listed above. This monograph captures three aspects of Torsten Bohlin’s work in this area. Firstly, there is an introduction to the theory and fundamentals of grey-box identification (Part I) that carefully defines white-box, black-box and grey-box identification. From this emerge the requirements of a grey-box procedure and the need for software to implement the steps. Secondly, there is the MoCaVa software itself. This is available for free download from a Springer website whose location is given in the book. Part II of the monograph is a tutorial introduction and user’s guide to the use of the MoCaVa software. For added realism, the tutorial is based on a drum boiler model. Finally the experience of the tutorial introduction is put to good use with the two fully documented case studies given as Part III of the monograph. Process engineers will be able to work at their own pace through the model development for a rinsing process for steel strip in a rolling mill and the prediction of quality in a cardboard manufacturing process. The value of the case studies is two-fold since they provide a clear insight into the procedures of greybox identification and give in-depth practical experience of using the MoCaVa software for industrial processes; both of these are clearly transferable skills. The Advances in Industrial Control monograph series has often included volumes on process modelling and system identification but it is believed that this is only the second ever volume in the series on the generic steps in an holistic greybox identification procedure. The volume will be welcomed by industrial process control engineers for its insights into the practical aspects of process model identification. Academics and researchers will undoubtedly be inspired by the more generic theoretical and procedural aspects that the volume contributes to the science and practice of system identification. M.J. Grimble and M.A. Johnson Industrial Control Centre Glasgow, Scotland, U.K.
Preface
Those who have tried the conventional approaches to making mathematical models of industrial production processes have probably also experienced the limitations of the available methods. They have either to build the models from first principles, or else to apply one of the ‘black−box’ methods based on statistical estimation theory. Both approaches work well under the circumstances for which they were designed, and they have the advantage that there are well developed tools for facilitating the work. Generally, the modelling tools (based on first principles) have their applications to electrical, mechanical, and hydrodynamical systems, where much is known about the principles governing such systems. In contrast, the statistical methods have their applications in cases where little is known in advance, or when detailed knowledge is irrelevant for the purpose of the modelling, typically for design of feedback control. In modelling for the process industry, however, prior knowledge is typically partial, the effects of unknown input (’disturbances’) are not negligible, and it is desirable to have reproducibility of the model, for instance for the monitoring of unmeasured variables, for feed−forward control, or for long−range prediction of variables with much delayed responses to control action. Conceivably, ‘grey−box’ identification, which is a ‘hybrid’ of the two approaches, would help the situation by exploiting both of the two available sources of information, namely i) such invariant prior knowledge that may be available, and ii) response data from experiments. Thus, grey−box methods would have their applications, whenever there is some invariant prior knowledge of the process and it would be a waste of information not to use it. After the first session on grey−box identification at the 4th IFAC Symposium on Adaptive Systems in Control and Signal Processing in 1992, and the first special issue in Int. J. Adaptive Control and Signal Processing in 1994, the approach has now been reasonably well accepted as a paradigm for how to address the practical problems in modelling physical processes. There are now quite a number of publications, most about special applications. (A Google search for “Grey box model” in 2005 gave 691 hits.) However, the problems of designing tools for grey−box identification are many. Mainly, prior knowledge of industrial processes is usually diversified and primarily ill adapted to the purpose of the model making. It is in the nature of things that prior knowledge is more or less precise, reliable, and relevant (it may even be false). This raises a number of fundamental questions, in addition to the practical problems: How can I make use of what I do know? How much of my prior knowledge is useful and even correct, when used in the particular environment? What do I do about the unknown disturbances I cannot get rid of? Are my experiment data sufficient and relevant? How do I know when the model is good enough?
xii
Preface
It was the desire to find some answers to these questions that initiated a long−range project at the Automatic Control department of KTH. The present book is based on the results of that project. It stands on three legs: i) A theoretical investigation of the fundamentals of grey−box identification. It revealed that sufficiently many theoretical principles were available in the literature for answering the questions that needed to be answered. The compilation was published in a book (Bohlin, 1991a), which ended with a number of procedures for doing grey− box identification properly. ii) A software tool MoCaVa (Process Model Calibrator & Validator) based on one of the procedures (Bohlin and Isaksson, 2003). iii) A number of case studies of grey−box identification of industrial processes. They were carried out in order to see whether the theoretical procedure would also be a practical one, and to test the software being developed in parallel. Most case studies have been done by PhD students at the department under the supervision of the author. The extent of the work was roughly one thesis per case. This book will focus on the software and the case studies. Thus it will serve as a manual to MoCaVa, as well as illustrating how to apply MoCaVa efficiently. Success in grey−box identification, as in other design, will no doubt depend of the skill of the craftsman using the tool, and I believe that skill is best gained by exercise, and case studies to be a good introduction. In addition, there is a ‘theory’ chapter with the purpose of describing the basic deliberations, derivations, and decisions behind MoCaVa and the way it is constructed. The purpose is to provide additional information to anyone who wants to understand more of its properties than revealed in the user’s manual. This may help the user to appraise the strengths and weaknesses of the program, either in order to be able to do the same with the models that come out of it, or even to develop MoCaVa further. (The source code can be downloaded from Springer.) The focus is therefore on the applicability of the theories for the purpose of MoCaVa, rather than on the theories themselves. Still, the chapter involves some not elementary mathematics, but mathematics stripped from the painstaking exactness of strict mathematics. This too is motivated by a kind of ‘grey−box thinking’, this time to try and bridge the notorious gap between theory and practice. It would be futile trying to adhere to the code of strict mathematics when dealing with problems that cannot be solved in that way, and, in addition, meant to be understood by readers who are not used to strict mathematics. And, conversely, it would be impractical to try and solve all problems of grey−box identification by relying on intuition and reasoning alone, however clever. Therefore, the mathematics is interpreted in intuitive terms, and necessary approximations motivated in the same way, whenever the mathematical problems become unsurmountable, or an exact solution would take prohibitively long for a computer to process. The following is one of my favorite quotations: “The man thinks. The theory helps him to think, and to maintain his thinking consistent in complex situations” (Peterka). The method presented in this book for building grey−box models of physical objects has three kinds of support: A systematic procedure to follow, a software package for doing it, and case studies for learning how to use it. Part I motivates and describes the procedure and the MoCaVa software. Part II is a tutorial on the use of MoCaVa based on simple examples. Part III contains two extensive case studies of full−scale industrial processes.
Preface
xiii
How to Use this Book Successful grey−box identification of industrial processes requires knowledge of two kinds, i) how the process works, and ii) how the software works. Since the knowledge is normally not resident within the same person, two must contribute. Call them “process engineer” and “model designer”. The latter should preferably have taken a course in ‘Process identification’. Part I is for the “model designer”, who needs to understand how the MoCaVa software operates, in order to appreciate its limitations − what it can and cannot do. Part II is for both. It is a tutorial on running MoCaVa, written for anyone who actually wants to build a grey−box model. It is also useful as an introduction to the case studies, since it is based on two pervading simple examples. Part III is also for both. It develops the case studies in some detail, highlighting the contributions of the three ‘actors’ in the session, viz. the engineer, the model designer/ program operator, and the MoCaVa program. The technical details in Part III is probably of interest only to those working in the relevant businesses (steel or paper & pulp), but are still important as illustrations of the issues that must be considered in practical grey−box process identification. The style of parts II and III deviates somewhat from what is customary in text books, namely to use sentences in passive form, free of an explicit subject. The idea of the customary practice is that science and engineering statements should be valid irrespective of the subject. Unfortunately, the custom is devastating for the understanding, when describing processes where there are indeed several subjects involved. “Who does what” becomes crucial. Therefore, part II is written more like a user’s manual. In describing grey−box identification practice there is, logically, no less that five ‘actors’ involved: : The customer/engineer (providing the prior information about the physical process and appraising the result) : The model designer/user of the program tools (often the same person as the customer, but not if he/she lacks sufficient knowledge of the physical process to be modelled). : The computer and program (analyzing the evidence of the data). : The author of this book (trying to reason with a reader) : The reader of the book (trying to understand what the author tries to say). In order to reduce the risk of confusion when describing a grey−box identification session − a process that involves at least the first three actors − the following convention will be used in the book: The contributions of the different actors are marked with symbols at the beginning of the paragraph, viz. for the operator (doing key pressing and mouse clicking), for MoCaVa (computing and displaying the results), and for the model builder (watching the screen, deliberating, and occasionally calculating on paper). It will no doubt help the reader who wants to follow the examples on a computer, that the symbol states explicitly what to do in each moment, and the symbol points to the expected response. There are also paragraphs without an initiating symbol − they have the ordinary rôle of the author talking to a reader. Also as a convention, Courier fonts are used for code, as well as for variables that appear in the code, and for names of submodels, files, and paths. HelveticaNarrow is used for user communication windows and for labels that appear in screen images.
xiv
Preface
The book uses a number of special terms and concepts of relevance to process identification. Some, but not all should be well−known, or self−explanatory to model designers, but probably not all. The “Glossary of Terms” contains short definitions, without mathematics, and some with clarifying examples. The list serves the same purpose as the ‘hypertext’ function in HTML documentation, although less conveniently. The contents in Part II is also available in HTML format. This form has the well− known advantage that explanations of some key concepts become available at a mouse click, and only if needed. In Part II explanations appear either under the headers Help or Hints, or else as references to sections in the appendix, which unavoidably means either wading through text mass (that can possibly be skipped), or looking up the appropriate sections in the appendix. In order to reduce the length of Part II the number of printed screen images is also smaller than those in the HTML document. MoCaVa is downloadable from www.springer.com/1−84628−402−3 together with all material needed for running the case studies. (The package also contains the HTML−manual as well as on−line help facilities.) This offers a possibility to get more direct experience of the model−design session. It would therefore be possible to use Parts II and III as study material for a course in grey−box process identification.
Acknowledgements The author is indebted to the following individuals who participated in the Grey−box development project: Stefan Graebe, who wrote the first C−version of the IdKit tool box, and later participated in the Continuous Casting case study. James Sørlie, who investigated possible interfaces to other programs. Bohao Liao, who investigated search methods. Ning He, who investigated real−time evaluation of Likelihood. Anders Hasselkvist, who wrote Predat. Tomas Wenngren, who wrote the first GUI. Germund Mathiasson and Jiri Uosukainen who wrote the first version of Validate. Olle Ehrengren, who wrote the first version of Simulate. Ping Fan, who did the Baker’s Yeast case study. Björn Sohlberg, who did the first Steel Rinsing case study. Jonas Funkquist, who did the Pulp Digester case study. Oliver Havelange, who did the Cement Milling case study. Jens Pettersson, who did the second Cardboard case study. Ola Markusson, who did the EEG−signals case study. Bengt Nilsson, who contributed process knowledge to the Cardboard case study. Jan Erik Gustavsson, who contributed process knowledge to the Recovery Boiler case study. Alf Isaksson, who participated in the Pulp Refiner and Drive Train cases, and headed the MoCaVa project between 1998 and 2001. Linus Loquist, who designed the MoCaVa home page.
Contents
Part I Theory of Grey−box Process Identification
1 Prospects and Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 White, Black, and Grey Boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 White−box Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2.2 Black−box Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.3 Grey−box Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3 Basic Questions ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.3.1 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.3.2 How to Specify a Model Set . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.4 ... and a Way to Get Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.5 Tools for Grey−box Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.5.1 Available Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.5.2 Tools that Need to Be Developed . . . . . . . . . . . . . . . . . . . . . . . 21 2 The MoCaVa Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 The Model Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Time Variables and Sampling . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Process, Environment, and Data Interfaces . . . . . . . . . . . . . . . 2.1.3 Multi−component Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4 Expanding a Model Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The Modelling Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Argument Relations and Attributes . . . . . . . . . . . . . . . . . . . . . 2.2.2 Graphic Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Prior Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Credibility Ranking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Model Classes with Inherent Conservation Law . . . . . . . . . . . 2.3.4 Modelling ‘Actuators’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.5 Modelling ‘Input Noise’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.6 Standard I/O Interface Models . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Fitting and Falsification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23 23 24 25 27 29 31 34 37 41 42 43 43 44 46 49 51
xvi
Contents
2.5
2.6 2.7
2.4.1 The Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Nesting and Fair Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Evaluating Loss and its Derivatives . . . . . . . . . . . . . . . . . . . . . 2.4.4 Predictor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.5 Equivalent Discrete−time Model . . . . . . . . . . . . . . . . . . . . . . . Performance Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Controlling the Updating of Sensitivity Matrices . . . . . . . . . . 2.5.2 Exploiting the Sparsity of Sensitivity Matrices . . . . . . . . . . . . 2.5.3 Using Performance Optimization . . . . . . . . . . . . . . . . . . . . . . . Search Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Applicability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.2 A Method for Grey−box Model Design . . . . . . . . . . . . . . . . . . 2.7.3 What is Expected from the User? . . . . . . . . . . . . . . . . . . . . . . . 2.7.4 Limitations of MoCaVa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.5 Diagnostic Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.6 What Can Go Wrong? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52 54 55 56 56 57 58 59 60 62 65 65 67 68 69 69 71
Part II Tutorial on MoCaVa
3 Preparations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Downloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.4 Starting MoCaVa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.5 The HTML User’s Manual . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The ‘Raw’ Data File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Making a Data File for MoCaVa . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77 77 77 77 77 78 78 78 78
4 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.1 Creating a New Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.2 The User’s Guide and the Pilot Window . . . . . . . . . . . . . . . . . . . . . . . 85 4.3 Specifying the Data Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.3.1 The Time Range Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.4 Creating a Model Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.4.1 Handling the Component Library . . . . . . . . . . . . . . . . . . . . . . 89 4.4.2 Entering Component Statements . . . . . . . . . . . . . . . . . . . . . . . 90 4.4.3 Classifying Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.4.4 Specifying I/O Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.4.5 Specifying Argument Attributes . . . . . . . . . . . . . . . . . . . . . . . 98 4.4.6 Specifying Implicit Attributes . . . . . . . . . . . . . . . . . . . . . . . . 100 4.4.7 Assigning Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.5 Specifying Model Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.6 Simulating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.6.1 Setting the Origin of the Free Parameter Space . . . . . . . . . . . 103 4.6.2 Selecting Variables to be Plotted . . . . . . . . . . . . . . . . . . . . . . 104 4.6.3 Appraising Model Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Contents
4.7 4.8
xvii
Handling Data Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fitting a Tentative Model Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.1 Search Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.2 Appraising the Search Result . . . . . . . . . . . . . . . . . . . . . . . . . Testing a Tentative Model Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9.1 Appraising a Tentative Model . . . . . . . . . . . . . . . . . . . . . . . . 4.9.2 Nesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9.3 Interpreting the Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . Refining a Tentative Model Structure . . . . . . . . . . . . . . . . . . . . . . . . Multiple Alternative Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Augmenting a Disturbance Model . . . . . . . . . . . . . . . . . . . . . . . . . . . Checking the Final Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Terminals and ‘Stubs’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Copying Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Effects of Incorrect Disturbance Structure . . . . . . . . . . . . . . . . . . . . . Exporting/Importing Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . Suspending and Exiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.18.1 The Score Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Resuming a Suspended Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Checking Integration Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
106 107 108 111 113 116 118 119 121 122 124 132 134 135 138 140 141 142 143 143
5 Some Modelling Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Modelling Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 The Model Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 User’s Functions and Library . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Rescaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Importing External Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Using DymolaZ as Modelling Tool for MoCaVa . . . . . . . . . 5.3.2 Detecting Over−parametrization . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Assigning Variable Input to Imported Models . . . . . . . . . . . . 5.3.4 Selective Connection of Arguments to DymolaZ Models . .
147 147 148 153 154 159 160 166 170 173
4.9
4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18 4.19 4.20
Part III Case Studies
6 Case 1: Rinsing of the Steel Strip in a Rolling Mill . . . . . . . . . . . . . . . . . 6.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Step 1: A Phenomenological Description . . . . . . . . . . . . . . . . . . . . . . 6.2.1 The Process Proper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 The Measurement Gauges . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.3 The Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Step 2: Variables and Causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 The variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 Cause and effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.3 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.4 Relations to Measured Variables . . . . . . . . . . . . . . . . . . . . . . 6.4 Step 3: Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Basic Mass Balances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2 Strip Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
185 185 185 185 188 189 189 189 190 191 192 194 194 201
xviii Contents
6.5 6.6
Step 4: Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Refining the Model Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.1 The Squeezer Rolls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.2 The Entry Rolls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Continuing Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8 Refining the Model Class Again . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8.1 Ventilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9 More Hypothetical Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9.1 Effective Mixing Volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9.2 Avoiding the pitfall of ‘Data Description’ . . . . . . . . . . . . . . . 6.10 Modelling Disturbances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10.1 Pickling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10.2 State Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.11 Determining the Simplest Environment Model . . . . . . . . . . . . . . . . . 6.11.1 Variable Input Acid Concentration . . . . . . . . . . . . . . . . . . . . 6.11.2 Unexplained Variation in Residual Acid Concentration . . . . 6.11.3 Checking for Possible Over−fitting . . . . . . . . . . . . . . . . . . . . 6.11.4 Appraising Roller Conditions . . . . . . . . . . . . . . . . . . . . . . . . 6.12 Conclusions from the Calibration Session . . . . . . . . . . . . . . . . . . . . .
203 206 206 211 213 215 215 217 217 219 222 222 223 225 225 225 229 233 233
7 Case 2: Quality Prediction in a Cardboard Making Process . . . . . . . . . 7.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Step 1: A Phenomenological Description . . . . . . . . . . . . . . . . . . . . . . 7.3 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Step 2: Variables and Causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Relations to Measured Variables . . . . . . . . . . . . . . . . . . . . . . 7.5 Step 3: Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.1 The Bending Stiffness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.2 The Paper Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.3 The Pulp Feed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.4 Control Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.5 The Pulp Mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.6 Pulp Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.7 The Pulp Constituents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 Step 4: Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.7 Expanding the Tentative Model Class . . . . . . . . . . . . . . . . . . . . . . . . 7.7.1 The Pulp Refining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.7.2 The Mixing−tank Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 7.7.3 The Machine Chests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.7.4 Filtering the “Kappa” Input . . . . . . . . . . . . . . . . . . . . . . . . . . 7.8 Checking for Over−fitting: The SBE Rule . . . . . . . . . . . . . . . . . . . . . 7.9 Ending a Calibration Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.9.1 ‘Black−box’ vs ‘White−box’ Extensions . . . . . . . . . . . . . . . . 7.9.2 Determination vs Randomness . . . . . . . . . . . . . . . . . . . . . . . . 7.10 Modelling Disturbances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.11 Calibrating Models with Stochastic Input . . . . . . . . . . . . . . . . . . . . . 7.11.1 Determination vs Randomness Revisited . . . . . . . . . . . . . . . . 7.11.2 A Local Minimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.12 Conclusions from the Calibration Session . . . . . . . . . . . . . . . . . . . . .
235 235 235 237 244 247 248 248 253 260 262 265 267 269 271 279 279 284 287 289 290 293 293 294 295 296 299 304 306
Contents
xix
Appendices
A Mathematics and Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.1 The Model Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 The Loss Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3 The ODE Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3.1 The Reference Trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3.2 The State Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3.3 The Equivalent Discrete−time Sensitivity Matrices . . . . . . . . A.4 The Predictor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.4.1 The Equivalent Discrete−time Model . . . . . . . . . . . . . . . . . . A.5 Mixed Algebraic and Differential Equations . . . . . . . . . . . . . . . . . . . A.6 Performance Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.6.1 The SensitivityUpdateControl Function . . . . . . . . . . . . . . . . A.6.2 Memoization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.7 The Search Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.8 Library Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.8.1 Output Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.8.2 Input Interpolators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.8.3 Input Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.8.3 Disturbance Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.9 The Advanced Specification Window . . . . . . . . . . . . . . . . . . . . . . . . B.2.1 Optimization for Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2.2 User’s Checkpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2.3 Internal Integration Interval . . . . . . . . . . . . . . . . . . . . . . . . . . B.2.4 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
313 313 316 317 317 318 318 321 322 322 326 327 330 330 331 331 331 334 335 337 337 338 338 339
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
1
Prospects and Problems
1.1 Introduction The task of making a mathematical model of a physical object, such as an industrial process, involves a diversity of problems. Some of these have traditionally been the subject of theoretical research and software development. Such a problem is “System identification”, typically defined as follows: “Given a parametric class of models, find the member that fits given experiment data with the minimum loss according to a given criterion” (Ljung, 1987). Now, the three “given” conditions concern anyone who intends to apply the software, whether that is in the form of theory, method, or computer program. Sometimes “given” means that prerequisites are built into the software, sometimes that they are expected as input from the user of the software. When one is faced with a given object instead, and possibly also with a given purpose for the model, it is certainly not obvious how to get the answers to the questions posed by identification software. It is therefore important that developers of such software do what they can to facilitate the answering. It is not necessarily a desirable ambition to make the software more automatic by demanding less from the user. He or she is still responsible for the quality of the result, and any input that a user is able to provide, but is not asked for, may be a waste of information and reduce the quality of the model. A better goal is therefore to make the software demand its input in a form that the user can supply more easily. Secondly, user input (both prior knowledge and experiment data) is often uncertain, irrelevant, contradictory, or even false. A second goal for the software designer is therefore to provide tools for appraising the user’s input. Admittedly, any software must have something ‘given’, but it makes a difference whether the software wants assumptions, taken for facts, or just hypotheses, that will be subject to tests. This motivates the decision to base MoCaVa on the ‘grey−box’ approach. The general and somewhat vague idea of grey box identification is that when one is making a mathematical model of a physical object, there are two sources of information, namely response data and prior knowledge. And grey−box identification methods are such methods that can use both. In practice, “prior knowledge” means different things. And generally, prior knowledge is not easy to reconcile with the form of the models assumed by a particular identification method. In fact, each method starts with assuming a model class, and each model class requires its particular form of prior knowledge. What one can generally do in order to take prior knowledge into account is to start with a versatile class of models, for which there are general tools available for analysis and identification, and try
4
Practical Grey−box Process Identification
and adapt its freedom, its ‘design parameters’, i.e., the specifications one has to enter into the identification program, to the prior knowledge. This means that the ‘grey−box identification methods’ tend to be as many and as diversified as the conventional identification methods, also starting with given classes of models. This makes it hard to delimit grey−box identification from other identification and also to make a survey of ‘grey box identification methods’. Neither is that the purpose of this chapter. Instead it is to survey the fundamentals the MoCaVa software is based on. A user of the program will conceivably benefit from an understanding of the purposes of the operations performed by various routines in the program. Generally, MoCaVa is constructed by specializing and codifying the general concepts used in (Bohlin, 1991a) and following one of the procedures derived in that book. In addition, the chapter will briefly discuss the prospects and problems of developing grey−box identification software further.
1.2 Black, White, and Grey Boxes Commercially available tools for making mathematical models of dynamic processes are of two kinds, with different demands on the user. On one hand there are Modelling tools, generally associated with simulation software (e.g., DymolaZ, http://www.dynasim.se/www/Publications.pdf ), which require the user to provide complete specification of the equations governing the process, either expressed as statements written in some modelling language, such as ModelicaX (Tiller, 2001), or by connecting components from a library. This alternative may be supported by combining the modelling tools with tools for parameter optimization (e.g., HQP, http://sourceforge.net/projects/hqp). Call this “white−box” identification. On the other hand there are “black−box” system identification tools (e.g., MATLABX System Identification Tool Box), which require the user to accept one of the generic model structures (e.g., linear) and then to determine which tools to use in the particular case, and in what order, as well as the values of a number of design parameters (order numbers, weighting factors, etc.). Finally, the user must interpret the resulting model, which is expressed in a form that is not primarily adapted to the physical object. Unless the model is to be used directly for design of feedback control, there is some further translation to do. Generally, the user has two sources of information on which to base the model making: prior knowledge and experiment data. “White−box” identification uses mainly one source and “black−box” identification the other. The strength of “white−box” identification is that it will allow the user to exploit invariant prior knowledge. Its weakness is its inability to cope with the unknown and with random effects in the object and its environment. The latter is the strength of “black−box” identification based on statistical methods, but also means that the reproducibility of its results may be in doubt. In essence, “black−box” identification produces ‘data descriptions’, and repeating the experiment may well produce a much different model. This may or may not be a problem, depending on what the model is to be used for. The idea of “grey−box” identification is to use both sources, and thus to combine the strengths of the two approaches in order to reduce the effects of their weaknesses. When following Ljung’s definition of “System identification”, and regardless of the ‘colour’ of the ‘box’, the designer of a model of a physical object must do two things, i) specify a class of models, and ii) fit its free elements to data. Call this “Model-
1 Prospects and Problems
5
ling” and “Fitting”. A method with a darker shade of ‘grey’ uses less prior knowledge to delimit the model class. Even if most available identification methods tend to be more or less ‘grey’, the following notations allow a formal distinction between the generic ‘white’, ‘black’, and ‘grey box’ approaches to model design. 1.2.1 White−box Identification Since both the model class definition and the fitting procedure are implemented as algorithms they can be described formally as functions: Model: F(u t, θ) → z(t|θ) Fitting: min θ E[y N, z N(θ)]
(1.1) (1.2)
The model designer specifies the model class F, which may contain a given number of unknown parameters θ. Given a control sequence u t (where the subscript denotes the input history from some initial time up to present time t), and the parameter vector θ, a simulation program allows the computing of the model’s response z(t|θ) at any time. Any unknown parameters θ are then estimated by applying an optimization program minimizing the deviation between measured response data y N and those components of the model’s output z N that correspond to the measured values. The deviation is measured by a given loss function E. The latter is usually a sum of squared instantaneous deviations, but various filtering schemes may be used to suppress particular types of data contamination. The following are some well−known obstacles to designing “white boxes” in practice: : Unknown relations between some variables: Engineers often do not have the complete mathematical knowledge of the object to be able to write a simulation model. : Too many relations for convenience: When they do have the knowledge, the result is often too complex a model to be possible to simulate with the ease required for parameter fitting. Many physical phenomena are describable only by partial differential equations. Simulation would then require supercomputers, and identification an order of magnitude more. (Car and airplane designers could possibly afford the luxury.) : Unknown complexity: It falls solely on the designer to determine how much of the known relations to include in the model. : Sensitivity to low−frequency disturbances: Comparing output of deterministic models with data in the presence of low−frequency disturbances generally gives poor parameter estimates. : Primitive validation: If one would try and use only literature values for parameters, or make separate experiments to determine some of them, in order to avoid the cumbersome calibration of a complex model and the usually expensive experimentation on a large process, this makes it the more difficult to validate the model. Remark 1.1. The sensitivity to disturbances can sometimes be reduced by clever design of the loss function. This requires some prior information on the object’s environment. Example 1.1 Consider a cylindrical tank with cross−section area A filled with liquid of density à up to a level z, under pressure p, and having a free outlet at the bottom with area a. The
6
Practical Grey−box Process Identification
tank is replenished with the volume flow f. According to Bernoulli’s law the variations in the level will be governed by the following differential equation: dz dt = − a zg + p à + f A
(1.3)
With u = (f, p) as varying control variables, the equation cannot be solved analytically, but given values of θ = (A, a, Ã, g), an ODE solver will be able to produce a sequence of values {z(kh|θ)|k = 1, , N} of z sampled with interval h. Hence F is defined as the ODE solver operating on an equation of some form like der(z) = −a*sqrt(z*g + p/rho) + f/A
with given constant parameters a,A,g,rho and variable control input p,f. With a recorded sequence of measurements y N = {y(kh)|k = 1,.. ., N} of the tank level z during an experiment with known, step−wise changing input sequences u N , it will be possible to set up and evaluate the loss function N
[y(kh) − z(kh|θ)] 2
E(u N, θ) =
(1.4)
k=1
for any given value of θ. Applying an optimization program, it will then be possible to minimize the loss function with respect to any combination of the parameters, and in this way estimate the values of any unknowns among (A, a, Ã), but not the value of gravity g. 1.2.2 Black−box Identification Defining this case is somewhat more complicated, since the task usually involves determining one or more integer ‘order’ numbers, the values of which determine the number of parameters to be fitted (Ljung, 1987; Söderström and Stoica, 1989): Model: F n(u t, ω t, θ n) → z(t|θ n) z(t|m, θ n) Predictor: P n(u t, y t−m, θ n) → ^ ^ Fitting: Q n = min θ E[y N, z N(m, θ n)] Test: Q n−1 − Q n < χ 2
(1.5) (1.6) (1.7) (1.8)
The designer cannot change F n, which is particular to the method, except by specifying an order index n. The latter normally determines the number of unknown parameters θ n. However, the model class accepts a second, random input signal ω t (usually ‘white noise’) in order to model the effects of random disturbances. For given order numbers the parameters θ n are estimated by minimizing the deviation between response data and m steps predicted output (usually one step) according to a given loss function E. The difference between the model and the predictor is that the latter uses previous, m steps delayed response data y t−m in addition to the control sequence u t for computing the predicted responses. The predictor P n is uniquely determined by F n. However, exact and applicable predictors are known only for special classes F n, and this limits the versatility of black box identification programs. Unknown orders n are usually determined by increasing the order stepwise, and stopping when the loss reduction drops below a threshold χ 2. A popular alternative is to use a loss function that
1 Prospects and Problems
7
weights the increasing complexity associated with increasing n, which allows minimization with respect to both integer parameters n and real parameters θ n (Akaike, 1974). The model classes are most often linear, but nonlinear black−box model classes are also used (Billings, 1980). The following are practical difficulties: : Restricted and unfamiliar form: Many engineers do not feel comfortable with models produced by black−box identification programs based on statistical methods. Mainly, the model structure and parameters do not have a physical interpretation, and this makes it difficult to compare the estimates with values from other sources. : Over−parametrization: The number of parameters increases rapidly with the number of variables, and even more so when the model class is nonlinear. This leads easily to ‘over−fitting’, with all sorts of numerical problems and poor accuracy. : Poor reproducibility: What is produced is a ‘data description’. If this is also to be an ‘object description’ the model class must contain a good model of the object. If it does not; if much of the variation in the data is caused by phenomena that are not modelled well enough by F n as effects of known input u t, the fitting procedure tends to use the remaining free arguments ω t and θ n to reduce the deviations. In other words, what cannot be modelled as response to control, will be modelled as disturbance. In this way even a structurally wrong model may still predict well at a short range. If the data sequence is long, the estimated parameter accuracy may even be high. This means that one may well get a good model, with good short− range predicting ability and a high theoretical accuracy, but when the identification is repeated with a different data set, an equally ‘good’ but different model is obtained. That will not necessarily mean that the object has changed between the experiments; it may be a consequence of fitting a model with the wrong structure. Generally, it will be difficult to get reproducibility with black−box models, unless the dynamics of the whole object are invariant, including the properties of disturbances, and the model structure is right. The basic cause of the poor reproducibility of black boxes, is that it is not possible to enter the invariant and object−specific relations that are the basis of white−box models. To gain the advantages of convenience and quick result, the model designer is in fact willing to discard any result from previous research on the object of the modelling. Remark 1.2. Adaptive control will conceivably be able to alleviate the effects of poor reproducibility, and benefit from the good predictive ability of the model, but this can be exploited only for feedback control of such variables that have online sensors. Monitoring of unmeasured variables, as well as control with long delays will still be hazardous. Remark 1.3. Tulleken (1993) has suggested a way to force some prior knowledge into black−box structures, thus making the models less ‘black’.
Example 1.2 With the same tank object as in Example 1.1, one could choose to ignore the findings of Bernoulli and describe the process as a “black box”. A linear model is the most popular first choice, but if one would suspect that the process is nonlinear, and also take into account some rudimentary prior knowledge (that a hole at the bottom tends to reduce the level), the following heuristic form would also be conceivable: dz dt = p 1 z + p 2 p + p 3 f − [p 4 z + p 5 p + p 6 f] α
(1.9)
8
Practical Grey−box Process Identification
Incidentally, this form contains the ‘true’ process, Equation 1.3, with p 1 = p 2 = p 6 = 0, p 3 = 1 A, p 4 = a 2 g, p 5 = a 2 Ã, and α = ½. But normally, that is not the case. A more likely, and ‘blacker’ form, would be dz dt = p 1 + p 2 z + p 3 p + p 4 f + p 5 z 2 + p 6 p 2 + p 7 f 2
(1.10)
This will define a deterministic black box of second order F n(u t, 0, θ n) → z(t|θ n), where n = 2, u = (f, p) and θ 2 = (p 1, , p 7). It can be processed as in Example 1.1. If the parameters are many enough, if measurements are accurate, and if the experiment is not subject to external or internal disturbance, the resulting model may even perform almost as well as the white box. If, however, the varying pressure p is not recorded, it might still be possible to use the following form
dz dt = p 1 + p 2 z + p 4 f + p 5 z 2 + p 7 f 2 + v dv dt = p 8 ω
(1.11) (1.12)
where ω is ‘white noise’, and v is ‘Brownian motion’ to model the unknown term p 3 p + p 6 p 2. Hence, θ 2 = (p 1, p 2, p 4, p 5, p 7, p 8). When models have unknown input it becomes necessary to find the one−step (or m−step) predictor associated with it, in order to be able to minimize the sum of squares of prediction errors. Exact predictors are known only for some classes of models. And even if the model belongs to a class which does allow a predictor to be derived, derivation is usually no simple task. However, black−box identification programs have already done this for fairly general classes of models that do allow exact derivation. One such class is the NARMAX (for Nonlinear Auto Regressive Moving Average with eXternal input) discrete−time model class (Billings, 1980) nz
na
nz
y(τ) +
nb
b νk P ν[u(τ − k)]
a νk P ν[y(τ − k)]= ν=1 k=1
ν=1 k=1
nc
+ c 0 w(τ) +
c k w(τ − k)
(1.13)
k=0
where τ is discrete time, and P k are known functions of u, for instance powers or Legendre polynomials, and w are independent Gaussian random variables with zero means and unit variances. The model has four order numbers, n = (n a, n b, n c, n z), where the first three are the orders of the dynamics of the system, and the fourth the degree of nonlinearity. Hence, the more common linear ARMAX model has n z = 1. The parameter array θ n is the collection of all a νk, b νk, and c k in Equation 1.13. Notice that Equation 1.13 contains only measured output y in addition to the input u, which is an essential restriction, but makes it easy to derive a predictor (which is why the class is defined in this way). Since the values of w(τ) can be computed recursively from Equation 1.13, and since E{w(τ|y τ−1)} = 0, the predictor follows directly as
1 Prospects and Problems ^
y(τ|τ − 1) nz
na
E{y(τ|y τ−1)}
nz
nb
a P ν[y(τ − k)] +
=−
b νk P ν[u(τ − k)]
ν k
ν=1 k=1 nc
+0+
9
ν=1 k=1
c k w(τ − k)
(1.14)
k=1
The prediction error is y(τ) − ^ y(τ|τ − 1) = c 0 w(τ), and the loss function is N
z N(1, θ n)] = c 20 E[y N, ^
w(k) 2
(1.15)
k=1
The special case of n c = 0 (NARX) is particularly convenient. Since the predictor in Equation 1.14 will then be a linear function in all unknown parameters, and the loss function therefore quadratic, this makes it technically easy to fit a large number of parameters. Now back to the original model, Equations 1.11 and 1.12. If, for simplicity, one assumes that the sampling is dense enough, then v(t + h) = v(t) + p 8 h w(t)
(1.16)
is a good approximation of t+h
v(t + h) = v(t) + p 8
ω(t) dt
(1.17)
t
and Euler integration yields z(t + h) = h[α(t) + β(t) + v(t)]
(1.18)
where α(t) = p 5 z(t) 2, β(t) = p 1 + p 2 z(t) + p 4 f (t) + p 7 f (t) 2 and is the backwards−difference operator. Take backwards differences of Equation 1.18 again, and insert Equation 1.16. Then 2
z(t + h) = h[ α(t) + β(t) + v(t)] = h[ α(t) + β(t) + h p 8 w(t)]
(1.19)
which yields z(t) = 2 z(t − h) − z(t − 2h) + h[ α(t − h) + β(t − h) + h p 8 w(t − h)] = (2 + p 2 h) z(t − h)+ (− 1 − p 2 h) z(t − 2h) + p 5 h z(t − h) 2 − p 5 h z(t − 2h)2 + p 4 h f (t − h) − p 4 h f (t − 2h) + p 7 h f (t − h) 2 − p 7 h f (t − 2h) 2 + p 8 h 3 2 w(t − h)
(1.20)
10
Practical Grey−box Process Identification
Assuming that the measurements are accurate enough to allow y(τ) to replace z(th) and with u(τ) = f (th), makes Equation 1.21 imbedded in 1.13, with P ν(u) ≡ u ν, n a = n b = n z = 2, n c = 0, a 11 = 2 + p 2 h, a 12 = − 1 − p 2 h, a 21 = p 5 h, a 22 = − p 5 h, b 11 = p 4 h, b 12 = − p 4 h, b 21 = p 7 h, b 22 = − p 7 h, c 0 = p 8 h 3 2. After a and b have been determined by minimizing Equation 1.15, the remaining parameter c 0 can be computed from the minimum loss. However, reconstructing the original parameters p from the estimated a, b, and c creates an over−determined system of equations; there are five unknown and nine equations. This can be solved too, for instance using a pseudo inverse, but still causes a complication, since the relations are case dependent and not preprogrammed into the identification package. If one would want to avoid even having to determine the order numbers, and setting up Equations 1.11 and 1.12, and hence also to reconstruct the parameters, it is possible to specify sufficiently large order numbers, and let the necessarily large order numbers be determined by the identification program. The SFI rule (for Stepwise Forward Inclusion) achieves this (Leontaritis and Billings, 1987). It is a recursive procedure:
The SFI rule: Initialize n a = n b = n c = n z = 0 While significant reduction, repeat For x ∈ (a, b, c, z), do Alternative order number: n → ν; ν x + 1 → ν x y N(1, θ ν)] Compute alternative loss: Q xν = min θ E[y N, ^ Find the alternative order number with the smallest loss: x = arg min x Q xν Test: If Q n − Q xν > χ 2 , then Q xν → Q n, ν x → n x, and indicate significant reduction
It is possible to design the loss function E and compute the χ 2 threshold in such a way that the decision of model order can be associated with risk values. The “Maximum−Power” loss function used by MoCaVa minimizes the risk for choosing a lower order when a higher order is correct. The threshold value χ 2 is based on a given risk for choosing the higher−order model, when the lower−order is the correct.
1.2.3 Grey−box Identification The formulation of “grey−box” identification as used in this book is similar to that of the black box: Model: F n(u t, ω t, θ n) → z(t|θ n) Predictor: P n(u t, y t−m, θ n) → ^ z(t|m, θ n) z N(m, θ n)] Fitting: Q n = min θ E[y N, ^ Test: Q n−1 − Q n < χ 2
(1.21) (1.22) (1.23) (1.24)
The only, but crucial difference is that the model class F n is no longer given by the method/program, but the designer is allowed to change it more freely, and in this way enter prior knowledge about the object of the modelling. How far this freedom will extend in practice depends on what the designer of a grey−box identification program finds it practical to include. Basic limiting factors are that prior knowledge must be
1 Prospects and Problems
11
able to convert into algorithms simple enough to allow i) simulation, ii) automatic derivation of at least an approximate predictor, and iii) fitting of parameters. The particular limitations imposed by MoCaVa will be specified below. Remark 1.4. Continuous−time white noise into nonlinear models has to be handled with care (Åström, 1970; Graebe, 1990b). In practice the equations have to be discretized, and ω replaced by discrete−time white noise w (Section A.1, Restriction #3). As would be expected, there are difficulties also with grey−box identification, some have been experienced using MoCaVa3 and its predecessors. Some may vanish with further development, others are fundamental and will remain: : Heavy computing: MoCaVa needs, in principle, to evaluate the sensitivities of all state derivatives with respect to all states and all noise variables for all instants in the time range of the data sample, and for deviations in all parameters that are to be fitted. And this must be repeated until convergence. And again, the whole process must be repeated until a satisfactory model structure has been found. In the worst case each evaluation requires access to the model, which altogether creates a very large number of executions of the algorithm defining the model. For other than small models the dominating part of the execution time is spent inside the user’s model. Since, the time it takes to run the user’s model once is not ‘negotiable’, the only option for improving the design of MoCaVa is to try and reduce the number of model accesses by taking shortcuts. However, since the model structure is relatively free, it is difficult to exploit special structural properties in order to be able to find the shortcuts, like the black−box methods are able to. A way that is still open is to have MoCaVa analyze the user−determined structure, in order to find such shortcuts. MoCaVa3 is provided with some tools to do this (see Section 2.5). : Interactive: It is difficult to reduce the time spent by the user in front of the computer, for instance by doing the heavy computing overnight. : Failures: More freedom means more possibilities to set up problems that MoCaVa cannot cope with. The result may be that the search cannot fit parameters, or worse, produces a model that is wrong, because the assumptions built into its design are not satisfied in the particular case. The ‘advanced’ options that may become necessary to use with complex models require some user’s specifications of approximation levels, and this adds another risk. The causes of failures are discussed in Section 2.7.6. : Stopping: Available criteria for deciding when to stop the calibration session are somewhat blunt instruments. When a model cannot be falsified by the default conditional tests, this may well be so because the user has run out of ideas on how to improve it. In that case unconditional tests will have to do. However, they do not generally have maximum power, and therefore have a larger risk of letting a wrong model pass. A user may have to supply subjective assessment, in particular by looking at ‘transients’ in the plotted residual sequences. : Too much stochastics: Stochastic models are marvellous short−range predictors, and therefore generally excel in reducing the loss, in particular with slowly responding processes. Technically, they have at their disposal the whole sequence of noise input to manipulate, in addition to the free parameters, in order to reduce the loss. However, they have a tendency to assign even some responses of known input to disturbances, if given too much freedom to do so. The result is inferior reproducibility, since disturbances are by definition not reproducible.
12
Practical Grey−box Process Identification
Example 1.3 Return again to the tank object, Equation 1.3, and assume that the varying pressure p has not been recorded during the experiment. Since the model class F n is not preprogrammed but defined by the user in each case, the user must enter code like der(z) = −a*sqrt(z*g + p/rho) + f/A
and, in addition, specify a model for describing the unknown p. The latter may well be a black−box (preprogrammed) model, unless one knows something particular about p, that does not let it be modelled by a black box. Since p is not in the data, the next step is to find a predictor for it, and for the output. When the model is nonlinear, an optimal predictor is usually not practical, but suboptimal predictors are. Most common are various types of EKF (for Extended Kalman Filter). Armed with such a predictor, it is possible to proceed as in the black−box case, although the mode of operation of the program will be different. Mainly, the EKF (which is preprogrammed) must call a function that defines the model class, and which depends on the entered code, and therefore must be compiled and linked for each case (like in white box identification, and like in any ODE solver in a simulation program). If, again for simplicity, the sampling interval h is short enough, and if the Brownian motion is used to model p, the discrete−time equivalent of the model will be z(t + h) = z(t) + h [− a z(t) g + p(t) Ã + f (t) A]
(1.25)
p(t + h) = p(t) + λ h w 2(t) y(t) = z(t) + σ w 1(t)
(1.26) (1.27)
where w i are uncorrelated Gaussian random sequences with zero means and unit variances, and σ and λ are parameters introduced to measure the average size of measurement errors and the average rate of change of the unknown pressure p. Unlike in Example 1.2, the measurements errors need not be small. It is convenient to use the state−equation form x(τ + 1) = G[x(τ)] + E w 2(τ) y(τ) = H x(τ) + σ w 1(τ)
(1.28) (1.29)
0 G(x) = x 1 + h[− a x 1 gx + x 2 Ã + u) A] , E = λ h 2
(1.30)
H= 1 0
(1.31)
where
The EKF in MoCaVa uses an observer of the following type: x(τ) = x(τ) + ~x(τ) y(τ) = C [x(τ) + ~x(τ)] e(τ) = y(τ) − ^ y(τ) x(τ + 1) = G[x(τ)] ~ x(τ + 1) = A(τ) ~x(τ) + K(τ) e(τ) ^
^
(1.32) (1.33) (1.34) (1.35) (1.36)
1 Prospects and Problems
13
where A(τ) = ∇ x G(x), C = H. The optimal filter gain K(τ) is computed using an algorithm that involves the solution R xx(τ) of the Riccati equation associated with Equations 1.28 to 1.31. Remark 1.5. It is more common to use G(x^) in Equation 1.35 instead of G(x), since this makes a better approximation, should the estimate ^ x drift far from the reference trajectory x. On the other hand, this will make the EKF more susceptible to large disturbances, and will thus increase the risk for instability. With Equation 1.35, both this and Equation 1.36 are stable as long as the model is, while a negative value of ^ x1 g + ^ x 2 Ã, for instance caused by a spurious value in y(τ), would cause a run−time error in the evaluation of G(x^). Notice that most matrices that are needed to handle the consequences of having an unknown input depend on τ, which means that the calculations generally take much more time than with a white box. The predictor is given by Equation 1.33, the prediction error by Equation 1.34, and the loss function is computed from Equation 2.23:
R e(τ) = σ 2 + C R xx (τ) C T
(1.37)
Q(θ) = 1 [log R e(k) + e(k) 2 R e(k)] 2 k=1
(1.38)
N
It is a function of θ = (a, A, Ã, λ, σ), which can therefore be estimated by an optimization routine, like in the white−box case. An option that can be copied from the black−box identification procedure is the estimation of model complexity, by testing whether all parameters are significant, or, alternatively, some could possibly be left out from the model. For instance, the SFI rule will work, and risk values for making a wrong decision can be computed.
1.3 Basic Questions ... MoCaVa has been conceived with the following scenario in mind: Suppose a production process is to be described by a dynamic model for simulation or other purposes. A number of submodels (or first principles or heuristic relations) for parts of the process are available as prior information, developed under more or less well controlled conditions. However, when the submodels are assembled into a model for the integrated process, all their input and output are no longer controlled or measured, and the environment is no longer the same as when the submodels were developed. In addition, unmodelled phenomena and unmeasurable input (disturbances) may effect the responses significantly. It is not known which of the submodels that are needed for a satisfactory model, or whether there will still remain unexplained phenomena in the data, when all prior information has been used. And, again, prior model information is more or less precise, reliable, and relevant. This raises a number of questions from a model designer: : How can I make use of what I do know? One usually knows some, but not all. And it may be a waste not to use it. : How much of my prior knowledge is useful, or even correct when used in the particular environment? Too much detailed knowledge tends only to contribute to the complexity of the model and less to satisfying the purpose for which it is made.
14
Practical Grey−box Process Identification
Formulas obtained from the literature are often derived and verified in an environment quite different from the circumstances under which they will be put to use. : What do I do about the disturbances I cannot eliminate? This is the opposite problem: too little prior knowledge. The response of an object is usually the effect of two kinds of input, known and unknown. Call the second “disturbances”. If one does not have a model for computing the unknown input, and cannot just neglect it, then some will obviously have to be assumed instead. : Are my experiment data sufficient and relevant? Can I use ordinary data loggings, obtained during normal production and therefore at little cost? Or do I have to make specially designed experiments (and lose production while it is going on)? : How do I know when the model is good enough? It may (or may not) be hazardous just to try and use the model for the purpose it was designed, for instance control, and see if it works. That depends of course on what it costs to fail. 1.3.1 Calibration Needless to say, none of the questions can be answered in advance. Considering the diversity of a user’s prior information, originating in a variety of more or less reliable sources, it is also very unlikely that one would be able to formulate, much less solve, a mathematical problem that, given prior input and data, would produce a ‘best’ model according to a given criterion (and thus be able retain the usual definition of the identification problem). However, it is possible to conceive a multistep procedure for making a model that satisfies many of the demands one may have on it, and taking the user’s prior knowledge into account. The steps in this procedure will require the solutions of less demanding subproblems, like fitting to data, and testing whether one model is significantly better than another. The literature offers principles and ideas for solving many of the subproblems, and a number of those have been compiled into a systematic procedure for grey−box identification (Bohlin, 1986, 1991a, 1994a). One of the procedures has also been implemented as a User’s Shell (IKUS) to IdKit (Bohlin, 1993). However, its principles are general and can be implemented with other tool boxes that are general enough and open enough. MoCaVa is based on two such ‘trial−and−error’ procedures, calibration and validation. The procedures operate on sets of models, since it is not given a priori how extensive the set has to be in order to satisfy the experiment data and the purpose of the model making. The calibration routine finds the simplest model that is consistent with prior knowledge and not falsified by experiment data. It is a double loop of refinement and falsification derived from basic principles of inference (Bohlin, 1991a):
Calibration procedure: While the model set is falsified, repeat Refine the tentative model set Fit model parameters Falsify the tentative model: Until falsified, repeat Specify an alternative model set If any alternative model is better, then indicate falsified
Notice that the procedure works with two sets and two models, namely tentative, which is the best so far, and alternative, which may or may not be better.
1 Prospects and Problems
15
The questions that have to be answered are now i) how to specify a model set, ii) how to fit a model within a given set, and iii) how to decide whether an alternative model is better than a tentative one. 1.3.2 How to Specify a Model Set The following first structuring of the model set F is motivated by the mode of operation of computers and common system software, like Unix and Windows. Assume there is a given component library {< i}, such that a given selection of its members will combine into a system defining an algorithm able to compute response values z(t), given input arguments. Define the model sets Model: F(u t, ω t, , ν , θ ν) Model structure: F(u t, ω t, , ν , ·) Model class: F(u t, ω t, , ·, ·) Model library: F(u t, ω t, ·, ·, ·)
(1.30) (1.40) (1.41) (1.42)
where
: A model library is the set of all models that can be formed by combining components < i . It is the maximum set within which to look for a model.
: A model class is a smaller set, defined by the argument , which is an array of indices of selected components.
: A model structure is an even smaller set, where also the free−space index ν is giv-
en. This determines the dimension of the free parameter space with coordinates θ ν. : A model is a single member in the model structure, selected by specifying also the values of the free coordinates θ ν. It includes all specifications necessary to carry out a simulation of the model given the control input, the random sequence, and the time range. Notice that this creates two means of refining a model set: with more components, or with more free parameters. Change of model class requires recompilation in order to generate efficient code, change of model structure and model does not. The definition also concretizes ‘prior knowledge’ as (hypothetical) algebraic relations between variables, and various variable attributes (like ranges, scales, values, and uncertainty). The case−specific model library contains prior knowledge of the object, while class and structure will depend also on the experiment data. The slightly more structured procedure becomes Calibration procedure: While the structure is falsified, repeat Refine the tentative model structure: → F(u t, ω t, ^, ^ ν , ·) ^ Fit model parameters: → F(u t, ω t, ^, ^ ν , θ ν) Falsify the tentative model: Until falsified, repeat If no more alternative structures, then expand the alternative model class: → F(u t, ω t, , ·, ·) Specify alternative model structures: → F(u t, ω t, , ν , ·) If any alternative model is better, then indicate falsified and assign → ^, ν → ^ ν
16
Practical Grey−box Process Identification
The procedure does fitting and testing of an expanding sequence of hypothetical model structures. It starts with those based on the simplest and most reliable components in the component library, for instance those based on mass and energy balances. The structure is then expanded by a procedure of ‘pruning and cultivation’: Hypothetical submodels (components) are augmented and the structure is tested again. Those who do not contribute to reducing the value of the loss function are eliminated. Those who do contribute become candidates for further refinement. The procedure is interactive: The computer does the elimination, and also suggests the most probable of a limited number of alternatives. The model designer suggests the alternatives. In this way a user of MoCaVa is given an opportunity to exploit his or her knowledge of the physical object and exercise the probably increasing skill in modelling, in order to reduce the number of tests of alternatives that would be needed otherwise. The construction is based on the belief that, even if it is difficult to specify the right model structure in advance, an engineer is usually good at improving a model, when it has been revealed where it fails. As a last resort it is possible to use empirical ‘black boxes’ to model such parts or physical phenomena of a process, that have been revealed as significant, but for which there is no prior knowledge. Example 1.4 Consider the following transfer function model z=
B 1(p|n b1) B 2(p|n b2) C(p|n c) ω P (u) + P (u)+ 2 1 D(p|n d) A 1(p|n a1) A 2(p|n a2)
(1.43)
where D(p|n) = d 0 + d 1 p + ööö + d n p n , and C, A i, B i have similar forms. P 1 and P 2 are polynomials, for instance Legendre polynomials, of first and second order, and ω is continuous white noise with unit power density. Then Equation 1.43 defines the model F(u t, ω t, , ν , θ ν) with = (1, 2, 3) (1.44) ν = (n a1, n b1, n a2, n b2, n c, n d) (1.45) θ ν = (θ a11, ööö , θ a1n, θ b11, ööö , θ b1n , θ a21, ööö , θ a2n, θ b21, ööö , θ b2n,θ c1, ööö , θ cn, θ d1, ööö , θ bn) (1.46) The point of the double indexing with and ν is that it allows the definition of a number of smaller model classes F(u t, ω t, , ·, ·), for instance: : Linear and deterministic: = (1), ν = (n a1, n b1) : Linear and stochastic: = (1, 3), ν = (n a1, n b1, n c, n d) : Nonlinear and deterministic: = (1, 2), ν = (n a1, n b1, n a2, n b2) plus a number of less likely alternatives. Notice that change of class changes the functions (differential equations) and hence the source code of the computer program, which means recompilation. Each class allows a number of model structuresF(u t, ω t, , ν , ·), defined by the values of the order numbers ν , which also determines the number of parameters in the model structure. Change of structure does not generally require recompilation, provided enough space has been allocated for a maximum order, or dynamic allocation is used. When also the values of the parameters are given, this defines the model.
1 Prospects and Problems
17
The model library F(u t, ω t, ·, ·, ·) is the set of model classes from which the user can pick one by specifying . Each transfer function in Equation 1.43 defines a component. If, as in this example, all can be combined, this generates eight model classes in the library, including the ‘null’ model class y = 0.
1.4 ... and a Way to Get Answers How does this answer the original questions from a model designer? : Question: How can I make use of what I do know? Answer: By entering hypotheses, and by specifying which of those to try next. : Question: How much of my prior knowledge is useful, or even correct when used in the particular environment? Answer: That is useful which reduces loss significantly. It is correct, if fitting its parameters yields values that do not contradict any prior knowledge. : Question: What do I do with the disturbances I cannot eliminate? Answer: Describe them as stochastic processes. : Question: How do I know when the model is good enough? Answer: There are two meanings of “good enough”: i) The model is not good enough for available data and prior knowledge, as long as it can still be falsified. ii) The model is good enough for its purpose, when the validation procedure yields a satisfactory result. : Question: Are my experiment data sufficient and relevant? Answer: They are, if the validation procedure yields a result satisfying the purpose of the model design. Remark 1.6. Valid logical objections can be raised against these rather flat answers. For instance, it is possible to conceive cases, where the experiment data is adequate for the purpose, but where the calibration procedure has failed to reveal errors in the model structure (because there are no better alternative hypotheses). It is also possible to conceive disturbances that do not let themselves be described by stochastic processes of the types available in the library, or at all. All hinges on the assumption that the model library will indeed allow an adequate modelling. Remark 1.7. Even if there are cases where it is theoretically correct to use the same data for calibration and validation (‘auto−validation’), it is generally safer to base the validation on a second data set (‘cross−validation’). Still, much of the results of the validation procedure hinges on the assumption that the second data set is demanding enough. If it is not, the validation procedure will not reveal an inadequate model. In fact, a failure will not be revealed until the model has been put to use and failed obviously. The costs related to the latter event, will therefore determine how much work to put into the validation process. For instance, paper machine control can afford to fail occasionally, Mars landers rather not at all. Remark 1.8. Logically, the calibration and validation procedures have little to do with one another, since the meanings of a “good model” are different. A model may well be good enough for such a limited purpose as feedback control, and thus easily validated in that respect, but still be unable to satisfy an extensive data sequence generated by a complex object. Conversely, a model satisfying a data sequence containing little dynamic information, may well satisfy that data, as well as all one knows about the object in advance, but still be unable to satisfy its purpose when validated with different and more damanding data.
18
Practical Grey−box Process Identification
1.5 Tools for Grey−box Identification The following is a list of what is needed for realizing the calibration and validation schemes: : A versatile class of models: So that it does contain a suitable model for the particular purpose. Models must be possible to simulate and fit conveniently. : A tool to restrict this class according to prior knowledge: This is the whole point of the grey box concept. It means that there must be some modelling tool, allowing the user to formulate the prior knowledge conveniently. Model class restriction is what identification is all about, and user−supportedrestriction is what grey−box identification is all about. : A tool to fit parameters: In order to find the model that agrees with data most. : A tool to falsify fitted models: In order to eliminate incorrect hypotheses about the object. : A tool to validate models: So that the model will not be more complicated than needed for the purpose. : A procedure to follow: In addition to the tool kit there is also a need for some kind of ‘handbook’ or ‘guide’ on how to build grey−box models using the tools. Again, grey−box model making is an interactive process: At each step, the software may or may not need more information from the user, or more data, depending on whether the result so far is satisfactory or not. Remark 1.9. The list leaves out the problem of what to do when no model is good enough for the purpose. An answer is to try and get better data, and there are methods for doing this in the literature, again valid for certain classes of models. MoCaVa does not support this. 1.5.1 Available Tools Some of the tools have been available for some time. Let’s look at what they can and cannot do, in order to find out what more is needed. Nonlinear State Models A reasonably general form that evades the subtleties of continuous−time stochastic differential equations, and lets itself be simulated is dxdt = G(x, u, p) + ω y = H(x, p) + w
(1.47) (1.48)
where x is the state vector, u is known control, ω and w are continuous and discrete ‘white noises’, and p are parameters. This is a rather versatile class that suits many physical processes, and is used, for instance, in the identification tool boxes in the Cypros (Camo, 1987), Matrixx (Gupta et al., 1993), IdKit (Graebe, 1990a−d; Graebe and Bohlin, 1992; Bohlin and Graebe, 1994a,b), and CTSM (Kristensen and Madsen, 2003) packages. Given models of this form the tools fit parameters p to given data. All use the Maximum Likelihood criterion, but different ways of fitting. The first two are commercial products. IdKit is not commercial; it has been used for case studies and has been developed into MoCaVa. Kristensen, Madsen, and Jørgensen (2004) use a somewhat more general form of Equation 1.48, where the variance of the ‘diffusion term’ ω may depend on u,p,t.
1 Prospects and Problems
19
What are the obstacles to a wider application of grey−box Maximum Likelihood identification tools? Mainly that they are difficult to use for other than experts. There are difficulties with : Modelling: Since one has usually to try a large number of structures before finding a suitable one, in particular with models of the complexity required by a full−scale industrial process, it becomes quite a tedious task to write all the models, and also to maintain the necessary housekeeping of all rejected attempts. : Setup and interpretation: It is easy to set up meaningless problems for the tools to solve (which they gladly do). It is more difficult to see whether the solutions are any good. : The state−differential equations also leave out important dynamical properties of some real objects, for instance those containing delays, or phenomena better described by partial differential equations, or containing hard discontinuities, like dry friction.
Modelling Tools They are tools to enter prior knowledge. Examples are SimulinkX (www.mathworks.com/products/simulink), Bond graphs (Margolis, 1992), DymolaZ (Elmqvist, 1978), Omola (Mattson et al., 1993), and ModelicaX (Tiller, 2001). SimulinkX is probably the most well known, and most adapted to the way control engineers like to describe systems. It generates the model−defining statements from block diagrams. The other are in principle model specification languages and tools, and they are normally combined with simulation programs that accept models defined in these particular languages. Sørlie (1994a, 1995a, 1996d) has shown a way to use Omola to write models for IdKit. It is still a considerable effort to write models in these languages, instead of directly in some programming language, such as M−files or C (in addition to the effort of learning a new language). However, the advantage of using a comprehensive modelling language is that it prevents the writing of inconsistent model equations. It is also possible to include extensive libraries of component models, thus simplifying the modelling. There is still no guarantee that the identification problems set up using these tools make sense. The languages were developed for simulation purposes. There are some problems with using them for grey−box identification: : Specialized languages: The languages are basic, and the user has to learn one of them. Like other computer languages they tend to develop into covering more and more objects, and this makes them more general and more abstract. Libraries may show a way out, but are of course limited by what the vendor finds it profitable to develop. In addition, since calibrating and validating a model is a much more demanding task than simulating it, the development tends to allow the writing of models increasingly less suitable for identification purposes. Again, more libraries may be a way out, if specialized to suit the identification purpose. : ODE solving and parameter optimization: There are special numerical problems associated with combining standard optimizers with efficient ODE solvers using step−length control. The numerical errors interfere. This means in practice that both integration and optimization will have to be done with extremely high numerical precision. There is at least one program (diffpar, Edsberg and Wikström, 1995) designed to do simultaneous integration and optimization. It handles only models without disturbances.
20
Practical Grey−box Process Identification
: Not predicting models: Grey−box identification is not simulation plus fitting, it is prediction plus fitting (and more). Modelling languages do not primarily produce predicting models. The difference is that a predictor uses past input and output to compute the next output, a simulator only past input. The difference is important when disturbances are important. Whenever it pays to have feedback control it also pays to use a predictive model, most obviously if the purpose is Model Predictive Control. Even if it is possible, in principle, to derive a predicting model from a simulating one, this is no easy task. It is known as ‘the nonlinear filtering problem’, and, in fact, only few cases have been solved so far. In practice it is not as bad as that, since approximating filters may be enough. Sørlie (1996) has investigated the possibilities of combining Omola with an Extended Kalman Filter.
Optimization Tools Classical optimization methods are those of Fletcher−Powell and Newton−Raphson type, and there are well developed computer libraries for doing the kind of continuous parameter optimization needed in both white, black, and grey−box identification. A particular prerequisite of model−fitting is that one cannot usually afford to evaluate the loss function a large number of times. Quasi−Newton methods are particularly effective for predictive models (Liao, 1989a). The reason is that one obtains the fast convergence of a second−order search method from evaluations of first−order derivatives of the model output. However, this enhances the search problem in more difficult cases: : Multiple minima: Global search methods, like the new and intuitively attractive method of ‘genetic programming’ tend to take an uncomfortably large number of loss evaluations. Alternatively, local search methods may have to be applied several times with different start values. : Discontinuities: The presence of discontinuities in the model’s parameter dependence ruins any search based on assumptions of continuity. Less serious, but still troublesome are discontinuities in parameter sensitivity. Validation and Falsification Once again, those tasks have basically different purposes: Falsification decides whether a model is good enough for the available data. Validation decides whether it is good enough for its purpose. A model can be both “false” and “valid”, as well as any other of the four possible combinations of the outcomes of the two tests. There are several quite general statistical tests for the falsification task, and most black−box identification packages support some of them, mainly ‘chi−square’ and ‘cross−correlation’ tests. They are typically used for order determination. Likelihood−Ratio tests are applicable to nonlinear models, and in addition have maximum discriminating power, i.e., they have the maximum probability of rejecting an incorrect model, for a given risk of rejecting a correct one. Validation is conventionally done by making a loss function that reflects the purpose of the modelling, evaluating the loss for the candidate model and see whether it is below a likewise given threshold. The simplest case is that when the modelling is done for control purposes, because a suitable loss is then the prediction error variance (when the model is evaluated using a different data sample). Remark 1.10. Falsification methods are sometimes found under the “validation” keyword in the literature.
1 Prospects and Problems
21
Calibration Procedure The procedure is a codification of the general approach scientist’s use when drawing conclusions from observations (which is the core of model making). In essence, the method adapts the rules to the case of state−vector models, and in this way lays a foundation for a user’s guide. Which can conceivably be implemented on a computer, and has been implemented in MoCaVa. 1.5.2 Tools that Need to Be Developed Generally, there are tools enough to make grey−box models, and evidence that it can be done in practice, if one knows how to use the tools. What remains is to make it easier. This is not without problems, however. The man−machine communication problem has to be considered. And communication has two directions: : User input: What prior information is it reasonable to ask from the user? The problem is enhanced by the fact that users in different branches of engineering have different ways of looking at models, and therefore different kinds of prior knowledge. This means that, ideally, there should be different man−machine interfaces for different categories of users. The interface implemented in MoCaVa is designed for process engineers, more than control engineers. : User support: The task which rests most heavily on the user is deciding what to do next, when a model has been found inadequate. What the computer can conceivably do to facilitate this is to present the evidence of the test results in a way that reveals at which point the model fails and that also is easy to understand. Unfortunately, general tests are rather blunt instruments in this respect. The result of a statistical test has the binary value of either “passed” or “failed” (in practice, it tends to be “failed”, since maximum−power statistical tests are honed to razor sharpness in that respect). However, there are some means to get more information out of testing a given model. An option in MoCaVa works in connection with the stepwise refinement and falsification of the model structure outlined above. It is based on an idea that can be illustrated by the following simple example: Assume that the current tentative structure is expanded by a free parameter p, whose value is known a priori to be positive. Instead of limiting the search to positive values, it is more informative to proceed as follows: Do not limit the search to positive values. Then the test has one of three possible outcomes as depicted in Figure 1.1: Hypothesis H 0 represents the tentative model (p = 0) and H 1 an alternative ( p ≠ 0). The particular case that there is an alternative but inadmissible model with a significantly lower loss Q(p < 0) means that H 0 is still rejected (since a better model does exist), but H 1 is not the one, and the alternative structure does not contain one. This gives two pieces of information to the model designer: 1) Continue the search for a better model, and 2) Use another model structure. In addition, the component of the total model to improve is the one containing the unsuccessful expansion. This determines whether a component is worth cultivating or not. In conclusion, statistical tests give a two−valued answer, but tests combined with prior structure knowledge may yield more. Remark 1.11. Notice that H 0 is rejected as soon as there is some alternative model H 1 within the alternative structure with a loss below the threshold χ 2. This means that there is no need to search for the alternative with the smallest loss, in order to test the tentative model, except when it cannot be rejected.
22
Practical Grey−box Process Identification
Q(p)
χ2
p H0 not rejected H0 rejected. H1 better
H0 rejected. H1 wrong
Figure 1.1. Illustrating the three possible results of falsification
Conditional and Unconditional Tests The rule used to decide whether a tentative model structure is falsified or not depends on the alternative structure, and is therefore ‘conditional’ on the alternative. ‘Unconditional’ tests do not assume an explicit alternative, but instead aim at testing the basic hypothesis that known and unknown input are independent. If not, there is obviously information in the input data that could be used to improve the estimation of the unknown input and thus the predicting ability of the model. The disadvantage of unconditional tests is that they are less discriminating, i.e., they let a wider range of similar models pass the test. This is so, because the set of implicit ‘alternatives’ is much wider. However, they are still applicable, when the model designer has run out of useful prior knowledge. The following modified calibration procedure takes into account the prospects offered by the various tests:
Calibration procedure: While there is a better model, repeat Refine the tentative model structure: → F(u t, ω t, ^, ^ ν , ·) ^ Fit model parameters: → F(u t, ω t, ^, ^ ν , θ ν) Test the tentative model: Until better model, repeat If no more alternative structures, then expand the alternative model class:→ F(u t, ω t, , ·, ·) Specify alternative model structures: → F(u t, ω t, , ν , ·) If an alternative model is significantly better, then indicate falsified If an admissible alternative model is significantly better, then indicate better model and assign → ^, ν → ^ ν If unfalsified, then test unconditionally: → falsified|unfalsified
2
The MoCaVa Solution
The analysis in chapter 1 outlines what the purpose of the model making would require MoCaVa to do. That must be reconciled, somehow, with the restrictions set by what a computer can do in reasonable execution time. MoCaVa therefore contains further restrictions in order to compromise between the two. In essence, MoCaVa makes use of the following tools: : State−vector models. : A new modelling procedure based on elementary MATLABX statements for user− defined relations, and library routines for some tasks common to all models. : Extended Kalman filtering to produce approximate predictors. : Newton−Raphson search. : Modified Likelihood−Ratio and correlation tests. : The general calibration procedure outlined in Section 1.5.2. : A collection of heuristic validation rules. Chapter 2 describes how these general tools are implemented, and motivates the restrictions that make it possible.
2.1 The Model Set A second compromise that must be made in the design of MoCaVa is that between the conflicting goals of versatility and convenience of the user’s modelling task. The model set used in MoCaVa is therefore structured further to adapt to common properties of industrial production processes, in particular to continuous transport processes. The latter may be characterized as systems of separate units, each one accepting flows of commodities from one or more preceding units, changing their properties, and feeding the product to one or more following units. Since there is an obvious cause−and−effect relationship between the input and output variables of the units, state−vector models (defined by assignment statements) are convenient to use in those cases. Secondly, the operation of an individual unit is generally a result of interaction between particular physical phenomena (at least ‘first principles’ are generally expressed in this way). Also the different phenomena may be described by submodels. A third common characteristic of production processes is that the operation of some units may be affected by the operations of other, control units. Instead of flows (mass or energy), they produce information input to the affected unit, but are still describable by the same type of submodel. In order to satisfy the requirements MoCaVa is able to administer the creation of submodels, and to connect them into systems.
24
Practical Grey−box Process Identification
Remark 2.1. Narrowing the area of easy applications necessarily makes the model set less versatile. In particular, it will not be convenient to model mechanical objects consisting of a large number of linked moving parts within the framework of MoCaVa. Remark 2.2. The assumption of causality between variables is crucial to the design of MoCaVa, and cannot be amended by a different user interface. The motivation for still requiring the direction of causality to be specified by the user (by entering assignment statements) is that in production processes causality is usually known from the construction of the process, that it is important prior knowledge, and that using it may prevent the program from processing mathematically feasible but unphysical alternatives. Example: An engineer knows whether an electric circuit is driven by current ( i R → V) or voltage ( V R → i), and so does the computer, if given an assignment statement. It does not know, if given only the equation i R = V, and will therefore have to keep both alternatives open. 2.1.1 Time Variables and Sampling One of the properties of production processes is that they seldom have regular and reliable sampling of all variables. There are gaps, outliers, different sampling frequencies, and possibly laboratory measurements at irregular intervals. In order to handle this IdKit (the subkit of MoCaVa doing the simulation and fitting) recognizes three time variables: : Physical time t (continuous) : Discrete time τ (equidistant) : Sampling time t k (discrete and irregular) Only physical time is used in the model equations and as sampling time in the data file. The discrete time is internal to IdKit and transparent to the user. Even so, its meaning is not irrelevant to the user, since it plays a rôle in the execution of the predictor associated with the model. When the loss function is evaluated, that is done in discrete time; the continuous−time model is integrated over consecutive intervals of fixed lengths h, called “time quantum”, in order to compute the residuals e(τ) the loss is based on. The relation between the discrete times and the corresponding physical times the model is accessed is t τ = t i + hτ, where t i is the startup time of the model, i.e., the time the state variables are initialized. The relation is depicted in Figure 2.1 together with its relation to the sampling times. h Continuous time
ti
t1 t2
t3 t4 t5
0 1 2 3 4 5 6
t6
tk
tN Discrete time
τ
Figure 2.1. Illustrating the relation between continuous and disrete time variables
The time quantum is an important design parameter, basically to be set by the user. Making it as large as possible reduces computing. However, its length is limited in two ways: i) It must not be longer than the shortest sampling interval, since residuals have to be evaluated at least each time there is a data point, and ii) it is limited by the longest range still allowing prediction with acceptable error of the relevant variables in the model, i.e., state and output. The default value of h is the shortest sampling interval,
2 The MoCaVa solution
25
i.e., the interval between the time markings in data records. If it becomes necessary to have a shorter value (because the model cannot predict that long), h must be an integer part of the shortest sampling interval. It follows that all sampling times must also be multiples of the time quantum. If the sampling is too irregular, it may be necessary to approximate for the sake of fast execution, for instance to adjust the sampling points to the nearest quantum point. The startup time is also a design parameter, with the restriction that the startup must be at least one time quantum before the first data in the sample, t i t 1 − h, which is also the default value (otherwise the first data point will not be taken into account). However, in cases where it may be difficult to find good values for the states to start with, for instance, one would like to start in steady state to avoid large start transients from the model, this can be achieved by specifying a starting time sufficiently far ahead of the first data point. Remark 2.3. In case the user is uncertain of the ability of the model to predict over the default time quantum, a feasible test is to cut the time quantum in half, and see if the loss function changes noticeably. Remark 2.4: Notice that the time quantum does not play the rôle of an integration step for the model’s ODE. Hence, it is not limited by the shortest time constant of the dynamic model. IdKit uses a stiff ODE solver capable of integrating longer intervals. 2.1.2 Process, Environment, and Data Interfaces A second restriction built into the model classes used in MoCaVa is also motivated by common properties of industrial production processes: Models are structured as the four−boxes system depicted in Figure 2.2. The purpose is to separate the modelling of the basically continuous−time process from those of its environment and the two computer interfaces, which must be modelled using both continuous and discrete time. The separation relieves the user of the more tricky task of writing hybrid models for conversion and disturbance generation. The only modelling that requires explicit entering of algebraic statements is that of the process proper (using only physical time). Modelling of the other blocks is done by selecting standard routines from a library. w y(k)
w(τ) Environment v(t) u
Experimenter
d (k)
Actuator
v(t) u(t)
Process
z(t)
Sensor
d y(k)
Data acquisition
Figure 2.2. Structure of the identification object (shaded area)
The following terminology will be used for the variables appearing in Figure 2.2: The output z(t) from the process is called “response”. The two kinds of input are “stimulus” u(t) and “disturbance” v(t), the difference being that stimulus is influenced by an external “control sequence” d u(k), while disturbances are not. Some of the re-
26
Practical Grey−box Process Identification
sponses, disturbances, and stimuli have sampling sensors attached to them, which produces “sensor output” y(k) = d y(k). The two random sequences are called “process noise” w(τ) and “measurement noise” w y(k). Remark 2.5. The terms “input” and “output” must be used with care. For instance, “stimulus” u(t) is input to the process proper, the box of most interest to the user, but it is an “output” from the “object”, which also includes the other boxes. The following generic forms of the block contents are a tradeoff between what needs to be modelled in normal processes and what can be handled conveniently by IdKit. The restrictions are discussed in Section A.1. The purpose of this is to provide a user with a list of the requirements that have to be satisfied a priori in order to render it worth while to try and use MoCaVa. Environment: dx v(t) dt = G v[x v(t), w(τ), p] v(t) = Z v[x v(t), p], t ∈ [t τ, t τ+1)
(2.1) (2.2)
Actuator: dx u(t) dt = G u[x u(t), u d(τ + 1), p] u(t) = Zu[x u(t), u d(τ), p], t ∈ [t τ, t τ+1)
(2.3) (2.4)
Process proper: dx z(t) dt = G z[x z(t), v(t), u(t), t, p] z(t) = Z z[x z(t), v(t), u(t), t, p], t ∈ [t τ, t τ+1)
(2.5) (2.6)
Sensor: y(k) = Zy[z(t k), v(t k), u(t k), w y(k), p]
(2.7)
Remark 2.6. The actuator requires a discrete−time input u d(τ) defined for each τ−value. If the time quantum is shorter than the sampling interval, or some data points are missing, the stimulus u d(τ) will have to be reconstructed from the data d u(k). This is done by linear interpolation, which may not always be satisfactory. The problem of a more ‘tailored’ interpolation between input data with irregular intervals is not solved in the current version MoCaVa3. The restrictions are compromises between the desire to allow a large time quantum h (for the sake of fast computing) and not to exclude too many common processes properties: : IdKit uses a special, stiff ODE solver, which does not have a variable integration step. It does however require that the small signal dynamics (local sensitivity matrices) change only little during a time quantum. Since no input u d(τ), w(τ) changes during a quantum, the restriction means that any rapid state changes during a quantum must be small enough to stay in a state interval where local linearization around the operating point is acceptable. Hence, changes in state variables are allowed to be steep or large, but not both steep and large within the same quantum interval. If they do, and if the rapid change is caused by random effects, then also the Discrete Extended Kalman Filter will break down. It is obviously impossible to predict anything at all with a reasonable meaning of the word, if states may jump around randomly with large increments. Neither would there be any hope of calibrating a model of such a process, unless some of its responses are sampled with
2 The MoCaVa solution
27
a much higher rate. The latter would then change the value of h and the result may again satisfy the requirement. : Only disturbances with frequency spectra below 1/2h are possible to model by the standard models in the MoCaVa library. The environment models create continuous−time disturbances v(t) from discrete−time random input w(τ), and cannot vary too fast. For instance, the simplest standard model, the “Brownian”, integrates a step function with Gaussian (0,1) random amplitudes normalized with a factor h −½. The normalization ensures approximately the same disturbance level, should the user change the value of of h. Remark 2.7. Earlier versions of IdKit supported certain classes of disturbances with larger bandwidths (such as ‘white noise’). Those disturbances must enter as terms added to the state derivatives. However, the modelling kit in MoCaVa3 does not allow them. Since the mathematics of high frequency noise into nonlinear ODE models has certain properties that may be unexpected to an engineer used to either white−box or linear black−box identification (Graebe, 1990b), the option would largely increase the risk of getting a nonsense result, without any diagnostic message. High−frequency disturbances whose variances do not depend on the state are reasonably safe. They may be allowed in future releases of MoCaVa, but then as an ‘advanced modelling’ option (see Section A.3). In practice, the restriction should not be a serious one. Since h is normally determined by the sampling frequency, the effects of much faster disturbances will not be possible to observe in the sampled response. They can therefore be ignored in the modelling, at least as long as their effects are dominantly linear. A troublesome case is when they are not. For instance, should their effects accumulate in a strongly nonlinear way, such as would be the case with heating due to, say, friction caused by an unknown and fast changing load (for instance a fast moving vehicle in rough terrain), then the effect would not be negligible. In that case something should be measured at sufficiently high rate, preferably the load. Remark 2.8. It has not been investigated what classes of random signals that are possible to generate by linear filters triggered randomly at discrete events. The frequency spectrum is certainly not limited by 1/2h. For instance, an oscillator would be able to make ‘music’ with a limited rate of change of notes, but without a limit to the highest note. Remark 2.9. An alternative to the ‘state−vector’ model is the ‘phase−variable’ model y(τ) = P(y τ−1, u τ, w τ, τ, p), where P is an invertible function, such that w(τ) = W(y τ, u τ, w τ−1, τ, p). The designer is required to reduce the number of noise sources to equal the number of measured variables, but that should be no problem in practice. More restricting is the fact that the designer must specify a “predictor model” P from the beginning, eliminating all unmeasured variables, for instance by using a likewise prespecified “observer”. However, if that can be done, the evaluation of the likelihood function will be straightforward (Bohlin, 1987b; Liao, 1989b−c, 1990). The model class is not implemented with MoCaVa3. However, an application has been published (Markusson and Bohlin, 1997; Markusson, 2002). 2.1.3 Multi−component Models The requirement of easy modelling of production processes requires a convenient handling of submodels and of their connections into systems. The following allows a further structuring of the general modelling concepts discussed in Section 1.3.2. : Model component: < i : Component library: F = {< i }
28
Practical Grey−box Process Identification
Parameter map: p i = I i(o i, ν i, θ i), with o i = I i(o i, ν i, 0) Model class: F = {< i, Ii|i ∈ } Model structure: F n = F ν = {< i, Ii, o i, ν i|i ∈ }, where n = Σ i|ν i| Model: M ν = {< i, Ii, o i, ν i, θ i|i ∈ } Model components are ordered in the library according to their given cause−and−effect relationships. Hence, activated components are executed in the order of decreasing number of their indices i. The Parameter map I i maps the infinite range of all free coordinate vector values θ i into the admissible range of the physical parameter vector p i belonging to the component. The parameter value o i corresponding to θ i = 0 is called the “origin”. The map is prior information, while o i, ν i, and θ i are not. The point of the mapping is that it allows the user to enter prior information on parameter ranges and to change the selection of parameters that are free to estimate. Typically, the entries of ν i take positive values only for those parameters in p i that are currently to be fitted and tested for deviation from the current origin o i. The free parameter index is an important user’s control variable in the interactive identification procedure. Parameters may be shared between components. Such a parameter belongs to the first component it is defined in (the one with the lowest number). The model class is defined by setting the activity index (requires automatic recompilation). It contains definitions of the activated components < i , sufficient to simulate the model when given in−data and parameter values. In addition, it contains the parameter maps I i . Setting also the origin o i and dimensions of the free parameter spaces, by the free− space indices ν i, makes the model class into a model structure. That is done without recompilation, and the values of o and ν , like , are not prior knowledge. The model is defined by also specifying the free coordinates. In addition to all information needed to simulate the model, it also contains information on how it was fitted to data. The latter yields information on its degree of dependence on the data sample. That number, n = Σ i|ν i|, is the number of parameter entries fitted to data, and used in MoCaVa as a measure of model complexity. It has an important rôle in the falsification procedure. Following the principle of parsimony, MoCaVa prefers models having low values of n, unless a model with a larger n has a significantly smaller loss. Remark 2.10. The required ordering of components is only partial: Components, whose operation depends on those of other components must have a smaller index. It is therefore irrelevant where in the library to place a new component, as long as that is after the last component it is designed to connect to. Placing components in the order they are created is therefore safe, unless it would contain a parameter that has been defined before, and, in addition, whose defining component may be deactivated and possibly replaced by another, previously created component. Remark 2.11. If an argument in an active component is shared, and the component it was defined in has been deactivated, this will mean an incomplete system. Rectifying this will require the re−definition of the active component to include certain specifications of implicit attributes. The restriction resembles that of the linking of object modules into an executable program: The linker will not accept undefined global variables. Some of the consequences of constructing counter−logical networks of components are detected by MoCaVa3, and result in error messages stating the cause. But possibly not all...
: : : :
2 The MoCaVa solution
29
2.1.4 Expanding a Model Class Expanding a model class is done by appending one or more components. The mechanism for connecting components is the following: Some of the time−invariantparameters or constants in a submodel are replaced by time−variable signals produced by other submodels placed upstream. Secondly, the place(s) in the target component(s) at which to connect the new source component’s output signals are defined already at the creation of the source component. Technically, this is done by giving the output signal of a source component the same name as the parameter it is to replace, when and if the source component is activated. The rationale for this construction is that it allows an intuitively appealing way of building a model. The user starts by hypothesizing the crudest model class that still makes sense (called the “root model class”), where all unknown quantities that may or may not be important are replaced by constants or parameters. Maybe that is enough, if input data contains little stimulus, or output data is much contaminated, and therefore will not allow the calibration of a better model. (One may even start with the simplest conceivable root model class y(k) = c + σ w y(k) , in case one would suspect that the experiment data may actually be quite useless.) In order to put the first hypothesis to test − that parameter variations are negligible − the model designer must conceive an explanation for the opposite event − that they do vary significantly − and describe the hypothetical source of the variation. Since it is likely that there are several parameters whose time invariance may be suspected a priori, MoCaVa provides for several alternative hypotheses to be tried out, each one in turn. It is also possible to try several alternative hypotheses simultaneously, or in combinations, although a need for the last option is less obvious. The new submodels normally contain new unknown quantities, that are first approximated by parameters, and then (possibly) need replacement by further submodels. Thus the refining proceeds until the model class becomes as complex as data allows, but not further. The model class grows like a tree, pruned and cultivated to satisfy the evidence of experiment data, but also sculptured according to the gardener’s preconception of what the tree ought to look like. The following are some consequences of the construction: : A component is not basically connected to another component, but to a parameter or constant. This means that if one would draw a block diagram of the system, then i) some of its output signals may connect to targets in several other components, if they share parameters, ii) a component may output several signals to the same or different components, iii) a component may receive signals from several source components. Hence, even if the model class in some respects grows like a tree, a block−diagram representation of it will not be a tree. It will more resemble a river system with tributaries, bifurcations and deltas. The subject of automatically drawing a block−diagram representation from component specifications will be treated in a separate section. : Information may be transferred between components also through the state variables, which may be shared. This means that unlike through the unidirectional signal channels, this information will allow feedback (upstream) in the partially ordered component set. However, since this information may only be output through the state derivatives, and consequently must first pass the integrator before it is input to another component, the construction will prevent a unintended creation of ‘algebraic loops’ (which is the point of the restriction). Feedback loops may of course also be modelled within components, but must then be resolved by the user to allow a description by assignment statements.
30
Practical Grey−box Process Identification
: In spite of this, MoCaVa3 allows direct feedback over components at the expense
of having to deal with algebraic loops. In such cases MoCaVa3 treats systems containing algebraic loops as ‘stiff differential equations’ with some very fast time constants (See Section 5.1 for an example). : The specifications of each component are independent of whether they will have connections to source components or not. Connecting a source component will need only an ‘activate’ decision, and requires no modification to the target component(s) it is connected to. This makes it possible to create a component once and for all and place it in a library. However, since connection is done by replacing a parameter or a constant in the target component, the latter must have a point to connect to. If there is no natural parameter to replace, the user must create one or more ‘stubs’, for instance in the form of zero constants to be added, or unit constants to be multiplied with, at places where he or she suspects a priori that a refinement might possibly be needed. This means that to the body of prior knowledge, exploited when writing equations for the component, belongs also the knowledge of where those equations are uncertain! Additive stubs are convenient means to prepare for a possible modification of a deterministic model with stochastic disturbances. Replacing multiplicative stubs with disturbances is a way to investigate whether nominally constant parameters are actually constant, for instance, the composition of the raw material fed into a process. : The component(s) defining the ‘root’ model class must always be active. This means that it is not possible to test whether the root model class is already unnecessarily complex. If one would want to try also a different root model class, this requires a new calibration session. Afterwards, it will be possible to compare the two models growing from the two root model classes. A feasible alternative would be to create a rudimentary root model and grow two ‘trunks’ from that. None of those consequences are serious obstacles to the convenient handling of a large number of alternatives. Denote by p i the array of parameters vectors in component #i, and let p i = (p 1,.. ., p i). Denote the other arguments in similar ways, except that x i = (x z1,.. ., x zi). Then the generic four−box component will be Environment: dx vi(t) dt = G vi[x vi(t), w i(τ), p i] v i(t) = Z vi[z vi(t), p i], t ∈ [t τ, t τ+1)
(2.8) (2.9)
Actuator: dx ui(t) dt = G ui[x ui(t), u d(τ + 1), p i] u i(t) = Z ui[x ui(t), u d(τ), p i], t ∈ [t τ, t τ+1)
(2.10) (2.11)
Process proper: dx zi(t) dt = G zi[x i(t), v i(t), u i(t), t, p i] z i(t) = Z zi[x i(t), v i(t), u i(t), t, p i] s i(t) = S zi[x i(t), v i(t), u i(t), t, p i], t ∈ [t τ, t τ+1) where s i ⊂ p i−1
(2.12) (2.13) (2.14)
Sensor: y i(k) = Z yi[z i(t k), v i(t k), u i(t k), w yi(k), p i]
(2.15)
2 The MoCaVa solution
31
The signal output s i must be a subset of the parameter set p i−1 in downstream components. Remark 2.12. Connections are restricted to the process proper in order to keep things simple. It would have been feasible also to allow connections between the other blocks. The form of the four−block submodel component primarily suits the modelling of individual units in a multi−unit production system. It is also general enough to allow the modelling of particular physical phenomena, as well as control equipment, even if not all four blocks will be needed in all cases. However, the motivation for the particular way of connecting components may appear to suit better to the second use of the component, that of refining a unit model by increasingly more detailed descriptions. This raises the question of how the models of units are to be connected. The answer is “by the same mechanism”. The motivation, however, is different: The mechanism is particularly efficient for modularizing a model when its structure is uncertain. Once the library has been provided with sufficient components, parts of models can be easily refined or replaced with an alternative hypothesis (by clicking on a menu). Even if the input flows from an upstream unit or the signals from control equipment are strongly variable, they have to be defined tentatively as “parameters” at the creation of the component(s) modelling the unit. When all the preceding units have also been defined, and connected, the input parameters will automatically become the output of the component(s) modelling the source unit(s), and no harm is done by originally calling them “parameters”. (Process engineers do not generally assume that “parameters” are constant, like system identifiers tend to do.) Now, even if one would like to model the whole series of units that constitute the production process, available data may not allow one to do that. For instance, it may be that the effects of what happens dynamically in the early units in the plant are too much filtered by the following storages and units to be possible to detect in the final response data. The same effect is caused by control equipment. In that case the output of a number of the upstream units may simply be replaced by constant parameters, and should therefore be deactivated in order to simplify the model. Secondly, even if the model of a source unit will never have to be deactivated, it may have to be replaced by another model. Placing the models of the units in separate components (instead of packing both a source and its target unit in the same component), will avoid duplication of equations. Generally, the higher the degree of modularization the easier to modify. Remark 2.13. A third motivation is that it facilitates debugging of large models: Start by running the root model only (and hence with constant parameter input). Even if its input would vary in normal operation, the test may reveal the presence of a ‘bug’ (for instance due to instability or a scale that is way off). Bugs are clearly easier to find in small models. Do the same when the next component is added. In this way the whole of a complex model will be debugged stepwise, and in each step one would know roughly where to look for the bug.
2.2 The Modelling Shell The purpose of the ‘modelling shell’ (ModKit) is to supply the user with some tools to create components of the form described above. This means no less than finding a ‘translator’ from the user’s reference frame to the form accepted by IdKit. As argued
32
Practical Grey−box Process Identification
above, it may be necessary to have different user interfaces for different branches of technology. The type of models used in SimulinkX are generated by connecting ‘boxes’ graphically, and specifying the contents of each box. This user interface suits control engineers. Since there is an obvious cause−and−effect relationship between the input and output of the boxes, this would also suit IdKit. However, a similar interface would be less suited to describing the type of connections between units in a production process, which is the purpose of the MoCaVa user’s shell. The dilemma may be expressed (somewhat unscientifically) like this: Control engineers do the connection of components into a system much like the programming of old−fashioned analog computers, by connecting the output signals from one component to the input terminals of other components. This suits also digital computers: Input variables to a subroutine cause output variables to change (according to an algorithm), which become the input of the next subroutine. Control engineers envisage their systems as based on signals. And there is an obvious causality between input and output. Process engineers envisage their processes as based on flows: Output flows (of mass or energy) from one unit become input flows to other blocks. However, flows have several properties, and it is not always that all properties of an output flow are caused by the corresponding properties of the input flows. Causality might be reversed, like in the following simple example: Example: Two mixing tanks in tandem (Figure 2.3): The three flows in and out of the tanks have two properties, flow rate f and concentration c. The concentrations are determined by the flow rates, which are determined by the pump speeds. Hence the flow rates are all input signals, in spite of the fact that two of them are properties of output flows. Process oriented graph of two tanks in tandem c0
c1
c2
f0
f1
f2
Equivalent graph for simulation purposes c1 c0 f1 f0 f1
c2
f2 Figure 2.3. Illustrating different graphic representations of the same model
The lower structure has two drawbacks: i) It is more complicated, and ii) it does not allow an object−oriented description of the flows. Instead one will have to split the two attributes of the same flow into two signals to accommodate for the fact that they have to enter the signal−oriented simulation blocks in different ways. The example demonstrates that there is a basic problem in displaying graphically a system of models of connected physical objects: In general, the input and output of the physical objects and their causal submodels do not correspond!
2 The MoCaVa solution
33
Remark 2.14. The difference is recognized in the theory of bond graphs, which is another way to create and visualize system models (Margolis, 1992). But bond graphs do not look like ordinary process engineer’s graphs either. It is also the reason for defining object−oriented modelling languages like ModelicaX (Tiller, 2001). In conclusion, a process engineer is used to seeing the upper graph. A control engineer is used to the lower graph. The software (IdKit) is based on functions implementing the lower graph. In addition, the three different reasons for modularizing a model, namely to allow convenient descriptions of process units, refinement or units, and control of units suggests several types of components. And there is some translation to do. It is done by means of the new concept of a “cell”, which is the element by which the user builds the structures. It is a slight modification of the relations defining a component; two input variables “feed” and “control” have been augmented to connect components. The cell may be represented graphically as in Figure 2.4.
Sensor output
Cell
SENSOR Response
Signal
PROCESS Parameters
Feed State STARTUP
Disturbance ENVIRONMENT Stimulus ACTUATOR
Data
Control
Figure 2.4. Illustrating the generic “cell” for building models of industrial production processes. A “signal” may connect to any of the three points indicated by dots.
Remark 2.15. The “cells” play the role of building blocks for the ‘body’ of the model. They are all complete and self−contained, have necessary connectors to the environment, and are able to ‘live their own lives’ (= be simulated) either as individual units or in interaction with other cells. They are also constructed to find the right other cells to connect to automatically. This means that a user trying out various stages of ‘evolution’ of the model body may tentatively add or remove a cell simply by activating or deactivating it, and without also having to specify where to connect it. Instead, the information has to be provided once, when the cell is built. Even if the cell may be represented graphically, the user input is not graphic. Instead, the cell is defined by classifying variables and specifying various attributes and interrelations, as follows.
34
Practical Grey−box Process Identification
2.2.1 Argument Relations and Attributes The prior information needed to define a component consists of argument relations (assignment statements) and argument attributes. The term “argument” will be used for variables, as well as for constants and parameters, where “parameters” may either stay constant or vary, depending on whether there is a connected source component or not. 2.2.1.1 Relations Between Arguments Argument relations (equations) are entered as series of assignment statements using a subset of MATLABX statements. The set is restricted in two ways: : It contains only elementary algebraic, for, if, and else statements and such elementary transcendental functions (sin, cos, tan, asin, acos, atan, exp, log, log10, sqrt) that are both in the standard C and M libraries. : The statements may involve only scalar arguments or explicitly indexed vector elements, for instance inletpressure(i). Vectors with fixed dimensions can be manipulated using for statements. There is a single addition to the syntax of MATLABX statements, viz. a time−differentiation operator D to be placed in front of a state variable. (See Section 4.4.2 for more detail.) There are several reasons behind the much restricted facilities for writing functions, when compared to what is offered by MATLABX statements: : One restriction originates in two somewhat conflicting strategic decisions in the design of MoCaVa: On one hand, to use MATLABX as platform and M−statements for entering user specifications (because MATLABX is believed to be more generally known to engineers), and on the other hand to use C for the time−critical calculations (because the execution of M−files is not fast enough in other than simple cases). It follows that M−statements have to be translated into C−statements automatically. The rudimentary translator in MoCaVa3 is therefore one bottleneck narrowing the set of models that can be handled. Neither can MoCaVa3 handle the translation into C of all possible variable types in MATLABX (for instance, C− structures are different from M−structures). Only scalars and vectors are translated. : A second, and more limiting restriction, originates in difficulties with generating predictors and fitting parameters, both tasks that are particular to grey−box identification: Both the predictor derivation and the loss evaluation use numerical differentiation, and to that is added various other approximations designed to gain speed, some of which also generate numerical errors. It follows in particular, that when the model residuals e(k|θ) are evaluated with slightly displaced parameter values θ, then the very long sequences of operations evaluating e(k|θ) must be exactly the same for each value of θ. From this follows that iterative loops cannot be allowed in the evaluation of the predictor (because they might stop after different numbers of iterations, however small the size of the displacement in θ). This means that loops cannot be allowed in the model statements, or in any function that statements may call. And from this follows that while statements, for instance, must be excluded. In all places in the complex processing of the model must loops either be avoided or controlled to a common number of iterations for all displacements of θ. Remark 2.16. The translation problems were avoided in an earlier all−C version of MoCaVa (used mainly within the department for the case studies). The user must
2 The MoCaVa solution
35
have a knowledge of how to write a C−function, but will then be able to exploit the scope of the C language freely. An ‘advanced’ modelling option is clearly a possibility in future versions of MoCaVa. Remark 2.17. Due to the ‘multi−shell’architecture of IdKit, where the user’s model is in the innermost layer, the strength of the ‘vectorization’ option in MATLABX cannot be exploited to advantage. The bulk of IdKit is a set of nested for loops with calls to subroutines. Since the model is nonlinear and recursive, it is not obvious how the operations are to be paralleled. Remark 2.18. The possibilities of using formula manipulation algorithms to replace some of the numerical differentiations in the derivation of the predictor have been investigated by Sørlie (1995c,1996a−c). It appears that formula manipulation will be efficient in only part of the predictor. Remark 2.19. A third argument for limiting the scope of the models is strategic: MATLABX will no doubt develop the scope of its M−language. If MoCaVa would have the ambition to cover the same scope, its M−to−Ctranslator would have to have the resources to follow the development, which is unlikely. For much the same reason, the possiblities of using Mex−functions or generic M−to−C translators for solving some of the translation problems have not been considered seriously. The bottom line is that it is very uncertain how much development of the modelling for simulation purposes, that can be handled also by the more demanding task of identification. (What about general hybrid systems?) Better then to restrict to a class MoCaVa has control over. 2.2.1.2 Sources of Input Arguments The IdKit model set (in equations 2.8−15) uses the following classes of arguments appearing in a model of the process proper (other arguments participate in the data and environment interfaces): : Input: state variable x, stimulus u, disturbance v, parameter p, constant c, time t. . : Output: state derivative x, response z, and signal s. The classification decides how the arguments are to be treated in the calibration process. In the modelling process (based on the “cell” in Figure 2.4) the user encounters the following argument classes instead: : Input: Control r, Feed q, Disturbance v, Parameter p, Constant c, and Time t. : Output: Response z. Their classification must be based on their meanings in the object being modelled, and belongs to the prior knowledge. The labels are set using the HelveticaNarrow fonts, in order to emphasize that they will appear in user communication windows. The motives for the choice of information asked from the user are the following: Since the connections of components into models may have to be changed frequently, this should be particularly easy to do. Connection of a component in MoCaVa is therefore done simply by activating it. The basis for this is prepared at the component definition by giving the signal output the same names as the parameters or constants they are to replace, if activated. Thus, the “signal” classification is done automatically. Also . the classes of “state” x and “state derivative” x are determined automatically from the names of the variables: A variable with a D in front of it is a state derivative, if there is also a variable with the same name. Secondly, the response of the model depends on how the stimulus u was generated, in particular on how the continuous signal behaves between the discrete time instants.
36
Practical Grey−box Process Identification
It is the user’s responsibility to provide at least partial information on that point. In industrial production processes there may be two kinds of stimulus: : Control: Actuator response to stepwise and known set point changes. : Feed: Input from an external known source. ”Known” means that there is a model for the source, normally dependent on filed input data. The third kind of input is : Disturbance: Input from an external unknown source. In all cases the user must enter some information about the source model. If the main source of information about a Feed or Control input is a data sequence, it may be possible to use one of the standard library models for interpolation between the sample points. The user is asked to select from a menu of interpolation formulas, or else indicate that the input is to be provided by another component. In case of Disturbance input, the only available source models are a number of stochastic library models. Remark 2.20. If the user would like to create an own stochastic model, this can be done by classifying the disturbance as Feed, then mark its source as User model, and write a source component having one of the standard ‘environment models’ as input. For instance, a positive disturbance d is generated by classifying d as Feed, indicating that d is to be defined by another component, writing the latter d = exp(v), and classifying v as Disturbance. An amplitude−limited disturbance may be created in a similar way, for instance using an atan function. The Parameter, Constant, Time, and Response classifications are the same as in the IdKit model: : Parameters may be either fitted, unchanged, or replaced by signals from other source components. : Constants may be unchanged or replaced by signals, but cannot be fitted. : Responses are all output variables of interest (e.g., for plotting) regardless of whether or not they have sensors attached to them. Conceptually, the classification should be easy to do for one who has written the statements. Technically, it is done by clicking on the appropriate classification for each entry in the displayed list of arguments. The re−classification into IdKit arguments is done automatically according to Table 2.1. Table 2.1. Relations between user and IdKit arguments
Classified as
Data assigned Active source Treated as
Feed Feed Feed Control Control Control Disturbance Parameter Parameter Constant Constant
no yes no no yes no
no no yes no no yes no yes no yes
parameter stimulus signal parameter stimulus signal disturbance parameter signal constant signal
2 The MoCaVa solution
37
2.2.1.3 Other Argument Attributes Arguments are defined automatically when they are entered, as in M−statements, and may have arbitrary names. However the arguments defined in this way also have a number of attributes that may be specified by the user, in case there is some prior knowledge contradicting the default values: : Variables associated with experiment data must be given references to their counterparts in the data file. : Most variables need scales, telling MoCaVa what variation to expect. They are needed for numerical differentiation and for setting the default scales when plotting a variable and some dependent variables. Scales are not required for Constants. However, they are required for Parameters, for the purpose of numerical differentiation. The values of scales are not critical. However, without scales it would not be possible to write equations in standard units (in order to avoid hazardous units conversions in components using mixed units). The ranges of the variables would vary too much for convenience. For instance, the thickness variation of the paper in a roll would vanish in comparison with its length. : Nominal values are required for Parameters and Constants. For Parameters they function as start values in a possible first fitting. : Ranges are optional, and most commonly used for such parameters that are known to be positive, and where a negative value could be expected to give low−levelrun− time errors. Other bounded parameters measure fractions of some variables values, and should therefore be confined to the range (0,1). : Scales, nominal values, and ranges are specified by editing the default values in a form displayed on the screen. Scales and nominal values may also be specified implicitly by entering an alphanumeric label. The option is necessary when the argument is a vector, and the nominal values of its elements differ. It is also convenient, when several arguments share scale or nominal value. Also arguments in different components may share scales or nominal values. The value of an implicit attribute must be entered in the first component it is introduced. Remark 2.21. Deactivating a component that defines an implicit attribute that is also used by other, active components gives an error message. The only way to rectify this is to rename the attribute in the first active component and enter its numerical value again. This is not entirely satisfactory, since it means that components are not fully independent of the model class created when connecting them. The problem is similar to that caused by shared state variables, although in this case it is clearly against logic to deactivate a component that would otherwise have computed the value of a state variable used by another component. All combinations of active components are not feasible also for other reasons (for instance all target components of active components must also be active) and it is necessary to exercise ones common sense also when selecting a combination. Expanding the class may be done with much less caution than changing the class in other ways, if arguments are shared. 2.2.2 Graphic Representations The components in MoCaVa are defined to include also the information needed to establish any of the possible connection patterns. The actual pattern is created when the participating components are activated. Thus, the system is defined without explicit reference to a graph. However, in order to give some feedback to the model maker MoCaVa generates a graph automatically from the component specifications and the activity index. The main motivation for this, as it would seem, ‘backwards’ way of defin-
38
Practical Grey−box Process Identification
ing connections is that, as argued above, this will simplify the connection process for the user. It also allows the construction of different types of graphs depending on the user’s preference. In particular, the user may shape the graph to distinguish between the three ways of refining a model, viz. by i) adding a physical unit feeding flows of matter or energy into a target, ii) adding a control unit entering signals to change the response of the target, and iii) adding a model of a physical phenomenon inside the target. Since the graphics generator by default enters Feed input from the right, and Control input from below, a series of units will be displayed as an array of boxes, with boxes representing control equipment appended below. This will make the graph look more like the upper graph in Figure 2.3. A source cell representing an internal refinement will be displayed as a box within the target box. Thus, refinement in several steps may result in ‘Chinese boxes’ with several layers, including subsystems of connected boxes within boxes. This too will be visually informative; a more refined unit will also appear more complex. The purpose of the construction is to facilitate the user’s understanding of how the various components may interact, and to corroborate this with the assumed operations in various parts of the actual production process. The point is that the classification should appeal to intuition, and therefore be quite easy to do in normal cases. If one wants a graphical representation of the model, and many do, then the block diagrams representing the physical object and the model should look alike. And that is particularly important in grey−box identification, since it concerns the possibility of entering prior structural information without misinterpreting its effect. However, the attempt in MoCaVa to satisfy this carries a logical restriction: Not all systems that can be defined as connections of cells can also be described graphically. For instance, a cell that has an output signal connected to a Parameter input, cannot also have another output connected to a Feed input − it is not possible to draw such a graph. Inadmissible classification results in diagnostic messages, and no graph. (It is also possible that there will be a garbled graph, and no message, since MoCaVa3 might not be able to diagnose all cases of inadmissible combinations of input classifications). The user may then try a better classification, or else proceed without a graph, since it is still possible to use the model for identification. Remark 2.22. It follows that the model classes that can be processed by MoCaVa is wider that those that might have been defined by drawing graphs. However, it is difficult to see a practical usefulness of this observation. And, again, it does not mean that MoCaVa would be able to handle any model class defined by SimulinkX. The component specification in MoCaVa accepts only a limited set of M−statements.And it is limited in a way that facilitates prediction and fitting. Remark 2.23. The setup and display of the graph is done by a separate user interface defined in the file NewDrawGraph.m. It may be suppressed by the user. Remark 2.24. A negative consequence of the emphasis on easy connection of cells is that the same cell cannot be used in several places. Each cell is basically an “instant”, i.e., a description of a particular unit or phenomenon, and it should therefore be known to the user what target parameter(s) it is associated with. For instance, models of technical units of the same kind must be copied and then given individual names to its input, output, and state variables. The deviation in design from what is common in simulation software is motivated by the different requirements: In simulation (and in white−box identification) the modelling of the total object should be easy. In grey−box identification it is even more important to be able to change the total model easily.
2 The MoCaVa solution
39
Since it is not obvious how to cater to both needs, MoCaVa puts the emphasis on the second one. Remark 2.25. MoCaVa3 has two options to facilitate reuse of model components: i) A “copy component” option, where the user may copy the equations, but change the names of the external arguments, and ii) allowing function calls of the form outputarguments = libfunction(inputarguments) among the component−defining statements. The functions may be either user defined or standard, as well as static or dynamic. 2.2.2.1 A Simple Example: Cascaded Tanks With the use of cells the graph of the two−tank example will look as in Figures 2.5−2.8 depending on how detailed one wants the display. The graphs are grey−scale images of those displayed by MoCaVa3. In all cases pulp0, pulp1, pulp2 represent the three flows in the upper graph of Figure 2.3. They have two properties flow rate and concentration (of pulp), and are therefore defined as arrays with two elements. The purpose of this is to keep the elements together, which will emphasize in the graph that they belong to the same physical object. Figure 2.5 illustrates a ‘root model’ consisting of the two tanks plus a rudimentary mechanism for supplying tank1 with pulp. The null hypothesis is that the flow rates f0, f1, f2 and the input concentration c0, as well as the start concentrations c10 and c20 are all constant parameters, possibly to be determined by fitting.
Figure 2.5. Graph of the root model of cascaded tanks
The components are defined as follows: tank2: M−Statements: Dc2 = (pulp1(1) * pulp1(2) − f2 * c2)/V2 pulp2(1) = f2 pulp2(2) = c2
Initialization: c2 = c20
Argument classification: pulp2 pulp1 f2 c20 V2
Response Feed Control Parameter Constant
tank1: M−Statements: Dc1 = (pulp0(1) * pulp0(2) − f1 * c1)/V1 pulp1(1) = f1 pulp1(2) = c1
Initialization: c1 = c10
40
Practical Grey−box Process Identification
Variable classification: pulp0 f1 c10 V1
Feed Control Parameter Constant
feed: M−Statements: pulp0(1) = f0 pulp0(2) = c0
Argument classification: f0 c0
Control Parameter
Figure 2.6 shows a slightly more detailed graph, where three rudimentary pump models have been added. The tank2, tank1, and feed components are unchanged. The null hypothesis is that all pumps are perfect (actual output equals reference input).
Figure 2.6. A more detailed graph of the root model
pump0 (similar for pump1 and pump2): M−Statements: f0 = fr0
Argument classification: fr0
Control
The presence of the new components obviously do not change the model’s response. They were added for two reasons: i) To show how graphs using both feed and control input will look, and ii) to create ‘stub’ components for possible refinement, should reference input fr0, fr1, fr2 not be constant and the pumps not be perfect. Figure 2.7 shows the case where the reference input of all flows are entered from a data file. This changes only the specification of input sources of the pump models.
Figure 2.7. The same model with input sources assigned
2 The MoCaVa solution
41
Input source model: fr0 fr1 fr2
Hold Hold Hold
fr0 fr1 fr2
fd0 fd1 fd2
Input data assignment:
Each pump model has now three arguments for speed. The distinction between them is the following: fd0 is the name of the recorded speed of pump0 in the data file, fr0 is the name of the (stepwise constant) continuous−time reference variable from the Hold function, and f0 is the signal from the pump model into the feed unit. The distinction becomes important whenever the pump or DAC conversion is not perfect. Figure 2.8 shows the case where one would want to emphasize that the modelling of the pumps are to be regarded as a refinement of the physical units they belong to. The following has been changed: Argument classification: f2 f1 f0
Parameter Parameter Parameter
Figure 2.8. The same model, displayed differently
Notice that the model structures corresponding to the graphs in 2.7 and 2.8 are the same, only the way the graphs are laid out differ. In contrast, the graphs in 2.5 and 2.6 correspond to slightly different models; 2.6 has three argument more, even if the models are equivalent in the sense that they have the same responses.
2.3 Prior Knowledge Grey−box identification hinges on prior knowledge as well as on data, and is necessarily interactive. And any interactive program is necessarily restrictive concerning the kind of prior knowledge it can receive. MoCaVa expects three types of prior information, viz. facts, hypotheses, and credibility ranking. The facts are values of constants and well established natural laws. Facts and hypotheses involve attributes of arguments as well as relations between them. They are used directly by the computer to construct the component library. The credibility ranking is used mainly by the human operator to determine in which combinations and in which order to test the various hypotheses. The computer uses that to build a structure from the selected components.
42
Practical Grey−box Process Identification
Remark 2.26: The building of a model resembles an evolutionary process in the sense that only successful systems survive. However, a difference is that the ‘mutations’ are not entirely random, but partially guided by the model maker. In this process the credibility ranking plays a key role in reducing the large number of unsuccessful trials that nature can afford but an engineer cannot. 2.3.1 Hypotheses The hypotheses determine structural and argument attributes. They state how arguments are related (hypothetically), and what one may know or guess about their likely values. 2.3.1.1 Structural Attributes Structural attributes are relations between defined arguments, argument class and dimension, and interface specifications to data and environment. Relations between arguments are entered as assignment statements (see Section 2.2.1). They are used by MoCaVa to create C source files for inclusion by the linker in the automatic generation of executable tasks for simulation and loss evaluation. The argument classes have been defined above. The classes are used, together with the argument dimensions, to build an interface between the user−named arguments and the generic identification tool box IdKit. The interface specifications apply to arguments classified as Response, Feed, Control, or Disturbance: All except Control require specification of whether a sensor is attached or not. The Feed and Control classifications require specification of the source of the input, and if that is a data file, also what standard routine to use for conversion from discrete−time input data to continuous−time stimulus. This includes interpolation between the sampling times, as well as possible filtering of contaminated data. The Disturbance classification requires specification of what standard routine to use for modelling the source of the random disturbance. Remark 2.27. Notice that in addition to the Response output also the Feed and Disturbance input may or may not have sensors attached to them. This allows the modelling of the case of measured random input, as well as the case of ‘input noise’ (see Section 2.3.5). 2.3.1.2 Argument Attributes Argument attributes are scale, nominal value, and range. The scales determine the default layout in plottings, as well as the sizes of small increments used in numeric differentiation. The scales of fitted parameters also have a third effect: They are used by IdKit to put weights on the deviation of an estimated parameter from its origin (see Section 2.4.1.2). The origin is either the nominal value or a previously fitted value. The weights function as soft barriers in order to implement the prior knowledge that parameters should not be much outside the range set by the scale, unless data says otherwise, and so strongly that the weights become overpowered. The ranges set hard boundaries to parameters. They are necessary for such parameters that must be bounded in order to make the model stable, or are input arguments in functions with limited domain of definition (such as sqrt), and that would cause run−time errors otherwise. Explicit ranges apply only to parameters. If necessary, other arguments must be limited in the definition statements, for instance using if statements for hard boundaries (e.g., overflow), or atan for soft boundaries (e.g., saturation).
2 The MoCaVa solution
43
The nominal values are the prior values of constants and such parameters that are not fitted to data. Otherwise they serve as start values in the search. 2.3.2 Credibility Ranking MoCaVa is designed to take into account that prior information is more or less reliable as well as relevant. The design distinguishes between structural uncertainty and parameter uncertainty: A hypothetical relation may or may not hold, and, independent of this, may or may not be significant for the calibration loss. Similarly, the prior (nominal) value of a parameter may or may not be accurate, as well as significant or insignificant. For instance, the most reliable are the facts, such as natural laws and models that build on first principles. Next come well tested heuristic relations between well defined variables, the element of uncertainty being a possible influence from the environment and the range of validity of the models. After that comes a range of relations modelling various hypothetical relations, ranging from, at one end, those based on engineering common sense of what physical phenomena that should be most important, to at the other end, just a hunch that there might well be a relation, although unlikely to be significant. The prior uncertainties affect the testing order, and it is partially the model designer’s responsibility to decide on the testing order (it is partial because several hypotheses may be tried in parallel). Thus, within each hypothetical model class the effects of free parameters are tested first. When the freedom of parameters have been exhausted the class is expanded by augmenting new hypothetical relations. The idea is that if a model designer has some basis to order his or her prior facts and hypotheses after some degree of credibility, at least partially, this can be exploited to reduce the calibration task. Now, the rationales for ordering the structural uncertainties and the argument uncertainties into ‘credibility rankings’ are different. It is reasonable to test the structural hypotheses in the following order: 1) Significant & reliable, 2) Significant & unreliable, 3) Insignificant & reliable, 4) Insignificant & unreliable. In contrast, the parameters are reasonably estimated in the order of 1) Significant & unreliable, 2) Insignificant & unreliable, 3) Significant & reliable, 4) Insignificant & reliable. The recommendation is self−evident as well as vague, and the point of making it here is to emphasize that the degree of accuracy and reliability of prior knowledge is also prior knowledge, and to point out how such knowledge can be used in structuring the model. 2.3.3 Model Classes with Inherent Conservation Law The following is a way to exploit the fact that matter and energy are preserved during transformation. If this conservation law is stated in a separate component, the latter may be retained in all models built by augmenting other components based of less reliable relations. The following is an example of such a construction. 2.3.3.1 Example: Chemical Reactor Models. It is a known fact that the mass fractions of different atoms in the constituents in chemical reactors are preserved during reactions. What is usually less well known are the reaction rates. This suggests that one try model classes of the forms in Figures 2 .9−11. The ‘root model’ contains only the balance equations, where f are the flows of constituents in and out of the reactor, m is a vector of constituents currently in the reactor,
44
Practical Grey−box Process Identification .
m = Cr f out = f in − Cr
.
m f out
r f in
f = (f in, f out) y = f + σw
y
σ
Figure 2.9. Root model with conservation law .
m f out
y
.
m = Cr f out = f in − Cr
r
r = R(m, f in, p)
f in
f = (f in, f out)
y = f + σw
m p
σ
Figure 2.10. Augmenting a reaction−rate model
and r is a vector of reaction rates. The matrix C is constant and known (provided one knows what reactions are going on in the reactor). The flows in and out of the reactor are measured, and a sensor model adds a white noise with rms−values σ. The first hypothesis is that reaction rates are constant. Start therefore by fitting y to data in order to estimate as many of the reaction rates in r as data permits. This is done by freeing one more parameter for each fitting task, either selected a priori, or else determined by trying several alternatives in parallel. In order to test the null hypothesis that reactions are in steady−state (r is constant), it is necessary to expand the model with a component containing a hypothesis of what might cause r to vary. Let r = R(f in, m, p)be a relation modelling this, i.e., the reaction rates may depend on the inflow, the concentrations in the reactor, and the known or unknown parameters p (Figure 2.10). It is conceivable that this will have to be expanded further, for instance by a temperature model, but the point is that all expansions will conserve the balance of matter. It may also be that the reactions are too complex for modelling, for instance in combustion processes. In such a case one may have to rely on a black−box Disturbance model for the expansion, in order to test the null hypothesis that the reactions are steady− state processes. The simplest black−box model with no known input is a Brownian . motion r = λw (Figure 2.11), but a number of input dependent models are also conceivable. 2.3.4 Modelling ‘Actuators’ The generic actuator model in Equations 2.3 and 2.4: dx u(t) dt = G u[x u(t),u d(τ), u d(τ + 1), p], u(t) = Zu[x u(t), u d(τ), p], t ∈ [t τ, t τ+1) defines an interpolation function between discrete−time data u d(τ) and u d(τ + 1). Hence, the model depends on the process that causes the input u(t) to vary in between, and must be determined a priori. The physical conditions are different for Control and Feed input:
2 The MoCaVa solution .
.
v
.
m f out
y
v
v = λw r=v
r
45
λ
.
m = Cr
f out = f in − Cr
f in
f = (f in, f out)
y = f + σw
σ
Figure 2.11. Augmenting a ‘black−box’ reaction−rate model
2.3.4.1 Control Input The typical case is digital control; the input u(t) is constant between the discrete points t τ, and the only thing to decide is whether it should be continuous to the right, u(t) = u d(τ), or to the left, u(t) = u d(τ + 1). Now, a natural sequence of events in digital control is the following: First, the control routine receives measurements y(t τ), computes a correction u d(τ), and sends it to the actuator. Then the logging routine sends the record y(t τ),u d(τ) to the file. This suggests continuity to the right (’hold’), i.e., u(t) = u d(τ). Remark 2.28. Notice that if the response to control is immediate, for instance y(t) = g u(t), this holds for t τ < t < t τ+1, but not for t = t τ. No control loop can have zero response time in both the forward and the back paths. A conceivable alternative to the sequence of events may occur if there are separated and un−synchronized routines for data logging and control: First, the logging routine sends the record y(t τ),u(t τ) to the file. Then the control routine retrieves the measurement y(t τ), computes the next set point u(t τ+1), and sends it to the actuator. This would suggest continuity to the left, u(t) = u d(τ + 1). In the worst case the sequencing may even change with time, depending on the priority rules in the process− control system. However, if the routines do not execute during a common cycle, then it is not likely that the data logging will occur in the narrow interval, where the independent control routine will execute, and u(t) ≠ u d(τ). Hence, even then u(t) = u d(τ) is to be preferred. If the actuator does not have a negligible response time, its model must be modified. MoCaVa offers a number of alternatives to model this case. 2.3.4.2 Feed Input In the typical case nothing is known about the source of the input in addition to the data sequence, which may also include measurement errors. (If more is known, it should be described in an augmented component). The interpolation will obviously have to be an approximation, and the choice of interpolation rule depends on the effect on the model response of the approximation error. When the object responds fast to input (compared to the time quantum), an instantaneous model will often be adequate, the only value of interest is u(t τ), and it is irrelevant what happens between the discrete times. Hence, the simplest assumption of a constant u(t) will do, but this time one will have to decide on two items: i) continuity
46
Practical Grey−box Process Identification
to the right or to the left, and ii) how to estimate the ‘true’ input. Leaving out the second problem for the moment, the deciding point for the first item is that in a noise−free case one would like to have u(t τ) = u d(τ), and this implies ‘hold’ again. The estimation problem remains however. Also when the object responds slowly it is often irrelevant what happens between the discrete times. However, if the ‘true’ input would also change slowly, a ‘hold’ model would generate a stair−case function which would cause a systematic delay. In such cases it seems better to interpolate linearly between the discrete points (if no more is known about the origin of the input data). The actuator model for this case will be dx u(t) dt = [u d(τ + 1) − u d(τ)] h u(t) = xu(t), t ∈ [t τ, t τ+1)
(2.16) (2.17)
However, this requires an additional state variable, which may become costly, if the input are many. The remaining case is that when the response of the object is such that it does matter what happens with the input between sampling points. The remaining option is then to use ‘black box’ models for interpolation. MoCaVa3 has only one model that can be used for the purpose: an approximative delay function with unknown delay time. When input data are noisy enough for this to be significant, there is usually not much point in asking what happens between the sampling intervals, and again a ‘hold’ function will be the answer. It follows also that if low−pass filtering of the input is the answer, the filter should be digital, i.e., the hybrid actuator will consist of a digital filter followed by a ‘hold’ function. However, the question of finding an estimate is much more difficult to answer, and it is not even clear that the answer is filtering. It will be the topic of the next section. Remark 2.29: The library routines available in MoCaVa3 for interpolation of Control input and interpolation and filtering of Feed input are defined in Section 2.3.6 Interpolators and Filters. 2.3.5 Modelling ‘Input Noise’ The following somewhat lengthy discussion will reflect on some of the intricacy of the problem of modelling unknown input to industrial processes. Some points may be repeated in several places using other words, and that is because I believe in the salutary effect of redundancy when transmitting through as imprecise a channel as one using natural language. However, using mathematics will not help to clarify things either, since the problem lies in its interface to the physical world. Theoretically, MoCaVa requires that the input to the ‘actuator’ be constant between sampling points, and known exactly. The requirement is typically satisfied when the input is actually generated by a computer and entered as ‘set points’ to the actuator. More often, however, the input to the process is known only partially, by sampling a continuous−time input originating in another process, not to be modelled. This creates a problem of reconstructing the continuous stimulus from discrete data, which, in addition, may be recorded with measurement errors. The two classifications of Control and Feed distinguish between the cases. The difficult case is that when the input is not known exactly, but measured with errors. This is known as “the problem of input noise” and is fundamental in identification from ‘historical’ or ‘operating data’, rather than from ‘controlled experiments’ (Bohlin, 1987). The actual input are unknown, and this, in turn, requires additional
2 The MoCaVa solution
47
models of the processes that caused the actual input to vary. If these processes are known and their set points have been recorded, this is a case of controlled experiment, and the set points are the input (known exactly). If the input processes are not known, then their models are part of the system to be identified, but the input (set points) to the whole model of object and input process are still known. However, if no set points have been recorded, it remains only the possibility of providing, no−input models for the sources of their variation, and if the latter are unknown too, it remains the possibility of hypothesizing a black−box model for the input behavior. The same holds for other unknown input, not directly associated with measurements, but whose effects will pass through the process proper, before any measured output is possibly responding. In order to formulate and solve the problem of ‘input noise’ it is necessary to assume something about the source of the ‘true’ input. The following are some feasible assumptions: : The input is band−limited: If the input does not contain frequencies above the Nyquist frequency 1/2h, then according to Shannon’s sampling theorem it is theoretically feasible to reconstruct the continuous input exactly from the sampled (Kuo, 1992). This is an intuitively rather surprising conclusion, but may on the other hand also be interpreted as an indication that the assumption is unlikely to hold in practice. The option is not supported in MoCaVa. : The input follows one of a number of interpolation formulas. If the formulas contain unknown parameters, they will be able to shape various behavior patterns between the sampling points (including delayed action), but the assumption is still that the pattern will be unchanged throughout the sample. A more severe limitation is that this does not solve the problem of ‘input noise’, since the continuous input and the discrete data will always agree at the sampling points, noise or no noise. However, MoCaVa supports a small number of interpolation formulas for low− noise cases. : The input is stochastic with a given structure: This may not seem to be a quite satisfactory assumption either, since the input is probably generated by a process that does not behave as any of the stochastic models available. However, assuming no further knowledge of the source there is no option left than to assume a black−box model with no input, either as a smooth function of time, or a stochastic model. When appraising the a priori credibility of the various assumptions, it would appear that one would be as credible or dubious as the other. But the advantage of the stochastic assumption is that it offers a way to deal with the problem of measurement errors. A theoretically correct way of dealing with the problem of ‘input noise’ (assuming a stochastic input) is to model the variables in agreement with what they really are, i.e., the ‘true’ input as unknown input, and the data as the known output of a sampling sensor; cases without exactly known input have only output data: dx v(t)dt = G v[x v(t), w(τ), p] v(t) = Z v[x v(t), p] y(k) = Zy[v(t k), w(k), p]
(2.18) (2.19) (2.20)
The quality that makes this work, in spite of the dubious assumptions behind, is the versatility of the stochastic model. The calibration includes an ‘inversion’ operation on the model; given output data, it does in effect estimate the actual unknown input, in addition to the unknown parameters. In essence, the predictor generated from it will function as an optimal low−pass filter for the measurements, and the estimated stimulus will follow the actual measurements as well as the estimated bandwidth of the stim-
48
Practical Grey−box Process Identification
ulus and the noise level will allow. And that is so, independent of what process actually caused the ‘true’ input. Thus, there are three kinds of unknown input (”disturbances”) and one may have to provide descriptions for all: : Measurement errors (added to the output of the object) : Measured input (unknown input to be included in the object model, in addition to known input) : Disturbance input (other unknown input to the object, or generated by an unmodelled process within the object). The difference between the second and third kind of input is that one but not the other will be available when the model will later be put to use, and none is available for the calibration. Remark 2.30. Notice that even if the intended use of the model does not allow the inclusion of disturbance input (e.g., open−loop prediction), they have still been there when the experiment data were recorded, and may therefore play a role in the calibration process. The idea is to model and calibrate the whole system of disturbance input, measurement process, and the process proper, but keep only the latter for the open− loop prediction purpose. Remark 2.31. In closed−loop prediction, also the disturbance models play a role (they are important when deriving a predictor), in addition to the model of the process proper. The difference is that in closed−loop prediction, measurements of both input and output variables are available, in addition to the reference input available to an open−loop predictor. The closed−loop prediction will generally be able to use these response values to estimate some of the unknown input, and thus improve the prediction. The improvement does of course depend on the prediction horizon, which in practice is determined by sampling rate and transport and measurement delays. 2.3.5.1 Time−saving Shortcuts The theoretically correct modelling has a practical drawback: Output variables cost much more computing than input variables. It may therefore be necessary to compromise. The observation that the predictors function as input filters gives a clue to how to do that: Replace the input with low−pass filtered values of the measurements. If the bandwidth of the input filter is unknown, then include an unknown parameter with those fitted to data. This will reduce the number of output and increase the number of input. If one would want more security (at the cost of more computing) there is also the possibility of combining an input filter with a measurement of its output. In this way the same data will be used twice: Once as input data for creating an estimate of the unknown input, and once again as output data for fitting the input filter to data. This double use of the same data is not in accordance with any theory, but intuitively plausible. Estimating data by filtering input data combined with feedback from measurements should not reasonably be inferior to estimating it only via feedback from measurements. The alternative replaces a stochastic input model with a deterministic one, which is less costly. It is also possible to make a further shortcut by first determining optimal filters for estimating individual input, by fitting to their measurements, and without involving any of the process model. This would be a loss of information only in the unlikely event that other output would hold more information about the input than the direct measurement of it. Once the filters have been fitted, the calibration may proceed based only on the filtered input, and without the costly ‘double’ use of the input data. In summary, the following are feasible ways to model a noisy input in MoCaVa:
2 The MoCaVa solution
49
: Classify as Disturbance, and assign a sensor. Fit source model and input sequence to output and input data: Theoretically correct, but costly.
: Classify as Feed and assign a data source. Fit source model to output data: Faster, but more hazardous.
: Classify as Feed and assign both a data source and a sensor. Fit source model to
output and input data: Intuitively better and less costly than the theoretically correct alternative. : Classify as Feed and assign both a data source and a sensor. Fit source model to input data only: It is the fastest alternative when the model is large. However, the loss of accuracy may or may not be negligible depending on the process. 2.3.6 Standard I/O Interface Models MoCaVa3 supports the following input interpolators, filters, and disturbance generators. Formulas and code are given in Section A.8. 2.3.6.1 Interpolators Interpolators are applicable when measurement errors are small: : Hold: This is a stepwise constant function continuous to the right. It describes an error−free actuator with negligible response time. It is also suitable for objects with responses that are fast enough to be described by instantaneous (algebraic) models. : Linear: A linear interpolation is suitable when the source of the measured input is unknown and slow−changing. : FirstOrder: It describes a case of digital control, where the actuator’s response time is significant. The time constant may be fitted. : SecOrder: It describes a case of digital control, where the actuator may overshoot. Time constant and overshoot may be fitted. : Delay: This function makes an exact time delay for ramp input. For other input it approximates a time delay, and should therefore be applicable under the same circumstances as the linear interpolator. It will be possible to fit a delay that may range over several time quanta, as well one that is a fraction of the quantum. Only Hold and Linear are true ‘interpolators’ in the sense that their output agree with their input at the sampling points. The other may lag behind. 2.3.6.2 Filters Filters are applicable to contaminated data, were it is less important what happens between sampling points. : LPFilter: This linear digital low−pass filter may be used when the ‘true’ input can be expected to change slowly. It is possible to fit the bandwidth to data. : NLFilter: This is a nonlinear filter for a case when the ‘true’ input normally changes slowly, but can be expected also to take occasional large steps. Linear filters with the narrow bandwidths required to suppress noise do not react with sufficient speed to step changes. The nonlinear filter has a gain that varies with the error amplitude, and thus responds fast to changes that are above a soft threshold. The latter may be fitted to the average noise level, if unknown. The standard models are admissible for different input classes: Control input may use Hold, FirstOrder, and SecOrder. They all assume a step input with no error. Feed input may use Linear, Delay, LPFilter, NLFilter, and Hold. The last alternativ is allowed, since it is less costly, and may be good enough in many cases.
50
Practical Grey−box Process Identification
Remark 2.32. The combination of assigning a sensor to Feed input and selecting one of the input models Linear and Delay is not admissible. Since the Linear interpolator agrees with the data at the sample point, there is no point in trying to fit the interpolator output to data. Neither is there a point in fitting the output of Delay; that would always yield a zero delay estimate. Remark 2.33: Both noise−suppressing filters are purely discrete−time,and produce step−wise constant output. When data are noisy, there is not much point in assuming some more sophisticated interpolation between end points that are uncertain anyhow. It also saves the overhead of continuous−time filter states that would have been required otherwise. Continuous−time states (used in FirstOrder, SecOrder, and Linear) are numerically costly, in contrast to the discrete states in the digital filters. Remark 2.34: It is in the nature of shortcuts that they are faster, more risky, and require some knowledge of the ‘geography’. The shortcut of using noise−suppressing filters is no exception. For instance, the filters may not work well for input data that are sampled irregularly or much more sparsely than the time quantum h. In that case MoCaVa computes the u d(τ)−sequencein Equation 2.4 by linear interpolation over the intervals of missing data d u(k), which means that the sequence that is filtered loses some of its character of ‘slow signal plus noise’ that the filter is designed for. This means that if one has to reduce the quantum for some reason, then the filter parameters cannot be expected to be unchanged. A safer (and more cumbersome) way of dealing with noisy data from sparse sampling is by modelling it as output data. Remark 2.35: A reasonable modelling strategy is to ignore the measurement errors to start with (select Hold), and later try better input models to see if that helps. Often the effects of high frequencies in upstream input are filtered naturally by passing through several downstream units, and do not show up in the output measured downstream. Hence spurious input into the model (from poor filtering of noisy data) will be filtered out by the model, and cause no harm. Remark 2.36: Usually, input filtering is a modelling task for which there is little prior knowledge, and one may have to try several filters. If each filter is given its own component, then it will be easy to change filters. It would be just as easy instead to edit the component that is the target for the input. However, in that case MoCaVa will no longer be able to keep a correct log of the hypothetical structures and how they score. 2.3.6.3 Disturbance Models MoCaVa3 supports three stochastic disturbance models: Brownian, Lowpass, and Bandpass. In order to select one of them it helps to have at least an idea of the general character of the actual physical disturbance. If nothing is known about its source, and no plotted data is available, one may have to try several models. Try first the simplest model, the Brownian, and take a look at the estimated disturbance that results from simulating the fitted model. This may reveal if the disturbance has any of the general characteristics assumed by the alternative standard models. All library models are dominated by low frequencies (compared with the Nyquist frequency 1/2h). : The Brownian model accumulates Gaussian random numbers, and makes a linear interpolation between the discrete values in order to create a continuous−time approximation of Brownian motion. It has one characteristic parameter, the average drift rate. Its general appearance is random drift with no given direction and no attraction to a ‘zero level’. It is a very robust disturbance, able to model most irregular low−frequency variation, including infrequent and large random steps.
2 The MoCaVa solution
51
: The Lowpass model makes a stepwise constant function from Gaussian random
numbers and then applies a first−order linear filter. The result varies randomly around zero, and has little power in frequencies above its bandwidth. It has two characteristic parameters, viz. bandwidth and rms−value. The bandwidth must be well below the Nyquist frequency. It is suitable for modelling low−frequency disturbances with the same general appearance throughout the sample, without a pronounced periodicity. : The Bandpass model makes a stepwise constant function from Gaussian random numbers and then applies a second−order linear filter with one pair of complex poles. It has three characteristic parameters, viz. rms−value, frequency, and bandwidth. Its general appearance is random variation with a dominating frequency, more or less pronounced. The dominating frequency must be well below the Nyquist frequency. It is suitable for modelling waves, effects of slow and poorly damped control loops, limit cycles, and clock−depending environmental effects. It would also be able to model vibration (for instance in a rolling mill), if the time quantum can be made small enough.
2.4 Fitting and Falsification The outcome of the test procedure decides whether or not a model structure is good enough to describe the data. With the requirement that the test should be maximally efficient, i.e., having maximum probability of rejecting a wrong hypothesis, this also determines the loss function to be used for fitting! The answer is to use the Likelihood Ratio test for falsification and the Maximum Likelihood criterion for fitting (Bohlin, 1991a). The following is a list of assumptions and restrictions leading to this result (see Bohlin, 1991a or 1994b for an analysis): 1) Popper’s principle of scientific discovery. This is the basis of the calibration procedure described in Section 1.3.1. 2) Statistical decision theory. Assuming that uncertainty may be described using probabilities makes it possible to define “risk” and to formulate a problem of finding the test procedure having the smallest risk of accepting a wrong hypothesis, i.e., an inadequate model structure. 3) Nested structures. The problem of minimizing risk can be solved, provided one has an alternative, wider model structure to compare with the one being tested, and that structure contains also the one to be tested. This is the basis for ‘refining’ the model structure by ‘expanding’ it in the way described in Section 1.3.2 “How to specify a model set”. 4) Parametric models. This assumption basically serves to reduce the unknown elements in a model to a finite number. (Even in ‘non−parametric’ methods it is possible to estimate only a finite number of parameters characterizing, for instance, an unknown frequency distribution.) 5) Long samples. This makes it possible to apply the “law of large numbers” and the “central limit theorem” in statistical theory to derive an explicit form of the likelihood function appearing in both the test criterion and in the fitting loss. The two optimality criteria will be: Fitting: Find M ν = arg min M Q(M, d|M ∈ F ν)
(2.21)
52
Practical Grey−box Process Identification
where F ν is the tentative model structure, d is the data sample and M ν the best model within this structure.
Testing: (2.22) If Q(M ν, d) − Q(M ν, d) > χ 2(|ν | − |ν |, β) 2 for some M ∈ F ν and ν ≥ ν , ≥ , then reject the model M ν
where χ 2(r, β) is the chi−square variable for r degrees of freedom and risk β. Notice that the same loss function Q appears in both. And since the function has a given form derived from quite general optimality criteria, this eliminates two conventional user’s tasks, viz. those of selecting a loss function (including the specification of weighting factors) and of selecting a test criterion. 2.4.1 The Loss Function The common loss function is N
Q(M, d) = 1 [log det R e(k|M, d) 2 k=1 + e(k|M, d) T R e(k|M, d)−1 e(k|M, d)]
(2.23)
where e(k|M, d) are the ‘residuals’, i.e., the errors incurred when predicting over the sampling intervals (t k−1, t k), and R e(k|M, d)is the covariance matrix of the prediction errors. Remark 2.37. The form of the loss function is that of the negative logarithm of a multivariate normal distribution, thanks to the “central limit theorem”. The chi−square threshold in the test is thanks to a theorem of Cochran (Wilks, 1962). Remark 2.38: Generally, noisy data require long samples to yield a satisfactory accuracy of the model. If the data sample is not long enough for the central limit theorem to be applicable, then the loss function is not necessarily the best one in the sense that it will give the test the maximum power. In addition, there is no convenient way to test if the sample is long enough for the asymptotic loss function to be optimal. However, it is still reasonable to use it also for short data samples, since the criterion still makes sense: It punishes the prediction errors weighted by their estimated accuracies. Thus, large factual errors e contribute heavily to the loss only if both measurements and the predictions are estimated to be accurate, when based on the model (otherwise the error may be more due to chance and therefore less informative). The term log det R e prevents an optimizer from minimizing the loss simply by reducing the weights R −1 e . Provided that it will still be possible to get an acceptable model out of the calibration procedure, even with a short sample, there is nothing to prevent one from specifying the criterion instead of deriving it from ‘higher’ principles. See (Bohlin, 1991a) for a further discussion on this point. 2.4.1.1 Normalized Loss The function defining Q(M, d) has an absolute value that is difficult to interpret, and that therefore makes little sense to a user. For one thing, it is not independent of the physical units of the sensor output. In fact, the value makes sense only in relations to other models.
2 The MoCaVa solution
53
In order to normalize the loss function define a ‘null’ model as y(t k) = Λ w y(k),
Λ = diag[rms(y)]
(2.24)
which can be interpreted as a model that contains no information from the data, except the rms values. The idea is to measure the loss of a model M relative to that of the null model. The latter has the loss N i (log Λ ii + ½)
Q0 =
(2.25)
i
where N i is the number of data from sensor #i. Subtracting this constant does not change the results of fitting or testing, but has the following advantages: : The normalized loss Q − Q 0 becomes independent of scale. : It is always negative (everything better than the ‘null’ model is a gain). : The value of exp[(Q − Q 0) N] defines a loss that lies in the range of (0,1); it is 1 for the null model and approaches 0 for a perfect model. 2.4.1.2 Weighted Loss The loss is a function of the unweighted Likelihood function, which assumes that nothing is known a priori about the values of θ. However, the prior knowledge entered by specifying parameter attributes places restrictions on likely ranges of variations in θ. The relations are given by the Parameter maps p = I(o, ν, θ), where p normally have physical meanings and attributes, while θ are scale−free coordinates. MoCaVa uses the following four mappings, depending on the intervals specified by the user: : No boundary: p = o + S p θ : Lower bound: p = p min + (o − p min) exp(βθ), where β = S p S 2p + (o − p min) 2 : Upper bound: p = p max − (p max − o) exp(− βθ),
where β = S p
S 2p + (p max − o)2
(1 + β 2), β = θ − α 1 − α 2 where c 1 = (p max + p min) 2, c 2 = (p max − p min) 2, α = (o − c 1) c 2 The mappings affect the likely ranges of θ in different ways. In the unbounded case the likely range is defined by the scale S p only, in the singly bounded cases by the scale, the bound and the origin, and in the doubly bounded case by the two bounds and the origin. However, in all cases it is not likely a priori that θ 2 1. Assume therefore the following a priori weight on the Likelihood function exp(− α θ 2 2), where α is a design parameter. The normalized and weighted loss function will be
: Upper and lower bound: p = c 1 − c 2 β
N
Q(M, d) = 1 [log det R e(k|M, d) 2 k=1 + e(k|M, d) T R e(k|M, d)−1 e(k|M, d)] N i (log Λ ii + ½) + α θ ν 2− 2 i
(2.26)
54
Practical Grey−box Process Identification
where M = ( , ν , θ ν). Remark 2.39. The weighted loss is also called the MAP (Maximum A Posteriori) criterion, since it maximizes the Likelihood of the θ−value after the data d has been obtained. The maximum a priori value is θ = 0. For long samples, the prior weighting normally has little effect. However, it prevents the estimates from drifting too far, which may otherwise happen when a bounded parameter tends to its boundary. 2.4.2 Nesting and Fair Tests A tentative structure F ν and an alternative structure F ν are ‘nested’ if F ν ⊂ F ν. When the condition is satisfied, the maximally efficient Likelihood−Ratio test can be applied. If the classes of the tentative and alternative structures are the same, = , the condition is simply ν > ν , i.e., the model structure is expanded within the class by freeing more parameters. If also the class is expanded, ⊃ , then nesting may still be possible. It is required that the null model in the alternative set be equivalent to the F(u t, w t, , p )≡ F(u t, w t, , o ) for all u t, w t. A tentative model, i.e., M (p ) sufficient condition is that the ‘origins’ of the parameters in the added component(s) have been set to their ‘null’ values, i.e., values that make the effect of the component(s) ‘null and void’. For instance, a factor in front of a component replacing a zero constant has a zero null value. When the nesting condition is satisfied, and there is more than one alternative, MoCaVa selects a modified ALMP (Asymptotic Locally Most Powerful) test by default (Bohlin, 1978). For models that are linear in the parameters the ALMP test can be interpreted as an LR−test,where the search for the best alternative model to be compared with the tentative model has been interrupted after one step (Söderström, 1981). With nonlinear parameter dependence, this does not hold, but the test that is implemented is still based on the loss reduction Q(M ν, d) − Q(M ν, d), and it is still called ALMP. The ALMP test not only avoids the time consuming search required by the LR test, but also allows several alternative structures to be tested in parallel. It is not as discriminating as the LR test, except when the difference between structures becomes small. However, that is the case where an efficient test is most important, while large differences have only to be detected sufficiently well by ALMP to surpass the threshold. Even when the nesting condition is satisfied the user has the option of overruling the default selection of the ALMP test in favour of the LR test. The reason is that ALMP is less safe than LR in nonlinear cases: The statistic is based on a single iteration, which is further based on a step length formula that is only partly reliable in nonlinear cases (see Section 2.4.3). There is a possibility that the step will overshoot, that the loss reduction will be negative, and consequently will not reject a wrong model. Hence, a negative loss reduction is an indication to the user to consider switching to the LR test. Null values do not always exist. When they do not, and the tentative model structure is to be tested against an alternative structure in a different model class, then the principle is different: It is to compare the losses of the best models within the two structures, and reject the tentative structure, if the alternative has a smaller loss. However, if the alternative is to be preferred, then it must not be more complex (according to the principle of parsimony), meaning that it must not have more free parameters. A comparison must be ‘fair’. The test may be interpreted as an LR test with no more degrees of freedom, i.e., with a zero threshold.
2 The MoCaVa solution
55
Remark 2.40. No method is known for computing a test statistic with not nested structures and different numbers of free parameters. When MoCaVa detects such a case, the cause of the ‘unfairness’ is displayed to the user, who is then given an opportunity to redefine the alternative structure. The simplest way is to reduce the number of free parameters. Algorithm 2.1. Setting up test Set up default test Check ‘nesting’ conditions: If nested, then select ALMP test Else check ‘fairness’ conditions: If fair, then select LR test Else indicate no valid test
2.4.3 Evaluating Loss and its Derivatives The following forms of the loss function and its derivatives with respect to coordinates θ in free parameter space are derived in Section A2: N
Q=
[γ T γ(k) + Á(k) T Á(k) 2] + α θ 2 k=1 N
Qθ =
2
− Q0
[γ T γ θ(k) + Á(k)T Á θ(k)] + α θ
(2.27)
(2.28)
k=1 N
[γ θ(k) T γ θ(k) + Á θ(k) T Á θ(k)] + α I
Q θθ ≈
(2.29)
k=1
where R e(k) = Γ(k) Γ(k) T , Á(k) = Γ(k) −1 e(k), γ(k) = log diag Γ(k), and Γ is lower left triangular. The matrices γ θ(k) and Á θ(k) are the gradients of the vectors γ(k) and Á(k) with respect to the vector θ, and γ is a constant vector. The right hand side of the Hessian Q θθ is a relatively inexpensive approximation of the exact value, since it can be evaluated without also computing the expensive second−order gradients Á θθ(k), γ θθ(k) in the full formula (Section A.2). However, it is valid only for long samples, and is a good approximation only when the model is close to a good model. Hence, there is motif for investigating whether it can reasonably be used also throughout the search for a good model. IdKit uses the Hessian for two purposes, viz. i) for computing the accuracy of parameter estimates, and ii) for determining the direction and step length in a Newton−Raphson search for values that make Q θ = 0. Fortunately (because one can hardly afford to use the exact Hessian), there are some favourable circumstances: : The parameter accuracy will not be needed until a good model has been found, and the approximation is valid. : When the search is far from the minimum, the exact Hessian will be of little use to a search method (the Newton−Raphson) that basically assumes a constant Hessian. However, the positive definiteness of the approximate Hessian will ensure that the steps taken always point in the direction of smaller loss values. : When the search is close to the minimum it will provide fast second−order convergence.
56
Practical Grey−box Process Identification
Remark 2.41. Considering how little that has been assumed about the model structure so far, I wonder if these fortunate circumstances are just pure luck, or if there is some hidden logic behind, still to be revealed. If, some day, someone will explain why it could not have been in any other way... Remark 2.42. Even if the direction of the next step in the search is guaranteed, its length is not. Far from the minimum both the exact and the approximate Hessian may cause step lengths both too short and too long to be of any use. This possibility will require modifications to the basic Newton−Raphson search rule (see Section 2.6). 2.4.4 Predictor It remains to find a way to evaluate the residuals and their covariances for a given model and data sample. Since the residuals are e(k) = y(t k) − y(t k|t k−1), the task is now to find a predictor P(k|M, d k) → y(t k|t k−1), R e(k). IdKit uses a version of the Extended Kalman Filter (Algorithm A.2 in Section A.4) to derive a class of approximate predictors P for the given model class F . They are valid in cases where prediction errors due to random disturbances are small enough to stay in the linear range when linearizing the model locally. There is no restriction on the size of errors caused by large stimuli and a wrong model. In principle, the algorithm predicts between two sampling points by applying a discrete−time Extended Kalman Filter as many times as there are time quanta in the sampling interval (normally once). It also computes the normalized residuals Á and residual variances γ needed to compute the loss and its derivatives in Equations 2.27 to 2.29. 2.4.5 Equivalent Discrete−time Model The object model in IdKit is obviously restricted by what an EKF can handle, most generally a nonlinear stochastic state−vector model in discrete time, but quasilinearizable with respect to its stochastic elements. This includes, in principle, equivalent discrete−time models obtained as solutions of ordinary differential equations with stochastic input, describing a sampled continuous−time object. However, in order to increase the efficiency of the algorithms in IdKit it is worth while to introduce more restrictions in the modelling, in order to allow more efficient algorithms for classes satisfying the restrictions, provided, of course, they seem acceptable in practice. The structural restrictions introduced in Section A.1 give the compact form dx = G[x, w(τ), u d(τ)] dt + E ω(τ) dω, t ∈ (t τ, t τ + h] η = Z[x, u d(τ)], where η = (z, v, u)
(2.30) (2.31)
where all except x, ω and η are constant during a quantum interval. Hence, a discrete− time equivalent can be obtained, in principle, by integrating the continuous−time model one quantum to obtain a relation between the initial values x(t τ), the constant input u d(τ),w(τ) and the next state x(t τ+1). Remark 2.43. Only the case of low−frequency disturbances ( E ω = 0, and all disturbance via w) is implemented in MoCaVa3, but Section A.3 shows a way to extend that to high−frequency direct state noise ω. The sensitivity matrices between the input and output of the continuous−time model (the user’s model) and the equivalent discrete−time model have a main rôle in the predictor. They are matrices of gradients (= derivatives) of all output with respect to
2 The MoCaVa solution
57
all input variables. They correspond to the ‘transition’ matrices in linear models. Computing them also takes the main part of the time of evaluating loss and its derivatives. The continuous−time sensitivities are .
G x(t) = ∂x(t) ∂x(t) . G w(t) = ∂x(t) ∂w(τ)
(2.32) (2.33)
and the discrete−time sensitivities .
H(τ) = ∂x(τ + 1) ∂x(t τ) A(τ) = ∂x(τ + 1) ∂x(τ) E(τ) = ∂x(τ + 1) ∂w(τ) C(τ) = ∂z(τ) ∂x(τ) F(τ) = ∂y(τ) ∂w(τ)
(2.34) (2.35) (2.36) (2.37) (2.38)
2.4.5.1 Linearization and Sampling Since a discrete EKF requires both linearization and sampling, there are two options: i) either linearize first and integrate after, or ii) integrate first and then linearize the nonlinear discrete−time equivalent. MoCaVa uses the first alternative, since this opens a very efficient way to handle ‘stiff’ systems: The linearized and sampled system is derived in (Section A.3, equations A.35 and A.39): xr(τ + 1) = x r(τ) + H(τ) G(τ) x(τ + 1) = A(τ) x(τ) + E(τ) w(τ) + w ω(τ)
(2.39) (2.40)
A(τ) = exp[G x(τ) h] H(τ) = G x(τ) −1 [A(τ) − I] E(τ) = H(τ) G w(τ)
(2.41) (2.42) (2.43)
where
The continuous−time sensitivity matrices are computed by numerical differentiation, and there are fast algorithms for evaluating the expressions in Equations 2.41 and 2.42 (Section A.3.3). They allow a wide range of eigenvalues, which characterizes normally ‘stiff’ systems. The DiscreteModel algorithm is defined in Section A.4.1, and carries out the following operations DiscreteModel [x r(τ), u d(τ), τ] → x r(τ + 1), η r(τ), H(τ), A(τ), C(τ), E(τ), F(τ) It appears at a single place in the Predictor algorithm (A.2), and is the only algorithm that needs access to the user’s model. The other statements in Predictor serve to reduce the effects of random disturbances.
2.5 Performance Optimization Basically, the sensitivity matrices H(τ), A (τ), C(τ), E(τ), F(τ) must be evaluated for each time quantum, each time requiring the evaluation of the user’s model dim(x) +
58
Practical Grey−box Process Identification
dim(w) + 1 times. And this has to be repeated dim( θ) + 1 times to yield one step in the search for the minimum loss. Since, running the model tends to dominate the computing in cases with large models, an investigation whether it would be possible to reduce the number of accesses to the user’s model seems well motivated. The prospects would also seem favourable: For instance, it is not likely that all the dynamic properties of a process would actually change at all times. Three possibilities of making shortcuts have been investigated, exploiting three expected sparsity properties of industrial production processes: : Sparse changes in dynamics: All entries in the sensitivity matrices do not change at all times. Normally, processes change their dynamic properties only in connection with changes in operating point. This suggests an option for predicting the values of sensitivity matrices instead of updating them by numeric differentiation and accesses to the user’s model. : Sparse parameter dependence: All entries in the sensitivity matrices do not depend on all free coordinates. This means that their values do not have to be recomputed for all displaced coordinate values θ i + δθ i. : Sparse sensitivity matrices: All output from all components do not depend on all input to all other components. For instance, state−transition matrices of systems without feedback over components are ‘triangular’ − the states of upstream components do not depend on what happens in downstream components. Neither do parallel feed or production lines influence each other. This means that all components in the user’s model do not have to be accessed for computing the output with displaced states or noise variables. Exploiting these possibilities requires additional routines, in the first case for detecting changes in the model dynamics, in the last two cases for analyzing the structure of the system of components. This will carry some overhead, which means that some models with dense structures or frequently changing dynamics will not benefit from using it. MoCaVa therefore offers the possibilities as ‘advanced user options’. 2.5.1 Controlling the Updating of Sensitivity Matrices In order to exploit the possibility of infrequently changing sensitivity matrices, a function SensitivityUpdateControl has been added in the algorithm computing the sensitivity matrices. The function does two things: 1) In a preceding ‘probing’ pass it analyses the model for significant changes in sensitivity matrices, and computes and stores information χ g needed to control the updating of the sensitivity matrices 2) In regular passes it updates the sensitivity matrices by prediction (and thus bypasses accesses to the user model) whenever χ g(τ) indicates this to be adequate. In essence, the function does the following (see Section A.6.1 for a more detailed description): : In the probing pass all sensitivity matrices Ψ(τ), Ψ ∈{H,A,C,E,F} are computed by numeric differentiation to be used as ‘data’ for the analysis. : The probing pass serves to create a predictor for the updating of sensitivity matrices. A ‘zero order hold’ would be simplest, meaning that matrix values are updated only when the computed values indicate that they have changed. However, a linear predictor is also manageable, based on the assumption that the sensitivity matrices are at most linear functions of the state and input arguments x and u within the range of argument variation during a time quantum. Setting up for a linear predictor obviously costs more overhead, but can also be expected to reduce the num-
2 The MoCaVa solution
59
ber of accesses. In order to find the slope of the extrapolating predictor it is necessary first to estimate the ‘curvature’ of the nonlinear model (which is a three−dimensional array). Hence, linear ‘models’ of the variations of the elements in a sensitivity matrix are set up Ψ ij(τ) = Ψ ij(τ − 1) + Å(τ) h Ψij + σ ij w ij(τ) and fitted to the {Ψ ij(τ)}−sequences in order to determine the curvature vectors h Ψij . The common input arguments in all models are Å(τ) [ x(τ) u(τ)], where is the backwards−difference operator. The fitting is done recursively, using a Kalman estimator. This avoids another probing pass. Variables are normalized with their scales in order to allow a comparison of the residual sizes. : At the end of the probing pass the residual sequences e Ψij (τ) = Ψ ij(τ) − Ψij(τ − 1) − Å(τ) h Ψij are first computed based on the final curvature estimates h Ψij and stored values Ψ ij and Å. They represent the prediction errors. Next are computed the p− percentiles e Ψp of e Ψ(τ), i.e., values such that p% of the residuals are larger, where p is a value specified by the user. The residuals and the percentiles are instrumental in deciding at what times to update Ψ(τ). The following rule is used: If Ψ
| e Ψ(τ)| > max(;.e , eΨp ), then mark Ψ(τ) for update, where ;.e Ψare thresholds set by the user. The motif for the double threshold is the following: The thresholds limit the error accepted without correcting the deviation. However, in difficult cases the user is also given the opportunity to limit the number of times the predictor accesses the model. That may raise the approximation errors above the specified, in which case the more tolerant threshold is displayed. The option may help the user to assess how difficult the case is, and avoid a situation where the computer becomes bogged down, trying to achieve an accuracy that is neither achievable nor actually needed at all times. Accurate calculations are normally needed only for the final model.
2.5.2 Exploiting the Sparsity of Sensitivity Matrices The user cannot affect the operation of these options in any way (except by activating or deactivating them), unlike with the SensitivityUpdateControl option. Neither is there any approximation involved. It is therefore less important for the user to know the details of how the options operate, and the following is a summary. 2.5.2.1 Independence of Parameter Perturbation “Memoization” is a simple and general technique to avoid computing the same value several times: Check if the sensitivity matrices have changed their values due to perturbed coordinates during the first pass, and create an indicator to be used in the next passes. The algorithm is defined in Section A.6.2. 2.5.2.2 Independence of State and Noise Perturbation In essence, an indicator matrix is constructed for each sensitivity matrix, which contains zeroes in places where there can be no sensitivity for structural reasons, and ones otherwise. The chain rule of differentiation is evoked to do that. It can be written as a recursive process running over the set of active components in the order determined by their cause−and−effect relationships (i.e., backwards in the component index): From the component models
60
Practical Grey−box Process Identification
s(i − 1) = S[i, x(i), w(i), s(i)] . x(i) = G[i, x(i), w(i), s(i)] y(i) = H[i, x(i), w(i), s(i)]
(2.44) (2.45) (2.46)
follows
∂S(i) ∂S(i) ∂S(i) δw(i) δx(i)+ δs(i)+ ∂w(i) ∂x(i) ∂s(i) ∂G(i) ∂G(i) ∂G(i) . δw(i) δx(i) + δs(i)+ δx(i) = ∂w(i) ∂x(i) ∂s(i) ∂H(i) ∂H(i) ∂H(i) δw(i) δx(i)+ δs(i) + δy(i) = ∂w(i) ∂x(i) ∂s(i) δs(i − 1) =
(2.47)
(2.48)
(2.49)
The formulas would allow the recursive computing (downstream) of all sensitivity matrices in the total model from those of each component. However, the following alternative is believed to be more efficient. Use the recursion to compute the indicator matrices, once and for all:
∂S(i) ∂S(i) ∂S(i) Bδw(i) Bδx(i)+ B Bδs(i)+ B ∂w(i) ∂x(i) ∂s(i) ∂G(i) ∂G(i) ∂G(i) . Bδw(i) Bδx(i) + B Bδs(i)+ B Bδx(i) = B ∂w(i) ∂x(i) ∂s(i) ∂H(i) ∂H(i) ∂H(i) Bδw(i) Bδx(i)+ B Cδs(i) + B Bδy(i) = B ∂w(i) ∂x(i) ∂s(i)
Bδs(i − 1) = B
(2.50)
(2.51)
(2.52)
where B is an ‘indicator’ operator replacing all nonzero elements with ones. The first step is to determine the nine indicator matrices for each component. This is done by analyzing the model structure to establish what component output depend on what input. Next, the sparsity routine loops over the recursion, once for each column in the indicator matrices, i.e., with Bδs = 0 and varying positions of single units in Bδx and Bδw as start values. The procedure yields the indicator matrices for the sensitivity . . dy dy functions B dx , B dx , B , B . dx dw dx dw Armed with this, the routine in ‘sparse’ mode operates with three consecutive screening processes, applied to each element in a continuous−time sensitivity matrix: 1) When the SensitivityUpdateControl routine has decided that a total sensitivity matrix must be updated, 2) and when the Memoization routine has decided that its value may deviate from one already computed, 3) and when the Sparsity routine has decided that its indicator is not zero, 4) and that the response element has not been computed already, only then will access be given to the relevant sequence of user−specified components for computing a response vector to the particular input variation.
2.5.3 Using Performance Optimization The success or failure of using the advanced options depends on the model structure, and with the SensitivityUpdateControl option also on the user’s specifications and the actual process generating the data. It is therefore important for the user to get a feel for what changing the specifications may achieve, and to obtain some indication from
2 The MoCaVa solution
61
the computer, by which he or she can appraise the effect. Since the purpose of all options are to reduce computing by reducing the number of accesses to the user’s model, and this can only be achieved at a cost, the indicators should reveal what reduction is achieved and at what cost. With the SensitivityUpdateControl option the cost is in the form of approximation errors, and the specifications change the approximation levels. The indicators should therefore reveal what approximation errors are actually achieved. Learning to use this option requires some exercise. On the other hand, the gain may be dramatic in cases that would otherwise be prohibitively cumbersome (as in the cases of “Bending stiffness of cardboard” and “Continuous pulp digester”). With the other options there is no approximation error, and the cost has the form of program overhead. There is no explicit cost indicator; a stop watch will be enough in all cases where performance optimization matters. 2.5.3.1 User’s Specifications With the SensitivityUpdateControl option the following specifications are required from the user: : Prediction error levels for the five sensitivity matrices H, A, E, C, F normalized with their scales. Default values are 10 −8 for H and 10−5 for the other sensitivity matrices, since the trajectory values need to be determined more accurately than the sensitivity matrices. : A limit for the percentage of accesses to the users model. Default value is no limit: ;.p = 1, i.e., 100%. : A value for the number of initial steps before the SensitivityUpdateControl is started. Default value is 5. No specifications are required with the other options. 2.5.3.2 Performance Indicators The following indicators appear on the screen while the model structure is being fitted: : Error thresholds in sensitivity matrices: e Hp, e Ap, e Fp , e Ep, e Cp, They are the error percentiles, if that is more than the specified thresholds. The latter eventuality indicates that care should be taken, either to accept the higher thresholds, or else to increase the percentage limit to the number of accesses. : Number of sensitivity matrices updated by access to the user’s model: n H2 , n A2 , n F2 , n E2 , n C2 . : Number of sensitivity matrices updated by extrapolation: n H1 , n A1 , n F1 , n E1 , n C1 . : Number of sensitivity matrices that need no updating: n H0 , n A0 , n F0 , n E0 , n C0 . The counter values indicate the degree of variation and predictability in the object dynamics. Notice that the first set of counters are limited by the percentage limit ;.p. Large values for the second set therefore suggest that the limit may have been set too low, in particular if the error levels have also increased. : Number of accesses to components in the probing pass, in the first fitting pass, and in consecutive passes: The user’s model is accessed one component at a time. Depending on the different degrees of dependence on parameters and other input the minimum number of accesses to the individual components vary. The latter cannot be exploited in the ‘probing’ and ‘first’ passes, since the input dependence is analyzed during the probing pass, and the parameter dependence during the first pass, where also loss derivatives are computed. Hence, the second set of values indicate
62
Practical Grey−box Process Identification
how much has been gained by the ‘sparsity’ option, and the third set indicate how much more has been gained by ‘memoization’. : Correction for approximation error: This is the difference between the exact loss computed in the probing pass, and that computed in the first fitting pass. It is used to correct all the following approximative loss values in the search. It is a measure of the overall accuracy. Since the threshold for a significant reduction of loss is at least 4, an approximation error much smaller than that should cause no concern.
2.6 Search Routine The search routine is basically Newton−Raphson with some elementary modifications to compensate for the fact that the Hessian is uncertain. The choice is motivated by the following circumstances: : Loss evaluations are expensive and must be few. : Once the gradient has been evaluated, an approximate Hessian adds little to the cost. : The approximate Hessian is positive definite. : Due to the special loss function the proper measure of convergence is the distance of the loss from the minimum (and not that of the parameters from the optimum). There is also a known and case−independent criterion for stopping the search. It involves the Hessian. The algorithm is given in Section A.7. The modified Newton−Raphson search suits the loss function used for the fitting, since it uses a non−negative definite approximation of the Hessian, computed from only first−order derivatives of the model residuals. For parameter values far from the optimum, and large variations in the residuals’ sensitivities to the free parameters, the estimate of the Hessian may however deviate much, causing the step taken towards the minimum to be inefficient or even counter−productive. For structures such that the residuals are affine (linear) functions of the free parameters, the search converges in one step, but normally it takes more. The number it takes depends on a number of design parameters for the search routine, and the setting of those requires some skill in difficult cases. Some guidlines are given in Section 4.8.1. Remark 2.44. The main reason for adhering to the Newton−Raphson procedure is that since loss evaluations dominate the computing, the advantage of few steps becomes paramount. Safer search routines take many more steps, and it is believed to be less frustrating for the user having to intervene at one more point, in an already interactive calibration procedure, and on rare occasions, than having to wait much longer in all cases for a wholly automatic procedure to finish. On the other hand, there is still the possibility of letting the program search over night, which means that here is a subject for further development. Example: The loss function for a simple stochastic model It turns out that the modified Newton−Raphson search method sometimes performs badly when both state−disturbance and measurement−disturbance variances are among the free parameters. In order to explain this it is helpful to analyze a simpler case where only the two variances are free parameters, namely
2 The MoCaVa solution
x(τ + 1) = a x(τ) + e 1(τ) y(τ) = x(τ) + e 2(τ) E{e 1(τ) 2} = r 1, E{e 2(τ) 2} = r 2, E{e 1(τ) e2(τ)} = 0
63
(2.53) (2.54) (2.55)
and let r1 and r2 be unknown (scalar) parameters. In order to compute the Likelihood function apply the (steady−state) Kalman filter. This yields K = r x r e = r x (r x + r 2) r x = a 2 (r 2 − K 2 r e) + r 1 ^ e(τ) = y(τ) − ^ x(τ) ^ x(τ + 1) = a [x^(τ) + K ^ e(τ)] Let
2
= r x r 2, η 2 = r 1 r 2. Inserting this into Equation 2.57 yields 2
Hence,
(2.56) (2.57) (2.58) (2.59)
2
(
2
+ 1) = a 2 [
2
(
2
+ 1) −
4
] + η2 (
2
+ 1)
(2.60)
satisfies 4
−
2
(η 2 + a 2 − 1) − η 2 = 0
(2.61)
with the solutions 2
= 1 [η 2 + a 2 − 1 2
(η 2 + a 2 − 1) 2 4 − η 2 ]
(2.62)
The logarithm of the likelihood function defined by Equation 2.23 is asymptotically log L(r 1, r 2) → − N [log r e + E{e^(t) 2} r e] 2
(2.63)
In order to compute the second term rewrite Equations 2.53 and 2.54 into operator form y(τ) =
q −1 e (τ) + e 2(τ) 1 − q −1 1
(2.64)
where q −1 is the backwards−shift operator. This is statistically equivalent to y(τ) = λ
1 − c q −1 w(τ) 1 − a q −1
(2.65)
for some values of (λ, c) that depend on the values of (r 1, r 2). From Equations 2.59 and 2.65 follows ^
e(τ) =
1 − c q −1 1 − a q −1 w(τ) y(τ)= λ −1 1 − a(1 − K) q −1 1 − a(1 − K) q
(2.66)
64
Practical Grey−box Process Identification
0.01 0 0
100 a=1
r1
c = 0.8
100
r2
Figure 2.12. Likelihood loss function for disturbance variances, logarithmic scale
Since 1 − K = (
2
+ 1) −1, the left member of Equation 2.66 has the variance
E{e^(t) 2} = λ 2
1 + c 2 − 2ac( 2 + 1)−1 1 − a 2( 2 + 1) −2
(2.67)
Inserting this into Equation 2.63, and noticing that adding a constant does not change anything of importance, yields the following loss function log L + const. N 2 ( 1 + c 2 − 2ac ( 2 + 1) −1 λ + log = 2 2 2 −2 1 − c ( + 1) ( + 1) r 2
Q(r 1, r 2) = − 2 lim
N→∞
2
+ 1) r 2 λ2
(2.68)
This yields the surface depicted in Figure 2.12, where 2 has been computed from Equation 2.62. The parameter scale is logarithmic, which agrees with the parameter map r i = exp(θ i). Hence, the figure is depicted as seen by the search routine, which is a function of θ. Apparently, it is far from quadratic, in particular for small values of r 2, which explains why the search routine has difficulties in estimating the direction and distance to the minimum. However, the form suggests the following recommendation: When assigning nominal values for the variances r 2 of measurement errors, choose a value one or two orders of magnitude above the estimated actual value! Not only will the decision be motivated because a substantial amount of modelling errors will probably add to the measurement errors, but too large start values will also make the search easier than too small start values.
2 The MoCaVa solution
65
A second recommendation based on the same figure is to keep the start values for the level of disturbances r 1 small.
2.7 Applicability The topic of this section are some issues of relevance to the decision of whether or not it would be worth while to try and use MoCaVa. In practice, much depends on experience, even if some hints to the type of problems that can be solved may be obtained from the case studies. Some of the contents are summaries and conclusions of what has been discussed in the previous sections, for the benefit of the readers who have skipped those sections. Some can be stated immediately: Do not use MoCaVa in cases with much data and large linear models of standard type, such as static or dynamic regression models. It will be slower than black−box packages. Do not use it for large mechanical or electro− mechanical systems, where disturbances are small and it is known what causes the dynamics. The user interface is not designed for this type of systems. Do not use it for objects whose dominating dynamical properties are discontinuous, for instance due to dry friction, dead zones, or saturation. The quasi−linearization will not work. Do not use it for objects whose proper modelling depends on a complicated geometry, and where a finite−element description is needed. The computing will be prohibitive. Use it for industrial processes of continuous type, where prior knowledge is partial and uncertain and disturbances are substantial. The long−rangepredicting ability and the possibility of monitoring unmeasured variables will be useful. Remark 2.45. Kristensen, Madsen, and Jørgensen (2004) have done a comparison between MoCaVa and CTSM. Their conclusions are that CTSM gains in estimation accuracy for large disturbances, and MoCaVa in speed and support functions for the user. 2.7.1 Applications The types and sizes of processes where MoCaVa will be applicable cannot be delimited a priori, but well exemplified by cases where the method has proved itself. The purpose of the case studies has been to test the theory and the software being developed in parallel, and to see what problems may appear in trying to apply the theory to practice. Table 2.2 summarizes the number of arguments involved, as crude measures of the sizes of the applications. “In” and “out” are the recorded variables. Table 2.2. Sizes of case models: Number of arguments
Case
in
out
states dist.
par.
Baker’s yeast Steel rinsing Pulp digester Cement milling Recovery boiler Bending stiffness EEG signals
2 4 2 2 7 16 0
4 5 3 2 2 2 2
4 10 70 2 1 8 19
25 31 25 6 6 25 9
1 5 3 0 1 2 2
The cases were selected as suitable for grey−box modelling, characterized by partial prior knowledge, some unknown input, and contaminated data. In most cases
66
Practical Grey−box Process Identification
physical data from either pilot plants or full−scaleproduction units have been analyzed using predecessors of MoCaVa. The following is a brief survey of case studies done within the grey−box identification program at KTH. Baker’s Yeast Production This is a bio−technical process. Data were collected during production of ‘mother yeast’ using a pilot plant at Jästbolaget in Sweden, normally used for that purpose and for experiments. Some prior knowledge of the nonlinear dynamics of yeast growth were available, but unknown variables, in particular the start conditions play a major role. The study revealed that a single internal unmeasured disturbance was the main cause of the large variations in the quality of the final product, and that this disturbance is observable (Fan, 1990; Fan and Bohlin, 1989). Steel Rinsing This is a process in industrial steel production. Data were collected from a full−scale production unit at the Domnarvet plant in Sweden. One of the four control variables was varied experimentally for the purpose of identification. Part of the dynamics of the process were known a priori, other parts were determined empirically. Unmeasured disturbances were significant (Sohlberg, 1991, 1992a, 1992b, 1993a, 1993b, 1998b; Bohlin, 1991b, 1994a). The case is the first of the two applications treated in Part III. Continuous Pulp Digester This is an industrial pulp production process. Data were collected during experiments on a full−scale production unit at the SCA Wifsta−Östrand plant in Sweden. Five inputs were varied by sequences of steps with different intervals. This is the largest of the case studies; modelling required five coupled nonlinear partial differential equations, and unknown disturbances play a major role. Applying a collocation method transformed the PDE into approximating ODE with 70 state variables. The study revealed that out of the three unmeasurable internal disturbances expected for physical reasons, one was the dominating source of model error in all measured output (Funkquist, 1993, 1994a−d, 1995). Cement Milling This is a subprocess in industrial cement production. Data were collected during experiments on a full−scale production unit at Lafarge Canada, Richmond. Two input were varied stepwise and two output measured. Some prior nonlinear structure knowledge has been used in the modelling. No disturbances were modelled (Havelange, 1995). The case study was the first to be carried out by someone who had not been involved in the design of MoCaVa. Recovery Boiler This is an industrial process in pulp production, recovering chemicals and energy from the pulp digesting process. The main physical process is combustion. Data were collected during normal production at the Husum and Värö plants in Sweden. Semi− physical modelling of the combustion process (involving chemical reactions as well as heat transfer and radiation) suggested a four−compartment, single−state, nonlinear model with seven input and three measured output. No disturbances were modelled. The study showed that a grey−box model predicted as well as a black box, but needed
2 The MoCaVa solution
67
fewer parameters. The study was discontinued (by a project deadline) before all information in the data were exhausted (Bohlin, unofficial internal report). Bending Stiffness of Cardboard This is the main quality variable of an industrial cardboard manufacturing process. Data were collected during normal production at the Frövi plant in Sweden. The nonlinear model has eight states, sixteen input, and two measured output. Two unmeasured disturbances also need to be modelled. The case is reviewed in Chapter 7 (Bohlin, 1996; Petterson et al, 1997, Pettersson, 1998). EEG Signals This is a physiological process; the signals are obtained by placing electrodes on the human scalp. In essence, some hypotheses about the source of the electrical processes in the brain producing the signals were tested. Simulated data were used, as well as physiological data recorded at the University of Houston, Texas. The model is nonlinear and purely stochastic with no evoked input. The particular method used to compute the likelihood loss has not yet been not implemented in IdKit (Markusson and Bohlin, 1997, Markusson, 2002). Other Applications of the Method The following projects are some ‘offsprings’ of the grey−box program. : Mould level control in continuous casting (Graebe, Elseley, and Goodwin, 1992). : Pulp refiner (Allison, Isaksson, and Karlström, 1995) : Air−cooling process for steel strips (Spännar and Sohlberg, 1997) : Monitoring and failure diagnosis in steel strips (Sohlberg, 1998a) : Paperboard properties (Bortolini, 2001) : Indirect measurement of steel strip temperature (Spännar, Wide, and Sohlberg, 2002) : River control (Sohlberg and Särnfelt, 2002) : A heating process (Sohlberg, 2003) : Drive train identification (Isaksson and Lindkvist, 2003). 2.7.2 A Method for Grey−box Model Design MoCaVa is a tool for setting up and solving the tasks of calibrating and validating models. However, making a mathematical model of a physical object involves also other tasks. For instance, the model designer must produce the data and propose a reasonable mathematical framework for the modelling. For one thing, he or she must decide what variables are interesting to measure and record. That will in fact define the object, by delimiting it from the environment (carving it out from the rest of the universe). It will save much work later, if it is also possible to anticipate which of the variables will have the largest influence on the response. This prior knowledge is needed for designing a ‘root’ model class to start from. If this class involves too many variables, there will be uncomfortably many possible relations between those variables to test and reject. As should not be surprising, there are no clear rules on how to do the preparations for using MoCaVa, when one is faced with an industrial process or other physical object. But the following procedure has been some support in the case studies. It involves five steps, of which MoCaVa offers support for the last three: 1) Phenomenological description: Make a verbal, graphical, or other mental description of the object and experiment conditions. This serves to delimit the object of the modelling.
68
Practical Grey−box Process Identification
2) Variables and causality: Translate the description into a system of causal dependences between defined variables, for instance in the form of a block diagram. This may introduce internal variables that are not measured, or random disturbances (real or fictitious). The step serves to eliminate a number of otherwise mathematically possible relations between the object’s input and output variables. It also creates a skeleton structure on which to hang any prior knowledge about the (usually internal) variables and the relations between them. 3) Mathematical modelling: Specify known relations between variables, including parametrization, or choose structure for unknown relations, including disturbances. If it is not known how many, or what relations that are needed, be prepared instead to create several hypothetical model structures with increasing detail. But not before an improvement is called for, based on the results of the next step. Start with the simplest conceivable model structure, containing the most well−founded relations. 4) Calibration: Find the simplest model that is not falsified by experiment data. This involves fitting to data and tests of significance. The results are measures of uncertainty and credibility, and usually cause a return to step 3. 5) Validation: Confront the model with independent data. If the calibrated model is more complex than the purpose requires, then reduce the model. If it is inadequate, then get more experiment data. If that is not possible, then accept a less ambitious purpose. Notice that calibration and validation follow different rules for when to end the sessions. They may be formulated as : The scientist’s rule: Proceed until the model explains the data. : The engineer’s rule: Proceed until the model satisfies the purpose. Remark 2.46. The recommendation to find the best model that data allows (”the scientist’s rule”) before looking for the simplest model that satisfies the purpose (”the engineer’s rule”) may seem somewhat ‘roundabout’, in particular when the end result is a much simpler model than the one calibrated, and particularly if the latter has been found with considerable effort. Why not go directly to validation? Hjalmarsson (2005) argues that calibrating before validating gives the best accuracy of the final model. Remark 2.47. The task may also be to calibrate a model purchased from an outside vendor. An obstacle is then that the model structure is fixed by the program, and it may or may not be difficult to reduce its complexity. A conceivable solution would be to redesign MoCaVa to accept external models. However, that is only possible if it is also possible to derive a predictor, like in deterministic models. The EKF needs the state− transition matrices, which are not available, unless the states of the external model are available. The last condition is satisfied by the model generator in DymolaZ, which allows MoCaVa to accept model structures from that external source (Section 5.3). 2.7.3 What is Expected from the User? How convenient is it to use MoCaVa? Certainly not as easy as with a black−box method, for instance the MATLABX Identification Tool Box. In summary, a user of MoCaVa must be able to do the following: : Define one or more submodels in the form of state equations. : Classify the input arguments in the equations as either feed, control, disturbances, parameters, constants, or time. : Specify various attributes of some of the arguments, such as scales, nominal values, and ranges.
2 The MoCaVa solution
69
: Assign data to the control variables, and pick suitable standard models for interpolating between sampled input data.
: Pick suitable standard models for disturbances. : Determine the sequence of execution of the submodels. : Suggest one or more alternative submodels to augment, when the current tentative model structure has been rejected as inadequate.
: Specify one or more purposes, if that is other than finding the simplest model that
agrees with available data and prior knowledge. A user does not have to know: : All submodels, in particular not those of disturbances. : Which of the known submodels that are needed. : The values of all parameters. : The number of parameters to calibrate. : A fitting criterion. : A falsification criterion. : A purpose, provided the simplest model that satisfies data and prior knowledge will suffice. 2.7.4 Limitations of MoCaVa
Basically, MoCaVa cannot handle : Models with discontinuities. But the program may still work well with models having a small number of discontinuities in derivatives. : Implicit models (where causation has to be determined automatically). But the idea is that causality is prior knowledge, which will save tests. : Continuous models with delayed states. But delayed input variables can be modelled approximately using a library routine. : General discrete−time or hybrid models: But simple hybrid models with a deterministic discrete−time parts can be treated, for instance a continuous process connected with digital input filters or controllers. : Distributed−parameter systems: But it is possible to supply the model with a user− defined routine for transforming PDE to ODE and treat the latter (Funkquist, 1995; Ekvall, Funkquist, Largerberg, 1994). 2.7.5 Diagnostic Tools In practice, two tasks will take much of the user’s time: i) debugging, i.e., finding errors in the writing of variable relations, when the model is not according to the user’s intentions, and ii) finding out what to do when a model is indeed in accordance with intentions, but still not good enough. MoCaVa offers some tools to facilitate this. Step−wise Design Primarily, each component is compiled stand−alone, and therefore free of compilation errors, before components are linked into a model. However, the difficulties in model debugging usually start after a successful linking. Fortunately, the tree−like structure of the model makes it possible also to test−run one more component at a time. Input are then replaced with their nominal values, and the model is run up to the last not yet debugged component. This means that the user can make debugging easier by modularizing more. Basically, the only effective tool for eliminating run−time errors is to print
70
Practical Grey−box Process Identification
or plot suspiciously−behaving variables and try and track the cause from the submodels that computed them. It is therefore of crucial importance that the user recognizes the variables and the statements that produce them. For that reason no formula manipulation is done, and variable names are retained during execution. Modularization The modularization together with the sequential testing procedure, inherent in the design, also provides information on what to do to improve a bug−free but inadequate model: Mainly, submodels that contribute significantly are worth considering further, while those that do not may be eliminated from consideration. Again, the more modularization, the more information. Nesting The tentative model (= the best so far) and the alternatives (= the hopefully better model structures) need not contain the same components. The statistical risk of rejecting the tentative model (which is the primary tool for determining model complexity) can still be computed, provided the alternatives are nested. That condition is checked by MoCaVa, and the cause of a violation is displayed. The user can then choose to modify the alternative(s) in order to satisfy the nesting condition (and have the benefits of a less cumbersome test and of getting a risk value), or else refrain from nesting, and instead use the prediction−error indicators, or loss values to determine whether an alternative is better. However, even in not nested cases, a comparison has to be ‘fair’. This condition is also checked by MoCaVa, and a violation results in a diagnostic message indicating the cause. The Test Statistics The table of test statistics displays the outcome of the test of a tentative model, and is the main source of information to the user on what to do next. Basically, it yields a three−valued piece of information for each alternative: : The tentative model has not been falsified : The tentative model is false and the alternative is better : The tentative model is false, but the alternative is inadmissible Generally, the results of statistical tests indicate whether a model is false, but not what is wrong. However, the third outcome provides additional information: An alternative is inadmissible, when its fitted parameters are outside admissible ranges, indicating that there is indeed a better model, but not among those tried so far. Since parameters are associated with certain model components, this gives information on what component to amend or replace. The residuals The residuals of a tentative model constitute another source of hints to what may be wrong. For instance, they may contain transients, either initially, or coinciding with large and rapid changes in other variables, or delayed such coincidences, or certain disturbance patterns, like drift or oscillations. Correlation measures are not displayed by default, since they tend to average out such transient errors, unless they occur frequently. In difficult cases it takes some experience to interpret the patterns, and occasionally this does not help either. No doubt, there are cases when it is wiser to suspend the calibration temporarily and instead put the effort into getting better data...
2 The MoCaVa solution
71
2.7.6 What Can Go Wrong? Since the designer of a program for black−box identification knows the model structure, it is conceivable in this case to make a program that is guaranteed always to produce a correct result, provided the assumptions about the model structure do hold. The only cases with guaranteed solutions are those that are linear in the parameters. Among nonlinear structures linearity in the parameters holds, for instance, for the Volterra type (Atherton, 1992), NARX (Nonlinear AutoRegressive with eXternal control) (Billings, 1980), and ‘semi−physical’ structures (Lindskog and Ljung, 1995), all with various shades of ‘blackness’. Not even linear black−box models are always safe in that respect. For instance, ARX models are linear in the parameters, while ARMAX models are not. MoCaVa has fewer restrictions on the model structure, but gives no guarantee always to solve the problem being setup by the user. However, in both the black−box and the grey−box cases the correctness of a result still hinges on the assumptions, which rarely hold exactly. This means that the responsibility for the model being produced still hangs on the model designer. Two things may generally go wrong: i) No model is produced, since the program cannot solve the problem, or ii) a model is produced that passes all tests, but it is still wrong, because the structural assumptions are wrong. The latter case is the serious one − the model is wrong, and one thinks it is right. Call it “pitfall”. MoCaVa has fewer assumptions, and therefore ought to reduce the risk of producing pitfalls. But what can go wrong when using MoCaVa? : The largest model class is too restricted. The risk is shared with all black−box cases, but should be smaller for MoCaVa, since the feasible classes are less restricted. The restrictions are partly inherent in the state−vector class in MoCaVa (see Section 2.1), and partly introduced by the user. A clear distinction is necessary here: The largest class is an assumption, and cannot be tested. The tentative classes (subclasses of the largest) are hypotheses, which are tested. Hence a too restricted largest class is a pitfall (it cannot be diagnosed by MoCaVa), while a too restricted tentative class is not. : The search routine does not converge properly. The construction of the search routine does not allow the search to increase the loss at any step. However, the search may reduce the loss very slowly and even cycle. When this happens there are three typical patterns of search, either i) taking too small steps in about the same direction, or ii) taking too long steps, only to step back about the same length in the opposite direction, or iii) overshooting so much that the loss will increase, again to step back, but then half−way in the same track. Since the negative gradient never points upwards, all the cases of slow convergence are caused by the fact that the loss function is generally not quadratic in the parameters (it may not even be convex), and hence the local value of the Hessian may say little about the curvature of the loss function when far from the minimum, and even less about the distance and direction to the minimum. In such cases the user may have to intervene by changing some of the design parameters of the search routine. Another cause of slow convergence is near singularity of the Hessian, generally caused by freeing too many parameters at a time. Evidently, there is always the possibility that several local minima may cause the search to converge to the wrong minimum. Trying different start values may reveal whether that is the case. The first default start values are the nominal values that are part of the specification of a component. However, when a minimum has been found in the first structure, the minimizing estimate becomes the start value for the next search in the expanded structure. This
72
Practical Grey−box Process Identification
helps to keep the search in the region of one and the same minimum. Experience from test cases has indicated that multiple minima is a practical problem only if the model structure is seriously wrong, or the expansion is done with too many free parameters at a time. However, there is no theoretical results to support the conjecture. The case is not a pitfall, except possibly when a local minimum has been reached and no alternative hypotheses are available to falsify the wrong model. Remark 2.48. Attempts to fit parameters in models with discontinuities in the parameter sensitivities have revealed the following typical pattern: In spite of the discontinuities in the model, the loss function remains a seemingly smooth function of the parameters. The search also quickly reaches the vicinity of the bottom, and often it even converges normally. Occasionally, however, the search has difficulties converging. When close to the optimum, the search may become chaotic taking small and inefficient steps. When the loss function derivatives are magnified, some of them display large numbers of small discontinuities, which accounts for the difficulties in taking short steps. An attempt to explanation of the peculiar behavior is the following: The loss function is a sum of a large number of functions of model residuals, all with continuous parameter sensitivities, except a few where the current state, the input variable and the parameter value happen to combine into a point close to a discontinuity. Since those terms in the sum are few, those with continuous sensitivities will dominate, and the effect will be seen only ‘through a magnifying glass’. The observation also suggests a solution: Augment the Newton−Raphson search with a new stopping rule, taking also loss changes into account. Remark 2.49. There are some observations (Bohlin, unpublished) from trying black−box identification using data collected in closed loop, which indicate that the smallest minimum is not necessarily the right minimum. An explanation would be that if the level of disturbance acting on the process is high, the smallest minimum may actually correspond to the negative inverse of the feedback path (since it contains less noise and is therefore easier to describe with small errors). There is no experience of using MoCaVa for identification in closed loop. However, the more specified the structure of the forward path becomes, the more difficulty should the search routine have in describing the inverse of the feedback path with a model of the same class. That should raise a possible wrong minimum, and conceivably above the right one. : The conditions for numerical differentiation are not satisfied. This is a consequence of allowing nonlinear structures and relying on user−specified scales to compute the argument increments. If scales are wrong by too many powers of ten, then rounding errors may cause erratic results. Numerical differentiation is used for linearizing the model locally as well as for computing the residuals’ parameter sensitivities. Errors in the latter are the more easily detected, since the search will not converge. Differentiation errors in sensitivity matrices mean that the actual model structure will be different from the one that was entered, but they may still constitute valid sensitivity matrices. The test will, in effect, evaluate another model. If the error is large, this should conceivably reject the model (since it will be contradicted by the data), and thus the error will not pass unnoticed. However, it will not be possible to distinguish this case from that of a wrong structural hypothesis. A user is therefore advised to be aware of the risk of numerical errors, and guard against them by confirming that a change in scale does not change the final model noticeably. : The conditions for the applicability of the EKF are not satisfied. It is required that linearization be acceptable around a nominal trajectory computed without feed-
2 The MoCaVa solution
73
back from the data sequence. This is a more severe requirement than generally holds for Extended Kalman Filters, where linearization is done around a state trajectory estimated from past data. The reason for the more restricted filter is that the nominal trajectory will remain stable independent of any outliers in the data. However, the requirement gets IdKit into difficulties in case of drifting disturbances, where even small levels of noise may cause the disturbed trajectory eventually to drift far from the nominal, and possibly out of the linear range. The user should therefore avoid entering drifting disturbances into strong nonlinearities (’Brownian motion’ is the only such disturbance in the library), and instead select a stationary disturbance model and put a bound on its unknown rms value, in order to prevent it from getting out of the linear range. Remark 2.50. A possibility to allow drifting disturbances, and still retain the stability of the reference trajectory has been analyzed by He (1991). However, it has not been implemented in MoCaVa3. : The conditions for the applicability of the stiff ODE solver are not satisfied. If sensitivity matrices would change considerably within a time quantum h, then the quasi−linearization operation the ODE solver is based on cannot be justified. Whether that has serious consequences or not can be investigated, for instance by checking the final model with half the time quantum. The loss value should not change by much more than one. : The conditions for the applicability of the ‘advanced’ options for optimization are not satisfied. The option of SensitivityUpdateControl involves approximations, based on the user’s prior specifications of what would be acceptable approximation errors. There are also a number of performance indicators displayed, to tell the user whether the approximations are reasonable. If the option would fail, that would most likely be because the dynamics of the model change so frequently that predicting transition matrices from one time instant to the next is meaningless. In particular, discontinuities in the model may cause this, but also too strong nonlinearities. However, a prediction failure will show up in the performance indicators (see Section 2.5.3), and the case is not a pitfall per se. However, another assumption may be: The indicators controlling the updating of sensitivity matrices are computed during the first pass in the search, and it is assumed that they will be applicable also throughout the search. This means that if a search carries the model parameters too far from their initial values, then it is conceivable that the indicators will no longer be adequate for the much changed dynamic properties. This is not detectable right out, since the performance indicators will also be wrong. If in doubt, the user may check this by initiating a second search with the same free parameters. This will start from the estimated values. If this converges immediately with only little change in parameter values, then approximations will be acceptable. There is always a possibility to deactivate the option in dubious cases, for instance to run a final search with the exact model overnight. : The balance between the models of object and environment is wrong. When it is difficult to find a physical explanation for the variation of a parameter, it is tempting either to assume a linear regression with any other variables that may possibly affect it, or else model the variation as a disturbance. Both have pitfalls, if used indiscriminantly. As stated in the introduction, it is easy for a predictor based on a stochastic model to predict well over a sampling interval, which yields a small loss, in particular if the sampling is dense or the process responds slowly. It is not unusual that the Brownian model alone, with the trivial predictor y(t k+1|t k) = y(t k),
74
Practical Grey−box Process Identification
predicts better than any deterministic model, which cannot base its prediction on previous output. Hence, as soon as a stochastic disturbance is introduced, the loss normally drops dramatically and the predictor plots look good (which increases the temptation). The variation in the data are responses to known as well as unknown input, described by deterministic and stochastic models respectively. The pitfall in a skew balance between the deterministic and stochastic parts in a model structure is that the search may allocate too much of the variation in the data to disturbances, since it is easier to predict that way, than by means of an underdeveloped deterministic model. Since the purpose of the modelling is normally not to predict the particular data sample, it is recommended that stochastic disturbances be introduced late, and preferably not before all possibilities of explaining a variation have been exhausted. On the other hand, the alternative ‘easy way’ of a regression model may lead into another pitfall: If the regression model has too many free parameters (which is clearly a risk if the number of variables are many), the result may be over−parametrization and spurious ‘dependencies’ discovered. Again the search tries to minimize the loss, and when given sufficiently many free parameters to use for that purpose it may succeed in establishing ‘relations’ that are merely data descriptions. In other words, too ‘forgiving’ deterministic models tend to include phenomena that are actually disturbances. Since stochastic disturbance models are easier to fit to random phenomena they have the salutary effect of reducing spurious regression. In conclusion: Introduce disturbances late, but do it eventually! Remark 2.51. No doubt, it would be desirable to have an objective method of deciding the right balance between deterministic and stochastic parts in a model structure, but, alas, MoCaVa3 does not have one. However, a number of guide−lines are given in the case studies in Part III. To the list of causes of failure may be added that the freedom to specify argument relations and attributes may introduce ordinary programming errors. MoCaVa3 has an option for using the debugging facilities in MATLABX for detecting errors in the user’s components, before subjecting them to fitting and testing.
3
Preparations
3.1 Getting Started 3.1.1 System Requirements The following are the minimum requirements for running MoCaVa, but later and faster configurations are preferable: : Intel Pentium III 500MHz : > 64 Mb RAM : > 100 Mb Hard Disk space : Screen resolution at least 1024*768 : Microsoft WindowsX 95, 98, NT, 2000, or XP : Matworks MATLABX 5.3 or later Remark 3.1. All combinations have not been tested, which might cause difficulties in some cases. Neither WindowsX nor MATLABX are fully backwards compatible. 3.1.2 Downloading Go to internet address www.springer.com/1−84628−402−3. Under “Related links” click on “Download Supplementary Files Here”. Then click on “here” to download MoCaVa−3.2_setup.exe. Alternatively, go to http://mocava.s3.kth.se and click Download on the MoCaVa home page. Then click on MoCaVa−3.2_setup.exe and save the file. 3.1.3 Installation Run MoCaVa−3.2_setup.exe, for instance by double−clicking on the icon. This will start the installation. Click Next twice. The program will open a window to enter the Destination Folder. Type the full path to the MATLABX toolbox directory. Alternatively, click the button marked “...” to open a browser and locate toolbox. Then click Finish. MoCaVa3 will then be installed automatically in a new MoCaVa3 directory under the MATLABX toolbox directory. Some files will also be installed in the toolbox\local directory. 3.1.4 Starting MoCaVa Start MATLABX and type mocava3 in the command window. MoCaVa opens a window for accepting some legal terms for the use of MoCaVa.
78
Practical Grey−box Process Identification
Click I accept the conditions for using MoCaVa. MoCaVa opens the MoCaVa window for selecting one of the Predat, Calibrate, Validate, or Simulate sessions. 3.1.5 The HTML User’s Manual The contents of Part II are available in more detail in hyper−text format in MoCaVa3\HTML_doc\UserManual. The manual also describes the Validate and Simulate sessions in MoCaVa, which are not treated in this book.
3.2 The ‘Raw’ Data File Data must be in an ASCII file and contain records as follows: : Any headlines must have been removed. : All records must contain the same number of fields separated by blanks or tab stops. : One of the fields may contain physical time, which need not be equally spaced, but must be multiples of a smallest sampling interval. This means that the time variable must be increasing, and never reset like a clock reading at midnight. : Fields with missing data must be marked with NaN or inf. This means that some data columns will contain NaN frequently, if they are sampled more sparsely than others. : If no time variable is icluded, this will be interpreted as sampling with constant interval for the variable with the densest sampling (other must therefore have NaN).
3.3 Making a Data File for MoCaVa Select Predat and New data file in the MoCaVa window (Figure 3.1). The alternative Modify data file is for changing the specifications, for instance scales, in a previously prepared file.
Figure 3.1. Starting data preparation
MoCaVa opens the Predat control window (Figure 3.2). It contains some tools for editing data files. Only the Main tab will be used in this tutorial. Click on Get data file. MoCaVa opens a browser window (Figure 3.3).
3 Preparations
Figure 3.2. Predat control window
Figure 3.3. Browser window showing defined projects
79
80
Practical Grey−box Process Identification
Figure 3.4. DrumBoiler data directory
It contains directories of all projects that are currently defined, so far only the demo projects. One of them is DrumBoiler, the one to be demonstrated first. Find the directory holding the response data file(s). After getting the data from an independent source (like a data retrieval or simulation program) it is natural to move the data file into the directory of the project that will use it (for instance, that would package all that is needed in a single directory for easy backup or export). However, since the same data may be used for different projects, as well as several data files used for the same project, the data preparation in MoCaVa is in fact independent of the project that will use it. Hence, the location of the data file may or may not be in the directory of any project. In the present demo case it resides in mocava3\Examples\DrumBoiler (Figure 3.4). The directory contains two ASCII data files containing data generated by the ‘true’ model. However, record #5 has been deleted from Dboil, and two entries in record #2 and #3 have been replaced with NaN, in order to simulate the effect of missing data. Open Dboil, for instance by double−clicking on Dboil (or by selecting Dboil and clicking Öppna − the Swedish is due to WindowsX). MoCaVa opens two windows: The Plot Outline (Figure 3.5) showing a graph of the data in the file, and the Data Outline for editing the data in the file. Indicate that time is in the first column in the data file and select its unit. Enter also names for the variables and their scales, if other than default. Notice that variable names must not contain spaces. After editing, the Data Outline window will look as in Figure 3.6. Click Apply Now in the Data Outline window, and then Save in the Predat control window. MoCaVa opens a browser window to store the prepared data (Figure 3.7). Give the file the same name but extension mcv, and place it in an arbitrary directory, in this case the same. Click Exit predat in the Predat control window. This concludes the data preparation in a case where no outlier removal is necessary. The new file contains the variable names and some data statistics in addition to the data. Remark 3.2. For instructions on how to remove outliers, see the CardBoard case in Section 7.3. Repeat the preparation for the second data file.
3 Preparations
Figure 3.5. Raw data for DrumBoiler
81
82
Practical Grey−box Process Identification
Figure 3.6. Window for editing DrumBoiler data
Figure 3.7. Window for storing prepared data
4
Calibration
This and the following chapter use two simple examples to introduce the concepts and communication windows needed to run MoCaVa. The first example DrumBoiler demonstrates the calibration of a simulated two−input two−output bilinear model. The second example CascadeControl illustrates the modelling of a nonlinear feedback system involving an ‘algebraic loop’. The drum boiler model has been developed and used on many occasions. The following is quoted from Sørlie (1996a): “The process ... is based on a simplified model of a power plant, originally developed by Eklund (1970); c.f. (Åström and Eklund, 1972, 1975). Later, Ekstam and Smed (1987) augmented a model with a reheater cycle, adding a second state. The result is a set of bilinear state and output equations. Through the inclusion of physically motivated state disturbances and output measurement errors, the example demonstrates the minimal complexity of the three−block functional decomposition devised in IdKit (Graebe, 1990a)”. The purpose of calibration is to find the simplest model that i) is consistent with the user’s prior knowledge and ii) is not falsified by response data. Start the calibration session by selecting Calibrate in the MoCaVa window (Figure 4.1)
Figure 4.1. Starting Calibrate
4.1 Creating a New Project MoCaVa opens the Select project window (Figure 4.2) for input. The project directory mocava3\mcvprojects contains four cases for demonstration purposes. Click on DrumBoiler to select the project. The name of the selected project appears in the header (Figure 4.3). If the DrumBoiler case had not been created already, it would have been necessary to do the following: Click on New Project. This opens a window for naming the
84
Practical Grey−box Process Identification
Figure 4.2. Window for selecting project
Figure 4.3. Selecting project
project (Figure 4.4). Enter the name DrumBoiler and click OK. This creates a project directory mocava3\mcvprojects\DrumBoiler for storing all information
4 Calibration
85
Figure 4.4. Naming a new project
that is particular to the project, and adds the project name to the list of defined projects in the window. When a project has been selected, it can be deleted, copied, or opened for processing. It is also possible to use the Microsoft Explorer for deleting, copying, or backup. Click Open to start the project. MoCaVa will first set up five permanent windows for communication with the user: : The Main window will receive user input and display intermediate results and messages from the user’s guide. : The Plot window will display the results of simulations of the current model. : The Model window will display a block diagram of the current model class. : The Pilot window will display the position of the latest executed subtask in the current session. : The MoCaVa window controls the execution. Any of the other windows may be shut off from under the View tab in the MoCaVa window.
4.2 The User’s Guide and the Pilot Window Basically, MoCaVa consists of a number of independently executable script files (in MoCaVa3\Source\Commands) for the various tasks that have to be handled in a calibration session, but employs a special user’s shell (in MoCaVa3\Source\MoCaVa\calibrate5) to run the tasks according to a procedure of ‘proper calibration’. The shell serves to control that scripts are executed in a logically correct order and input arguments are set properly. However, the sequence is not fixed; there are a number of decision points, where the user may control what will be done next, or has an opportunity to acknowledge or overrule the proposals of the user’s guide. Communication windows have a number of ‘decision’ buttons, the clicking of which will determine the sequel. MoCaVa displays the User’s guide in the Pilot window (Figure 4.5). It lists the names of the scripts and the conditions under which they are executed. It also highlights the script executed last (except at start−up, when it highlights the first script). Each time a task has been completed, the position changes to the next logical step. Only places where the user can make a decision will normally be highlighted. The numbered lines show the names of the scripts, those marked with > are comments, and the other lines are the conditions. Each script can also be executed independently by simply typing its name in the MATLABX command window. This allows an experienced user to run MoCaVa also in a ‘command mode’, and thus to accept the responsibility that this is done in a logically correct order, and that all input are set properly. However, because it is expected to be difficult to interpret the outcome of executing a script, when some input has not been set or updated properly, even an experienced user is recommended not to use this option in other than simple cases.
86
Practical Grey−box Process Identification
Figure 4.5. The Pilot Window
The Pilot window is also useful when the user wants to suspend a session temporarily and resume afterwards, or regrets a decision and wants to step back.
4.3 Specifying the Data Sample At first start of the Calibrate session (or after Reset) MoCaVa opens a browser to select the data file used for the calibration (Figure 4.6). The default directory will be mocava3\Examples. Change the directory to where the prepared data were placed, i.e., mocava3\Examples\DrumBoiler (Figure 4.7). Open the dboil.mcv file. 4.3.1 The Time Range Window MoCaVa sets up the first user’s entry point in Main window (Figure 4.8). It requires specification of what segment of the data file to be used for calibration (if not all). Default values are the following:
4 Calibration
Figure 4.6. Directory for data used for demo and case studies
Figure 4.7. Selecting data file for calibration
Figure 4.8. Specifying sample
: : : :
Time quantum: The smallest interval between sampling times. Start time: The time of the first record minus one quantum. Stop time: The sampling time of the last record. Time scale: The scale specified in Predat.
87
88
Practical Grey−box Process Identification
Click OK to accept the default values. Help: Time range window Edit the numbers in order to specify the calibration sample as a segment or all of the data file: : Time quantum is the constant time increment used in the stepwise integration of the ODE defining the model. It should be as large as possible to make computing efficient. It is limited by the (shortest) sampling interval. Since IdKit accepts ‘stiff’ ODE, the time quantum does not have to be shorter than the fastest time constant in the model, i.e., the rate at which the states may change. Instead, it is limited by the rate at which time constants may change (”time constants” are the inverses of the eigenvalues of the state−to−derivative transfer matrix). If the time quantum is smaller than the sampling interval, it must be an integer part of it (1/2,1/3,...). : Start time is the time at which states are to be initialized. This determines the first data record in the sample, as the first one that follows the start time. The default start time is therefore one default time quantum before the time in the first data record. : Stop time determines the end of the sample. : Time scale determines the interval between tick marks in plots. The time unit of all entries is the same as that in the time variable in the data records. Hints
: Whenever the specifications differ from the default values some statistics of the sample will be displayed.
: If you are uncertain of whether you can use the whole sampling interval as time
quantum, make a test by halving it and see if the model responses change noticeably. : You may specify a start time well before that of the first data record. The option is useful when the data were recorded in steady state, and it is difficult to specify prior values for the states. In that case you may use the model for determining a steady state, by running it past the transient phase, before comparing its output with the data.
4.4 Creating a Model Component After the sample specification MoCaVa requires specification of a model class. It consists of a set of connected components (or a single component) corresponding to different parts or phenomena of the physical object to be modelled. Components are ordered according to cause−and−effect relationships. The set is built progressively, as the calibration process proceeds, by appending components of more parts or phenomena, as need be. The DrumBoiler example will have four components at most. Preferably, the first component should describe only well known phenomena in the object (so that it need not be changed later). MoCaVa opens the Component naming window (Figure 4.9) for the user to give the first component a name of alpha−numeric characters without blanks. (From now on, the screen images will be cropped to show only the relevant parts) Enter DrumBoiler and click OK. (If you need to do an extra click, you did not press Enter).
4 Calibration
89
Figure 4.9. Naming a new component
Figure 4.10. Selecting a component for editing
4.4.1 Handling the Component Library MoCaVa opens the Component library window (Figure 4.10). The DrumBoiler component is empty. Mark it for Change and click OK. Help: Component Library Window It shows the list of defined components. : Select Retain for components to be unchanged. : Select Change to open an existing component for modification or define a new one. : Select Delete to remove a component from the library index. : Select Insert to add a new component to the library index. Its position in the list will be on the line after the Insert. Click OK after selection. Alternatively, click User lib to define or modify a user function. A “user function” is a static or dynamic function common to several components in the project. Hints A component (with the exception of the top component) is always associated with one or more components that receive its Signal output. It must therefore be executed before all receiving components. MoCaVa executes active components from the bottom to the top of the list of components. A new component should therefore be placed below all its target components. The placement of the components in the list thus defines the causality between variables. Unlike in some simulation programs causality is not determined automatically from the mathematical structure of the total system, but instead regarded as prior information, to be entered in this manner. In addition to simplifying the implementation, this also, and more importantly, removes the otherwise necessary fitting and testing of a number of mathematically feasible but physically impossible combinations of components. Since causality usually follows from the construction of the process to be modelled, it would be a waste not to use this prior information.
90
Practical Grey−box Process Identification
Components are connected automatically based on the names of their output; each output variable is connected to all input with the same name. Activating a component causes the otherwise constant input of its target(s) to be replaced with the component output signal(s), when the model is executed. Thus, a component may connect to several targets, and may receive input from several sources. Since the arrangement of components in the list determines the (reversed) order of execution, it is possible to connect signals only to components placed higher in the list. Any connections in the opposite direction (downwards in the list) are established through the state variables, again automatically by their names. This means that, basically, systems involving dynamic feedback loops can be modelled using components inside the loop, but those having algebraic loops cannot. The reason is that states do not depend directly on the input, but will first have to pass an integrator (the ODE solver), before their values affect the next execution of the component sequence. If other component output with a direct dependence on the input were fed back, that would create an algebraic loop, and it would require iteration over the component sequence to resolve such loops. However, MoCaVa provides a way out for such algebraic loops that can be approximated with ‘fast’ state−equations. Hence, it is still admissible to feed back component output with direct dependence on input. Any algebraic loop created in this way will introduce implicit states with time constants fast enough to reach steady state within one time quantum. This creates a ‘stiff’ ODE which is resolved by the integrator in IdKit. Notice again, that an algebraic loop within a component is not allowed. Since component equations are defined by assignment statements and not by equations, it is not possible to write such a component. The top component may execute alone. Other components may execute as long as all their target components are active. It is possible to write the whole model as a single component. However, if the prior knowledge of parts of the object is uncertain, it is convenient to place the uncertainties in separate components. The advantage of this is that the model may be changed easily (by making another component active) in order to try several alternatives for a physical phenomenon or sub−unit with uncertain description. This also takes care of the housekeeping of the large number of alternatives that may play a part in the calibration session. Creating the components may require some thinking and key strokes, but may then, by combination, generate many more model classes, and it is the handling of the latter that is automated as far as possible. The calibration session works by a series of expansions and tests of tentative model classes and structures (see the Pilot window). The ‘root’ model class is the one that is tried first, and should preferably remain unchanged during the calibration session. It contains the top component, and possibly other components. It should therefore contain only relations that can be trusted a priori, like those based on conservation laws. If the ‘root’ model class needs to be expanded in order to describe the sample data adequately, this is done by augmenting more components. 4.4.2 Entering Component Statements MoCaVa opens the Component function window (Figure 4.11). The window expects state equations and algebraic equations describing the component in the form of a subset of MATLABX M−file statements. Click OK to close the window.
4 Calibration
91
Figure 4.11. Window for editing state equations
Figure 4.12. Window for editing start conditions
If the function is dynamic, i.e., there are state variables, MoCaVa opens a second function window for specifying initialization of the state variables (Figure 4.12). The simplest initialization model is to assign constant values to the states. Notice that component functions may use for statements to generate variable array indices. This makes it possible to write models that process arrays. Click OK. Help: Component Function Window Enter known or hypothesized relations between arbitrary named arguments using a subset of MATLABX M−statements. The following restrictions apply: : Statements may contain elementary algebraic, for, if, else, elseif, and end statements, and the transcendental functions: sin, cos, tan, asin, acos, atan, exp, log, log10, sqrt. For−loop ranges may be either numbers or have symbolic names (but not expressions). : The statements may contain calls to user−defined functions with the same syntax as in M−files, i.e., [out1,out2,...] = MyFunction(in1,in2,...) The brackets [] must be present, even with a single output. : The statements may involve only scalar arguments or explicitly indexed vector elements, for instance inletpressure(i). Arguments must not contain underscores (_). : Comment lines (with % in the first position), blank lines, and continuation symbols (...) ending lines are allowed. Statements do not have to end with semi−colons. : There is a single addition to the MATLABX statements: A time−differentiation operator D to be placed as prefix to a state variable. Hints Notice that x^2 and x^y would be illegal. They must be written as the equivalent expressions x*x and exp(y*log(x)).
92
Practical Grey−box Process Identification
Vectors can be manipulated using for statements. It is also feasible to handle matrices, even if not as convenient as in MATLABX. For instance, a linear state−vector model dx/dt = Ax + Bu may look like this: % Linear state−vector model: for i = 1:n Dx(i) = 0 for j = 1:n ij = i*n + j − n Dx(i) = Dx(i) + A(ij) * x(j) end for j = 1:m ij = i*m + j − m Dx(i) = Dx(i) + B(ij) * u(j) end end
Avoid writing iterative loops in a function definition. The result of fitting a model whose response depends on a variable number of iterations is unpredictable. In order to prevent iterative loops the while statement has been excluded. Do not try and outsmart this restriction, for instance by manipulating the for−loop index or range, or by using an if statement with a condition that depends on some error criterion to terminate the loop. Because of the different syntaxes of the MATLABX elseif and the C else if statements, complex conditional statements involving elseif are somewhat hazardous. Preferably, stick to if and else. MATLABX and C will also interpret expressions involving integers differently. For instance C evaluates the expression 1/2 to an integer, which is 0. The ratio must therefore be written 1./2. to mark that the result should be a real number. 4.4.3 Classifying Arguments MoCaVa opens a number of windows to allow the user to enter further qualifications and prior information about the object. The first one is the Argument classification window (Figure 4.13). It requests a classification of the arguments defined in the function, according to the rôles they will have in a simulation of the model: : E and P are the dependent variables of interest outside the component. Select Response. : K1, K2, K3, A1, A2, A3 are known constants. Select Constant. : f is fuel flow, i.e., an input variable whose value is determined externally. Select Feed. : u is control valve position, again determined externally. Select Control. : TD, TR, A4, and initstate are constants, whose values are uncertain and possible candidates for estimation. Select Parameter. : distE and distP are unknown input, either constant or varying. Classify them tentatively as Parameter, since it might be enough to have non−varying disturbances in the model. Click OK.
4 Calibration
93
Figure 4.13. Classifying component arguments
Help: Argument Classification Window Specify the class of each one of the listed arguments in the component. The class determines how the argument is to be interpreted by the calibration and validation programs. There are four classes of component output: : State derivatives: Time derivatives of states. Their classification is determined automatically from context and not changeable from this window. : Signals: Variables that will be input to other, downstream components. They are classified automatically. : Responses: Output that are of interest outside the component, but are not Signals. They may be connected to sensors (through a library function), or fed back to upstream component(s), or just used for display. : Internal: Auxiliary arguments that are used in the statements defining the component functions, but whose values are not of interest outside the component. There are nine classes of component input: : States: Arguments whose values are predicted from the previous call. Their classification is determined automatically and not changeable from this window. : Feedback: Arguments whose values are Responses of downstream components. They are classified automatically. : Parameters: Arguments that are basically independent of time but may change between passes. Some or all may be subject to fitting. Alternatively, they function as
94
Practical Grey−box Process Identification
targets for the output of other components, and can in this way be made to vary with time, when a constant value has turned out to be inadequate. : Control and Feed are known input to the component, whose values are determined by a source model. The latter may be another component, or else a standard routine converting input from a data file. Control and Feed differ in three ways: 1) In their physical interpretations: Control input model known input (with negligible error), typically set points from digital controllers. Feed input model other input from known sources, typically other units or components. The corresponding data (if any) may have measurement errors. 2) In their conversion of input data: This is done by selecting from a menu of standard library functions for interpolation between discrete−time data. The admissible models for Control and Feed differ, in order to adapt to the different properties of the expected data sequences. In particular, Feed may need to filter input data. 3) In the way the graph of connected components is laid out: When the source is another component the Feed terminals are placed on the right side, and the Control terminals on the bottom side of the (target) component. Control and Feed input without a source (data or active component) are treated as Parameters. : Disturbance: Arguments whose values are generated by library models of stochastic processes. : Constants: Arguments that are constant in time and may not be subject to fitting. The difference between Constants and such Parameters that are not fitted, is that Constants will not appear in lists of parameters displayed to the user. The only way to change their values is to modify the component definition. Constants may be used as targets, in the same way as Parameters, Feed, and Control. : Time and Start time: Arguments to be interpreted as physical time. Time arguments appear only in a component whose behavior depends on a clock reading. Time is synchronized to the time variable in the data file. Clicking Cancel takes you back to the window specifying the component equations. This is the typical response when you find a variable in the list that is a misspelling, or which you do not think should be there for some other reason. Hints The argument classification can be used to model the object in a way that suits physical intuition and makes the graph of the system of connected components look familiar to process engineers: : The Feed classification is used mainly to visualize the operation of industrial production processes, consisting of separate units, each one accepting flows of commodities from other units, modifying their properties, and feeding the result to the next unit(s). In this interpretation, a component models a physical unit. In the graph any Feed input enters the component from the right. : The Control classification is useful for such variables that affect the operation of the component. In the graph any Control input enters the component from below. : When a Parameter input to a component is connected to the Signal output of another component, this is visualized by placing the source component inside the target. In this interpretation a component is a sub−model describing an internal physical phenomenon, that may or may not be important for the modelling. Finding this out is done by testing whether a constant parameter value (Dormant sub−model) will do almost as well as the (Active) sub−model in describing the model response.
4 Calibration
95
Figure 4.14. Specifying I/O interfaces
: Constants are typically established physical data, their values obtained from hand-
books, or fixed attributes of the object, like volumes. They may also be used to create ‘stubs’, i.e., points where to enter modifications of the model. Thus, adding a zero constant, or multiplying with a unit constant, creates such stubs. Making a component for a modifying sub−model and connecting it to the stub is a way to refine the model class without having to change any other component. Note: Library functions may use one more class of input and output arguments, namely discrete−timestates. They are transparent to the user, but needed for describing the conversions between discrete and continuous−time variables. 4.4.4 Specifying I/O Interfaces
MoCaVa opens the I/O interface window (Figure 4.14). It handles the specification of three kinds of interfaces to external variables, including conversion functions, when needed: : Connections to sensors: This causes the inclusion of a library routine defining the transition from continuous variables to sampled data. The user is asked whether or not the variables have sampling sensors attached to them. The option applies to arguments classified as State, Response, Feed, or Disturbance. : Source models: This causes the inclusion of source models generating continuous input from a library routine interpolating between the discrete−time data, in order to define the values of the continuous−time input between the sample points. The user may choose between a number of interpolators, or else indicate that the input will be generated by another component. : Disturbance: This causes the inclusion of library ‘environment’ models. The user must choose between a number of standard models for generating random input with different characteristics. Notice that it is possible to have a sensor attached to a Feed input, thus making it an output of the model. That is in agreement with the actual circumstances when the actual process input has been logged by the computer producing the data file, regardless of whether the source is an external process, or whether the input signal is generated inside the computer and then passed through an actuator before its values are logged.
96
Practical Grey−box Process Identification
Hence, if the actuator is not ideal, it makes sense to have both an interpolator and an sensor attached to the same process input. Using the option of assigning sensors to the Feed input may be useful in cases were there is no reliable information about the source of the input, except that the values have been recorded, probably with measurement errors, and possibly irregularly. In the DrumBoiler case, however, the input is known without measurement errors (it is assumed to have been issued by a control computer). The obvious choice of interface would therefore be the Hold model. In a case where the interpolation routine is not as clear−cut, it is usually better not to choose any of the library functions at this point, but instead let the sources of Feed and Control be modelled by separate components. The point is that this makes it easy to change interpolation rule, without also having to change the root model. In order to illustrate this (even if it would not be needed in the present case), select User model for input f and u. Select Sensor for output E and P. Click OK to accept the selection. Help: I/O Interface Window This window handles the interfaces with one of three types of variables external to the component: : Filed data. : ’Noise’ sequences from a random number generator. : Signals from other components. The first two require specifications of how to do the conversion between the different types of variables. The third type obviously does not need conversion, and neither does it necessarily need a source component to be connected (if not, it will be treated as Parameter and assigned its nominal value). Conversion specifications (if any) are requested for arguments with the following classifications: : State, Response, Feed, and Disturbance allow inclusion into the component of a sensor model sampling the continuous−time values in the model. There are two options, Sensor and NoSensor; the choice depends on whether or not there is a corresponding variable in the sample file, that one wants to use for prediction and/or fitting. : Feed input allows inclusion of selected routines describing relations between filed discrete−time data and the continuous−time input to the model. There are six options, either for interpolating or filtering data in the sample file, or else indicating that the input is to be a signal from another component. : Control input allows inclusion of selected routines describing relations between filed discrete−time data and the continuous−time input to the model. There are four options, either for interpolating between data in the sample file, or else indicating that the input is to be a signal from another component. : Disturbance input allows inclusion of stochastic models for the environment. There are three alternatives. The choice depends on the general character of the disturbance. Notice that the conversion function is basically a part of the component. However, even if the source of a component input is a data file, it may be better not to include a conversion function. Instead, it is possible to select User model, and then create a separate component to do the conversion. Since it is generally not evident how to interpolate between discrete data, in particular if the latter are also contaminated by measure-
4 Calibration
97
ment errors, it may be necessary to try several interpolators. Placing them in separate components makes it convenient to try alternative input conversions. The following are some comments on the definitions and properties of the standard routines for conversion between variables and data. They implement the algorithms in Section A.9. Output Conversion Functions The Sensor model samples variables with classes State, Response, Feed, or Disturbance with an interval equal to the time quantum, and adds a Gaussian random error. In case the data is sampled with a lower frequency, or some data points are missing, the sensor output will be ignored. Notice that also Disturbance and Feed input may have sensors. Both Feed and Disturbance are naturally regarded as input to the object of the modelling. However, they are also (continuous−time) output of the component, since the latter includes descriptions of input conversion (by library functions). The corresponding input to the component are discrete−time filed data and random numbers. The option of modelling input sensors provides a solution to the well−known problem of ‘input noise’: An estimate of the ‘true’ input is generated by a standard input filter fed with the noisy data, and then a Sensor function attached to the filter output provides a coupling to the same noisy data, which allows estimation of the filter parameters as well as a check of whether the right input filter has been selected. (If you dislike the idea of using the same data twice, you may classify the unknown ‘true’ input as Disturbance instead. This would be logically correct, since the input is in fact unknown. However, it is reasonable to expect that a stochastic disturbance model would not perform better, since it does not use any information in the data, in particular the presence of large transients. It will also make the calibration more time consuming. When in doubt, you can always try both ways). See Section 2.3.5 for further discussion on the subject. Data Input Conversion Functions The Hold model makes a stepwise constant function from the data values, continuous to the right. The FirstOrder model outputs the response to a first−order linear model with unit low−frequency gain and stepwise input. It is a reasonable choice when the response time of a stepwise driven actuator is unknown. The SecOrder model outputs the response to a second−order linear model with unit low−frequency gain and stepwise input. It is a reasonable choice when the actuator may overshoot or its step response has a continuous derivative. The Linear model makes a linear interpolation between the data values. The Delay model is an approximation of a delay function with a (possibly unknown) delay time, which need not be an integer number of the time quantum. The response is exact for ramp input, and a good approximation for processes with long response times. The LPFilter applies a first−order linear digital filter to the data, and then makes a continuous−to−the−right stepwise constant function from the filtered values. It is useful for suppressing input measurement errors. The NLFilter model is designed to eliminate a drawback with the linear filter, namely its tendency to respond slowly to large step changes, when the measurement error level is high and therefore calls for a long time constant of a linear filter. The nonlinear filter uses a nonlinear gain function to reduce the response only to changes below the noise level (instead of the constant ‘smoothing’ of the linear filter). Thus it provides
98
Practical Grey−box Process Identification
a faster response to changes above the noise level, while preserving the noise−filtering effect of the linear filter. Disturbance Functions In order to select a model for disturbances it helps to have at least an idea of the general character of the actual physical disturbance. If nothing is known about its source, and no plot is available, one may have to try several models. Try first the simplest model, the Brownian, and take a look at the estimated disturbance that results from simulating the model. This may reveal if the disturbance has any of the general characteristics assumed by the alternative library models. The Brownian model accumulates Gaussian random numbers, and makes a linear interpolation between the discrete values to create a continuous−time approximation of ‘Brownian motion’. It has one characteristic parameter, the ‘average drift rate’. Its general appearance is random drift with no given direction and no attraction to a ‘zero level’. It is a very robust disturbance, behaving reasonable well for most irregular, low−frequency variation, including infrequent and large random steps. The Lowpass model makes a step−wise constant function from Gaussian random numbers and then applies a first−order linear filter. The result varies randomly around zero, and has little power in frequencies above its bandwidth. It has two characteristic parameters, namely ‘bandwidth’ and ‘rms−value’. It is suitable for modelling low−frequency disturbances with the same general appearance throughout the sample, without a pronounced periodicity. The Bandpass model makes a step−wise constant function from Gaussian random numbers and then applies a second−order linear filter with a pair of complex poles. It has three characteristic parameters, namely ‘rms−value’, ‘frequency’, and ‘bandwidth’. Its general appearance is random variation with a dominating frequency, more or less pronounced. The dominating frequency must be well below the Nyquist frequency. It is suitable for modelling waves, effects of slow and poorly damped control loops, limit cycles, clock−depending environmental effects, and possibly vibrations, if the time quantum can be made small enough. Note: Functions using only discrete−time state variables are less costly than those using continuous−time state variables. Thus the LPFilter and NLFilter take less computing than for instance the Linear interpolation. The reason is that continuous−time states add to the overall order of the model, which affects the computing time approximately by the second power of the order. Note: There is no ‘white−noise’ model in the library. The only way to model high− frequency disturbances into a component is to use the Lowpass model with a bandwidth well above the relevant time constants in the component. The ‘no−white−noise’ restriction is introduced to prevent unexpected consequences of feeding white noise into a non−linear model. 4.4.5 Specifying Argument Attributes MoCaVa opens the Argument attributes window (Figure 4.15). This window is one of the main entry points of the user’s prior information (together with the Component function window). If possible, replace the default values with better ones. : The nominal values for the input f and u have been set to approximate the sample averages. Nominal values for the rms values are much larger than the expected measurement error. The reason for this initial guess is that it anticipates that the first
4 Calibration
99
Figure 4.15. Editing argument attributes
model structure is wrong, and that therefore the total model error will be much larger than the measurement error. : The start value initstate has been given an implicit nominal value initstate, since it is a vector, while scaleE and scaleP are implicit because they are common to several arguments. : The simple expressions for scales and nominal values of the rms values derive from the Sensor library function. The unit factors specify that both the scales and the nominal values equal the scales of the measured variables. They may be edited to lower values. : The Min and Max values specify that all physical parameters have positive values, and that A4 is limited to the range (0,1). : The only arguments not created in the Component function window are rms_E and rms_P. They originate in the Sensor library function. After editing, the window for the DrumBoiler component should look as in Figure 4.15.
100
Practical Grey−box Process Identification
Figure 4.16. Specifying implicit attributes
Click OK to accept the values in the table. Help: Argument Attributes Window This window is one of the main entry points of user prior information (together with the function window). If possible, replace the default values with better ones: : Short description will be appended to the name of the argument in most communication with the user. The text string may include spaces. : Dim is the dimension of the array. Dimensions are either editable or un−editable. Editable vales must agree with the dimensions implicit in the component definition. Un−editable values either correspond to scalars, or else means that the dimensions of the arguments have been set in previously defined components. : Scale is mainly used for setting error levels in numerical approximations. It also determines the default scale in plots. Scales of Parameters have a third use: they are instrumental in setting the likely range of search for parameter estimates (prior probability distribution in the Maximum Aposteriori Probability criterion, Section 2.4.1.2). : Nominal values of parameters determine the start values in a potential search, and the values of known parameters and constants. : Min and Max values set ranges of admissible values (optional). Scales and nominal values may be specified implicitly by entering simple expressions of one of the forms a, b*a, label, label*a, or label(i)*a where a and b are numbers, i is an integer number, and label is the symbolic name of an array of constants, its values to be defined later. Implicit attributes are useful when several variables have the same scale, and are necessary when an argument is an array whose elements have different scales or nominal values. Simple expressions are useful when different arguments have strong and obvious physical relations. 4.4.6 Specifying Implicit Attributes MoCaVa opens the Implicit attributes window for entering numerical values (Figure 4.16). The initial state is unknown, try zeroes. Click OK. 4.4.7 Assigning Data MoCaVa opens the Data assignment window (Figure 4.17). Pull down the menu of variables in the data file to find those that correspond to the model variables.
4 Calibration
101
Figure 4.17. Assigning data
The units of logged data often differ from those used in the model of the process proper. Therefore, and in order to eliminate errors caused by confusion of units, it is generally a good idea to use scale converting factors at this point: data = Factor * model output. When a model variable is a vector, then data for all entries in the vector must be stored in consecutive columns. In that case the first of the columns should be selected. The same factor will also be valid for all entries in the vector. In the present case there is no units conversion. Click OK. This ends the definition of the DrumBoiler component.
4.5 Specifying Model Class MoCaVa opens the Model class specification window (Figure 4.18).
Figure 4.18. Specifying model class
It shows a list of defined components, so far consisting of a single item. A tentative model class is specified by making a selection of components as either Active or Dormant. Selecting Show and clicking OK will display the component statements. Click Graph. MoCaVa draws a block diagram in the Model window of the current tentative model class, so far consisting of a single box (Figure 4.19). The boxes show only the input and output arguments involved, their classification, and, in graphs with more components also their connections to other boxes. Output are listed to the left and input to the right. Help: Model Class Specification Window A tentative model class is defined by the following items: 1) A number of component definition directories (in casedir\Clib). 2) A system definition file (casedir\activity), indicating which components that are currently selected. The latter is set through this window: Mark the components of your choice to become Active, and the other Dormant. The components will connect automatically.
102
Practical Grey−box Process Identification
Figure 4.19. Trivial graph of a single component
Selecting Show for one or more components (and clicking OK) will display the statements defining them in a separate window. This window stays until it is closed manually. The decision buttons indicate a number of logical ways to proceed: : OK will accept the model class as the tentative class and proceed to fitting and falsification. : Simulate is a less bold decision. It causes the predictor defined by the model class with nominal parameter values to be simulated, and the predictions to be compared with the data sample. : Graph will generate and display the graph corresponding to the current selection of active components. It is useful for checking the connections in a model class consisting of several components. : Edit will open the Component library window for defining new components or changing or deleting old ones. : Advanced will open a window for activating a number of tools for enhancing the processing speed in cases of large models, bypassing some of the default user’s check points, changing the time quantum, or debugging the user model. : Suspend will suspend the session temporarily and return control to the MoCaVa window. You must also click Exit in the latter window to return to the MATLABX command window. Restarting Calibrate will open a window where the user may choose between Resume and Reset (see Section 4.19). Hints The graph (in the Model window) shows a block diagram of the current tentative model class. The boxes show only the input and output arguments involved, their classification, and their connections to other boxes. The colours of the argument names indicate the class (on the colour screen), and arrows indicate connections to other boxes or to the data file. The convention for positioning the blocks is that Feed input enter from the right, while Control input enter from below. The graph is laid out to support the modelling of industrial production processes, comprising chains of units modifying the properties of input commodities and feeding the product to the next unit. Boxes within another box connect to parameters of the receiving box and indicates a refinement of that box; the constant parameters have been replaced by the output of the boxes inside. A new graph will be created whenever there is a change in a component or activity status.
4 Calibration
103
Figure 4.20. Editing parameter origin
The graph is constructed automatically from the model class specifications (as opposed to the case in SimulinkX). This means that in more complicated cases a box may overlay some connecting lines between other boxes. One can see this by the fact that the box has no terminal (arrow head) at places where connections disappear under the box and reappears on the other side. Notice that the terminals are either input or output. If you do not want to see a new graph all the time (and wants to save some time in complicated cases), click Advanced and have the display suppressed.
4.6 Simulating A cautious user will try out a new component, before building a complete system from this and other components. That is possible to do, since all input have been (tentatively) classified as either Parameter, Feed, or Control, and given nominal values. The ‘model’ will simulate badly in this case, since without a source component Feed and Control are constant, while the corresponding data are variables. However, the step may reveal any bugs in the newly written component. In more dubious cases than the present there is another reason for simulating as soon as a new component has been created, namely that some of its input may change very little, or their variation may have little effect on the output. In the interest of parsimony it is often worth while to investigate the hypothesis that a constant input will do. Click Simulate. 4.6.1 Setting the Origin of the Free Parameter Space MoCaVa opens the Origin window (Figure 4.20). The “origin” is the parameter values used in the predictor simulation that follows. The default values are either the nominal values entered into the Argument attributes window in Figure 4.15 (¸ the ‘true’ values), or else values previously fitted to data. The list is termed “origin”, since any variation in the parameter values (for fitting or other purposes) is done around the values in the list − in the “free parameter space”. Click OK.
104
Practical Grey−box Process Identification
Figure 4.21. Selecting variables for plotting
Help: Origin Window The values displayed in the table are those of the best of the significantly better alternatives so far to the current tentative model (except at the start of a session, or the activation of a new component, in which cases they are the nominal values set when the component was defined). They are used in simulations, and normally also make a good starting point for the fitting of a new tentative model. You may overrule the recommended values, if you have information that suggests otherwise. Press Export to create and save current values to an ASCII file. Press Import to enter parameter values from a prepared ASCII file, instead of doing the editing manually. In order to allow the Import option the file containing the values to be imported must be any of : An ASCII file with records of the following type: parametername value1 value2 ... : An M−file with records (possible commented): parametername = [value1 value2 ...]; where value# are valid scalar real numbers. Pressing Export creates a file of the second type with address casedir\status\parameters. Notice that only the values may be edited. (Changing a short description in this window will have no effect. It has been placed in a field that allows editing only because this will allow sidewise scrolling of long text strings). 4.6.2 Selecting Variables to be Plotted MoCaVa starts the predictor simulator, and, after finishing, opens the Plot specification window (Figure 4.21). Check the boxes of the variables to be plotted. Default are variables associated with data and residuals (one−step prediction errors). You may also specify a smaller time range to be displayed.
4 Calibration
105
Figure 4.22. Response of DrumBoiler model with zero input
Figure 4.23. Standard deviations of prediction errors
Click OK. MoCaVa displays the result in two windows: The Plot window (Figure 4.22) and the Model class appraisal window (Figure 4.23). 4.6.3 Appraising Model Class There is no obvious error (except that input are constant and start values are wrong). The model obviously needs the input. Click Reject. Help: Model Class Appraisal Window Study the plots to appraise the model class with default parameters. Continuous curves are model output and discrete points are data. : Click Accept if you believe the model class will hold a tentative model, provided some parameter values are fitted.
106
Practical Grey−box Process Identification
: Click Reject if something is obviously wrong with the model class. This will take you back to the Model class specification window to set up another model class.
: Click Simulate if you want to see whether other parameter values will help. This will take you back to the Origin window.
: Click Layout if you want to display other variables or limit the time range in order
to obtain better time resolution. This will take you back to the Plot specification window. : Click Rescale to change the scale of a variable for better amplitude resolution. This will allow you to click on the y−axis graduation, in order to open a window for changing the scale attribute of the variable. The new scale will be retained until further manual scaling. The prediction errors are expressed in percentages of the rms−values of the output. Hence a ‘null’ model (all predictions zero) will have 100% prediction error.
4.7 Handling Data Input MoCaVa opens the Model class specification window again. Click Edit to indicate that the model does not have enough components. MoCaVa opens the Component library window again to allow you to expand the library. Select Insert and click OK in order to add a component handling the control input. MoCaVa opens the Component naming window. Give the second component the name Control. MoCaVa opens the Component library window, now containing two components. Mark Change for the new, empty component. MoCaVa opens the Component function window to receive M−statements. The purpose of the component is to describe the connection between the continuous control signal u and the corresponding values in the data file. Since the user−defined functions only allow continuous variables, a library routine must be used to handle the conversion from discrete−time data to continuous−time input. Now, a component must have an output, which means at least one assignment statement. In the current case the output is a signal that will replace the Control input u in the DrumBoiler component, once it has been connected. The minimum component is therefore u = uc, where uc (continuous−time) is the output of a library routine interpolating between the data points. (If the input conversion were included in the DrumBoiler component instead, the auxiliary uc variable would not appear). Click OK. MoCaVa opens the Argument classification window. Select Control and click OK. MoCaVa opens the I/O interface window to receive the user’s choice of library routine. Select Hold and click OK. This will cause the library routine modelling the selected interpolation mechanism to be included. MoCaVa opens the Argument attributes window. Edit default values, if necessary, and click OK. MoCaVa opens the Data assignment window. Again there is an option for units conversion: data = Factor * model input, but the DrumBoiler case does not need to use units conversion.
4 Calibration
107
Figure 4.24. Graph of the root model class
Pull down to find the data that corresponds to uc and click OK. This ends the definition of the Control component. MoCaVa opens the Model class specification window, containing two components. It would be possible to simulate and plot again, to see how much it helps to have added as source for the Control input. However, to shorten the proceedings somewhat, first provide for the Feed input. Create a Feed component in the same way. MoCaVa opens the Model class specification window, containing three components. All components are Active, indicating that they are connected to form the model class. Click Graph. MoCaVa adds two new boxes in the Model window (Figure 4.24). In this graph of the three−components system the former parameters u and f have been replaced by input terminals connected to the output of components generating variables u and f. The latter have input connections to the data file. Click Simulate again, and OK twice. MoCaVa shows the result in Figure 4.25. This looks better. There is nothing obvious in the plots that cannot possibly be amended by some fitting of parameters. The obvious first candidates are the start values. Click Accept to acknowledge the first tentative (‘root’) model class consisting of three connected components.
4.8 Fitting a Tentative Model Structure MoCaVa opens the Tentative structure window (Figure 4.26). Again, “structure” means “class” with a given selection of free parameters. Select the parameters to be fitted by checking the boxes. By default, the standard deviations of measurement errors are indicated, since they are notoriously difficult to know in advance. In addition to the ‘true’ measurement errors, they generally have to account for all modelling errors in a fitting. Free also the two elements in initstate, since it is evident from Figure 4.25 that their zero values are far off. Click OK.
108
Practical Grey−box Process Identification
Figure 4.25. Responses of the ‘root’ model with nominal parameter values
Figure 4.26. Selecting free parameters
MoCaVa opens the Origin window again (Figure 4.27), this time to specify start values for the search for optimal values of the free parameters. Click OK to acknowledge the default parameter values. 4.8.1 Search Parameters MoCaVa opens the Search specification window (Figure 4.28). It allows the user to change the design parameters in the search routine. Click OK to accept the default values and start the search.
4 Calibration
109
Figure 4.27. Setting start values for the search
Figure 4.28. Setting search parameters
Help: Search Specification Window Both the determination of the best model within the tentative structure, and the determination of alternative model(s) to compare with the tentative, use the same search routine. A difference is that in the first case the search is to go on until convergence, while in the second case one or a few iterations may achieve what is needed to reject the tentative model and point out a better alternative. The default maximum number of iterations is therefore set differently in the two cases, namely to 1, if the test satisfies the “nesting” condition, otherwise to 16. Basically, the search routine is a modified Newton−Raphson procedure, which normally requires few iterations, but may have trouble when the loss function differs much from a quadratic form of linear expressions in the free coordinates. It may therefore need some ‘coaching’ from the user in difficult cases, and the listed design parameters are the means to control the search. Hints The search is time−efficient for the loss function used for the fitting, since it uses a non−negative definite approximation of the Hessian, computed from only first−order derivatives of the model residuals. For parameter values far from the optimum, and large variations in the residuals’ sensitivities to the free coordinates, the estimate of
110
Practical Grey−box Process Identification
the Hessian may however deviate much, causing the steps taken towards the minimum to be inefficient or even counter−productive. For structures such that the residuals are affine (linear) functions of the free coordinates, the search converges in one step, but normally it takes more. The number it takes depends on the search parameters, and the setting takes some skill in difficult cases. The following provides some guidelines: : The Maximum number of iterations: In the beginning of the search (when parameters are far from the optimum) it may be wise to use few iterationss (two or four), in order to check the start−up. This causes the search to be halted, and gives the user a possibility to modify the search parameters and restart (click Reject), or else to continue (click Unfinished) if things seem satisfactory. In difficult cases, one may even maximize to one iteration, in order to have user control over the search all the way. A quick way to change the maximum number of iterations is to use the buttons [*2] or [/2] for doubling or halving the displayed number. : The Step reduction factor: The direction of a step never points in a direction where the loss function increases. However, the length of the step may be wrong in two ways (since it is estimated from an uncertain estimate of the curvature of the loss function): 1) It may be much shorter than optimal, recognizable from the fact that neither the steps nor the convergence indicator values change much from one iteration to the next. The loss function reduces with about the same (small) decrement. 2) It may be substantially longer than optimal, recognizable from the fact that successive iterations will change sign to step back. This is indication of ‘overshoot’. If the iteration is much longer than optimal, the overshoot may even be so large that the loss will increase. Modifying the step reduction factor is a means of rectifying this. : The Step limit: This sets a limit to the step length. The search coordinates are normalized (using the Scale values of the parameters) in such a way that step lengths larger than one should be unlikely. Hence, much longer steps indicate that the search may be going astray, and possibly out of the region of attraction to the optimum. The idea is that in cases of multiple minima, the search should be drawn to the one closest to the start value. It would therefore be reasonable to lower the step limit in difficult cases, and accept the risk of an increased number of iterations. : The Regularization parameters: Ill−conditioned cases (most often caused by freeing too many parameters simultaneously) cause the Hessian to be near singular and the steps to deviate much from the steepest gradient direction. Positive regularization parameters reduce the degree of singularity of the estimated Hessian. The two design parameters r 1 and r 2 cause a constant value Nr 2 to be added to the diagonal elements of the Hessian (where N is the number of data records) and then to be amplified by a factor 1+ r 1. The effect is that the search direction will be drawn towards the steepest gradient direction, with a strength given by the values of the parameters. Well conditioned cases are not effected by small values of the parameters, and that is the reason why the default values are 0.001. Larger values (up to 1, and more) slow down the search, and it is generally better to reconsider the number of free parameters. : The Overshoot limit: The loss value is normalized in such a way that its statistical uncertainty is about one. If the loss value would increase much more, this would be a clear indication of a large overshoot. The overshoot control will then reverse the direction, and take half a step back automatically, and continue to do so, until the limit is satisfied. This is a safety measure for the case that the user has left the
4 Calibration
111
search control to the computer (by specifying a large maximum number of iterations). : The Prior weight: This sets the factor α in the term ½ α θ 2 in the loss function (Equation 2.26), and thus reflects the user’s confidence in the origin of the free parameter space. Notice that this also has an effect on the Hessian; a zero weight may have to be compensated by a positive regularization r 2. However, the parameters are not interchangeable; a positive weight changes the optimum, a positive regularization does not. : Generally, manual search control is one of the points where a user will have an opportunity to exercise his/her (hopefully increasing) skill in dealing with difficult non−linear model structures. The auxiliary displays evoked by setting the Logging index constitute the data from which one can appraise the success of the search. Admissible values are 0: No auxiliary printout 1: Printed are Loss, Convergence indicator, Free coordinates 2: Printed are Loss, Convergence indicator, Free coordinates, Loss gradient, Hessian, Free coordinate changes. As a rule of thumb, a smooth search should reduce the convergence indicator with about one unit per iteration, at least after some initial and transient steps. Less smooth searches may still converge, but take longer. : In cases where the routine has difficulties converging, you may also try the step− adaptation routine. You activate it by checking the Adaptation box. 4.8.2 Appraising the Search Result MoCaVa compiles and executes the C−based fitting routine and displays the result in the Search appraisal window (Figure 4.29). The record shows that loss decreases rapidly, and the convergence indicator gets negative after four iterations. Click Plot to get more information to base the next decision on (Figure 4.30). The model predicts poorly, but the fitting seems to work. Click Accept in the Search appraisal window. Help: Search Appraisal Window Look at the search log and make an assessment of the search: : Click Accept if the search has converged (convergence indicator < 0) and nothing is obviously wrong with the model, taking into account the restrictions of the model structure (It is the search and not necessarily the model structure that is ‘accepted’). : Click Reject to restart the search, for instance from a new origin or with new values of the search routine parameters. Even if the search has converged you may click Reject if you suspect you have reached a local minimum, or just want to make sure by trying a different start. : Click Unfinished if the search still looks promising, but needs more iterations, and possibly better values of the search routine parameters. : Click Confirm to restart the search from the point where it stopped. This may be useful if you have assigned a prior weight on deviations from the origin, thus forcing the search to compromise between the origin and the Likelihood maximum. Large deviations from the origin will cause a warning of unsuitable parameter scaling.
112
Practical Grey−box Process Identification
Figure 4.29. Search score
Figure 4.30. Responses of the root model class with fitted start values
Confirm can be used to amend that, without actually doing a rescaling. Since the search will be started from an origin closer to the unweighted optimum, the prior loss will reduce, and vanish after a few Confirms (see also Hints).
4 Calibration
113
: Click Plot if you want to see the standard variables (data input, sensor output, and residuals) before deciding.
: Click SelectPlot if you want to see also other variables. : Click Customize to change some default design parameters, in particular the increments used in numerical differentiation with respect to states and parameters.
Hints The absolute loss values are difficult to interpret directly; only the differences matter. Generally, loss differences smaller than one are insignificant (the loss is scaled in that way). A better interpretation is provided in the Test appraisal window showing the outcome of testing the fitted model. The free coordinates are those used in the search. They are dimensionless functions of the values and attributes of the physical parameters (origin, range, and scale). Normally, the sizes of the coordinate values should not be much larger than one. If they are, a warning is displayed. This suggests one of two things: either i) the scale of the parameter corresponding to the largest coordinate is too small, or ii) its origin is too far from the optimum. The second case can be checked by re−running the search starting from the optimum. The Confirm button is for that purpose. Only if that does not help will it be necessary to increase the scale attribute (Click Accept, NewClass, Edit, Change, etc., until you get back to the Argument attributes window to change the scale). If the search log would show a long sequence of positive and unchanging values of Conv (indicating nonzero gradients) followed by unchanging values of Loss, this may be due to rounding errors caused by a too small increment. Try the Customize option and increase DELTA. However, also too large an increment may cause similar problems. The values of the physical parameters are shown together with the estimated standard deviations for those that have been fitted. However, they are reasonable measures of significance of the estimates only if the model class is right. Again, a better assessment of significance is provided by the Test appraisal window.
4.9 Testing a Tentative Model Structure The next step is to try and falsify the tentative model structure by falsifying the best tentative model (just obtained). In preparation for this, one or more alternative (and preferably better) structures have to be conceived. For this purpose MoCaVa opens the Alternative structures window (Figure 4.31). (The window has also an alternative form, allowing more options, and suiting more complex cases). This is a point where the user must contribute ‘engineering sense’ in order to suggest amendments to the discrepancies seen in the model response. At least two things are wrong in Figure 4.30: : The power transients are wrong, as seen most clearly from the sequence of power residuals. : There are drifting disturbances in both the output. It is reasonable to assume that fitting some or all of the parameters that affect the responses directly will help. The free parameter spaces in this window are used to set up the finding better values. A “free parameter space” is an array of dimensions of the parameters that the
114
Practical Grey−box Process Identification
Figure 4.31. Specifying alternative model structures
Figure 4.32. The DrumBoiler statements
search routine may use to improve the fit. The dimension of a free scalar parameter is 1, and that of an bounded one is 0. But which parameters to free and how many? The window allows up to eight alternatives, in case it is unclear what to do next. A general strategy would be to free each one of the currently unfitted parameters, thus making as many alternatives (the “Stepwise Forward Inclusion” rule). However, it is often more efficient to use ones prior knowledge of the model structure to decide what to do. Click Show to display the model statements and check the box for DrumBoiler in the window that opens. MoCaVa displays the statements in Figure 4.32. You may close the display after using it, or keep it as a reminder of what you have defined. It seems the easiest to reduce the disturbances first, by fitting the two parameters distE and distP, thus making a single alternative #1. Click [+] twice to indicate this (Figure 4.33). Then click NewDim to initiate the testing. Help: The Alternative Structures Window In order to create efficient tests to the tentative model structure MoCaVa needs specifications of one or more alternative structures for comparisons. Providing such specifi-
4 Calibration
115
Figure 4.33. A single alternative with two more free parameters
cations is the user’s main tool for entering prior information in addition to the component specifications. You may create the alternative(s) in one of two ways: : By expanding the tentative structure within the current model class (= the set of active components). There is room for maximum eight such alternatives, specified by the free−space indices in the eight columns. As long as the number of free parameters of any of the listed parameters (indicated by the current index values) is smaller than the maximum number (indicated under Max), you may free more parameters by increasing one or more indices (click on +). The indices of specified alternatives turn red. For the sake of parsimony, it is recommended that you increase only few index values (normally one), and the same number for each alternative (column). Click NewDim when all alternatives have been set. : By expanding the current model class. Click NewClass. This will open another window for changing the model class. Click Show, if you want to display selected components before deciding. Click Plot or SelectPlot to appraise the response of the current tentative model before deciding. Click Verify, if plottings look good, and you therefore may consider ending the calibration session. This starts a number of unconditional tests based on computing actual correlations between variables that should not be correlated if the tentative model were true. In addition, the appearance of ‘outliers’ among the residuals is tested. Logically, these tests cannot stop the calibration conclusively, since a hypothesis can never be verified by statistical tests. However, it can be falsified, which yields a one−sided stopping rule: It determines when not to stop. Click Advanced, if you want to suppress some or all user’s checking of the computer’s proposals on parameter values, free space, and search parameters. The window will take long to build in cases with many parameters. MATLABX needs the time to generate the large number of [+] and [−] buttons required for minimizing the number of user’s mouse clickings. As an alternative the user may call for a different ‘primitive’ window, taking somewhat more key strokes and mouse clicking to fill, but much less time to generate. This window will also be more versatile, as well as allow more alternatives (up to 16 vector or scalar parameters). Click on Advanced and check the appropriate box. The setting will not take effect until the next time the Alternative structures window is opened. To get it opened immediately, you may use the ‘dummy’ commands of clicking first NewClass and then OK.
116
Practical Grey−box Process Identification
Help: The Primitive Alternative Structures Window Being ‘primitive’ it allows more combinations of alternatives. You can specify alternative model structures in three ways: : Enter the index value (the number of free entries) of a parameter and press Enter. The corresponding index for the tentative structure is indicated by the number of checked boxes. The column will be completed automatically with the tentative values of other indices. Any thus created column may be edited further, in case one would want to free more parameters in the same alternative. Thus, instead of a single clicking each alternative will take two key strokes and one mouse clicking. : Check the boxes to the right and verify the choice by checking one in the top array of boxes for creating a column of dimension values. : Use the “<” macro (for expand). When applied to array parameters this will create a number of alternatives, each one freeing one more of the entries in the array. This implements the “Stepwise Forward Inclusion” rule. For an application see Section 7.6, Figure 7.15. 4.9.1 Appraising the Tentative Model Suggesting alternatives in order to improve the tentative model is a task that obviously needs human support. The evidence presented by the computer are plots of the model’s performance and the parameter values maximizing the performance. The following more or less obvious guidelines may help the interpretation and use of the evidence. The Model Output The obvious thing to do with deterministic models (no Disturbance variables) is to examine how the sensor output of the model match the data. The residual sequences are the deviations. When the model is to be used for purposes of feedback control, it is more important how the response patterns match locally, than whether residuals have a tendency to drift. Hence, the value of the loss is not a good measure of the model’s predicting ability. Only when the purpose is long−range feed−forward control, do also the levels of residuals become important. The Residuals The most important plots in less clear cases are the residuals, i.e., the variations in the response data that are not explained by the current model. The possibilities of improving the model hinges on whether the residuals do contain information that could still be exploited to reduce the unexplained part. This information is of two kinds: : Auto−correlation: Residuals depend on past residuals. : Cross−correlation: Residuals depend on past and present input. A way to exploit the information in past residuals is to look for the physical cause of the deviation, and, if not possible to explain in other ways, introduce a Disturbance variable at the point in the model where the actual disturbance is believed to enter. An easier but less informative alternative is to add fictitious disturbances to the output, where the effects of physical phenomena from unknown sources finally appear. The alternative has a drawback in case the deviations in several measured output have a common cause in the form of a dominating and un−modelled physical phenomenon inside the process or from its environment. Then the fictitious disturbance variables will be unnecessarily many, which is detrimental to the model accuracy. Hence, do not introduce fictitious disturbances until you are convinced that you have really no idea of where
4 Calibration
117
the deviations come from! And do not unnecessarily pre−filter the raw data, or remove bias, trends, and the like! The phenomena behind such behaviour are real and should be taken into account in the modelling, instead of being swept under the carpet as ‘contaminations’. The effect of having disturbance variables in the model is that residuals will now be the errors in a prediction of the sensor output, using past sensor data in addition to input data. IdKit derives the predictor from the disturbance model. This means that the residuals will no longer contain such information in past residuals, that can be explained by the disturbance model. Ideally, residuals should look ‘white’, which means that there is no information in past residuals that could be used to improve the prediction. Even if auto−correlation has been removed by introducing disturbances, there may still be unused information in past and present input. Look for large transients in the residuals. If such transients coincide in time with large changes in an input, or follow closely after, this is evidence that the dependence on this particular input has not been modelled or not modelled correctly. This gives information of what part of the model to improve. The Disturbance The ‘whiteness’ of residuals is not a sufficient condition for a correct model. Often, it is also difficult to see whether residuals are actually ‘white’. It may therefore help to look at the predicted disturbance variables. If they vary in coincidence with data input, this is evidence that there is a correlation, contrary to the hypothesis that they are independent disturbances. But this could be used to improve the disturbance model. If possible, try and find the physical cause. If that is not possible, add a black box model for the hypothesized relation to the disturbance variable. The Parameter Values Preferably use parameters that have physical meanings. This has the advantage that one may have an idea of its likely range of variation, and even a reasonable value to start with, obtained from external sources. The knowledge that a parameter is positive may be valuable information. The obvious way to exploit this is to set the range of the parameter, thus preventing the search routine from trying any value outside the range. However, the information (of a given physical range) can be exploited in another way: If you remove the range, the search routine may or may not find a value outside the range. And if that ‘model’ (with an unphysical parameter value) would still be significantly better than the tentative model, the conclusion is this: The tentative model is rejected, but the alternative model does not describe the data either. This puts the focus on the sub−model to improve, namely the one containing the unphysical parameter. In this way a test of the tentative model yields one of three answers (instead of the ordinary “reject” or “do not reject”), namely “do not reject the tentative model”, “reject the tentative model and accept the alternative”, and “reject the tentative model, but do not accept the alternative” (see also Section 1.5.2). Cross−correlation The Verify command evaluates a number of auto and cross−correlation estimates. However, they are somewhat blunt falsification instruments, meaning that although the probability of rejecting a correct model is small, that of not rejecting an incorrect model may be larger. There are two reasons for this: : The averaging of a large number of products that computes the correlation estimates tend to reduce the effect of even significantly large transients, which will
118
Practical Grey−box Process Identification
Figure 4.34. Setup for the ALMP test
therefore not show up in the correlation estimates. Again, look for such transients in the residuals, and their possible coincidences with other variables. The Verify option points out the largest of the transients, if any. : Cross−correlation tests are unconditional tests and do not necessarily have ‘maximum power’ (= maximum probability of rejecting an incorrect hypothesis). The maximum−power Likelihood−Ratio tests require alternative hypotheses (the alternative model structures), and are conditional on those alternative hypotheses. They are evoked by the NewDim command. 4.9.2 Nesting MoCaVa opens the Origin window again. The values of the four fitted parameters have replaced the nominal. In normal cases the window is displayed for confirmation and a reminder of what are the parameter values of the tentative model the expanded alternatives are to be compared with. It would therefore save some mouse clicking to have it suppressed (Click Advanced). However, entering a different value will set up for a comparison with a single alternative model within the same structure. This may be useful, for instance, if one would suspect that the tentative model has actually been over−fitted, i.e., some fitted parameter could well be replaced with its nominal value without significant increase of loss. The same effect can be obtained, however, in a more systematic way: See Section 7.8 for a case of over−fitting and reduction of model structure. Click OK. MoCaVa opens the Search specification window (Figure 4.34). The purpose this time is for the user to set or acknowledge the search parameters used for finding alternative models within the alternative structures to compare with the tentative model. Since this does not have to be the best models within the alternative structures, the search needs fewer iterations, normally one. A single iteration allows the much faster ALMP (Asymptotic Locally Most Powerful) test to be applied, instead of the more cumbersome LR (Likelihood−Ratio) test. Both tests require the tentative and the alternative structures to be “nested” (Section 2.4.2).
4 Calibration
119
Notes on “Nesting” It simplifies the testing to use “nested” structures, i.e., the tentative model structure is contained in all of the wider structures defined by the alternatives. If there are still reasonable parameters to free in the current class (in addition to those indicated in the table), the nesting condition is that no index in the alternative be smaller than that in the tentative structure, and at least one be larger. It is also recommended that the number of free parameters be increased by the same amount (preferably one) in all alternatives. If the alternatives are to be defined within a different class, the nesting conditions are less simple, but a necessary requirement is that the alternative structure contain at least the same components as the model to be tested, and at least one component more. Thus, increasing the free space within the current model class satisfies the nesting condition. Also expanding the model class by augmenting a component may satisfy the nesting condition, provided the parameters in the new component have their “null” values. “Null” values are parameter values that make the expansion ‘null and void’, i.e., when simulated the alternative responds as the tentative model. Changing the model class in other ways, for instance by replacing a component, does not satisfy the nesting condition. In such a case, the comparison must be “fair” for even the test of comparing likelihoods to be feasible. A necessary condition for a “fair” test is that the total dimensions (= sums of index values) of the free spaces of the tentative structure and those of the alternative(s) be the same. Normally, MoCaVa determines automatically whether the “nesting” or “fairness” conditions are satisfied and chooses the maximum number of iterations accordingly. Occasionally, the routine cannot decide this based on the structure and parameter information only, and will therefore need a trial test to find out what test to use. Even an ‘unfair’ test will pass the checking, as long it is unfair to the alternative. The logic for this is that if an alternative turns out better, even though it has fewer free parameters than the tentative model, the latter must still be rejected. The cause of any violation of the conditions and therefore failure to set up a test will be displayed. This gives the user some hints on what to amend in the alternative structures in order to allow a testing. Click OK to go back and reduce the number of free parameters. This will reduce the degree of dependence on data, and thus change the outcome of the “fairness” check. Alternatively, the user may force MoCaVa to proceed with the testing by pressing Overrule, thus taking over the responsibility of interpreting the test result. Even if a test is ‘unfair’, its outcome may still be obvious enough to make it unnecessary to compute other test statistics than the loss reduction. 4.9.3 Interpreting the Test Results MoCaVa shows the test result in the Test appraisal window (Figure 4.35). The ‘framed’ parameter values are those freed in the alternative structure in addition to those in the tentative structure. Normally, it is enough to select the alternative with the smallest risk among those with physically reasonable parameter values. In less clear−cut cases than this see “Help: Test Appraisal Window” on how to interpret the test statistics in general. Click Select_#1, since the risk is zero (= smaller than allowed by the numerical accuracy). This falsifies the tentative model structure, and indicates that the alternative is a better one.
120
Practical Grey−box Process Identification
Figure 4.35. Test results with a single alternative
Help: Test Appraisal Window There are three sets of statistics to support an appraisal of the results of the hypothesis testing: : The Test statistics are the first to look at. The Risk value is the probability that rejecting the tentative model in favour of the alternative is wrong, i.e., that instead the ‘null’ hypothesis holds that the tentative model is as good as the alternative, while the loss difference is due to chance alone. It can therefore be interpreted as the risk of getting a somewhat more complex model than would have been needed to satisfy the data sample. The risk has been calculated from the loss−reduction. The loss reduction is the logarithm of the likelihood ratio between the alternative and the tentative. Twice its value is approximately chi−square distributed (for long samples), with the number of degrees of freedom indicated in the next row. : The risk value is valid for “nested” cases. In other cases, the test has to be “fair”, i.e., the number of degrees of freedom must not be positive. In this case Risk is defined as the value of the a priori probability one would have to assign to the alternative model in order not to prefer it. If that probability is small, there is little risk in rejecting the tentative model. : In the case that the tested model and the alternatives are based of different data samples a test statistic cannot be computed, and the appraisal must be based primarily on the rms values of prediction errors. They are expressed as percentages of the rms values of the data sample. : Check also whether the parameter estimates are reasonable. In the case that a reject risk is small for an alternative, but the corresponding parameters are out of reason-
4 Calibration
121
able ranges, this indicates that the tentative structure is wrong, but there is no physically reasonable model within the alternative structure either. If no alternative so far is significantly better, then you may either click Continue to search further for a better alternative model within the same structure, or Restart to try better search parameters, or QuitSearch to try a better alternative structure. Click Advanced if you want to reduce the number of windows that open for checking the computer’s proposals. Notes on Model Structure Selection and Risk Primarily, MoCaVa aims at determining a model of the least complexity that data allows, and uses repeated falsifications of tentative models of various, usually increasing complexities to achieve this. The falsification of a tentative model is based on the concept of ‘risk’. Likelihood−Ratio tests are employed, since they have the maximum ‘power’ (discriminating capacity) of all statistical tests, i.e., they have the minimum risk of accepting an inadequate model, with a given risk of rejecting an adequate model, and thus the smallest risk of selecting a somewhat more complex model than would have been necessary. The second, ‘given’ risk level must be specified by the user. Reasonable values are 0.1 − 0.001, but the value is deciding only in cases where the alternatives fit data almost equally good or bad, so that the data is insufficient to discriminate clearly between the alternatives. Hence, the risk level displayed for each alternative is the level that would be needed to reject the tentative model in favour of the alternative. Notice that the risk value for each alternative increases when there are several alternatives, since this increases the probability that one of them will deliver a spurious statistic by chance. However, this may be decisive only when the best alternative is only marginally better. The risk level can be used to control indirectly the complexity of the model: A smaller specified risk for false rejection results in a less complex model (but with an increased risk of having accepted an inadequate model). When the “nesting” condition is satisfied, MoCaVa uses the LR or ALMP tests to compute the risk. Those tests are used to discriminate between model structures of different complexities (different number of free parameters). Discrimination between structures of equal complexity is done by computing the Likelihood Ratio LR. The risk is 1(1 + LR).
4.10 Refining a Tentative Model Structure The next step in the calibration procedure is to refine the tentative structure. The MoCaVa user’s guide proposes the obvious, that the replacement should be the selected alternative structure. MoCaVa opens the Tentative structure window, in order to give the user the option of having a different opinion. Click OK to acknowledge the alternative structure as the new tentative structure. This does not acknowledge the alternative model, since it is not necessarily the best within the alternative structure. The user’s guide therefore returns to the fitting procedure, which means first setting the start values for the search. MoCaVa proposes the falsifying parameter values as new origin, and opens the Origin window once more, again to give the user an opportunity to overrule. Click OK to acknowledge the alternative model as start value for the search for the best model within the same structure.
122
Practical Grey−box Process Identification
Figure 4.36. Responses of the new tentative model
As before, the Search specification window appears. Click OK. MoCaVa searches for the minimum loss and opens the Search appraisal window to report the result. Click Plot to see the effect of the search. MoCaVa displays Figure 4.36 in the Plot window. The search worked well, and fitting the disturbance parameters apparently helped. Click Accept. This creates the new tentative model and closes the recursion.
4.11 Multiple Alternative Structures MoCaVa returns to the Alternative structures window. The power transients in Figure 4.36 are still wrong, and it is unclear which of the remaining parameters that will offer the most effective amendment. As seen from the component equations (if you kept the window, otherwise click Show), the parameters TD, TR, and A4, all effect the power output E directly. Lack of other prior knowledge suggest that the SFI−rule be used. Make three alternatives by freeing each one of the parameters (Figure 4.37). Click NewDim. MoCaVa opens the Origin window. Click OK. MoCaVa opens the Search specification window. The routine should be clear by now: The best of the alternative structures will be the new tentative structure, and the next step is to find the best model within that struc-
4 Calibration
123
Figure 4.37. Setting up for three alternative structures
ture. Which is again tested against one or more models in still better alternative structures. In this procedure MoCaVa will repeatedly open windows to display the settings of “origin”, “free parameter space”, and “search specifications“ proposed by the user’s guide in the Calibrate procedure, thus giving the user frequent opportunities to overrule the proposals. Since it may be somewhat irritating to see such never−used options in clear−cut cases, the user has also the option of suppressing the check points. Regardless of this the windows are opened whenever a search fails, and MoCaVa has to ask for help from the user. Click Advanced to set up for fewer check points. MoCaVa opens the Advanced specifications window (Figure 4.38). Section A.9 describes the options available in the Advanced specifications window. Uncheck Origin, Free parameters, and Search rule. Then click OK. MoCaVa opens the Search specification window once more; the turn−off does not take effect until the next time. Click OK. MoCaVa opens the Test appraisal window (Figure 4.39). Rejecting the tentative model (with the wrong transients) in favour of the alternatives #1 and #3 is risk free. Choose #1, since it has the larger loss reduction. Press Select_#1 to make alternative #1 the new tentative structure. Due to the suppression of three user’s check points, the refining of the tentative structure and the fitting of the best tentative model within this expanded structure is now automatic. MoCaVa displays the result of fitting alternative #1. Click Plot to see the effect. MoCaVa displays Figure 4.40. Click Accept to make the fitted model the new tentative model. MoCaVa opens the Alternative structures window. The transients remain. Free each one of the remaining parameters to make two alternatives. MoCaVa shows the test results in Figure 4.41. Press Select_#2. MoCaVa displays the result of fitting alternative #2. Click Plot to see the effect. MoCaVa displays Figure 4.42. Click Accept.
124
Practical Grey−box Process Identification
Figure 4.38. Some options for running MoCaVa
MoCaVa opens the Alternative structures window. Transients have reduced, but are still evident. Free also the last parameter. MoCaVa shows the test results in Figure 4.43. Press Select_#1. MoCaVa displays the result of fitting the last alternative. Click Plot to see the effect. MoCaVa displays Figure 4.44. The obvious transients have vanished, but there are still significant low−frequency disturbances. Click Accept. MoCaVa opens the Alternative structures window. Since all freedom in the current model class has now been exploited, the latter is unable to model the disturbances and must be amended with more components. Click NewClass.
4.12 Augmenting a Disturbance Model MoCaVa opens the Model class specification window. Click Edit.
4 Calibration
Figure 4.39. Test results for three alternative structures
Figure 4.40. Responses of the current tentative model
125
126
Practical Grey−box Process Identification
Figure 4.41. Test results for two alternative structures
Figure 4.42. Responses of the current tentative model
4 Calibration
Figure 4.43. Test result
Figure 4.44. Responses of the current tentative model
127
128
Practical Grey−box Process Identification
Figure 4.45. Specifying argument attributes for low−amplitude disturbance models
MoCaVa opens the Component library window for expansion. Select Insert and click OK. MoCaVa opens the Component naming window. Name the component Disturbance and click OK. MoCaVa opens the Component library window again, this time for defining the component. Click OK. MoCaVa opens the Component function window. The disturbances in the Plot window look irregular, and there is no knowledge about their sources. It remains to describe them as unknown varying input. The relation is the same as for known input from a data file, namely output = input. Thus, the minimum component is the same for known and unknown input. The difference is the definition of the source of the input. Enter distE = Ev and distP = Pv. Click OK. MoCaVa opens the Argument classification window. Classify Ev and Bv as Disturbance. MoCaVa opens the I/O interface window. The disturbances are clearly dominated by low frequencies. Try Brownian models. There are no sensors measuring disturbances. Select Brownian and NoSensor. Click OK. MoCaVa opens the Argument attributes window (Figure 4.45). The default values for the drift−rate parameters are small, in order for the alternative to be close to the tentative model (see the example in Section 2.6 for a motivation). Click OK to acknowledge. MoCaVa opens the Model class specification window. Click Graph. MoCaVa displays the block diagram in Figure 4.46. A new box has appeared, this time inside the DrumBoiler. That is so because the arguments replaced by input variables where classified as Parameter. The placement serves to illustrate a ‘refinement’ of the DrumBoiler component, it has now a better disturbance model (than constant disturbance). Notice that the standard functions generating the disturbances Ev and Pv introduced the new parameters drift_Ev and
4 Calibration
129
Figure 4.46. Graph of the model class
Figure 4.47. Result of simulating with nominal disturbance parameters
drift_Pv. They also introduced the new states x_Ev and x_Pv, but states are not shown in the graph. Click Simulate. MoCaVa opens the Plot specification window. Click OK. MoCaVa displays the result in Figure 4.47.
130
Practical Grey−box Process Identification
Figure 4.48. Message issued during the process of selecting test
Figure 4.49. Message issued during the process of selecting test
This is worse than before! Either the new parameters are wrong, or the ‘Brownian motion’ model is inadequate. However, it is the model class that is appraised, and fitting the parameters has not been tried yet. Start values are obviously wrong. Click Accept. MoCaVa opens the Alternative structures window. There is no point is testing the tentative model with constant disturbances against the current alternative with nominal values of the drift rates. Free therefore the two drift rates to allow a better alternative. The number of free parameters will be the same, so a comparison should be “fair”. Free all parameters and click NewDim. MoCaVa starts a number of routines for setting up a suitable falsification test, and finds that the nesting condition for the faster ALMP test cannot be determined a priori (Figure 4.48). The message gives the reason. The ALMP test still may or may not be feasible. To find that out, the MoCaVa will next do a ‘trial’ test, and analyze the result. Click OK to proceed. MoCaVa displays the message in Figure 4.49. Even the a posteriori test has failed to ascertain nesting. Click OK to evoke the checking of the condition of “fairness”. MoCaVa has found the test “fair”, searches for an alternative model, and opens the Test appraisal window (Figure 4.50). The tentative model is falsified with zero risk. Since the parameter values are not physically unreasonable, the alternative is better. Click Select_#1. MoCaVa searches for the best model in the new structure and displays the result in Figure 4.51. Click Plot. MoCaVa displays Figure 4.52.
4 Calibration
131
Figure 4.50. Result of testing the tentative model
Figure 4.51. Result of fitting
Since the residuals appear to be ‘white’, this looks good. The residuals no longer contain any predictable parts that could be used to improve the model’s predicting ability. Click Accept.
132
Practical Grey−box Process Identification
Figure 4.52. Result of simulating the new tentative predictor
Notes on “Fairness” If a comparison between tentative and alternative structures is ‘unfair’, this is detected by the routine setting up the test. In that case, click OK to go back and reduce the number of free parameters. Alternatively, Overrule the default common origin (= parameter values for the null hypothesis). This will reduce the degree of dependence on data, and thus change the outcome of the “fairness” check. However, if you choose to overrule, you should be aware of the conditions for a fair test: For instance, do not try and be clever by replacing a fitted value with the same value, or a value close to that. This may let you pass the test set−up procedure, but would be cheating the test, and the computed risk values would be wrong.
4.13 Checking the Final Model MoCaVa opens the Alternative structures window. The observation that residuals appear to be ‘white’ is the main indicator for when to start thinking about ending the calibration session (see Section 4.9.1 for other indicators). However, MoCaVa provides some means for checking further the hypotheses inherent in all the models, namely that all disturbances are independent of each other and of input data. If that would turn out not to be true there is more information in the data that could be exploited to improve the model. Click Verify. MoCaVa computes a number of relevant correlations and checks that they are not significantly different from zero. In addition, the program does an investigation of possible ‘outliers’ among the residuals (Figure 4.53). The probability that any of the outcomes of those multiple (seven) tests may be due to chance alone (and thus the risk of rejecting a correct model) is also displayed. The
4 Calibration
133
Figure 4.53. Results of correlation tests
test clearly falsifies the model. However, the largest structural error is 0.67%, which says that there is no more reduction in prediction error to gain by looking for an improvement. However again, there might still be structural information to gain. As long as there are other alternatives to investigate there is good reason for returning to the more informative conditional Likelihood−Ratio tests. Click OK to close the window and then NewClass to look for something better. Notes on Correlation Computing auto−correlations of residuals and cross−correlations with other residuals and with input is a standard way of making an unconditional test of a tentative model. In particular, it tests the hypothesis that stochastic disturbances (from the ‘environment’) are independent of other (Control and Feed) input. The test is ‘unconditional’, since it does not rely on explicit alternative hypotheses for comparisons. It has less power than the conditional LR and AMPL tests used in MoCaVa, meaning that the risk that an incorrect model will not be rejected is higher, although the risk of rejecting a correct model is the same. Therefore, correlation tests are used as a last resort, when one has already used all prior information on which to base an alternative hypothesis. The results are displayed as a number of chi2−statistics with the indices of the correlated sequences and the number of degrees of freedom as arguments. The degrees of freedom equals the maximum lag used in auto−correlations, and one more than used
134
Practical Grey−box Process Identification
in the cross−correlations. For a correct model the probability is small that the chi2−statistic will have values much larger than the number of degrees of freedom. Therefore the probabilities associated with the actual values of the chi2−statistics are also displayed. If a probability is small for any of the correlations, the model may be rejected with an equally small risk that the decision is wrong. The total risk associated with the multiple testing is also computed. Notice that it is larger than the smallest of the risks associated with each test (according to the theory of “order statistics”, Wilks, 1962). The following is a conventional proposal, although without theoretical support: Require a risk value below 0.001 (”strong significance”) before rejecting a model.
4.14 Terminals and ‘Stubs’ MoCaVa opens the Model class specification window for the user to make structural improvements. Since the present case is simulated, we know that at least the root component of DrumBoiler is correct. Data was copied from a file used in the testing of IdKit1, and originally generated using the same root model. The disturbances entered into the original model were ‘Brownian’ too. However, for the demo case let’s pretend that it is unknown where disturbances entered. Click Show, check DrumBoiler, and click OK. MoCaVa displays the statements in Figure 4.32. Three places are reasonable for adding disturbances: a) at the output E and P, b) at the input f and u, or c) at the state derivatives Dstate. Case a is the hypothesis used so far, and it has been falsified. Click Edit to try the alternatives. MoCaVa opens the Component library window. As before, introducing disturbances is done by replacing existing parameters or constants with signals from disturbance components modelling an ‘environment’. They function as ‘input terminals’ for the signals. Generally, all input of classes Feed, Control, Parameter, and Constant may be used as input terminals. However, in the DrumBoiler component only output disturbances have parameters distE and distP to replace and thus terminals to connect to. Hence, the root component must first be modified; it does not have the appropriate input terminals. That appears to break the rule that the root component should be unchanged, when components are added. However, a complete specification of a component includes information on which points the component is uncertain. In other words, one must anticipate where it may need improvement (but not how). Since the modelling apparently failed in that respect, it must be re−done. Select Change for DrumBoiler and click OK. MoCaVa opens the Component function window. ‘Stubs’ are a particular kind of constants used to create input terminals. They are quantities in the model equations that have no effect, until they are possibly replaced by connecting a component. Thus, additive constants with zero values and multiplicative constants with unit values are stubs. Introduce stubs for adding to f, u, and Dstate (Figure 4.54). MoCaVa opens the Argument classification window.
4 Calibration
135
Figure 4.54. DrumBoiler modified by added stubs
Figure 4.55. Copying a component
Classify the new stubs as Constant. (The old ones, distE and distP, were classified as Parameter in order to make it possible to test the hypothesis that a constant output disturbance would be adequate). Assign zero nominal values to the stubs. Click OK until MoCaVa opens the Model class specification window. Then click Edit to open the Components library window, select Insert, and click OK to expand the library.
4.15 Copying Components MoCaVa opens the Component naming window. The new component describes the source of the input to the arguments stubf and stubu. It is convenient to make it a copy of the Disturbance component, although with a different name. Enter a new name. Pull down the list of previously defined components and pick Disturbance to be copied (Figure 4.55). Click OK. MoCaVa opens the Targets specification window for editing argument names. Only the output must be renamed (they are marked Unique). Other arguments may share names with those of another component, as long as not both components are Active. Rename the output (Signals) to agree with the input ‘stubs’ and click OK (Figure 4.56). This concludes the definition of the InputDisturbance component. Repeat the procedure to define the StateDisturbance component. MoCaVa opens the Model class specification window. Make InputDisturbance Active and the other two disturbance models Dormant (Figure 4.47). MoCaVa opens the Alternative structures window.
136
Practical Grey−box Process Identification
Figure 4.56. Editing argument names of a copy
Figure 4.57. Hypothesizing input disturbance
Free all parameters except the (zero) distE and distP and click NewDim (Figure 4.58). MoCaVa executes the test selection and test routines and then opens the Test appraisal window (Figure 4.59). There is a zero risk of rejecting the Disturbance model. That has already been established by the unconditional correlation tests. But the conditional LR test also yields a better model. Click Select_#1 to make this the next tentative model structure. MoCaVa searches for the best tentative model and displays the result. Simulation and plotting shows no noticeable difference. Click Accept and try Verify again. MoCaVa displays the result in Figure 4.60. This risk 0.7 of rejecting this model is too high even to consider. There is only 0.026 probability that the first residual of E is not an ‘outlier’, so there is some initial misfit. A look at the plot reveals that this vanishes rapidly and should be of no concern. However, try also the other alternative disturbance (it might still be better).
4 Calibration
137
Figure 4.58. Defining an alternative structure
Figure 4.59. Results of testing the output disturbance model
Click OK and then NewClass. MoCaVa opens the Model class specification window. Make StateDisturbance Active and the other two disturbance models Dormant. MoCaVa opens the Alternative structures window. Free all parameters except the (zero) distE and distP and click OK.
138
Practical Grey−box Process Identification
Figure 4.60. Results of correlation tests
MoCaVa executes the test selection and test routines and then opens the Test appraisal window (Figure 4.61). The StateDisturbance model is not significantly different. Click Select_#1to get the model. MoCaVa searches for the best tentative model and displays the result. Click Accept and try Verify again. MoCaVa displays Figure 4.62. The result is similar.
4.16 Effects of Incorrect Disturbance Structure The following can be concluded from the testing of the different disturbance models: : The difference in simulation results is not noticeable to the eye, and the prediction errors are the same. : Both correlation tests (Verify option) and Likelihood−Ratio tests are able to reject the case of ‘output disturbances’. : The estimates of the drum boiler parameters are the following: True model: TD = 300 TR = 15 A4 = 0.28 Output disturbances: 285.3 16.5 0.261 Input disturbances: 301.4 14.6 0.277 State disturbances: 301.7 14.6 0.277
4 Calibration
139
Figure 4.61. Results of testing the input disturbance model
This is evidence that an incorrect disturbance structure (‘output disturbance’) has a significant detrimental effect on the estimation accuracy, even if the models’ responses are indistinguishable to the eye. : Unknown to the user, the ‘true’, data−generating model had state disturbances. MoCaVa cannot distinguish objectively between that and the InputDisturbance model. However, a comparison of the disturbance levels needed to satisfy the data in the two cases indicates that the incorrect placement of the of disturbances require a (fictitious) disturbance level dist_Ev that is two orders of magnitude higher than needed otherwise. Hence, Figures 4.59 and 4.61 yield some added evidence for the user to select the (correct) StateDisturbance model. : Outlier detection (Verify option) pointed out a small initial transient. Notice that ‘outliers’ in residuals mean that there are either corresponding outliers in the data, or transient errors in the model. The Verify option cannot detect which. A conditional, Likelihood−Ratio test could possibly, provided the user is able to propose an alternative structure without the error. : When estimating parameter values, it is customary also to compute the estimated standard deviations of the estimation errors (they appear within parameters following the estimates in the Search appraisal window). However, they are generally not to be trusted. For one thing, they assume that the structure is correct. If not, their values still reduce with increasing amount of data, without the actual accuracy increasing at the same rate.
140
Practical Grey−box Process Identification
Figure 4.62. Results of correlation tests
4.17 Exporting/Importing Parameters The current parameter estimates may be saved each time the Origin window is open. Click Advanced and check Origin to restore the display of the Origin window. MoCaVa opens the Origin window Figure 4.63. Click Export. MoCaVa creates an ASCII file with contents % DrumBoiler TD = 301.668; TR = 14.5905; A4 = 0.2772; distE = 0; distP = 0; initstate = [ 148.005 27.6649]; rms_E = 1.03253; rms_P = 1.01403; % Control % Feed % StateDisturbance drift_Ev = 0.00198599; drift_Pv = 0.00322468;
4 Calibration
141
Figure 4.63. Parameters values to export
and name mcvprojects\DrumBoiler\casedir\status\parameters . The file contains information on the components and parameter values that together define the model. The file can be entered at a later stage by clicking Import. It can also be executed directly from the MATLABX command window to set parameters used in other programs.
4.18 Suspending and Exiting Exiting from the calibration session is done in the same way as suspending it temporarily. Click Suspend. MoCaVa opens the Session save window (Figure 4.64).
Figure 4.64. Save history or not?
If you want to resume the session after having closed MATLABX (or logged out) the session status history must be saved on disk. That may take some time, since the status history is quite a complex structure. However, the status history is always in the MATLABX work space, which means that you may save time by not writing it to the disk (as long as you do not clear the work space, for instance by closing MATLABX). The window allows the user to decide whether or not the session history should be saved on disk. If you select No, then clear the work space (or log out), and then Resume the session, the status history from the latest time you stored it will be loaded. This is a way to make a ‘detour’ to experiment with the model structures, without losing the option of falling
142
Practical Grey−box Process Identification
Figure 4.65. The score table
back on what you have, should the experiment fail. You may use the Export/Import option for the same purpose, but will then have to restore the model class and structure manually by editing the contents of the Model class specification and the Tentative structure windows. Select Yes and click OK. MoCaVa displays the Calibration score (Figure 4.65), and then transfers control to the MoCaVa window. Click Exit to suspend the calibration session. MoCaVa stores all information relevant to the project in mcvprojects\DrumBoiler\casedir. You may continue the project from where you left. 4.18.1 The Score Table The display that follows Suspend contains a summary of the tentative models that were tried during the calibration session. It lists the model structures that were accepted in various stages of the session. In addition to the combination of active components that define the model classes the table displays the chi2−statisticfor each test (with its number of degrees of freedom), the normalized loss of each tentative model, and the free parameter spaces of the structures. The square brackets group the indices after the components they belong to, and parentheses enclose entries of vector parameters. The
4 Calibration
143
loss may be interpreted, roughly, as one−step prediction error level relative to that of the variation in the predicted variables. Thus, the level is one for the ‘null’ model and zero for a perfect model.
4.19 Resuming a Suspended Session The purpose of this exercise is mainly to show how a session can be continued, in case the user wants to test the model further, or has new ideas on how to improve it. Start Calibrate again from the MoCaVa window, Figure 4.1 (or if that has been closed by first typing mocava3 in the MATLABX command window). Select the same project (Figure 4.3). MoCaVa opens the Restart window (Figure 4.66).
Figure 4.66. Restart
You may resume the session from the point where you left it, step back a specified number of steps, or reset the session history and start from the beginning. Look at the Pilot window to see where you did leave the session (Figure 4.67). The user’s guide follows the logic in the Pilot. It is particularly useful in complicated cases, when the user has found it expedient to suspend, step back and resume several times to amend the consequences of poor decisions. The Suspend/Resume option offers the only way to step back in the procedure. The reason is that the session status needs to be updated and the history saved in order to make it possible to step back from an arbitrary position. Click Resume. MoCaVa loads the session status and history and reopens the same window as when the session was suspended. Click NewClass to reach a point where you can change the model class. MoCaVa changes the position in Pilot accordingly, and opens the Model class specification window.
4.20 Checking Integration Accuracy When the final model has been tested against all conceivable alternatives, and also with respect to the independence of input and disturbances, it may still be useful to check it once more, namely with respect to the numerical accuracy of the predictor, based on the model and used for all fitting and testing. A simple test is to halve the predictor interval, the “time quantum“, and compare the results. Click Advanced and halve the value of Time quantum in the window that opens. Click OK. Then specify no alternative index set and click NewDim. The label for the latter command, initiating the comparison of the two model structures is somewhat illogical for the case, since the dimensions are the same, although with differently approximated model classes.
144
Practical Grey−box Process Identification
Figure 4.67. The Pilot window
MoCaVa displays the Test appraisal window (Figure 4.68). The test result shows that the risk is about ‘fifty−fifty’ for preferring any of the models. Hence, there is little point in reducing the numerical error. In the present simple case the doubling of the computing time it takes to halve the time quantum is of little concern, but it may not be so in other cases. This concludes the Calibrate session for the DrumBoiler example. Click Suspend, OK, and Exit.
4 Calibration
Figure 4.68. Result of halving the time quantum
145
5
Some Modelling Support
This section introduces some more advanced tools, mainly to facilitate the modelling of large objects. In essence, MoCaVa3 has support for feedback loops, reuse of models, and import of DymolaZ models. Again, simple examples will be used as illustrations.
5.1 Modelling Feedback The first example illustrates the modelling of long feedback loops, i.e., models transferring signals from one component to another one placed ‘upstream’. Basically, signal information can be transferred ‘backwards’ only via state variables, since this prevents the creation of algebraic loops. Having such loops would mean that a signal would pass the loop in zero time, which is unphysical. It would make it unclear what is cause and what is effect. Technically, algebraic loops would also have to be resolved in real time, suggesting all sorts of numerical difficulties and much computing. However, if one would regard the algebraic loops as ‘fast’ dynamic loops, with short but positive response times, then a mixed system of ordinary differential equations and algebraic equations would become a systems of ‘stiff’ ordinary differential equations. Since MoCaVa already needs a stiff ODE solver, to be able to handle models of physical processes with much different time constants, it is no large step to feed it with algebraic equations transformed to ODE. One might expect that this would slow down the integration of the ODE much, depending on the value of the fictitious time constant that is introduced, but the processing speed depends very little on the time constants. It depends on the time quantum, which is normally much longer (see Sections A.3 and A.5 for details). Start Calibrate and open the Cascade control project. This is a simple model designed to illustrate the use of feedback between components, including the effect of having an algebraic loop. It simulates a tank that is being tapped irregularly and replenished through a valve, whose opening is controlled by proportional feedback from the tank level. What makes the case less trivial are the facts that the tank volume is large and the range of the proportional controller is limited (making the control slow), while the valve pressure varies fast and irregularly. This means that the proportional controller will be inefficient. The control is therefore improved by an inner, proportional, and also range−limited feedback loop from the valve output. Since the valve model and the proportional controller both lack dynamics, this will create the algebraic loop. Test data has been generated using the ‘true’ model, processed by Predat and stored in Examples\CascadeControl\testdata.mcv.
148
Practical Grey−box Process Identification
Figure 5.1. The model class
Figure 5.2. Specifying model class
5.1.1 The Model Class Open the file and click OK in the next window. MoCaVa shows a graph of the system in Figure 5.1 and opens the Model class specification window (Figure 5.2). The graph illustrates a system with cascade control of the level in the tank to a constant reference value z0. The outer (master) controller feeds back the tank level z2, in order to reduce its difference from z0, and the inner (slave) controller feeds back the valve flow z1 in order to reduce its difference from the output signal of the outer controller. The outer controller serves to compensate for random variation in tapping rate tap, and the inner controller serves to compensate for random variations in valve pressure. Tapping and pressure variations are the main disturbances and have been recorded in testdata, which also contains (simulated) measurements of level and feed. Both controllers are of type proportional, but limited ranges make them nonlinear and calls for two controllers to steady the level effectively. The Model class specification window also contains a third controller of the same type. When it is activated in place of the first two, a comparison of performance reveals how much cascaded control will reduce the variation in level. The created system requires three options in MoCaVa that have not been described in Chapter 4: : The two controllers have input arguments that are also output arguments of components placed downstream (= higher in the list of components).
5 Some Modelling Support
149
: Controller2 feeds back a direct output from Valve without first passing an integrator, which creates an algebraic loop.
: Both controllers call the same user−defined function PController.
Click Edit and select Change for all active components to see how the components were created. This will open the usual sequence of windows for editing the component’s properties. The Tank The first two windows show the component statements: % Tank level Dx = (feed − tap)/V z2 = x % Start level x = x0
The next window shows the arguments classification:
: z2 is of class Response, since it is of interest outside the component. : feed is of class Feed, since its placement is graph should illustrates its physical meaning.
: tap is of class Control, also for physical reasons. : V is of class Parameter, since the tank volume is unknown. Otherwise it would have been Constant.
: x0 is of class Parameter, since the tank level is unknown at start up. Argument
Class
Component output z2
Response
Component input feed tap V
Feed Control Parameter
Initialization input x0
Parameter
The next window shows the I/O interfaces:
: A Sensor is attached to z2, since the variable is recorded in the data file. : The feed input is also recorded in the data, but its source is a User model, i.e.,
another component. Logically, one could choose between including the sensor in any of the two components. However, the MoCaVa3 implementation requires that an input attached to another component must have its sensor (if any) attached to the latter. Attempts to attach the sensor to the input instead will be ignored, and a message issued. : Since z2 = x, the latter needs no extra sensor. : tap is in the data file and was generated using the Hold function. MoCaVa does not ask whether it has a sensor, since it was classified as Control. This classification implies that its values are known, either generated by another component or by interpolating in a data sequence generated by logging (without error) the output of some
150
Practical Grey−box Process Identification
device generating the true input. If these requirements cannot be satisfied the input should be classified as Feed, which would allow various filtering of noisy data as well as attaching a sensor to the input. Argument
Source
Connections to sensors z2 Sensor feed NoSensor x NoSensor
Feed input: Source model feed User model
Control input: Source model tap Hold
The next window shows the argument attributes:
: The nominal values of the parameters are the ‘true’ values. The case is designed to illustrate the modelling and not the identification process.
: The scale of rms_z2 may seem large compared to its nominal value, but it the same as that of z2.
Argument
Short description
Dim Scale
Nominal
Min
Parameters V x0 rms_z2
Volume Initial level StDError_z2
1 1 1
10 1 1
100 1 0.01
0 0 0
Control input tap
Tapping rate
1
1
Feed input feed
Feed rate
1
1
Process output z2 Level
1
1
States x
1
1
Level
Max
0
The next window shows the data assignment:
: The tapping rate has the same name tap in the model and in the data file, but the level has not. It would be most logical to have different names, since one is a model output and the other the data to compare the output with. MoCaVa3 allows the slight abuse of correct naming for convenience.
Argument
Data
Conversion
z2 tap
level tap
1 1
5 Some Modelling Support
151
The Valve The first window shows the component statements: % Valve z1 = u1*p feed = z1
The next window shows the arguments classification:
: z1 is of class Response, since it is of interest outside the component. : u1 is of class Control, since it represents the valve opening issued by the control system.
: p is of class Feed; it represents the pressure input to the Valve model. It could also
have been classified as Control (in case one would consider the varying pressure as another means to control the flow through the valve). This would affect only the layout of the graph, and only if another component would be added to describe the pressure variation.
Argument
Class
Component output z1
Response
Component input u1 p
Control Parameter
The next window shows the I/O interfaces:
: A Sensor is attached to z1, since the variable is recorded in the data file. : No sensor is attached to p. The value is interpolated from the data using a Hold function, and these values are exact and need no checking by a measuring device.
: The source of u1 is another component describing a controller (either Controller1 or Controller3).
Argument
Source
Connections to sensors z1 Sensor p NoSensor
Feed input: Source model p Hold
Control input: Source model u1 User model
The next window shows the argument attributes: Argument
Short description
Dim Scale
Nominal
Min
Parameters rms_z1
StDError_z1
1
1
0.01
0
Control input u1
Control signal
1
1
1
Max
152
Practical Grey−box Process Identification
Feed input p
Pressure
1
1
Process output z1 Feed rate
1
1
The next window shows the data assignment: Argument
Data
Conversion
Output data z1
feed
1
Input data p
pressure
1
The Slave Controller The first window shows the component statements: % Slave controller [u1] = PController(z1,u2,k1,u1max,u1max)
where PController is a User function, shared by several components. The next window shows the arguments classification: : u2 is of class Control, since it represents the set point issued by the outer (master) controller. : k1 is of class Parameter; it represents the controller gain and can be varied. : u1max is of class Constant; it represents the controller range and is fixed. Argument
Class
Component input u2 k1 u1max
Control Parameter Constant
The next window shows the argument attributes: Argument
Short description
Dim Scale
Nominal
Min
Parameters k1
Proportional gain slave controller
1
1
10
0
Control input u2
Reference signal slave controller
1
1
1
Feedback input z1 Feed rate
1
1
Constants u1max
1
Saturation limit slave controller
2
Max
5 Some Modelling Support
153
The Master Controller The first window shows the component statements: % Master controller u2ref = 0 [u2] = PController(z2,z0,k2,u2ref,u2max)
The next window shows the arguments classification:
: u2ref is a dummy variable, zeroing the output reference of PController. The function must not have numbers as input.
: z0 is of class Constant; it represents the fixed level reference. : k2 is of class Parameter; it represents the controller gain and can be varied. : u2max is of class Constant; it represents the controller range and is fixed. Argument
Class
Component output u2ref
Internal
Component input z0 k2 u2max
Constant Parameter Constant
The next window shows the argument attributes:
: The master controller has a much higher gain than the slave, to be in tune with the large tank volume.
Argument
Short description
Dim Scale
Parameters k2
Proportional gain master controller 1
Feedback input z2 Level
1
Constants z0 u2max
1 1
Level reference Saturation limit master controller
Nominal
Min
1
100
0
1
1
Max
1 2
5.1.2 User’s Functions and Library The components of the CascadeControl system have now been defined, except for the PController function. To see how a user’s function is created click Edit in the Model class specification window (Figure 5.2). Then click User lib in the Component library window to open the User’s library window (Figure 5.3). The User’s library window works as the Component library window, except that its contents reside in casedir\Flib instead of casedir\Clib. In this case the library contains a single function. To see how it was created select Change and click OK. The first window shows the function statements:
154
Practical Grey−box Process Identification
Figure 5.3. Editing PController function [u] = PController(z,zref,k,uref,umax) u = k*(zref − z)/umax u = uref + umax*u/sqrt(1 + u*u)
The function is defined in the same way as a component, except that it must start with a function statement in the same way as a MATLABX function. The brackets [] are compulsory. The variables are dummy arguments. The statements that follow have the same syntax (a subset of M−statements) as in component definitions, except that user functions cannot call other user functions. If some output are derivatives (i.e., if an input also appears in the output prefixed with D), then this defines a dynamic user’s functions. It would for a PIController. In PController the output of an ideal linear controller passes a soft limiter with half−range umax and is added to the output reference uref. Notice that if uref equals umax the controller output is positive. That is the case with Controller1 setting the valve opening. No further specification is required. Unlike components, a function’s (dummy) arguments do not have attributes and cannot have I/O interface. The argument attributes of the calling components provide what is necessary to execute a call.
5.2 Rescaling Scaling of a variable or parameter is primarily done (by editing the Argument attributes window) as part of the definition of the first component that contains the variable or parameter. Sometimes, however, suitable scales cannot be determined until the model has been simulated, and it is possible to see the consequences of the initial scaling. This motivates a rescaling option: Click Simulate in the Model class specification window. MoCaVa opens the Origin window (Figure 5.4). Click OK. MoCaVa opens the Plot specification window (Figure 5.5). Click OK. MoCaVa displays Figure 5.6. In this case one would be most interested in the level the control system is meant to stabilize, and may also want to reduce the amplitudes of the two switching input variables. Generally, the y−ranges of the plots are given by those of their variables, and the scales are determined by their scale attributes, specified when the variables were introduced in the modelling. However, MoCaVa3 has an option for changing the scales directly based on the plots: Click Rescale in the Model class appraisal window (Figure 5.7). Then click on the enumeration of the graph whose scale you want to change, in this case on the very compressed numbers in the upper right graph.
5 Some Modelling Support
155
Figure 5.4. Parameter values for CascadeControl
Figure 5.5. Selecting plotted variables
MoCaVa opens the Rescale window (Figure 5.8). The scale attribute of z2 appears in the editable field. Click on x 0.1 to reduce the number by that factor, i.e., to amplify the graph of z2 ten times. If you prefer to edit the scale attribute manually, you must also close the window by clicking x 1. MoCaVa displays the new graph in Figure 5.9. Notice that the scale of the residuals of z2 has also changed. It is always the same as that of z2, in order to ascertain a fair comparison between the variable and its residuals. (Trying to click in the graduation of a residuals graph causes an error message). Notice also that the scale attribute has been changed and not only the plotting resolution. The latter, but not the first, can also be changed by double−clicking directly on a graph. This will open, for instance the window in Figure 5.10. The magnifying glass in the tool bar can now be used to study a particularly interesting part. For instance, magnifying the first step in the feed response twice will yield Figure 5.11.
156
Practical Grey−box Process Identification
Figure 5.6. Results of simulating the CascadeControl model
Figure 5.7. Allowing layout and rescaling
Figure 5.8. Window to rescale z2
One might wonder what could be causing the spikes. A look the measurements only might suggest that they are ‘outliers’ (hadn’t the data been simulated). Click Layout in the Model class appraisal window (Figure 5.7). MoCaVa opens the Plot specification window. Select the time range (40,140) to increase the time resolution (Figure 5.12) and click OK. MoCaVa shows the graphs in Figure 5.13.
5 Some Modelling Support
Figure 5.9. Rescaled results of simulating the CascadeControl model
Figure 5.10. Graph of feed rate
157
158
Practical Grey−box Process Identification
Figure 5.11. Magnified part of the Feed rate graph
Figure 5.12. Specifying time range and plotted variables
5 Some Modelling Support
159
Figure 5.13. Selected part of the responses of the CascadeControl model
It is now evident that the spikes are due to the slave controller’s response to the changing pressure.
5.3 Importing External Models Running MoCaVa with this option requires a DymolaZlicense. The writing of components in MoCaVa is based on a limited set of M−statements (see Section 4.4.2, Help: Component Function Window), and the user has to enter those statements manually. However, it is conceivable that the user has already created a model for other purposes than calibration and validation, most likely for simulation. MoCaVa3 has an option for importing such models produced by DymolaZ. The advantage of using modelling and simulation software like DymolaZ is that the modelling of complex systems is more developed and supported with more graphics and an extensive library. A disadvantage is that it is easier to make models of a complexity that can well be simulated, but not calibrated, both because the latter is a basically more difficult task, and because available data may not allow a complex model to be calibrated and validated. Generally, MoCaVa cannot cope with hybrid models or models that contain essential discontinuities, like dominating effects of dry friction or dead zones. MoCaVa3 is therefore equipped with a ‘filter’ that tries to sort out DymolaZ models that MoCaVa cannot process. The filter may not always succeed, so there is some hazard in importing models. The import function is based on the DymolaZ option for producing an S−function dsmodel.c, which contains all information necessary to simulate it, and can be linked with the other C−functions that constitute the core of IdKit. The importing of DymolaZ models will be illustrated by three examples: : A simple DymolaZ demo example written in ModelicaX: “Pendulum”.
160
Practical Grey−box Process Identification
Figure 5.14. ModelicaX code
Figure 5.15. Translating ModelicaX to C−code
: The Drum Boiler again, but created with DymolaZ using ModelicaX. : A moderately large DymolaZ demo example using the DymolaZ library: “Furuta pendulum”.
5.3.1 Using DymolaZ as Modelling Tool for MoCaVa Start DymolaZ and open the window for defining models in the ModelicaX language. The code defining Pendulum is show in Figure 5.14. The essential differences from the modelling in MoCaVa are that DymolaZ recognizes fewer argument classes (“Parameters”, “States”, “Derivatives”, and “Output”) and that fixed values are assigned in the code. Click on Simulation and select Translate (Figure 5.15). DymolaZ generates a number of files in directory Dymola\work\, in particular dsmodel.c. Close DymolaZ.
5 Some Modelling Support
161
Figure 5.16. Importing the Pendulum model
Figure 5.17. Components library for imported Pendulum
Start MoCaVa and select Calibrate. Click New Project and enter its name Pendulum. Open the new project. Find and open the prepared test data in Examples\ Pendulum\ (it was generated using the same model). MoCaVa opens the Time range window. Click OK to indicate that the whole data file be used. MoCaVa opens the component naming window. So far, the procedure does not deviate from that of processing a MoCaVa model. Enter Pendulum and check the box for Dymola dsmodel.c (Figure 5.16). MoCaVa opens the Component library window (Figure 5.17). This is also the same, except that the only component has got a D_ prefix to mark that it contains a DymolaZ model. Click OK to edit the so far empty component. MoCaVa opens a browser window to find the directory of the C−model produced by DymolaZ. Find and open dsmodel.c. The dsmodel.c. file will contain everything needed to run the model with given parameters and start values, and is the only thing imported. After loading it MoCaVa will first process the file to obtain information about its arguments (including their names), in order to make a MoCaVa ‘root’ component, and then link dsmodel.c with the MoCaVa C−programs for fitting and simulation. Only the ‘root’ component can be created by Import, and its input are all Parameters. The DymolaDrumBoiler example in Section 5.2.3 will illustrate how input can be made variable, for instance entered from a data file. The information entered into the browser is also used to find and store the path to DymolaZ. The path is needed to locate the include files in dsmodel.c residing in Dymola\Sources\ (and property of the Dynasim company). MoCaVa will copy dsmodel.c (but not the include files) to its case directory for processing. MoCaVa opens the Dymola connection window (Figure 5.18). The purpose of this window is to give the user an opportunity to select the input (Parameters) and output (Variables) from among the arguments defined in DymolaZ, and thus to make a full or partial connection to dsmodel.c. The option is useful in
162
Practical Grey−box Process Identification
Figure 5.18. Window for specifying terminals
Figure 5.19. All input and output of Pendulum connected to MoCaVa
Figure 5.20. Classifying the arguments
cases where the variables are many, while only some of them are of interest to the user of MoCaVa. Some DymolaZ demo models define many hundreds of variables, but can still be run by MoCaVa, provided the smaller numbers of variables to be fitted and parameters to be estimated are of a size the MoCaVa windows can handle and available data can support. Notice that states cannot be selectively connected; it is not possible to process a model without access to all its state variables. (In DymolaZ the corresponding selection is done when the user decides what variables to plot after a simulation). An example of a selective connection of a large DymolaZ model will be given in the Furuta pendulum case further below. For the simple Pendulum model select
in the pull−down menus and click Connect for each column. Connected variables and parameters are listed (Figure 5.19). Click OK. MoCaVa opens the Argument classification window (Figure 5.20).
5 Some Modelling Support
163
Figure 5.21. Both states are measured
Figure 5.22. Specifying argument attributes for Pendulum
The moment of inertia J is computed by Pendulum and output to MoCaVa each time quantum. It is constant and of little interest outside the model, and therefore classified as Internal. It would also have been feasible not to connect it in the Dymola connection window. The pendulum has two parameters, length and mass, while gravity is a known constant. It is well−known that mass does not affect the movements of an ideal pendulum, so that only L is identifiable from response data. But that will turn up in an attempt to calibrate the model. Notice that the start values are not among the parameters. Unlike the parameters, which have names, the start values have been ‘hard−coded’; that of phi is set in the ModelicaX statement “Real phi(start=0.3);”, and since no start value has been given for w, it is zero by default. Hence start values cannot be fitted in this model, even though they would have been identifiable from the data. Click OK. MoCaVa opens the I/O interface window (Figure 5.21). The simulated data file contains records of both angle and velocity. Click OK. MoCaVa opens the Argument attributes window. The default values are inherited from the DymolaZ model, except the Short description fields. Enter the Short descriptions and click OK (Figure 5.22). This ignores the prior knowledge that mass and length have positive values.
164
Practical Grey−box Process Identification
Figure 5.23. Assigning data to Pendulum
Figure 5.24. Single−component graph of Pendulum
Figure 5.25. Parameter values for Pendulum
MoCaVa opens the Data assignment window (Figure 5.23). The recordings of the states have been given the same names as in the model. Click OK. This concludes the import of Pendulum and assignment of data. The single−component model class is shown in Figure 5.24. The graph of the model class gives information only on the model’s input (parameters and constants) and output. Notice that phi and w are the sensor output, and the arrows point at data records with the same names. Most generally, there are three variables associated with each output, which in the present case is hidden by the fact that they all have the same names: the state variables, the output of the sensors attached to the states, and the actual recordings in the data. The graph always show the sensor output and never the state variables. Trying to click Show in the Model class specification window to see what is in the box yields no other information than the component name. The statements defining the model are in DymolaZ and not in MoCaVa. Click Simulate. MoCaVa opens the Origin window (Figure 5.25). The values of m and L are those defined in the ModelicaX code in Figure 5.16, and therefore correct. The rms−values originate in the MoCaVa Sensor library model and are much larger than the values 0.01 used when generating the test data. However, this does not affect simulation. Click OK.
5 Some Modelling Support
165
Figure 5.26. Selecting Pendulum state variables
Figure 5.27. Simulating Pendulum
MoCaVa opens the Plot specification window Check phi and w. Click OK (Figure 5.26). MoCaVa shows the plots in Figure 5.27. This shows only that the importing works, since data has been generated by the same model with a small noise sequences added. The latter coincide with the residuals. Click Accept to initiate calibration.
166
Practical Grey−box Process Identification
Figure 5.28. Wrong start values
Figure 5.29. Result of search for Pendulum parameters
5.3.2 Detecting Over−parametrization MoCaVa opens the Tentative structure window. he only difficulty with the Pendulum case is that it is over−parametrized. However, and for the sake of argument, let a reckless user ignore that m is unidentifiable, and try to fit all parameters, and, to make things worse, guess very wrong start values for the search. Check all parameters free and click OK. Edit the default values and click OK (Figure 5.28). MoCaVa shows the search score in Figure 5.29. After some initial difficulties (caused by the strong nonlinearity and much erroneous start values), the search arrived at a loss minimum. The rms values are correct,
5 Some Modelling Support
167
Figure 5.30. The Pilot
Figure 5.31. Result of fitting only the rms−values
and the L−value is correct, except that it has the wrong sign. However, the sign is unidentifiable, since the response depends only on the square of L. The totally unidentifiable m−value stays where it started, and the value within parenthesis says that it is uncertain. In order to get the correct sign of L the user should have specified the prior knowledge in the Arguments attribute window that L has zero minimum. Although even a ‘reckless’ user would arrive at the correct result in this case (if limiting L), a ‘cautious’ user would do this: Suspend, start Calibrate, and Resume from where the free parameters were picked, backing in this case 3 session steps (Figure 5.30). Fit the rms−values first. With the same erroneous start values the outcome of the search is shown in Figure 5.31. Accept. Free m and L to see whether any of them will yield in better model. MoCaVa shows the result in the Test appraisal window (Figure 5.32). The answer is, correctly, that L but not m may be used to improve the model. Click Select_#2 and fit. MoCaVa shows the result in the Figure 5.33.
168
Practical Grey−box Process Identification
Figure 5.32. Result of testing whether fitting one more parameter will help
Figure 5.33. Result of fitting length of Pendulum
Again the estimate is correct except for the sign, again because the search space was not limited to positive values. Since the search started with a positive value, this indicates that the search, initially, took too long a step, into the domain of attraction towards the wrong minimum. It should therefore help to reduce the step length. Since m has not been freed, it has the nominal value.
5 Some Modelling Support
169
Figure 5.34. Reducing step length
Figure 5.35. Result of fitting length of Pendulum
Click Reject. MoCaVa opens the Search specification window. Enter 0.5 for Step reduction and click *2 (Figure 5.34). Click OK. MoCaVa shows the result in Figure 5.35. The sign is now correct, which confirms the suspicion of earlier initial ‘overshoot’. Click Suspend to end the exercise.
170
Practical Grey−box Process Identification
Figure 5.36. ModelicaX code for DrumBoiler
5.3.3 Assigning Variable Input to Imported Models The Pendulum model standing alone has only constant input. The next example will illustrate how input can be made variable, for instance entered from a data file. It is the same model as treated in chapter 4, but now written in ModelicaX, generated by DymolaZ, and imported to MoCaVa. Start DymolaZ and open the window for defining models in the ModelicaX language. The code corresponding to the DrumBoiler component is shown in Figure 5.36. Click on Simulation and select Translate. This will again generate a file Dymola\work\dsmodel.c. Only the ‘root’ component can be imported, and its input are all Parameters. In order to make some input vary one must connect them with other components generating variable input, either by connecting to data files, or using library functions for stochastic disturbances, or by writing some other source model. This is the same construction as when the root component is created in MoCaVa and then connected to other input− source components. In fact, the same source components will be used as in the DrumBoiler case. Hence, importing a DymolaZ model generally makes a model that is a hybrid between MoCaVa and DymolaZ models, simply because a DymolaZ model is independent of its input data, and some more is needed to get that data into the model. Only when a DymolaZmodel has only constant input can it be run alone by MoCaVa, like the Pendulum example. To process the DymolaZ DrumBoiler using MoCaVa: Start Calibrate and make a new project DymolaDrumBoiler. MoCaVa adds the new project to the list. Click Open. Then select and open the same data file and the same sample from that file. Enter the root component as before, but check also the box after Dymola dsmodel.c. The Component library window is also the same, except that the only component so far has got a D_ prefix to mark that it contains a DymolaZ model.
5 Some Modelling Support
171
Figure 5.37. Pull down to select parameters to be connected
Click OK. Find Dymola\work\dsmodel.c and open it. After processing the file MoCaVa opens the Dymola connection window. Connect all output (Variables) and input (Parameters). Click OK (Figure 5.37). MoCaVa opens the Argument classification window. Classify input, as in the DrumBoiler project (Figure 4.13). In dsmodel.c all input are parameters by default. Notice that the Disturbance classification is no longer available. A stochastic input must be modelled by connecting the output of a source component to any of the four classes of input. As before, both response variables have sensors, but not the states. MoCaVa opens the Argument attributes window (Figure 5.38). The window contains the attributes needed for the calibration. Some of their values have been extracted from dsmodel.c, mainly names, dimensions, and nominal values. Scales and ranges are not available in dsmodel.c and have been given default values. Short descriptions are sometimes available, but not if the model has been written in ModelicaX. Edit the default values to agree with those in the DrumBoiler project (Figure 4.15). Edit the Implicit attributes (Figure 4.16) and Data assignment windows (Figure 4.17). This completes the creation of the ‘root’ component. It consists of the DymolaZ model with two sensors attached and data records assigned. It can execute standing alone with constant input. The result would be as in Figure 4.22. MoCaVa opens the Model class specification window. The way to give the model variable input is the same as in the DrumBoiler project. In fact, it is possible to copy its input source components into the DymolaDrumBoiler project. The simplest way is to use the Windows Explorer: Press and hold down the right mouse button on the highlighted Control directory and drag to copy it into the DymolaDrumBoiler\casedir\Clib directory
172
Practical Grey−box Process Identification
Figure 5.38. Default attributes for the DymolaDrumBoiler
(Figure 5.39). Repeat this for the Feed and Disturbance components. (The ModelicaX version has no ‘stubs’ for input and state disturbances.) There are two more things to do before the components are properly installed and can be activated: : Register the new components in the component library index. : Generate their corresponding C−functions. Both tasks are done in the same way as when a new component is created. The point of copying the directories is that all necessary specifications (except the component names) will already be in the input windows, and what remains to do is clicking OK sufficiently many times. Click Edit, select Insert and enter Control. (Do not use the Copied option; it is only for copying a component in the same library. That is described in Section 4.15). Continue clicking OK, until the Model class specification window appears again. Repeat this installation procedure for the Feed and Disturbance components, until the Model class specification window shows that all have been installed. The model is now equivalent to that in the DrumBoiler project, and can be calibrated in the same way.
5 Some Modelling Support
173
Figure 5.39. Using Windows Explorer to copy components However, since it has already been established that all components and parameters are significant, let it suffice to fit the full model structure to data. Click OK. Free all parameters (Figure 5.40) and start from the nominal values. The search finds the estimates in Figure 5.41. A comparison with the values from the DrumBoiler project shows that the estimates differ somewhat (since they have been found by different search procedures), but not significantly (Figure 5.42). 5.3.4 Selective Connection of Arguments to DymolaZ Models Start DymolaZ and open the Furuta pendulum demo (Figure 5.43). This example is all continuous and much larger than the DrumBoiler. It is also constructed using the DymolaZ graphic facilities and library, which means some preliminary complications. Mainly, parameters and variables are structures, and their number (873 output and 122 input) is larger than MoCaVa can handle. Press the Simulation button, and use the Simulation and Translation tools to generate dsmodel.c. This demo case is created in the directory
174
Practical Grey−box Process Identification
Figure 5.40. Search for all parameters
Figure 5.41. Parameter estimates for DymolaDrumBoiler
Figure 5.42. Parameter estimates for DrumBoiler
Dymola\Modelica\Library\ModelicaAdditions\Multibody\ Examples\Loops\ unlike the user created models, which reside in Dymola\work\. The Furuta pendulum describes an inverted triple pendulum. It consists of eleven blocks, as depicted in the graph and also listed under Components in the Modeling window. Assume one wants to simulate and plot the angles of each pendulum. DymolaZ achieves this by first simulating all defined variables, and then allowing the user to select the variables to be plotted. The latter is done by opening the appropriate directories in the Simulation window and checking the boxes for the variables to be plotted. In this case R1.q, R2.q, R3.q (Figure 5.44). Exit from DymolaZ and start MoCaVa and Calibrate. Select New Project, and enter its name Furuta. Open the Furuta project. Open the prepared test data in MoCaVa\Examples\Furuta\testdata.mcv. Use the whole file.
5 Some Modelling Support
175
Figure 5.43. Opening the DymolaZ Furuta pendulum demo
Name the ‘root’ component Furuta and check the box after Dymola dsmodel.c. The imported Furuta model is now registered in the component library but not connected. Tha latter is done in two steps: Click OK to start the connection, and find and open dsmodel.c to get its path. The path is the key to extracting structure information from Furuta and linking with MoCaVa. The second step is selecting the variables (out of the 1001 defined by DymolaZ) that are of interest to the user in the particular case. In the Pendulum example (Section 5.2.1) the same selection was done by classifying an uninteresting output as Internal. What prevents something similar in this case is the sheer size of the Argument classification window that would be required. DymolaZ reduces the problem by using structured variables, but it is still necessary to scroll a large virtual window to be able to mark the interesting members (Figure 5.44). The Dymola connection window serves the same purpose and works without scrolling (Figure 5.45). In the present case the three angles R1.q, R2.q, R3.q we want primarily to connect (since there are corresponding values in the data file) are also among the six state variables (the other three are the angle velocities R1.qd, R2.qd, R3.qd). Since states are connected by default, and since input are all constant parameters and have values specified in dsmodel.c, is it quite possible to proceed without connecting any of the other 995 variables. However, for the sake of argument, assume we want also to check some particular detail in the model, for instance the vector FT3.torque.size.
176
Practical Grey−box Process Identification
Figure 5.44. Selecting DymolaZ variables to be plotted
Figure 5.45. Specifying variables to be connected
Pull down the Variables menu to find FT3, then the FT3 menu to find torque, and finally the torque menu to find size (Figure 5.46). Thus, the partial−connection procedure emulates the selection of variables to be plotted in DymolaZ (as well as the similar procedure of finding files in Windows Explorer), and eliminates the need for scrolling large lists. Click Connect. MoCaVa lists connected variables, except the default states (Figure 5.47). Click OK. Classify the torque sizes as Response (Figure 5.48). MoCaVa opens the I/O interface window.
5 Some Modelling Support
177
Figure 5.46. Connecting three torque size variables
Figure 5.47. Three torque sizes have been connected
Figure 5.48. Classifying torque sizes
Notice that the object−oriented names of the variables are preserved in a connection to MoCaVa. Attach Sensors to all recorded variables (Figure 5.49). MoCaVa opens the Argument attributes window (Figure 5.50). Since no input was connected, the only parameters are those introduced by the sensors. All other entries are extracted from dsmodel.c, including the Short descriptions of the states. The latter appear in the list of DymolaZ variables, and in this particular case only hold the units of the variables. Since the corresponding field for FT3.torque.size has been left empty in the DymolaZ model, MoCaVa has filled in with the default text, the variable name. Click OK to accept the attributes from DymolaZ. Click OK to assign data to the sensors (Figure 5.51).
178
Practical Grey−box Process Identification
Figure 5.49. Append Sensor functions
Figure 5.50. Default attributes extracted from DymolaZ
This completes the Furuta component. Since the component has no variable input, it also completes the definition of the model class. To start processing Furuta: Click Simulate. MoCaVa compiles dsmodel.c and links to MoCaVa, executes the resulting program simp.exe, and opens the Plot selection window. Select the three angles and click OK (Figure 5.52). MoCaVa plots the result in Figure 5.53. The results agree with those plotted by DymolaZ (Figure 5.44). Click Layout and check the boxes for the three torques. MoCaVa plots the result in Figure 5.54. This too agrees with DymolaZ. Also the simulation times are about the same, a few seconds excluding compilation time.
5 Some Modelling Support
Figure 5.51. Assigning data to state sensors
Figure 5.52. Selecting the three angles for plotting
179
180
Practical Grey−box Process Identification
Figure 5.53. Result of simulating Furuta with MoCaVa
Figure 5.54. The torque sizes
5 Some Modelling Support
181
Fitting the model to data yields estimates of the only free parameters, the standard deviations of measurement errors, all without significant deviation from the true values 0.01 (Table 5.1). Table 5.1. Parameter estimates Parameter
Value
rms_R1.q rms_R1.qd rms_R2.q rms_R2.qd rms_R3.q rms_R3.qd
0.00980 á0.0002 0.01007 á0.0002 0.00989 á0.0002 0.00986 á0.0002 0.00961 á0.0002 0.01005 á0.0002
6
Rinsing of the Steel Strip in a Rolling Mill
6.1 Background The case has been published previously (Bohlin, 1991b, 1994a) as an application of the earlier IdKit1 package. That study, in turn, was based on data collected in a study by Sohlberg (1990; 1991), using the CYPROSX package (CAMO A/S, Trondheim, Norway) for analysis and identification. The purpose of presenting the case a third time is to illustrate the application of the user’s guide in MoCaVa3 as an interface to IdKit. The main difference between the presentations is that the systematic procedure for setting up problems to IdKit that the previous study was intended to try out, has now been formalized and partially automated by MoCaVa. The data originate in an experiment carried out at the SSAB Domnarvet steel plant in Borlänge, Sweden. During this experiment one of the control variables, the fresh water feed, was varied for the sole purpose of identification. Other input were logged as in normal production. Permission to use the data is gratefully acknowledged. The original purpose of the model design was to find a better control strategy for the fresh water feed, in particular during the frequent grade changes.
6.2 Step 1: A Phenomenological Description This first step in the model making procedure means recognizing a priori what units and phenomena that mainly contribute to making the object of the modelling do its task, and the circumstances under which the experiment data were produced. The step serves to delimit the object, and to focus the modelling on the a priori most important phenomena. This normally requires frequent consultations with an engineer responsible for the process. Björn Sohlberg (1990) is the main source of the prior information used below. 6.2.1 The Process Proper The mechanism of the rinsing process is illustrated by Figure 6.1 (Sohlberg, 1990). The steel strip enters the process from a pickling bath. After the pickling, high−concentrated hydrochloric acid adheres to the strip surface. Most of it is removed by double push−back rollers, also called “squeezer rolls”. The rollers are pressed to the strip by springs in order to make the removal efficient. A residual lets through, however, as a film on both sides of the strip, and enters a series of five rinsing zones. Each zone has the same principle construction.
186
Practical Grey−box Process Identification
#5
#4
#3
#2
#1
Strip
Control water
Outlet Figure 6.1. The rinsing process
While the strip passes a rinsing zone it is sprayed from both sides by showers under high pressure. The rinsing fluid is pumped from the bottom of a recipient below the rinsing zone. Rinsing zone and recipient are confined in a housing (call it “tank”), which prevents most of the spray from contaminating the surroundings. The acid diluted by the spraying goes mainly down into the recipient, either by streaming over the edges of the strip, or falling from the lower surface of the strip, or dropping from the walls or ceiling of the tank. There is no forced mixing mechanism in the recipient; mixing is obtained naturally on the strip surface by the spraying and in the recipient by the suction from the bottom sink caused by the rinsing pump. However, since the diluted acid falls on the surface in the recipient, while the rinsing fluid is pumped from the bottom, the acid concentrations in the recipient may or may not be homogenous. This would obviously depend on the pump capacity (which is about 100 m3/h) as well as on the volume of the recipient (which is about 10 m3). This yields a turn−over time of about 0.1 h, or half the sampling interval. At the end of the rinsing zone the strip passes between another (but single) pair of push−back rollers and enters the next zone. The single pair of rollers are not pressed to the strip by springs, only by the weights of strip and upper roller. The successive rinsing zones dilute the acid adhering to the strip stepwise, until it exits from the last zone with an acceptably low concentration. The exit rollers in the last zone are doubled and pressed by springs. The rinsing fluids (diluted acid of various concentrations) are continuously refreshed by flows of lower concentration from the next recipient, except that the last recipient is diluted by distilled water. The flow mechanisms differ somewhat. The flows into recipients #2, #3, and #4 are caused by natural overflow through three slots in the walls between the recipients. Protective plates prevent undesirable ‘backwards’ flow by spray through the slots. Recipient #1 is refreshed by pumping from #2. Fluid is also pumped out of recipient #1 as a rate of 0.3 m3/h into a recovery process. The pump flow into #1 is regulated to hold a constant level in the recipient. A visual inspection revealed that the regulator error is about ¦0.05 m. The access fluid in recipient #2 also goes to the recovery process by natural overflow through a circular hole in the wall.
6 Rinsing of the Steel Strip in a Rolling Mill
187
The mechanism of the spraying and the effect of the rollers are complicated and known only vaguely. Although most of the acid diluted by the spraying no doubt returns to the recipient, there are (at least) two other possible exits: i) through the rollers and further through the slot letting out the steel strip, ii) through the ventilation system. Very little is known about the effect of the latter. The inlet of ventilation air is through the natural openings (mainly with the strip). There is no reason to assume other than constant ventilation. The flows through the rollers and the slot depends on the geometry and the placements of the shower heads, and are likely to be prohibitively complicated for hydrodynamic modelling. The following can be stated, however, with some confidence: : The rollers are rubber coated, which means that the strip digs into the roller surface, and makes the slot between rollers, and beside the strip, smaller than the strip thickness (Figure 6.2).
Figure 6.2. Rollers cavitate
: The coating deteriorates by wear, gradually increasing the film thickness on the strip.
: The rollers push most of the fluid on the upper surface over the edges, but since
the flow has approximately the same velocity as the strip, there should be a flow ejected through the gap between rollers, at least close to the edges. : A third exit flow (to the next recipient) may go directly from the showers into the exit slot (not dependent on strip velocity). : The input flow from the pickling bath should be mainly via the surface film, since there is no spraying, and the rollers are doubled. : Showers are turned off when the strip is stopped.
Some Geometrical Measures (Reliable) Height of tanks = 2.38 m Width of tanks = 2.6 m Lengths of tanks #1 − 5 = (3.4, 3.4, 3.4, 3.4, 4.3) m Total width of the three slots in a wall = 2.04 m Height of slots = 0.09 m Volumes of recipients #1 − 5 = (6.0, 5.12, 6.01, 6.80, 9.84) m3 Diameter of the circular hole in tank #2 = 0.2 m Volume of each one of the four pickling baths = 16 m3
Other Prior Information Flow pumped from tank #1 = 0.3 m3/h (reasonably reliable) Rinsing flow = 100 m3/h (reasonably reliable) Input concentration of acid ¶ 200 kg/m3 (unreliable)
188
Practical Grey−box Process Identification
Figure 6.3. Experiment data. The first four variables are input, the last five are ouput. Time unit is hour.
6.2.2 The Measurement Gauges The conductivities in the rinsing fluids are measured by reliable gauges. There is also a reliable linear relation to the acid concentration. The calibration constant is known.
6 Rinsing of the Steel Strip in a Rolling Mill
189
The sizes of random measurement errors are uncertain, but believed to be of the order of 1%. 6.2.3 The Input The object process has four known input variables, namely control water flow, strip velocity, strip width, and strip thickness. All are control variables for the integrated steel making process, but only the first one is used to control the main quality variable, the residual acid on the strip surface. Their variations are depicted in the four top graphs in Figure 6.3. They have been generated as follows: : The control input flow has been switched between a priori chosen high and low values (3 and 1 m3/h) according to a given pattern of short and long intervals. After half the time the pattern has been repeated. The actual input flow was however disturbed by irregularities in the actuator, and also by unscheduled stops. As a rule, whenever the strip has been stopped so has the input flow. : The three other input sequences have been those of the normal operation of the plant. In particular, the strip has been stopped frequently for unrecorded reasons. Width and thickness vary according to product specifications. Two more input are determined by the preceding pickling process and the efficiency of the initial rollers, but are unknown, namely the volume flow and acidity of the surface film into the first tank.
6.3 Step 2: Variables and Causality The purpose of the second step is to define the most important variables and the causation between them. Like the first step this must be based on prior knowledge only. It will also create a skeleton structure on which to hang any prior knowledge about the variables and the relations between them; the variable definitions will be the ‘nodes’ to be connected with functional relations. 6.3.1 The Variables A list of variables follows immediately from Figure 6.1. Known Input uf = Control input flow [m3/h] uv = Strip velocity [m/h] uw = Strip width [m] ut = Strip thickness [m] F10 = Outlet flow from recipient #1 [m3/h] F s = Rinsing flow (all tanks) [m3/h] Unknown Input c 0 = Input acid concentration F01 = Input acid flow Measurements Ci = Conductivity in the rinsing fluid in tank #i (i = 1,...,5) [mS/m]
190
Practical Grey−box Process Identification
Fs q ji, F ji
F ji−1
Strip c si F ri
F oi+1
Recipient
ci
F oi
Figure 6.4. A two−compartments tank model
Other Endogenous Variables F ji = Volume flow from roller #i into next tank, (i=1,...,4) [m3/h] q ji = Acid flow from roller #i into next tank, (i=1,...,4) [kg/h] F ri = Rinsing fluid from strip into recipient in tank #i, (i=1,...,4) [m3/h] F oi = Overflow from recipient #i, (i=3,...,5) [m3/h] F o2 = Flow between recipients #2 and #1 controlled by level regulator F20 = Outlet from recipient #2 [m3/h] F50 = Residual exit flow with the strip [m3/h] State Variables In a rinsing process the number of state variables is generally infinite (a distributed− parameter system), but a reasonably detailed (and perhaps too detailed) compartmental model would have only few per rinsing tank. A simple start is to model each tank by two compartments, i.e., the boxes in Figure 6.4 represent perfect mixing tanks. c si = Concentrations on the strip at the end of rinsing zone #i c i = Concentration(s) in the recipient in tank #i (possibly vector−valued) zi = Depth of fluid in recipient #i A number of other (and more elusive) state variables are conceivable, e.g., caused by waves on the surface. 6.3.2 Cause and Effect The causality between the state variables in different tanks is not clearcut, since the overflows mean feedback flows between tanks. At the minimum, there are two states per tank, namely concentration and level (if c si would have an algebraic relation to c i ). The first state feeds to the next tank and the second state feeds to the previous. However, it is evident that the time constants of the level overflow are much shorter than those of the concentration (water flow is faster than mixing), which makes the system ‘stiff’. In the simplest conceivable model it will even be reasonable to hypothesize that the effect of overflow is instantaneous, and thus to resort to a single state per tank. That would still mean feedback between tanks, only that the equations would now be algebraic. When modelling the process it would be most straightforward to assign a separate component to each tank. That would mean connecting components both ‘downstream’ and ‘upstream’. MoCaVa is able to handle such systems, even when some relations are
6 Rinsing of the Steel Strip in a Rolling Mill
191
algebraic. However, this is done, in essence, by replacing the algebraic equations with ‘fast’ state equations (see Section A.5). A better way is to avoid feeding state information between components, simply by modelling all five tanks within the same components. In essence, this replaces the algebraic equations by assignment statements, and the level states become output, who cost very little computing. The simplification is feasible, because the level ‘states’ are now bound algebraically to the concentration states (hypothetically). Also because models of all tanks have to be present in all reasonable model structures, is it better to include at least rudimentary models of all five tanks in a single component. Refinement of the model will then mean including more physical phenomena in the expanding model classes. Thus, components will mainly represent physical phenomena and not process units. The ‘root’ model class should preferably describe only the mass balances of acid and water in a (perfect) mixing taking place in the five tanks. With a single component the causality issue will be no problem, since the state equations of the root model class follow immediately. Additional components, describing other phenomena are natural causes of changes in the root model, and the order of causality between components follows naturally too. The following steps are supported by MoCaVa. 6.3.3 Data Preparation The experiment data used for the calibration and validation has been recorded during normal operation, except that the fresh water regulator has been disconnected and replaced by a digital perturbation−signal generator. However, in other ways the range of the experiment covers a representative set of product specifications. Raw data are available from a 280 hours experiment, and are stored in a single ASCII file residing in mocava\Examples\Rinsing\domnarvet.dat. Table 6.1 shows the contents in the file. Table 6.1. Record specifications of data file domnarvet.dat
Pos.
Variable
1 2 3 4 5 6 7 8 9 10
Time_[h] ConductivityTank#1_[8000 mS/m] ConductivityTank#2_[500 mS/m] ConductivityTank#3_[25 mS/m] ConductivityTank#4_[3 mS/m] ConductivityTank#5_[mS/m] FreshWaterFlow_[m3/h] StripVelocity_[230 m/min] StripWidth_[m] StripThickness_[mm]
Notice the peculiar units, which have been set to fit the data acquisition system (they serve to bring all data to the same order of magnitude). The sampling interval is constant, and there are no missing data or gaps in the sequence or records. Neither are there any obvious outliers. The data preparation will therefore be without problems.
192
Practical Grey−box Process Identification
Figure 6.5. Entering data file properties
Remark 6.1. A reader who would want to use MoCaVa to repeat the model design procedure described below may skip the data preparation steps, since the result is available as domnarvet.mcv in the same directory. Start MoCaVa by typing mocava3 in the MATLABX work space, and select Predat, New project, and Get data file. MoCaVa opens a browser window for locating the file. Move to the directory Examples\Rinsing and open domnarvet.dat. MoCaVa opens the Plot outline and the Data outline windows. Consult Table 6.1 to change the following in the Data outline window: : Rename the variables. This will also have immediate effect on the Plot outline window. : Indicate Measured Variable Nr 1 as time, and select hours. Click Apply now. The resulting Data outline and Plot outline windows are shown in Figures 6.5 and 6.6. End Predat by clicking Main, and then Save as and Exit. Place the prepared sample in mocava\Examples\Rinsing\domnarvet.mcv. 6.3.4 Relations to Measured Variables The measured variables are displayed in Figure 6.6. The measurement conditions will need specifications later. Make one more table to support this (Table 6.2). The conversion factor is needed for converting from the units used in the model to those used in the data file. The model will use the standardized units [m] and [kg] for length and weight, but [h] for time unit and [mS/m] (milliSiemens/meter) for conductivity.
6 Rinsing of the Steel Strip in a Rolling Mill
Figure 6.6. Data as they appear on the screen
Table 6.2. Relations between model variables and data
Variable
Data
Mean Conversion
C 1 [mS/m] C 2 [mS/m] C 3 [mS/m] C 4 [mS/m] C 5 [mS/m] u f [m3/h] u v [m/h] u w [m] u t [m]
Conductivity#1 [8000mS/m] Conductivity#2 [500mS/m] Conductivity#3 [25mS/m] Conductivity#4 [3mS/m] Conductivity#5 [mS/m] FreshWaterFlow [m3/h] StripVelocity [230m/min] StripWidth [m] StripThickness [mm]
1.036 0.694 1.64 2.11 0.743 1.62 0.496 1.40 3.11
1/8000 1/500 1/25 1/3 1 1 1/13800 1 1000
193
194
Practical Grey−box Process Identification
6.4 Step 3: Modelling Connecting the variables with relations of various complexities and credibilities, with known or unknown parameters, will create the expanding set of model classes that MoCaVa needs for prior information (in addition to the experiment data). This will exploit more specific knowledge about the phenomena governing the behavior of the object. So far, preparations for this step will have to be done only for the ‘root’ model, i.e., for the simplest conceivable and preferably also reliable relations between the defined variables. Further modelling is postponed until the result of the calibration step indicates that refinement will be necessary. Thus, steps 3 and 4 are taken repeatedly in a ‘loop’. Start the loop by selecting Calibrate in the MoCaVa window. Select and open the Rinsing project. (If it is not in the list, click New project and enter its name Rinsing). In the next window select the data file from where you put the prepared data, i.e., mocava3\Examples\Rinsing\domnarvet.mcv. Click OK in the Sample specification window to indicate that the whole file is to be used for the calibration. The modelling procedure always starts with describing the elements that are preferably both reliable and a priori most likely to be important for describing the object’s response to known input. It is clear already from a look at the data that a static model will not be enough to describe the variations in acid concentrations in the tanks (they vary more slowly than the input). In addition, the concentrations covary strongly, and it is difficult to see which of the four input that are the causes of the variation. Make therefore a ‘root’ model class that takes all input and output into account, but also allows for testing the hypothesis that some of the input do not contribute significantly. 6.4.1 Basic Mass Balances The descriptions in Figures 6.1 and 6.4 suggest the following hypotheses for the ‘root’ model class: Hypothesis: Recipients have perfect mixing. Hence, material balances hold for the contents of acid: dc i dt = [F ri (c si − c i) + F oi+1 (c i+1 − c i)] V i, (i = 2, dc 5 dt = [F r5 (c s5 − c 5) + F 05 (0 − c 5)] V 5
, 4)
(6.1) (6.2)
The equations hold also for variable volumes. Hypothesis: The mixing on the strip is instantaneous. Hence, static balances hold for flow and concentration on the strip c s1 = (F s c 1 + F 01 c 0) (F s + F 01) F r1 = F s + F 01 − F j1 + F stub 1 stub q j1 = F j1 (c s1 + c stub ) + q 1 1 c si = (F s c i + q ji−1) (F s + F ji−1), (i = 2, , 4) F ri = F s + F ji−1 − F ji + F stub (i = 2, , 4) i , stub q ji = F ji (c si + c stub ) + q , (i = 2, , 4) i i s s j s j c 5 = (F c 5 + q 4) (F + F 4) F r5 = F s + F j4 − 0
(6.3) (6.4) (6.5) (6.6) (6.7) (6.8) (6.9) (6.10)
6 Rinsing of the Steel Strip in a Rolling Mill
195
Hypothesis: The effective volumes of recipients V 1, , V 5 are constant and known. Hypothesis: Input concentration c 0 and flow F 01 of acid are constant. Hypothesis: Changes of flows between tanks are instantaneous (much faster than the sampling interval 0.2 h). The fluid balance yields F o5 = F r5 + F 05 − F s F o4 = F r4 + F o5 − F s F o3 = F r3 + F o4 − F s
(6.11) (6.12) (6.13)
The relations express the balances in tanks #5,4,3. Since the outlet through the hole in the wall of tank #2 is not measured, a fluid balance in tank #2 would contribute no prior information. Hypothesis: The flow F 10 pumped out of tank #1 is constant = 0.3 m3/h, and the level of the recipient is regulated perfectly. F o2 = F s − F r1 + F o1
(6.14)
It remains to model the residual flows of rinsing fluid ejected from each tank into the next. That is much less well supported by prior information, but the following would appear to be the simplest hypothesis Hypothesis: The ejected flows are mainly proportional to the strip velocity, possibly added to constant ‘spray’. F ji = a oi u v + F spray , (i = 1, i
, 4)
(6.15)
where a oi are parameters measuring the effective areas of the jets of fluid ejected with the strip. Hypothesis: The concentrations are measured with identical, linear conductivity gauges with constant scale factors Lk = 1100 [mS m2/kg]. Measurements are subject to independent Gaussian errors with constant standard deviation C i = L k c i, (i = 1,
, 5)
(6.16)
Stubs The equations have been supplied with a number of extra constants, whose purpose is to indicate where it would be expected a priori that the model structure is inadequate. Determining their places is a means to enter prior assessment of the credibility of the model structures. : a oi is the effective area of the jet of fluid ejected with the strip, and the one most likely not to be constant. stub stub : F stub allow for additional modelling of the flow dynamics. Since it is uni , qi , ci known what physical phenomena that may need to be described, if any, there are several logical places to put the stubs. : Add also Dcstub to the Equations 6.1 and 6.2 in anticipation that they will not explain all consistency variation. Changes in the input acid concentration c si−1 will not be enough, since the transfer gain to dc i dt is approximately F ji V i. This would yields an approximate response time of 100 h. Responses to changes in input flow will be equally inadequate.
196
Practical Grey−box Process Identification
Entering Function Statements Type Tanks in the Component naming window, and select Change to define it. MoCaVa opens the Component function window to receive the assignment statements that specify basic mass balances for the acid flows in the five tanks. Rewriting Equations 6.1 to 6.16 as M−statements yields the following code to be entered (this has already been done in the demo case): % Forward flows with strips: for i = 1:4 Fjet(i) = ao(i)*uv + Fspray(i) end % Static balance on strips: cs(1) = (Fs*c(1) + F01*c0)/(Fs + F01) Fr(1) = Fs + F01 − Fjet(1) + Fstub(1) qjet(1) = Fjet(1)*(cs(1) + cstub(1)) + qstub(1) for i = 2:4 cs(i) = (Fs*c(i) + qjet(i−1))/(Fs + Fjet(i−1)) Fr(i) = Fs + Fjet(i−1) − Fjet(i) + Fstub(i) qjet(i) = Fjet(i)*(cs(i) + cstub(i)) + qstub(i) end cs(5) = (Fs*c(5) + qjet(4))/(Fs + Fjet(4)) Fr(5) = Fs + Fjet(4) − 0 % Static flow balances in recipients #3−5 (overflow): Fout(5) = Fr(5) + F05 − Fs Fout(4) = Fr(4) + Fout(5) − Fs Fout(3) = Fr(3) + Fout(4) − Fs % Outlet from tank #1: Fout(1) = 0.3 % Perfect level regulation in recipient #1: Fout(2) = Fs − Fr(1) + Fout(1) % Temporary halt: if uv < 1 for i = 1:5 Fout(i) = 0 Fr(i) = 0 cs(i) = 0 end end % Dynamic acid balance in recipients: for i = 1:4 Dc(i) = (Fr(i)*(cs(i) − c(i)) + Fout(i+1)*(c(i+1) ... − c(i)))/V(i) + Dcstub(i) end Dc(5) = (Fr(5)*(cs(5) − c(5)) + F05*(0 − c(5)))/V(5) ... + Dcstub(5) % Conductivities: C1 = Lk*c(1) C2 = Lk*c(2) C3 = Lk*c(3)
6 Rinsing of the Steel Strip in a Rolling Mill
197
C4 = Lk*c(4) C5 = Lk*c(5)
Click OK. MoCaVa opens the Component function window again for entering start conditions. The start values are unknown. Introduce an array parameter cinit. for i = 1:5 c(i) = cinit(i) end
Click OK. MoCaVa opens the Argument classification window. Argument Classification The entries in the Argument classification window follows immediately: : All dependent variables, except C1,...,C5 are Internal. : The strip velocity uv and fresh water flow F05 are Control variables. : Input acid flow F01 and concentration c0 are Feed variables. : The a0 array is unknown. Classify it tentatively as Parameter, even though it may (or may not) need to be replaced by variables from other components(s) describing how the acid flows ejected with the strip will depend on the strip properties. : The known rinsing fluid Fs and tank volumes V are Constant. : Stubs Fstub, cstub, qstub, Dcstub are Constant. : The initial unknown value cinit of the state is Parameter. Edit the Argument classification window: Argument
Class
Component output Fjet cs Fr qjet Fout C1 C2 C3 C4 C5
Internal Internal Internal Internal Internal Response Response Response Response Response
Component input a0 uv Fspray Fs F01 c0 Fstub cstub qstub F05 V Dcstub Lk
Parameter Control Parameter Constant Feed Feed Constant Constant Constant Control Constant Constant Constant
198
Practical Grey−box Process Identification
Initialization input cinit
Parameter
I/O Interfaces Specifying the sources of the input and the possible targets of response arguments allows some freedom of choice: : The conductivities C1,...,C5 are measured regularly, and the results are in the sample file. Select Sensor. : The concentrations c are measured, but only indirectly through the conductivities. Select NoSensor. : None of the properties F01 and c0 of the acid input is measured. Select NoSensor and User model to indicate that more modelling may be needed to describe their variation (if any). : The control input F05 and uv have data sequences associated with them. One can choose either to assign library conversion functions, and thus to include the latter as part of the Tanks component, or else indicate that separate components will do the conversion. Generally, it is not clear at this point what interpolation routine to specify. Since the latter may need to change, it may be advantageous to have separate components for the DA conversion. Another general argument leading to the same conclusion is that the model may need to be expanded later with other components referencing the same control or feed data. This will allow the conversion functions to be specified only once. In view of this, the classification will be different for the two input: i) F05 has accurate data and enters at a well defined point in a reliable relation. Assign therefore a Hold function as source. ii) Even if the velocity input can also be assumed to have been recorded accurately, it may also appear as input to other components, for instance the source of ao. Select therefore User model. Edit the I/O interface window: Argument
Source
Connections to sensors C1 Sensor C2 Sensor C3 Sensor C4 Sensor C5 Sensor F01 NoSensor c0 NoSensor c NoSensor
Feed input: Source model F01 User model c0 User model
Control input: Source model uv User model F05 Hold
Argument Attributes The next two windows are the second most important entry points for prior information (after the function statements).
6 Rinsing of the Steel Strip in a Rolling Mill
199
The following are some notations of relevance to the Argument attributes window:
: Since the variables are many, it is important to associate them with informative Short descriptions, including units.
: The dimensions of ao, Fspray, Fstub, cstub, qstub are one less than the
number of tanks. The reason for this is that the balance equations are different for the last tank. : The arrays ao, Fspray, cinit, c, Fstub, cstub, qstub, and V must have implicit attributes. This means that the problem of getting their values will be postponed to the next window. : All parameters have positive values, except the stub Fspray : The scale of Fspray is the same as that of the fresh water flow F05, since there is a flow balance throughout the tank array. Use zero for nominal value, since the term may not be significant. : Some Constants are scalars and need no implicit attributes. : The scales and nominal values of the rms values are by default expressed as factors of the scales of the measured variables. Giving the factors the high values of one would seem somewhat unphysical, since errors are normally smaller than the average amplitude of the measured variables. However, if the first model structure is wrong, the rms values have to account for more errors than caused by the sensors, which would suggest a value between the assumed 0.01 and 1. Selecting the latter value would emphasize the risk that the first trials may well fail completely. Edit the Argument attributes window. Argument
Short description
Dim Scale
Nominal
Min
Parameters ao Fspray cinit rms_C1 rms_C2 rms_C3 rms_C4 rms_C5
JetCrossArea_[m2] Spray_[m3/h] InitAcidConc_[kg/m3] StDError_C1 StDError_C2 StDError_C3 StDError_C4 StDError_C5
4 4 5 1 1 1 1 1
ScaleArea ScaleFW Scalec ScaleC1*1 ScaleC2*1 ScaleC3*1 ScaleC4*1 ScaleC5*1
NomArea 0 Nomc ScaleC1*1 ScaleC2*1 ScaleC3*1 ScaleC4*1 ScaleC5*1
0
Control input uv F05
StripVelocity_[m/h] FreshWaterFlow_[m3/h]
1 1
Scaleuv ScaleFW
Nomuv
Feed input F01 c0
InputAcidFlow_[m3/h] InputFlowConcentration_[kg/m3]
1 1
ScaleF01 Scalec0
NomF01 Nomc0
Process output C1 C2 C3 C4 C5
ConductivityTank#1_[mS/m] ConductivityTank#2_[mS/m] ConductivityTank#3_[mS/m] ConductivityTank#4_[mS/m] ConductivityTank#5_[mS/m]
1 1 1 1 1
ScaleC1 ScaleC2 ScaleC3 ScaleC4 ScaleC5
States c
AcidConcentration_[kg/m3}
5
Scalec
0 0 0 0 0 0
Max
200
Practical Grey−box Process Identification
Constants Fs Fstub cstub qstub V Dstub Lk
ShowerCapacity_[m3/h] JetFlowDisturbance_[m3/h] ConcentrationDisturbance_[kg/m3] AcidFlowDisturbance_[kg/h] RecipientVolumes StateDisturbance_[kg/m3/h] InstrumentFactor_[mS*m2/kg]
1 4 4 4 5 5 1
Internal arrays Fjet cs Fr qjet Fout
Fjet cs Fr qjet Fout
4 5 5 4 5
100 0 0 0 V 0 1100
Nominal Values and Scaling The values of implicit attributes are specified in the next window. Getting the values requires some prior analysis. Considering the entries carefully will save troubles later. That holds in particular for parameters. For nonlinear model structures, it may be very important to have reasonable start values for the parameters as well as expected ranges of variation. Both are in principle prior information. It is therefore worth while to put some effort into trying to estimate the ranges of possible values before starting the calibration. : The nominal values of conductivities are obtained from the averages in Table 6.2 divided by the scale factors. Further division with the instrument factor 1100 yields the nominal values of the acid concentrations c. Scales follow from nominals by rounding to the nearest power of ten. : Nominal value and scale for the the control input are is also obtained from Table 6.2. : Finding the nominal values NomArea of the a o−array requires some prior analysis. Acid balances in tank arrays #i,...,5 (i = 2,3,4,5) yield F ji−1 c i−1 = F 05 c i . This allows the computing of nominal values of F j1, , F j4. However, it will be advantageous to have a common nominal value, since this allows the testing of the hypothesis that (besides different volumes) the tanks have equal properties. The geometric mean will be (F j1 F j2 F j3 F j4) ¼ = F 05 (c 5 c 1) ¼ = 1.62 (0.000675 7.56) ¼ = 0.157. The nominal value of a oi is obtained by dividing with that of the strip velocity: a oi = 0.157/6845 = 0.0000228 m2. : Because of the outlet in tank #2 (the whole in the wall), the acid balance of the total tank array #1,...,5 is different: F 01 (c 0 − c 2) = (F 05 − F 10) c 2 + F 10 c 1. However, F 01 cannot be computed, since the inlet concentration c 0 has not been measured. It will be necessary first to guess a value, c 0 = 200, say, in order to be able to compute an equally uncertain value of F 01. F 01 (c 0 − c 2) = (1.62 − 0.3)×0.312 + 0.3×7.56 = 2.68. F 01 = 2.68/(200−0.312) = 0.0134. The total flow of input acid 2.68 kg/h will be more accurate. Enter the following values into the Implicit attributes window:
6 Rinsing of the Steel Strip in a Rolling Mill
Attribute
Values
ScaleC1 ScaleC2 ScaleC3 ScaleC4 ScaleC5 Scalec Scaleuv Nomuv ScaleF01 NomF01 Scalec0 Nomc0 ScaleFW ScaleArea NomArea V Nomc
1000 100 10 1 0.1 1 1000 6845 0.01 0.013 10 200 0.2 0.00001 0.0000228 6.00 7.56
0.1
0.01
0.001
0.0001
0.00001 0.0000228 5.12 0.315
0.00001 0.0000228 6.01 0.0373
0.00001 0.0000228 6.80 0.00575
9.84 0.000675
201
Assigning Data The next window requires the position in the data file and a units conversion factor. Both have been specified in Table 6.1. Edit the Data assignment window: Argument
Data
Conversion
Output data C1 C2 C3 C4 C5
CondTank#1 CondTank#2 CondTank#3 CondTank#4 CondTank#5
0.000125 0.002 0.04 0.3333 1
Input data F05
FreshWaterFlow 1
6.4.2 Strip Input Next is needed a component coupling the strip velocity to data. However, since it is likely a priori that some of the other recorded strip properties, width and thickness, also affect the acid concentrations, it is convenient to include those in the same component. Select Edit and Insert, enter Strip, and select Change. Connecting input variables to data is done in a standard way, the only user’s freedom is that of selecting interpolation routine. Entering Function Statements Enter the M−statements: uv = speed uw = width ut = thickness
202
Practical Grey−box Process Identification
The statements do two things: i) define three continuous−time variables (speed, width, thickness) to receive the results of library routines interpolating between the discrete−time data, and ii) direct those to all input named (uv, uw, ut) in other components. Argument classification The classification is immediate. Edit the Argument classification window: Argument
Class
Component output uw ut
Response Response
Component input speed width thickness
Control Control Control
I/O Interfaces Try the simplest input model, the Hold. Edit the I/O interface window: Argument
Source
Connections to sensors uw NoSensor ut NoSensor
Control input: Source model speed Hold width Hold thickness Hold
Argument Attributes No nominal values are required, since the variables will be connected to data. Edit the Argument attributes window: Argument
Short description
Dim Scale
Control input speed width thickness
StripVelocity_[m/h] StripWidth_[m] StripThickness_[m]
1 1 1
Scaleuv Scaleuw Scaleut
Process output uw ut
StripWidth_[m] StripThickness_[m]
1 1
Scaleuw Scaleut
Nominal
Min
Max
Scaling The scales of width and thickness are obtained from the averages in Table 6.2 divided by the scale factors.
6 Rinsing of the Steel Strip in a Rolling Mill
203
Enter the following values into the Implicit attributes window: Attribute
Values
Scaleuw Scaleut
1 0.001
Assigning Data Again, the names of the associated variables in the data file and the conversion factor are listed in Table 6.2. Edit the Data assignment window: Argument
Data
Conversion
speed StripVelocity 0.0000724 width StripWidth 1 thickness StripThickness 1000
This concludes the definition of the Strip component. The two components make a meaningful ‘root’ model class for a first calibration. But first, and in order to detect any obvious faults in the components and nominal parameters, do a simulation of the root model. Click Simulate. The averages of the model output in the Plot window (Figure 6.7) agree roughly with the data averages in tanks #1, 4, and 5. However, it is apparent that at least the concentrations in tanks #2 and 3 have large biases, which indicates that the hypothesis of common tank properties does not hold. This suggests to try different eject−area parameters a oi.
6.5 Step 4: Calibration Select Accept (the model class), click OK to fit the five error parameters rms_C1, ...,rms_C5. MoCaVa searches for optimal values and opens the Search appraisal window. Click Accept. MoCaVa opens the Alternative structures window. This brings the session to the starting point of the calibration loop, where the user must define one or more alternative structures, either by freeing selections of the remaining locked parameters, or by expanding the model class. Tentative model structure #0 As long as there are unknown parameters in the tentative class, they are obvious tools by which to improve the current tentative model by fitting. In order to set up for this, one has to define one or more alternative structures, i.e., structures created by freeing the parameters in the tentative class that will be subject to fitting. The obvious candidates are the elements in ao. Free all of ao, in addition to the already free rms_C1,...,rms_C5 (Figure 6.8). MoCaVa calculates the loss reduction to 2065 and the risk value to 0. As anticipated from Figure 6.7 the tentative structure has been falsified.
204
Practical Grey−box Process Identification
Figure 6.7. Response of tentative model #0
Select Advanced and suppress the user’s check points for Origin and Free parameters, thus making the fitting occur automatically after the new structure has been acknowledged. Then click Select_#1 (to acknowledge that the alternative with different ao is
6 Rinsing of the Steel Strip in a Rolling Mill
205
Figure 6.8. Specifying a single alternative structure
better), and select Accept (the new model with fitted ao). This also accepts the ‘root’ model class with free ao as tentative model structure #1. MoCaVa calculates the normalized loss to 0.432202, and opens the Alternative structures window again. Tentative model structure #1 Click Plot. The levels of variation in tanks #2 and #3 have adjusted better, but nothing else has improved. Click Accept. In the second round test also the nominal values of F01 and c0. Free each one of F10 and c0, thus creating two alternative structures. Loss reductions are 6.8 and 6.8 for the two alternatives. Risk is 0.0009. Both are just significantly better, and one can choose any of them. However, earlier analysis indicated that although the product of F10 and c0 (the acid input) is strongly related to data, the factors (flow and concentration) are not. If it is possible to determine only one of them, the choice must be based on preference. Prefer to lock the input concentration, since it is believed to vary less than the flow, which should be proportional to the frequently changing strip velocity. (Taking and analyzing some samples from the pickling bath would have helped.) Click Select_#1. Accept the fitted model and the ‘root’ model class with free ao and F01 as tentative model structure #2. MoCaVa computes the normalized loss to 0.431782, and opens the Alternative structures window again. Tentative model structure #2 It remains to test the effect of better start values. Free cinit (click [+] five times). Loss reduction is 42.6 Start values are significant. Click Select_#1. Then Accept model structure #3. MoCaVa computes the normalized loss to 0.428216, and opens the Alternative structures window. The freedom of the ‘root’ model class has now been exhausted. As expected, freeing also c0 yields no loss reduction. The amplitude variations in the model agree only
206
Practical Grey−box Process Identification
partly with the variations in the data. The model needs a better description of the mechanism generating the acid flow that follows the strip between tanks.
6.6 Refining the Model Class The “push−back” or “squeezer rolls” are designed to prevent as much as possible of the rinsing fluid from following the strip into the next tank. They are the obvious next candidates for modelling. 6.6.1 The Squeezer Rolls Click NewClass (to mark that the tentative class has been exhausted), select Edit (to change the model library), select Insert on the same row as Tanks (to add a new component), enter Rollers (to give it a name), and select Change (to define the component). The Rollers component has to be placed above Strip, since it will depend on the output of that component. MoCaVa opens the Component function window to receive M−statements. Since the fluid streams around the rollers are governed by complex hydrodynamics, a model of their effects will obviously be much more uncertain. However, the following seems reasonable for a starter. Hypothesis: The flow of acid is partly carried with the surface film on the strip, partly adhering to the edges of the strip, partly ejected through the gap between rollers beside the strip, and partly as additional spray through the exit slots in the tanks. Hypothesis: Because of the weights of the upper roller and the strip, the latter digs into the rubber surface and creates a gap that is smaller than the strip thickness. For thin strips the gap may even close. This suggests a description by a oi = h fi u w + h ei u t + h gi max(u t − δ i, 0) + a si
(6.17)
where h f and h e are the thicknesses of the films adhering to the surface and the edges of the strip, h g is the effective width of the gap flow, and a s is the effective area of spray following the strip (for instance pulled from the surfaces of the rotating rollers by centripetal forces). Figure 6.9 illustrates the hypothesis. Roller as
Fluid
hf he + hg
Strip
Figure 6.9. Hypothesized contributions to the ejected residual acid flow (upper half)
The impression δin the roller surface depends on the weights of the strip and roller. A formula for the dependence can be derived as follows: The force acting on the lower roller is
6 Rinsing of the Steel Strip in a Rolling Mill
f 1 = g (m r + Ã s l s u w u t)
207
(6.18)
where m r is the weight of the upper roller, Ã s is the density of steel, l s is the effective length of the strip resting on the roller, and g is the constant of gravity. The force equals the resistance against impression ψ
2 E uw r
[1 − cos(ψ − φ)] dφ 0
= 2 E u w r (ψ − sin ψ) ≈ 2 E u w r ψ 3 3
(6.19)
where 1/E is the elasticity of the rubber, r is the radius of the roller, and ψ is the largest angle of contact between roller and strip. The maximum impression is δ 1 = r (1 − cos ψ) ≈ r ψ 2 2
(6.20)
which yields δ 1 = [g
mr + Ãs l s uw ut E r −1 2 u w 8 3
] 2 3 ≡ d c u t (u w u w + d cw u t u t) 2
3
(6.21)
where d c and d cw are unknown parameters, the first one measuring, in essence, the elasticity of the rubber, and the second the relative weight of the strip section. The constants u w and u t are rated values introduced to make the parameters free of scale. The impression on the upper roller is smaller:
δ 2 = d c u t (u w u w) 2
3
(6.22)
There are 24 new parameters replacing the five in the ‘root’ model class. The following prior knowledge will help in reducing the number. Rollers wear out with use in two respects: i) the rubber coating hardens, which would be expected to decrease the elasticity parameter d c , ii) the surface becomes less smooth, which would be expected to increase the thickness h f of the film of acid residuals. Normally, the first effect of wear requires regular replacement of the rubber coating. The second effect requires only a less expensive ‘re−conditioning’ of the roller surface. During the experiment were used rollers in different states of wear. Hypothesis: The variations in ejected fluid are due to ageing of the rollers with two effects: i) corrosion of the roller surface, increasing the residual film, and/or ii) hardening of the rubber coating. This means that at least h f and d c need to be arrays, while scalars will do for h e and cw d . The h g parameter measures the elusive property of ‘effective width’ of the flow through the roller gap, and is likely to be an array. The purpose of remaining parameters a s is to play the role of ‘stubs’ accounting for all other sources of acid ejected through the exit slits, whether that comes from the roller surface, or from effects of the shower placement, or any other conceivable phenomenon. Let therefore also a s be an array. The number of parameters is now 18.
208
Practical Grey−box Process Identification
One more thing needs attention: The roller model introduces a discontinuity in the dependence of d c . That may (or may not) cause a particular kind of difficulty in the fitting of that parameter. The effect is general and described in Part I (Section 2.7.6, Remark 2.48). The following modification will remove the discontinuity q oi = h fi u w + h ei u t + h gi smooth(u t − δ i|c g) + a si
(6.23)
where smooth(z|α)
z 2 + z2 4 + α2
(6.24)
and α is the degree of ‘smoothing’. Remark 6.2. In this case the fitting would work even without the smoothing, with the exception of the stopping rule. The user would have to stop the search by overruling the stopping rule, which causes no other problem than the slight inconvenience. Remark 6.3. The choice of notations of parameters is a for area, h for film thickness, q for volume flow, and d for dimensionless parameters.
Entering Function Statements Rewriting Equations 6.21 to 6.24 into M−statements yields the code to be entered into the Component function window (Notice that it will be necessary to write (u w u w) 2 3 as exp(log(rvw/uw)/1.5), since (rvw/uv)^(2/3) is not supported by MoCaVa3): % Ejected acid jet cross−section areas: for i = 1:4 % Contribution from spray, surface film, and edges: z = as(i) + hf(i)*uw + he*ut % Impressions into roller surfaces: delta = dc(i)*rvt*(exp(log(rvw/uw)/1.5) ... + exp(log(rvw/uw + dcw*ut/rvt)/1.5)) % Roller gap: z1 = ut − delta z1 = 0.5*z1 + sqrt(cg*cg + 0.25*z1*z1) % Total cross−section area of ejected fluid: ao(i) = z + hg(i)*z1
end
Argument classification The entries in the Argument classification window follow immediately: : as, hf, he, dc, dcw, hg are Parameter input. : rvt, rvw, cg are Constant. Edit the Argument classification window:
6 Rinsing of the Steel Strip in a Rolling Mill
Argument
Class
Component output z delta z1
Internal Internal Internal
Component input as hf uw he ut dc rvt rvw dcw cg hg
Parameter Parameter Control Parameter Control Parameter Constant Constant Parameter Constant Parameter
209
I/O Interfaces The source of uw and ut is User model, which is the already defined Strip component. Edit the I/O interface window: Argument
Source
Control input: Source model uw User model ut User model
Argument Attributes Again, the attributes require more consideration: : Nominal values of uw and ut are given immediately in Table 6.2. : Use zero nominal values for the as stub. : Nominal values of hf and he, can be computed from those of ao, provided each assumed effect dominates the flow: h f = a ou w = 0.0000228/1.4 = 0.0000162 h e = a ou t = 0.0000228/0.00311 = 0.0073 However, since it is unknown which type of flow that dominates, if any, it may be necessary to try both hypotheses. This is not straightforward under the user’s guide in MoCaVa3, but can be done in one of two ways: 1) Assign zero nominal values to all, and setup two alternative hypotheses (that each one hf or he dominates). If the searches for the two alternative parameter sets will converge, in spite of the very unphysical zero start values, the continuation will be straightforward. 2) Assign zero nominal values to all. Then set up for one of the alternatives, for instance hf, and overrule the default zero start value of hf by entering the precalculated value 0.0000162 in the Origin window. When and if the alternative has been accepted as ‘better’ and thus made the next tentative model, set up for free he only, as the next alternative. Then overrule the default start values by entering 0 for hf
210
Practical Grey−box Process Identification
and 0.0073 for he in the Origin window. The test result will decide which alternative is better. : Even if the nominal values cannot be used for origins, they determine the scales of hf and he. Reasonable values are 0.00001 and 0.01. : Nominal values of the parameters characterizing the ‘cavitation’ effect, namely hg, dc and dcw must be determined in other ways. It would be conceivable to assign zeroes to all, and thus hypothesizing that the effect of cavitation is negligible. However, it is apparent from the equations that with hg = 0 and dc = 0 both parameters will be unidentifiable. Hence, use 1 as nominal value of the scale−free cavitation dc measuring the depth of the impression. Both parameters will be identifiable, even with hg = 0 as start value. : Limit dc and dcw to positive values. : Do not limit the stubs to positive values. As argued before, the possible result of a negative estimate, associated with a significant loss reduction would yield the useful information that the model class must be improved. : For values of rvw and rvt may be used any value in the vicinity of the widths and thicknesses of the strip. Use the sample aveages. : A value of cg is more difficult to determine. It should be as large as possible to simplify the search, but not large enough to change the effect of the roller model substantially. A possibility is to relate to the minimum strip thickness 0.00155 and make it considerably smaller: cg = 0.0001. The constant will have an effect only in the thickness range where the roller gap just closes. Edit the Argument attributes window: Argument
Short description
Dim Scale
Nominal
Min
Max
Parameters as hf he dc dcw hg
AreaSprayRoller_[m2] ThicknessFilm_[m] ThicknessEdges_[m] FactorCavitation FactorCavitationStripWeight WidthRollerGapJet_[m]
4 4 1 4 1 4
ScaleArea ScaleFilm ScaleEdges 1 1 Scalehg
0 0 0 0 0 0
0 0 0 0 0
2
Control input uw ut
StripWidth_[m] StripThickness_[m]
1 1
Scaleuw Scaleut
Constants rvt rvw cg
RatedValueThickness_[m] RatedValueWidth_[m] SmoothingConstant_[m]
1 1 1
Nomuw Nomut Nomut Nomuw 0.0001
Nominal Values and Scaling Enter the following values into the Implicit attributes window: Attribute
Values
Scaleuw Nomuw Scaleut Nomut
1 1.4 0.001 0.00311
6 Rinsing of the Steel Strip in a Rolling Mill ScaleFilm ScaleEdges Scalehg
0.00001 0.001 0.1
0.00001
0.00001
0.00001
0.1
0.1
0.1
211
6.6.2 The Entry Rolls Make a separate component Roller0 describing the spring−pressed double rollers that reduce the film of acid on the surfaces of the incoming strip. Place Roller0 immediately before Strip. The model should be simpler than those describing Rollers, since there is no rinsing fluid involved. Hypothesis: The flow of acid into the rinsing tanks is partly carried with the surface film on the strip, and partly adhering to the edges of the strip. This suggests a description by a i = a is + h if u w + h ie u t
(6.25)
where a is ,h if, and h ie are (scalar) parameters associated with the input acid and with similar meanings as the (array) parameters of the rinsing fluids. Entering Function Statements Enter the following M−statements into the Component function window: % Entry rollers: ai = ais + hif*uw + hie*ut F10 = ai*uv
Argument classification The entries in the Argument classification window follow immediately. Edit the Argument classification window: Argument
Class
Component output ai
Internal
Component input ais hif hie
Parameter Parameter Parameter
Argument Attributes Again, the attributes require some consideration: : Use zero nominal value for ais. : Scales of hif and hie can be computed from those of F01, provided each assumed effect dominates the flow: h if = F 01(u v u w) = 0.0134/(6845x1.40) = 0.0000014 h ie = F 01(u v u t) = 0.0134/(6845x0.00311) = 0.00061 Edit the Argument attributes window:
212
Practical Grey−box Process Identification
Figure 6.10. Block diagram of the tentative model class Argument
Short description
Dim Scale
Nominal
Min
Parameters ais hif hie
StubInputAcid_[m2] ThicknessFilmInputAcid_[m] ThicknessEdgesInputAcid_[m]
1 1 1
0 0 0
0 0
0.000001 0.000001 0.001
Max
Since the Rollers and Roller0 components were placed ‘downstream’ of and depends on the Strip component, the latter must be regenerated in order to allow automatic connection of its output to the new target. Select Change for the Strip component and click OK sufficiently many times. This concludes the definition of the Rollers and Roller0 components. Because of the unphysical nominal values, there is no point in simulating the model again. The new model class is shown in Figure 6.10. The placement of the Rollers component inside the Tanks is determined by the classification as Parameter of the terminals it connects to, and thus illustrates its physical location. The Roller0 is outside, since F01 was classified as Feed. The Strip model is connected to input classified as Control, and its placement therefore illustrates that the strip properties enter externally and in several places.
6 Rinsing of the Steel Strip in a Rolling Mill
213
6.7 Continuing Calibration The expansion allows more alternative hypotheses to be compared with the tentative model #3, which is the best within the ‘root’ model class. Tentative model structure #3 Assume for the first alternative hypothesis that either one of the acid films adhering to the strip surface or to its edges dominate the flow. This may well be different in the rinsing fluid and the input acid, since in the latter case the acid on the surface should conceivably, be reduced with particular efficiency due to the doubled and spring−pressed rollers, while the acid clinging to the edges passes unhindered by the entry rollers. However, the fact that the effects of (hie,hif) and (he,hf) multiply creates a complication. Since the parameter all have zero nominal values the SFI rule of freeing only one at a time cannot be used. At least one of hie and hif must be non−zero in all alternatives. This will make four alternative hypotheses with free parameters, namely (hie,he), (hif,he), (hie,hf), (hif,hf). However again, the case is not nested (the model classes differ, and the added component has no null values) which creates another complication. All alternatives must have the same number of free parameters as the tentative model structure (five) in order to make a ‘fair’ comparison, and the two first groups have only two. It is obvious that they will have a fatal disadvantage a priori. Since they are common to all tanks, they will not be able to describe the observed differences in tank dynamics. Remark 6.4. It would not be incorrect to set up for the four alternatives, since a test will pass the routine checking the feasibility, as long as it is not unfair to the tentative structure. It would only be a waste of computing; two of the alternatives would not be able to falsify the tentative structure. Remark 6.5. When the number of parameters becomes large, the default Alternative structures window takes time to respond to button pressing. Use the Primitive alternative structures window instead (See the appropriate “Help” section in Section 4.9). Continue the calibration by freeing each of the two combinations (hif,hf), (hie,hf), thus making two alternative structures. The test results show a loss reduction of 685 for the first and 734 for the second alternative. Select_#2, and Accept the new tentative model with fitted (hie,hf). The normalized loss is 0.383999. Tentative model structure #4 Next, test whether the other hypothesized flows contribute too. Free each one of hif and he. The loss reductions are 37.6 for free he and −26.8 for free hif. The second result does not mean that a positive hif would not be significant, only that the test routine ALMP failed to find one. However, one falsifying alternative is enough. Select_#1, and Accept the new tentative model with fitted (hie,hf,he). The normalized loss is 0.378838. It is now straightforward to free the remaining parameters in the roller model, one at a time. In order to avoid tedious repetition, the sequel will be reported in more compact form, and only exceptions from the normal will motivate further comment. The following formalized reports emphasize the roles of the two ‘actors’ in the interactive procedure. The test routine is either ALMP (for Asymptotic Locally Most Powerful
214
Practical Grey−box Process Identification
test) or LR(#) (for Likelihood−Ratio test with # number of iterations). LR appears when ALMP is not applicable, or has failed to find a better model in the alternative structure. The default value of # is 4, when ALMP fails. More iterations means that also LR(4) has failed. LR(*) means that LR has searched to the maximum of the Likelihood Ratio. If that fails too, there is no better model in the alternative structure. ML(#) means that the fitting routine has found the Likelihood maximum after # number of iterations. MLA(#) means that the step adaptation option has been activated. Test routine for tentative model structure #5 Window: Primitive alternative structures Motivation: Testing the contribution to input acid from surface film. User decision: Free hif. Response from LR(*): The loss reduction is 48.3. Interpretation: The alternative is better. User decision: Select_#1. Accept structure #6. Response from ML(1): The normalized loss is 0.376230. Test routine for tentative model structure #6 Window: Primitive alternative structures Motivation: Testing the contribution to ejected fluid from roller gap. User decision: Free hg. Response from LR(8): The loss reduction is 44.3. Interpretation: The alternative is better. User decision: Select_#1, Accept structure #7. Response from ML(31): The normalized loss is 0.345452 Test routine for tentative model structure #7 Window: Primitive alternative structures Motivation: Testing the effect of different elasticity of the roller surface. User decision: Free dc. Response from ALMP: The loss reduction is 151. Interpretation: The large loss reduction indicates a strong dependence on roller elasticity. All parameters have reasonable values. The alternative is better. User decision: Select_#1. Accept structure #8. Response from ML(141): The normalized loss is 0.274644. Remark 6.6. The fitting of Model #8 required many iterations. An explanation to the difficulties is the very strong nonlinearity introduced by modelling the cavitation effect. When the gap closes, this causes a dramatic change of parameter sensitivity, which is only partly alleviated by the smooth function in Equation 6.23. In such cases of strong nonlinearities in parameter sensitivity the approximate Hessian used in the search routine fails to be a good estimate of the curvature of the loss function at the minimum, and the step length calculated from that will also fail. (The step−length adaptation option reduced the number of iterations somewhat, but only to 104.)
6 Rinsing of the Steel Strip in a Rolling Mill
215
Remark 6.7. Nonlinearities do not cause approximation in the loss values and gradients, which means that the minimum can still be trusted, once it has been found. Test routine for tentative model structure #8 Window: Primitive alternative structures Motivation: Testing the significance of strip weight. User decision: Free dcw. Response from LR(*): Loss reduction = 0.2. Interpretation: Strip weight is not significant. User decision: QuitSearch. Test routine for tentative model structure #8 Window: Primitive alternative structures Motivation: Testing the presence of spray from rollers: User decision: Free as. Response from LR(4): The loss reduction is 14.6. One of the parameter values is negative. Interpretation: The tentative model #8 has been falsified, but no better alternative has been found yet. User decision: QuitSearch. Freeing Fspray turns out similarly. Since there is no more freedom in the model class that can be used to find a better model, this falsifies also the tentative model class, and calls for a second refinement to create an expanded alternative class.
6.8 Refining the Model Class Again The session has reached another point where the user must contribute ‘engineering sense’. Since the present alternative model suggests negative ‘spray’ to obtain a better fit to data, this points at the ventilation system as a candidate for absorbing some of the spray. 6.8.1 Ventilation Click NewClass, select Edit, select Insert, enter Ventilation, and select Change. MoCaVa opens the Component function window to receive M−statements. The following analysis will suggest the statements. Hypothesis: The ventilation system removes some of the mist of acid at a constant rate. Entering Function Statements The ‘stubs’ F stub in the fluid balances in Equations 6.4 and 6.7 provide convenient i places to connect the new component. With negative constant values, the terms will be able to describe a deficiency in the balance due to loss by ventilation. Enter the following M−statements:
216
Practical Grey−box Process Identification
% Evaporation: for i = 1:4 Fstub(i) = − vent(i) end
Argument classification Edit the Argument classification window: Argument
Class
Component input vent
Parameter
Argument Attributes Edit the Argument attributes window: Argument
Short description
Dim Scale
Nominal
Parameters vent
VentilationFlow_[m3/h]
4
0
ScaleFW
Min
Max
Testing Test routine for tentative model structure #8 Window: Primitive alternative structures Motivation: Testing the effect of ventilation. User decision: Free vent. Response from LR(4): The loss reduction is 63 for vent = (0.002,−0.027,−0.32, −0.31) m3/h. Interpretation: The loss reduction is large. However, this is achieved for parameter values having the wrong sign and also far in access of what would be expected (Sohlberg, 1993). Hence, model structure #8 has been falsified again, but still no better model has been found. It is also unanswered whether ventilation does have a significant effect. User decision: QuitSearch, NewClass, deactivate Ventilation. At this point it is not easy to see how to proceed. The inadmissible parameters appear in the formulas F ji = a oi u v + F spray i a oi = h fi u w + h ei u t + h gi max(u t − δ i, 0) + a si
(6.15) (6.17)
The significantly negative values obtained for a si suggest that the total fluid F ji ejected into the next tank increases more than linearly with either the strip width or thickness. And the width seems to be the most likely candidate. The problem is to find a reasonable physical explanation for this.
6 Rinsing of the Steel Strip in a Rolling Mill
217
Remark 6.8. Attempts to model the nonlinearity in various ‘black−box’ ways, i.e., assigning heuristic formulas containing unknown parameters without physical interpretation, also gave significant improvements in loss. This confirms the reject decision, but still does not explain anything.
6.9 More Hypothetical Improvements Another hypothesis that may be suspected is that of perfect mixing and equal acid concentrations on the strip and in the recipient. The problem is obviously what to replace it with. It would be conceivable to partition the rinsing system in each tank into, for instance, three fictitious ‘compartments’, namely one for the mixing on the strip, one near the surface of the recipient, and one in the rest of the recipient. There would also be a physical motif for creating a fourth compartment for possible ‘backwater’ regions far from the bottom pump. This would have some disadvantages, however, mainly of expected sluggishness of computing: : Each added compartment would require an additional state variable, and there are five tanks. A large number of states is costly to IdKit, in particular in combination with the stochastic disturbances that are also reasonable to try. : The equivalent volumes of compartments would have to differ much and so would the flows between compartments. This would result in much different time constants. In particular, the rinsing flow is approximately 100 m3/h, as compared with a flow of approximately 0.1 m3/h through the tank. One would easily create a set of differential equations that are ‘stiff’. Even if IdKit allows stiff differential equations, it would mean more computing. A way out of the dilemma of heavy computing is to use some heuristics. That may be a break with the philosophy of sticking to physically explainable phenomena in the modelling, as far as possible. However, it is still possible to exploit some prior information and intuition, in good agreement with the philosophy of grey−box identification: : When the inlet is placed far from the outlet, incomplete mixing generally tends to slow down the overall process response. Hence, it would be reasonable simply to reduce the mixing speed by a constant factor η ∈ (0, 1). This would create an ‘effective’ mixing volume larger than the actual by the factor 1 η. : It is also easy to envisage an acid gradient between the top and the bottom of the recipient. That is particularly motivated by the fact that the acid concentration is measured in the pipeline pumping acid from the bottom of the tank, while the interchange of acid between tanks enters and exits at the top (Figure 6.1). 6.9.1 Effective Mixing Volumes Click NewClass, select Edit, select Insert, enter EffectiveVolumes, and select Change. MoCaVa opens the Component function window to receive M−statements. Entering Function Statements Enter the following M−statements:
218
Practical Grey−box Process Identification
% Effective volumes: for i = 1:5 V(i) = V0(i)/eta end
Argument classification Edit the Argument classification window: Argument
Class
Component input V0 eta
Constant Parameter
Argument Attributes Edit the Argument attributes window: Argument
Short description
Dim Scale
Nominal
Min
Parameters eta
MixingEfficiency
1
1
0
Constants V0
TankVolumes_[m3]
5
1
Max
V
Testing Test routine for tentative model structure #8 Window: Primitive alternative structures Motivation: Testing the effect of incomplete mixing. User decision: Free eta. Response from ALMP: The loss reduction is 298 for eta = 0.785. Interpretation: The loss reduction is large. The alternative is better. User decision: Select_#1. Accept model structure #9. Response from MLA(18): The normalized loss is 0.249665 Remark 6.9. The ML search (without adaptation) will also work, but this requires a correct setting of the value of the step−length reduction factor. The number of iterations will be (65,33,17,>256) for factors (0.1,0.2,0.5,1). The adaptation will eliminate the need to experiment with the factor, when, as in this case, the default value of 1 will not work. Test routine for tentative model structure #9 Window: Primitive alternative structures Motivation: Testing again the effect of strip weight on roller gap. User decision: Free dcw. Response from LR(*): The loss reduction is 213.
6 Rinsing of the Steel Strip in a Rolling Mill
219
Interpretation: The strip weight is now significant. User decision: Select_#1. Accept model structure #10. Response from MLA(1): The normalized loss is 0.242170 The best model so far is defined in Table 6.3. Table 6.3. Components and parameters defining the deterministic model
Active components Tanks EffectiveVolumes Rollers Roller0 Strip Parameter Short description c_init rms_C1 rms_C2 rms_C3 rms_C4 rms_C5 eta hf he dc dcw hg hif hie
InitialAcidConc_[kg/m3] StDError_C1_[mS/m] StDError_C2_[mS/m] StDError_C3_[mS/m] StDError_C4_[mS/m] StDError_C5_[mS/m] MixingEfficiency ThicknessFilm_[m] ThicknessEdges_[m] FactorCavitation FactorCavitationStripWeight WidthRollerGapJet_[m] ThicknessFilmInputAcid_[m] ThicknessEdgesInputAcid_[m]
Values 10.94 0.356 1232 102 13.4 1.99 0.262 0.620 6.82:10−6 5.87:10−6 0.321:10−6 1.67 0.61 0.165 0.00029 0.254 0.277:10−6 0.000433
0.000062 0.00117
0.000440
8.42:10−6 14.01:10−6 0.53
0.54
0.097
0.044
6.9.2 Avoiding the Pitfall of ‘Data Description’ The number of parameters fitted to the data is now 22, not counting the five rms− values. All have come out statistically significant, although there has been little visual improvement at the end. This suggests that further attempts in the direction might only add to the number of ‘significant’ parameters and still explain nothing about the source of the data. In other words: The session may have come close to the pitfall of ‘data−fitting’, where some of the many free parameters are used by the search routine to adjust the modelling errors to variations in the data that have quite different origins than have been modelled. Figure 6.11 reveals at least two such places: : The model is quite unable to describe the variations of the conductivity in tank #1. Apparently, the flow model of the acid input (Equation 6.25) is not adequate, or its concentration is not constant. The conductivity in tank #2 also has difficulties to agree with data, but that is possibly a consequence of the differences in tank #1. : There are some large and apparently transient phenomena in particularly in tank #5, and in particular around 125 and 225 hours, some of which spills over into tank #4. Each is associated with a stand−stilllasting several hours. Figures 6.12 and 6.13
220
Practical Grey−box Process Identification
Figure 6.11. Responses of the best deterministic model
Figure 6.12. Transient errors at restart after long halt
show the restart of the two occasions, and reveal what happens: The model fails to describe the rapid change of conductivity in the last tank after restart.
6 Rinsing of the Steel Strip in a Rolling Mill
221
Figure 6.13. Transient errors at restart after long halt
Since there is neither any data of the incoming acid, nor any prior knowledge about its source (the “pickling” process), it remains to accept that the source is indeed unknown, and use stochastic models to describe the variation. Modelling the restart from temporary halts would require more information on what happens during the halts and immediately after, and possibly also what caused the stopping in the first place (why is the conductivity in tank #3 in Figure 6.13 changing when the process is standing still?). Since the information is not available, one must refrain from explaining the transients as caused by other known variables, and regard them as random. The point of this is to prevent the errors at the halts from propagating into other segments of the sample. Remark 6.10. Obviously, one could choose first to try and get some data or physical knowledge about the pickling process, or even give up on tank #1, and decide that the modelling starts with tank #2. However, that is just a question of where to delimit the object of the modelling. Even if the pickling process were included in the modelling, it would still have some unrecorded and not constant input that has to be described somehow. Remark 6.11. Admittedly, there are still a number of conceivable other shortcomings in the model: The flow dynamics have been assumed fast, and this means for instance that stepwise changes in the input flow to tank #5 would instantaneously affect the level in tank #1, which is obviously not true in the real process. However, simulations have indicated that within the rather long sampling interval of 0.2 h the effect of flow changes have propagated at least to tank #2 (Sohlberg, 1990). Since the level of tank #1 is controlled by a level regulator (and also affects the system dynamics very little), it does not seem worth while a priori to take the effort of modelling the flow dynamics better at this point. It would also cost five more state variables (the levels).
222
Practical Grey−box Process Identification
Remark 6.12. The current deterministic model can easily be falsified, unconditionally of an explicit alternative. A correlation test of the residuals would do it (Section 4.9.1). Simpler still: A look at the residuals sequences immediately reveals that they are not uncorrelated in time. Hence, they cannot be independent measurement errors, as was implicitly hypothesized by trying a deterministic model. However, the same result would probably be obtained from any reasonably complex deterministic model for this case. Hence, in order to stop the process of developing progressively more complex models, one will sooner or later have to resort to some principle that allows for modelling error (in addition to measurement error), in order to decide when it is enough. Stochastic modelling provides such an error concept.
6.10 Modelling Disturbances The concentration of the hydrochloric acid from the pickling process depends on a number of factors that have not been recorded in the data. It is reasonable to suspect that it is not constant. 6.10.1 Pickling Click NewClass, select Edit, select Insert, enter Pickling, and select Change. The following analysis will suggest the form of the model. Hypothesis: The input acid concentration from the pickling bath varies randomly and slowly, and well within the range (0,1000) kg/m3. The boundaries can be modelled by the following formula c 0 = 500 [1 + (v − 0.75)
1 + (v − 0.75) 2]
(6.26)
where v is an unbounded stochastic variable with zero mean and unit standard deviation. A zero value of v yields the nominal value of c 0 = 200. Remark 6.13. The same general formula is automatically used in IdKit to limit parameters in the search. However, it has not been implemented for stochastic variables. There is no guarantee that the estimated values of c 0 will stay within the limits. Under extreme conditions the linearization performed by the EKF in IdKit may violate the boundaries. Entering Function Statements Enter the following M−statements into the Component function window: % Bounded acid concentration from the pickling bath: beta = v0 − 0.75 cpb = 500*(1 + beta/sqrt(1 + beta*beta)) c0 = cpb
Argument Classification Make v0 randomly variable by classifying it Disturbance. Classifying cpb as Response makes it possible to plot the estimated acid concentration. Edit the Argument classification window:
6 Rinsing of the Steel Strip in a Rolling Mill
Argument
Class
Component output beta cpb
Internal Response
Component input v0
Disturbance
223
Source Models Select the Lowpass model, since this allows estimation of the unknown bandwidth, as well as the rms−value. Unlike in the simpler Brownian model it is possible to limit its variance, which is important to ensure that linearization of the nonlinear function in the model will be possible. Edit the I/O Interface window: Argument
Source
Connections to sensors cpb NoSensor v0 NoSensor
Unknown input: Environment model v0 Lowpass
Argument Attributes The disturbance has two characterizing parameters. Use default attributes, but maximize rms_v0 to stay approximately within the linear range. The Nyquist frequency is 0.5/time quantum = 2.5 1/h, and also the maximum bandwidth. Edit the Argument attributes window. Argument
Short description
Dim Scale
Nominal
Min
Max
Parameters rms_v0 bw_v0
Rms_v0 BandWidth_v0
1 1
1*1 0.5
0 0.5
0 0
0.5 2.5
Disturbance input v0 AcidInputNoise
1
1
Process output cpb ConcPicklingBath_[kg/m3]
1
100
States x_v0
1
1
StateAcidInputNoise
6.10.2 State Noise Figure 6.11 reveals that there is also unmodelled actual acid flow that adds to the modelled rate of change of the acid concentration in each tank. The simplest hypothesis is that the error flows are random and of high frequency. Call them “spray” to emphasize their unpredictability, and model them as other internal disturbances.
224
Practical Grey−box Process Identification
Internal disturbances can be entered conveniently by connecting to some of the ‘stubs’ in the Tanks model, either to Fstub, cstub, qstub, or Dcstub depending on whether one would expect the unmodelled disturbance to affect liquid flow, concentration, acid flow, or the concentration derivatives. However, the first three hypotheses have now lost their appeal, since this would mean accepting the ‘unphysical’ negative values that have previously been rejected. In addition, the analysis in Section 6.4.1 revealed that only Dcstub will be effective enough to describe the fast responses to restart after a halt. One will simply have to accept that there are phenomena that are too rapid to be explained by a mixing model. Adding high−frequency ‘noise’ to the state derivatives is a conventional way out of the dilemma. Click NewClass, select Edit, select Insert, enter ConcDisturbance, and select Change. Entering Function Statements The state disturbance are not necessarily positive, and it is difficult to find any other boundaries. Assume therefore unlimited disturbances. Enter into the Component function window: % Unbounded state disturbances: for i = 1:5 Dcstub(i) = v(i) end
Argument Classification Edit the Argument classification window: Argument
Class
Component input v
Disturbance
I/O Interfaces Select the Lowpass model, since this allows estimation of the unknown bandwidth, as well as the rms−value. Edit the I/O interface window: Argument
Source
Connections to sensors v NoSensor
Unknown input: Environment model v Lowpass
Argument Attributes Edit the Argument attributes window.
6 Rinsing of the Steel Strip in a Rolling Mill
225
Argument
Short description
Dim Scale
Nominal
Min
Max
Parameters rms_v bw_v
Rms_v BandWidth_v
5 5
Scalev*1 0.5
0 0.5
0 0
2.5
Disturbance input v ConcNoise_[kg/m3/h]
5
Scalev
States x_v
5
StateConcNoise
Nominal Values and Scaling The scale of v is related to that of c; divide by the sampling interval: Scalev = Scalec/0.2. Enter the following values into the Implicit attributes window: Attribute
Values
Scalev
5
0.5
0.05
0.005
0.0005
Remark 6.14. The six new states introduced by the disturbance models make the calibration run considerably much slower. However, activating the options for speeding up the processing will not help in this case. Transfer matrices vary too fast and are too dense for the speed optimization to be effective. An obvious cause is the frequent speed changes.
6.11 Determining the Simplest Environment Model Introducing stochastics into the model means accepting the fact that some phenomena in the process behaviour cannot be modelled by mathematical equations, at least not without unreasonable complexity. Whether this is a serious shortcoming or not depends on how one intends to use the model. It might seem that if one wants to use the model for calculating the process’ responses to various known stimuli, then one needs a deterministic model − stochastic input would obviously be unknown − and one should therefore stick to deterministic modelling. However, that conclusion runs contrary to the conjecture that deterministic model structures, without the ‘padding’ of internal disturbance models, will make fitted parameters more susceptible to disturbances. The conjecture is supported by a study using the same data (Bohlin, 1991b), and also by the DrumBoiler example in Chapter 4. 6.11.1 Variable Input Acid Concentration In view of the fact that only the product of concentration and volume flow of the input acid affects the steady−state acid balance of the rinsing process, it may be expected a priori that it would be difficult to estimate both the random input acid concentration and the state noise in tank #1. Start therefore by activating only the Pickling component. Make Pickling Active and ConcDisturbance Dormant.
226
Practical Grey−box Process Identification
Test routine for tentative model structure #10 Window: Primitive alternative structures Motivation: Testing the effect of variable input concentration. User decision: Free rms_v0. Response from ALMP: The loss reduction is 930. Interpretation: The alternative is better. User decision: Select_#1. Accept model structure #11. Response from MLA(10): The normalized loss is 0.146867. Test routine for tentative model structure #11 Window: Primitive alternative structures Motivation: Testing the effect of optimal input concentration bandwidth. User decision: Free bw_v0. Response from ALMP: The loss reduction is 31.2. Interpretation: The loss reduction is significant. The alternative is better. User decision: Select_#1. Accept model structure #12. Response from MLA(4): The normalized loss is 0.146070. Figure 6.14 shows the simulation result in tank #1. The only obvious prediction errors are at the two long halts. 6.11.2 Unexplained Variation in Acid Concentration Click NewClass, and make ConcDisturbance Active. Test routine for tentative model structure #12 Window: Primitive alternative structures Motivation: Testing the effect of concentration state noise. User decision: Use the SFI rule (click <) to free each one of the entries in rms_v. Response from ALMP: The maximum loss reduction is 1556 for free rms_v(5). Interpretation: Not unexpected, since the largest transient errors appeared in tank #5. The loss reduction is significant. The alternative is better. User decision: Select_#5. Accept model structure #13. Response from MLA(24): The normalized loss is 0.090561. Figure 6.15 shows that the introduction of two disturbance variables are effective. However, there are still obvious errors in tanks #2 − #4. Test routine for tentative model structure #13 Window: Primitive alternative structures Motivation: Testing the effect of concentration state noise in more tanks. User decision: Free rms_v(1),...,rms_v(4) simultaneously. Response from LR(4): The loss reduction is 7044.
6 Rinsing of the Steel Strip in a Rolling Mill
Figure 6.14. Result of modelling input acid flow
Interpretation: The loss reduction is significant. The alternative is better. User decision: Select_#1. Accept model structure #14. Response from MLA(11): The normalized loss is 0.031156.
227
228
Practical Grey−box Process Identification
Figure 6.15. Predicted and measured conductivities
Test routine for tentative model structure #14 Window: Primitive alternative structures
6 Rinsing of the Steel Strip in a Rolling Mill
229
Motivation: Testing the effect of bandwidth. User decision: Free bw_v(1),...,bw_v(5). Response from ALMP: The loss reduction is 49.0. Interpretation: The loss reduction is significant. The alternative is better. User decision: Select_#1. Accept model structure #15. Response from MLA(7): The normalized loss is 0.030696. Figure 6.16 shows the result of simulating the model. The prediction of all conductivities appears almost perfect. However, this is mainly a consequence of the fact that the concentration data is dominated by low frequencies. With a stochastic model it is easy to predict any low−frequency data sequence well one sampling interval ahead. The plots of prediction errors are more revealing. The residuals no longer contain any obvious low frequences that could be used to improve the predicting ability of the model. Neither do the sizes and numbers of ‘outliers’ seem alarming. Figure 6.17 shows the same section as in Figure 6.12 after state disturbances have been added. The transient errors in tank #5 no longer propagate into tank #4. The set of model equations offers no more alternatives. In principle, the modelling can go on forever by expanding alternative model classes. In order to stop the procedure, one would have to take into account the purpose for applying the model, and validate the model with respect to that purpose. This step is supported by the Validation shell, but will not be reviewed here. However, there are some other options for checking the final model.
6.11.3 Checking for Possible Over−fitting
When fitting parameters in an insufficiently modelled deterministic model structure (as structure #10), there is generally a risk that part of the estimated values are spurious and have been fitted to compensate for such real phenomena in the data that have not been modelled. The purpose of the stochastic disturbances introduced into an otherwise determinstic model structure is to account for the unmodelled parts. This means that the question of whether the introduction of disturbance models have made some of the previously significant parameters superfluous has still to be answered. If so, their values will have low credibility; they may be regarded as part of a ‘data description’. A way to answer is to reduce the tentative structure, and appraise the increase in loss. In the Primitive alternative structures window click > on the rows of all parameters to test for over−fitting. MoCaVa starts a long procedure of systematically locking each parameter entry to its nominal value (usually zero), fitting the reduced structure, and computing the increase of loss. The result is shown in Table 6.4. The interpretation of the values in the table is that a parameter is either significant (if the loss increase is above a threshold), or else possibly insignificant. Among the latter the one with the smallest loss increase is insignificant. Since the outcome is not conclusive for all parameters, the reduction must be done in several rounds, although with decreasing number of alternatives.
230
Practical Grey−box Process Identification
Figure 6.16. Conductivities and prediction errors
6 Rinsing of the Steel Strip in a Rolling Mill
231
Table 6.4. Loss increases due to reduced parameter freedom
Parameter
Loss increase
he dcw hg hif hie
0.0 0.1 187 12.2 0.1
346
156
65
The first round of the reduction normally involves much calculation, in particular for those alternatives that end up as significant, and they are normally the majority. The reason is that the search for an alternative model within the reduced structure must continue until the best parameters within the alternative structure have been found. (This is in contrast to the testing with expanding alternatives, where it is enough to find a falsifying alternative). If a parameter is significant, the search starts with values that are far from the optimum. However, the next rounds will be fast, due to the following circumstances: : The parameters that turn out significant need not be tested again. : The possibly insignificant must be tested again, but when they turn out possibly insignificant again (which is likely), the computing will not take long. The procedure yields the sequence of results shown in Table 6.5.
Figure 6.17. A segment of the sample showing restart after a long halt
232
Practical Grey−box Process Identification Table 6.5. Loss increases due to reduced parameter freedom
Round
Parameter
Loss increase
#2
dcw hie
0.0 0.0
#3
dcw
0.0
Table 6.6. Components and parameters defining the deterministic model with stochastic disturbances
Active components Tanks EffectiveVolumes Rollers Roller0 Strip Pickling ConcDisturbance Parameter Short description
Values
c_init rms_C1 rms_C2 rms_C3 rms_C4 rms_C5 eta hf he dc dcw hg hif hie
InitialAcidConc_[kg/m3] StDError_C1_[mS/m] StDError_C2_[mS/m] StDError_C3_[mS/m] StDError_C4_[mS/m] StDError_C5_[mS/m] MixingEfficiency ThicknessFilm_[m] ThicknessEdges_[m] FactorCavitation FactorCavitationStripWeight WidthRollerGapJet_[m] ThicknessFilmInputAcid_[m] ThicknessEdgesInputAcid_[m]
7.35 46.2 6.97 1.18 0.204 0.0122 0.491 12.7:10−6 0 0.77 0 0.048 1.38:10−6 0
rms_v0 bw_v0 rms_v bw_v
Rms_v0 BandWidth_v0 Rms_v BandWidth_v
0.498 0.093 0.228 0.367
0.155
0.0227
0.00265
0.000158
3.32:10−6 0.264:10−617.6:10−6 0.55
0.58
0.84
0.144
0.095
0.248
0.0580 0.088
0.00936 0.528
0.00208 2.40
0.000180 2.30
Accept model structure #16 with zero nominal values for he, dcw, and hie. Response from MLA(1): The normalized loss is 0.030988. Test routine for tentative model structure #16 Window: Primitive alternative structures
6 Rinsing of the Steel Strip in a Rolling Mill
233
Motivation: Testing the final model. User decision: Free Fspray, as, ais simultaneously. Response from ALMP: The loss reduction is 22.0, but again with inadmissible parameter values. The risk value is 0.0000014. Interpretation: The loss reduction is significant. The model is falsified, but no alternative is better. User decision: QuitSearch. The best stochastic model so far is defined in Table 6.6. Remark 6.15. The bandwidths of the state disturbances v(4) and v(5) are close to the Nyquist frequency = 0.5/0.2 = 2.5, which indicates that for all practical purposes they could have been replaced by ‘white’ sequences. Using a ‘white noise’ disturbance model would not cost extra states, and would thus speed up computing. However, MoCaVa does not allow ‘white noise’ disturbances (or a discrete−time approximation of it). The main reason is transparency: White noise into nonlinear structures may yield intuitively unexpected responses (Graebe, 1990b). Figure 6.18 shows a block diagram of the final model class. 6.11.4 Appraising the Roller Conditions The well−established result that the push−back rollers vary in their efficiency to prevent acid from being ejected into the next tank suggests that this information be used to monitor the rollers. The estimation of the status of the rollers has been used to devise a new strategy for maintenance of the rollers (Sohlberg, 1993b). It is therefore interesting to compare the estimates of hf (film thickness) and dc (cavitation factor) when fitted with and without disturbance models. From Tables 6.4 and 6.5: : Without: hf = (6.8, 5.9, 8.4, 14.0) ³m, dc = (1.67, 0.61, 0.53, 0.54) : With: hf = (12.7, 3.3, 0.26, 17.6) ³m, dc = (0.77, 0.55, 0.58, 0.58) The values differ. However, the results indicate that rollers #1 and #4 have worn surfaces (since the they let much more acid through) and are in need of reconditioning. The latter is achieved by grinding the surface to remove a thin layer of the rubber. Using the values of dc to assess the elasticity of the rubber would also be feasible. A low value would indicate that the rubber has hardened or become thin from frequent reconditioning. However, the values are not yet alarmingly low.
12 Conclusions from the Calibration Session The following may be concluded from the calibration session: : The tanks have different dynamic properties. : The rollers have individual surface properties. They are identifiable, and can be used to monitor the roller conditions. : ‘Cavitation’ is significant and the impressions are individual to the rollers. : Much of the acid transferred between tanks is ejected beside the strip through the slot, and not, as first expected, adhering to the strip surfaces. : The mixing in a recipient is not perfect, causing a significant difference between the acid concentrations at the bottom and on the surface. : No effect of ventilation can be verified.
234
Practical Grey−box Process Identification
Figure 6.18. Block diagram of the final model class
: The input acid concentration varies significantly. : There are significant unmodelled phenomena in the models of the rinsing flows between tanks.
: Long halts cause unmodelled transient phenomena after restart. : The model of tank #1 is uncertain. In order to improve it the concentration of input acid must either be measured, or its variation described by a model of the pickling bath.
7
Quality Prediction in a Cardboard Making Process
7.1 Background The case is based on a preliminary study (Bohlin, 1996) with the purpose of investigating whether grey−box identification would eliminate some of the problems with an earlier attempt to model the variations in one of the main quality variables of cardboard, the “bending stiffness”, at the Frövi plant in Sweden. Although a semi−physical regression model predicted well (Gutman and Nilsson, 1996, 1998), the parameter estimates varied much with time. The main study was later carried out by Pettersson (1998), who also designed the on−line predictor running at the plant. All studies were sponsored by the Swedish national NUTEK program for promoting the use of modern methods for industrial control problems. A large amount of experiment data was available from the Gutman−Nilsson study, recorded during normal production of cardboard of a representative series of qualities. The purpose of the model building was two−fold: i) for long−range prediction of the bending stiffness based on preset control variables to be used for preparing grade changes, and ii) for design of model−predictive feedback control from online measurements of cardboard thickness and laboratory measurements of bending stiffness.
7.2 Step 1: A Phenomenological Description This first step in the model making procedure means briefly identifying what physical units and phenomena that mainly contribute to making the cardboard manufacturing process do its task, and the circumstances under which the experiment data were produced. The step serves to delimit the object of the modelling, and to focus the modelling on the a priori most important phenomena. This normally requires frequent consultations with an engineer responsible for the process. Remark 7.1. I recommend a team of two to be responsible in cases of grey−box modelling. An engineer generally skilled in grey−box modelling cannot also be expected always to make a well balanced prior judgement of what details that can and cannot be eliminated a priori. A cardboard plant is a huge collection of machinery, and far too complex for modelling without a drastic reduction of its complexity already from the beginning. Instead of starting with the process and trying to describe how it operates, it is more prudent to start by considering the basic ideas behind its operation, which are usually much simpler. And a process engineer (or possibly a handbook) is a good source of such information. What makes the real process complex is usually
236
Practical Grey−box Process Identification
all the secondary support and control units that make the process operate as intended, or serve to trim its performance to maximum. Only if the secondary units would not do their jobs, will it be necessary to take them into account. The following builds on the report of Gutman and Nilsson (1996) and further consultation with Bengt Nilson at Frövi. The cardboard at the Frövi plant is manufactured by joining four layers of board from four separate systems of headbox and wire, each fed from a separate pulping process, except the two middle layers, which are fed from the same pulp. The four head boxes eject emulsions of wood fibers and chemicals onto the four running wire nets. Most of the water is drained in the process, leaving damp fiber mats on the wires. The joining takes place on the elongated wire of the bottom layer, as depicted in Figure 7.1. In certain qualities a layer of coating is added on the top. The figure describes the process in a detail that is assumed relevant for the modelling of bending stiffness. Processes not included are the many water flows in the system, the consistency control, and the drying.
Refiners r6 r5
r1
f6
Reject
f5
B75
f4
Ctmp
f3
Birch
f2
Pine
f1
Storage tank
Refiner Pump Measurement
Mass flow Information
Laboratory BSI κ6
Mixing tanks Machine chests
Headboxes
Symbols
B70
vw ps
ph
w
vp
Coating
Wires Shoe press
h
wc
Pope Hot nip press
Figure 7.1. Illustrating the sub−processes in the manufacturing of multi−layer cardboard that are assumed to affect the bending stiffnesss index BSI
The pulp in each layer is a blend of six kinds of ‘raw’ pulp: pine, birch, ctmp, B75, reject, and B70. Before the blending the pulp has been treated in refiners, which has effected the properties in a way that may or may not be advantageous to the bending stiffness. Even if one may assume that refining is not done unnecessarily and to an adverse effect, the latter may still be the effect of an attempt to improve other quality variables than the bending stiffness. The joining of the layers certainly affects the bend-
7 Quality Prediction in a Cardboard Making Process
237
ing stiffness, and the subsequent processing in the paper machine, in particular pressing, drying, and coating may or may not change cardboard properties significantly. Speed differences along the machine may do the same. The main control variables in the model are the flows of raw pulp (illustrated by six pumps), and the specific refining energies of some of the pulp qualities, altogether nine possible controls. The flows extracted from the pulping system and fed to the head boxes are not free, but are used to control the basis weight, as illustrated by the information fed back from the basis weight measurements. It is assumed that the control is good enough to render the difference between actual and specified basis weight negligible in the stiffness model. The recording in Figure 7.3 provides some support to that assumption. It is also assumed, although not supported by recordings, that pulp flows and consistencies are regulated well enough.
7.3 Data Preparation Before proceeding further, it is worth while to take a look at the experiment data, since that has bearings on the modelling. Generally, raw data from data acquisition systems not particularly designed for process identification must be preprocessed. In the present case the preparation of suitable data files is done in two steps, first using general functions for data processing in MATLABX and then special functions in MoCaVa. Remark 7.2. A reader who would want to use MoCaVa to repeat the model design procedure described below may skip the somewhat tedious data preparation steps, since the results are available in the directory Examples\CardBoard\. In particular the first manual step depends very much on the particular plant, and has been included for completeness, and also to illuminate that data preparation may require serious consideration. The experiment data used for the calibration and validation has been recorded during normal operation, covering a representative set of product specifications. Sixteen of the variables have been measured on line with a sampling rate of 12 minutes, and two more, BSI and kappa−number of the B70 pulp have been measured in a laboratory more sparsely and with irregular intervals. Raw data were available from two loggings of about one and five months lengths and with about four months in between, viz. during 940829 − 940925 (files out0.txt − out3.txt) and during 950209 − 950703 (files out4.txt − out21.txt). The files out0.txt − out10.txt contain records of 25 variables, and files out11.txt − out21.txt two variables more. The first task is to create two samples to be used for calibration and validation. This is not straightforward however: : There is a four months gap between samples out3.txt and out4.txt : The numbers of recorded variables differ between out10.txt and out11.txt : There are more variables recorded than needed for the identification purpose, and the positions of some variables differ in the files. : There are outliers. These and similar circumstances are certainly not uncommon in practice, and would preferably require a more sophisticated data preparation support than available in the Predat program of MoCaVa. Instead the following will be a somewhat tedious and detailed account of how the raw data were molded into samples suitable for processing by MoCaVa3. First, some MATLABX statements were applied to merge the raw data samples: out0 = load(’..\MoCaVa3\Examples\CardBoard\raw\out0.txt’);
238
Practical Grey−box Process Identification
........................................ out21 = load(’..\MoCaVa3\Examples\CardBoard\raw\out21.txt’); out0010 = cat(1,out1,out2,out3,out4,out5,out6,out7,... out8,out9,out10); out1121 = cat(1,out11,out12,out13,out14,out15,out16,... out17,out18,out19,out20,out21); save ’..\MoCaVa3\Examples\CardBoard\out0010.dat’ out0010 ... −ascii save ’..\MoCaVa3\Examples\CardBoard\out1121.dat’ out1121 ... −ascii
The first statements load the 22 raw data files into the MATLABX work space. The second two statements catenate the resulting matrices into two, out0010 and out1121. The last two save them as ASCII files with a dat extension in the (arbitrary) directory Examples\CardBoard\ for further processing by Predat. The compositions of the two files were determined by the different number of variables in the raw records. This means that the first file out0010 will contain the four months time gap. Use this sample for validation purposes, and the longer and later sample out1121 for calibration. The number of records are 9401 and 10464. The following is a table of the contents in the second file: Table 7.1. Record specifications of raw data file out1121.dat
Pos.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Name
Physical meaning
On/off−indicator Date Time_[h,min] F197 FFC181 F193 F194 F192 F196 EQ900 EQ903 EQ904 ASPOP ASVIR SP2LT SLT APAL AKYT2 ATJO2
PulpFlowPine_[t/h] PulpFlowBirch_[l/s] PulpFlowCTMP_[t/h] PulpFlowB75_[t/h] PulpFlowReject_[t/h] PulpFlowB70_[t/h] SpecRefEnergyPine_[kWh/t] SpecRefEnergyReject_[kWh/t] SpecRefEnergyB70_[kWh/t] PopeSpeed_[m/min] WireSpeed_[m/min] PressureShoe_[kN/m2] PressureHotNip_[kN/m2] Coating_[g/m2] BasisWeight_[g/m2] Thickness_[micron]
QKAP01 KappaNumberB70
KSTY8
BendingStiffnessIndex_[mN]
7 Quality Prediction in a Cardboard Making Process
239
Variables without specification are not of interest for the case. The specifications for the other data file out0010.dat is the same, except that KSTY8 is now in position #24. Start MoCaVa by typing mocava3 in the Matlab window, and select Predat, New data file, and Get data file. Use the browser that opens to locate the MoCaVa3\Examples\CardBoard\ directory. The window in Figure 7.2 appears. The directory contains three sets of data files in various stages of preparation: i) a directory raw containing the data files from the original data acquisition ii) the two data files out0010 and out1121 resulting from the manual catenation process just described, and iii) the two data files out0010.mcv and out1121.mcv resulting from running Predat, as described below.
Figure 7.2. Widow for selecting sample to be prepared
Open the file out1121. This opens the Plot Outline and Data Outline windows (Figures 7.3 and 7.4). Clearly, the data is in need of some editing. The time indications, records #2 and #3 cannot be used in this case. The time variable must be increasing, while the clock readings (#3) are reset at midnight. It will be necessary to use the option for generation of a fictitious time variable instead, based on the record counter. Consult Table 7.1 to change the following in the Data Outline window: : Rename the variables to be used. : Mark the variables not to be used, #1, 2, 3, 4, 21, 23, 24, 25, 27, for removal and click Delete Now. : Select Const. Sampl. interval, from the ‘pull−down’ menu, set the sampling interval to 0.2, and select hours. Click Apply Now. Remark 7.3. The reason for not considering the indicator variable in the first column is that the options in Predat do not allow for the elimination of individual records based on an indicator variable. In the preliminary study (Bohlin, 1996) a temporary C−program was written to do this, in addition to the other data preparation tasks.
240
Practical Grey−box Process Identification
Figure 7.3. Recorded data sample #2
However, most ‘indicator−off’ values are also associated with NaN or inf in the corresponding records, which MoCaVa3 is able to handle. The data shows a number of breaks in the data flow and occasional missing data, as well as a large number of ‘outliers’ both in connection with the breaks and during normal operation. Predat applies a simple rule to eliminate the worst parts from the raw data automatically. This will reduce the need for the user to eliminate spurious data manually. Click Edit in the MoCaVa window, and then Remove outliers. The MoCaVa − Remove Obvious Outliers window opens. Click Detect Outliers. This will draw 3 σ thresholds in all graphs having values outside the thresholds (Figure 7.5). Click Remove Outliers. The values will be replaced with NaN and the graphs will change accordingly (Figure 7.6). Click Quit. It is generally difficult to know what values are ‘outliers’ and also how much of the outliers to remove, and MoCaVa3 offers no direct solution to the problem. However, the possibilities to model disturbances in both input and output should reduce the
7 Quality Prediction in a Cardboard Making Process
241
Figure 7.4. Window for editing sample attributes
sensitivity to outliers, in particular since the larger outliers have been removed. However again, a glance at the graph still suggests that a few more values should be removed, for physical reasons: The two speed variables ASPOP and ASVIR display a few values that are clearly not in accordance with the normal accuracy in speed measurements. Remark 7.4. The SP2LT and SLT pressure values in Figure 7.6 include quite a num-
242
Practical Grey−box Process Identification
Figure 7.5. Selected records with 3σ limits
ber of zeroes, which may indicate failures in the pressure gauge, and hence outliers. However, they might also indicate that the press itself has been temporarily inactivated, in which case the values are not outliers. In order to decide, one would have to make quite a thorough investigation of the events during the intervals with abnormal operation of the plant. Since this would appear to be overdoing it for a preliminary study, no outliers will be removed from SP2LT and SLT. Remark 7.5. Generally, the variables for which outliers would have the largest effect, if any, are those at the end of the process, namely the speed, pressure, and basis weight settings, which would be expected a priori to affect the stiffness index instantaneously. Also, their measurement accuracies would be expected to be generally higher than the other input variables. The recordings of pulp flows F and specific refining energies EQ are more uncertain, and it is even more difficult to know if an outlier is a measurement error, or if the pulp flow has actually been reduced for a short while. In order to do a manual elimination of outliers in a variable, first double−click at the corresponding graph. This opens the window in Figure 7.7 for ASPOP. Check
7 Quality Prediction in a Cardboard Making Process
243
Figure 7.6. Selected records purged from obvious outliers
the box Mark Outliers, and draw boxes around the values to be removed. The values turn from red to green. When all selected outliers have been marked in this way, click Remove all. The values will be replaced with NaN and the graphs will change accordingly. Do the same for ASVIR. One more thing has to be taken care of: Figure 7.5 reveals that there are some negative values in the flow variable masurements FFC181. In order to avoid having to include statements in the model checking for non−positive flow values in the model it is expedient to eliminate them already in the data. It is possible to use Predat again for the purpose. A faster alternative would probably be to use a MATLABX statement for eliminating all negative values from the data matrix. After the manual outlier elimination the data sample appears as in Figures 7.8 and 7.9. File the sample by clicking Main, and then Save as. Place the prepared sample in MoCaVa3\Examples\CardBoard\out1121.mcv. Finally, click Exit predat.
244
Practical Grey−box Process Identification
Figure 7.7. Marking outliers in ASPOP
Repeat the procedure for out0010.dat. This brings the level of data contamination down to what is a priori believed to be acceptable for the preliminary study. The remaining, and still considerable amount of ‘errors’ of various kinds will be left to the Calibration and Validation functions in MoCaVa to deal with. Remark 7.6. The need to decide a priory when to stop the data preparation is not entirely satisfactory, but well in accordance with the grey−box concept of supporting ‘objective’ data analysis with prior physical knowledge − some data simply cannot have been caused by the object to be modelled. However, it is the author’s opinion that, ideally, ‘data preparation’ in the form of elimination of some parts of data believed to be erroneous on subjective grounds should be reduced to clear−cut cases. ‘Contamination’ in a data sequence is also data, since there is still some physical process that produced it. It should preferably be modeled as caused by a ‘disturbance’ model. However again, a practical issue is also the trade−off between the time put on preparing ‘good data’ and that used for ‘disturbance modelling’. Since, MoCaVa does not have enough support for modelling as serious contamination as ‘outliers’, the need for separate data preparation remains.
7.4 Step 2: Variables and Causality The purpose of the second step is to define the most important variables and the causation between them. Like the first step this must be based on prior knowledge only. It will also create a skeleton structure on which to hang any prior knowledge about the variables and the relations between them.
7 Quality Prediction in a Cardboard Making Process
245
Figure 7.8. Sample prepared for calibration
The following translates the physical description of the object illustrated by Figure 7.1 into the block−diagram form in Figure 7.10 which is better suited to simulation and identification: f = Input mass flow of pulp constituents [kg/s] r = Specific refining energies of constituents [Wh/kg] κ = ‘Kappa numbers’ of constituents Z = (Q,E,B) = Pulp flow properties in pulp lines Q = Pulp flows [kg/s] B = Bulk of pulp [m3/kg] E = Tensile strengths [N/m2] v = Speeds at wire and pope [m/s] p = Shoe and hot nip pressures [N/m2] c = Coating [kg/m2] H = Thicknesses of layers (including coating) [m]
246
Practical Grey−box Process Identification
Figure 7.9. Sample attributes and statistics
w = Basis weight [kg/m2] BSI = Bending−stiffness index [mN] f,r,κ
Pulp prep.
Z pc
Mixing
Z mx
Pulp feed
Z pu
Paper
machine
w,v
E,H
w,c,v,p
Lab.
BSI
w,c
Figure 7.10. Partitioning of the object, determining the internal and external variables involved in the modelling
The partitioning is based on the following preconceptions:
: Lab.: The bending stiffness BSI should mainly depend on the thicknesses H and
tensile strengths E (= stiffness against pulling) of the layers of the finished cardboard. The relation is static and given by basic mechanics (Gavelin, 1995), and should therefore be invariant and reasonably reliable.
7 Quality Prediction in a Cardboard Making Process
247
: Paper machine: Thickness equals basis weight times bulk, where basis weight is
mainly dependent on the amount of pulp in the different layers. The relation should be reliable, since it can be based on mass balances, where the input flows are known, as well as the output speed. The sum of layer thicknesses is measured. Bulk and tensile strength are material properties, determined primarily by the mixing and refining of the pulp, but transported through the system, and transformed in the paper machine (headboxes, wires, pressing, and drying). The effects of pressing and drying should be instantaneous, but are unknown, except possibly the sign. : Pulp feed: Since fiber strengths and bulk are additive pulp properties, the relations between those properties in the input and output flows Q mx and Q pu can be based mainly on mass balances, and therefore expected to be reliable. The output flow Q pu is controlled to regulate the basis weight w. This is assumed to work without error. The machine chests may or may not introduce dynamics, but with well regulated consistency that should be of first order (with two properties and three chests, that would possibly amount to six state variables). : Mixing: The mixing process should introduce simple and comparatively reliable dynamics, since its volumes and flows are known. The dynamics should be at most first order (possibly adding another six state variables). Transport delays might add to that, if significant. Since fiber strengths and bulk are additive pulp properties, the relations between those properties in the six input ingredients E pc, B pc and the three output mixtures E mx, B mx are based on mass balances. : Pulp preparation: It is generally unknown how a given blend of pulp and refining affect the mechanical properties and bulk of the pulp entering the mixing tanks and machine chests. It is reasonable to expect that a relation would be less reliable, since the raw material and previous processing of the pulp cannot be expected to produce a homogeneous result. The variations should be slow, however, compared to the overall processing time. The primary setting of pulp properties should be static (instantaneous), at least on the time scale of the 12 min sampling interval. Since there are several pulp lines and layers, most variables are arrays. The two middle layers in the cardboard have the same properties, and will be regarded in the sequel as a single layer, so the cardboard has three layers, not counting possible coating. Still, 59 scalar variables have been defined altogether, of which 18 have direct but contaminated relations to experiment data. Hence, most of the variables cannot be measured. But the point of dividing the object into several parts is that it will be possible, on physical grounds, to exclude directly a large number of otherwise mathematically possible relations between the input and output variables. And the number of unknown parameters will still be smaller than in the simplest conceivable ‘black box’ with sixteen input and two output. 7.4.1 Relations to Measured Variables The measured variables are indicated in Figure 7.1. The measurement conditions will need specifications later. Make one more table to support this. The following specifications will be needed: : The names of the recorded variables that correspond to the model variables. : The mean values, as given in the Data outline window in Figure 7.9; they will be needed for scaling. : A factor for converting from the units used in the model to those used in the data file. The latter follows easily, if the data units have been specified. The model will
248
Practical Grey−box Process Identification
use the standardized units [m], [kg], [W], for length, weight, effect, but [h] for time unit. The deviation is an adaptation to the sampling rate, 0.2 h. Table 7.2. Relations between model variables and data
Variable
Data
Mean Conversion
f1 f2 f3 f4 f5 f6 r1 r5 r6
F197 [t/h] FFC181 [l/s] F193 [t/h] F194 [t/h] F192 [t/h] F196 [t/h] EQ900 [kWh/t] EQ903 [kWh/t] EQ904 [kWh/t] QKAP01 AKYT2 [g/m2] ASVIR [m/min] ASPOP [m/min] SP2LT [kN/m2] SLT [kN/m2] APAL [g/m2] ATJO2 [micron] KSTY8 [mN]
3.35 27.8 6.21 5.84 6.27 5.51 205 106 180 63.8 273 318 330 488 54.4 20.2 445 14.1
[kg/h] [m3/h] [kg/h] [kg/h] [kg/h] [kg/h] [Wh/kg] [Wh/kg] [Wh/kg]
À6
w [kg/m2] v w [m/h] v p [m/h] p s [N/m2] p h [N/m2] c [kg/m2] H [m] BSI [mN]
0.001 1000/3600 0.001 0.001 0.001 0.001 0.001/0.001 0.001/0.001 0.001/0.001 1 1000 1/60 1/60 0.001 0.001 1000 1000000 1
7.5 Step 3: Modelling Filling the blocks with relations of various complexities and credibilities, with known or unknown parameters, will create the expanding set of model structures that MoCaVa needs for prior information (in addition to the experiment data). This will exploit more specific knowledge about the phenomena governing the behavior of each block. So far, preparations for this step will have to be done only for the ‘root’ model, i.e., for the simplest conceivable and preferably also reliable relations between the defined variables. Further modelling is postponed until the result of the calibration step indicates that refinement will be necessary. Thus, steps 3 and 4 are taken repeatedly in a ‘loop’. To start the loop select Calibrate in the MoCaVa window. Click New project and enter its name CardBoard (unless it has already been defined). Select and open the new project. In the next window select the data file from where you put the prepared data, namely MoCaVa3\Examples\CardBoard\out1121.mcv, and click OK in the next window to indicate that the whole file is to be used for the calibration. The modelling procedure starts from the end of the causality chain, the component producing the final output of the model, in this case the “Lab” unit. 7.5.1 The Bending Stiffness Enter Lab in the Component naming window, and select Change to define it. The next window will receive the assignment statements that specify how the bending stiffness BSI depends on the thicknesses H and strengths E of the layers in the cardboard.
7 Quality Prediction in a Cardboard Making Process
249
Entering Function Statements In order to find formulas for the dependence assume that the cardboard has the mechanical properties of an ideal I−beam. The following relations are obtained from the theory of bending an elastic four−layer beam: Let H 0 = H c and E 0 = E c be the cb thickness and elasticity of the coating, and H i = H cb i and E i = E i those of the three layers in the uncoated cardboard. Then the following formulas compute the bending stiffness index BSI from E, H, w, c (Pettersson, 1998): Ply coordinates: z 0 = − (H 0 + H 1 + H 2 + H 3) 2 z i = H i−1 + z i−1, (i = 1, 2, 3, 4) Bending stiffness per unit width: A 0 = E 0 (z 1 − z 0) B 0 = E 0 (z 21 − z 20) 2 D 0 = E 0 (z 31 − z 30) 3 for i = 1,2,3, compute A i = A i−1 + E i (z i+1 − z i) B i = B i−1 + E i (z 2i+1 − z 2i ) 2 D i = D i−1 + E i (z 3i+1 − z 3i ) 3 S b = D 3 − B 23 A 3
(7.1) (7.2) (7.3) (7.4) (7.5) (7.6) (7.7) (7.8) (7.9)
The bending stiffness is measured as the force it takes to depress the free end of a sample of cardboard of standard size fixed at the other end. The “bending stiffness index” is further normalized by a factor depending on basis weight: BSI = K si S b [(w + c) 0.1] 3
(7.10)
The known instrument factor is K si = 3 × depression × width (length) 3 of the test sample. Rewriting the formulas into M−statements yields the statements to enter. The credibility of this prior knowledge is high, and the relations are obvious parts of the ‘root’ model. Remark 7.7. From the point of view of identification, it is interesting to note that the relation is cubic in the thickness variables for given strength parameters E. However, the latter parameters may also depend on thickness; in particular pressing may affect both H and E. It is reasonable to conceive that when a piece of cardboard is compressed, the pulling force it will withstand, will not reduce as much as its thickness, if at all. This will conceivably reduce the degree of nonlinearity. Enter the following M−statements into the Component function window (this has already been done in the demo case): % PLY COORDINATES z0 = − (Hc + Hcb(1) + Hcb(2) + Hcb(3))/2 z(1) = Hc + z0 z(2) = Hcb(1) + z(1) z(3) = Hcb(2) + z(2) z(4) = Hcb(3) + z(3) % BENDING STIFFNESS PER UNIT WIDTH A = Ec * (z(1) − z0)
250
Practical Grey−box Process Identification
B = Ec * (z(1)*z(1) − z0*z0)/2 D = Ec * (z(1)*z(1)*z(1) − z0*z0*z0)/3 for i = 1:3 A = A + Ecb(i) * (z(i+1) − z(i)) B = B + Ecb(i) * (z(i+1)*z(i+1) − z(i)*z(i))/2 D = D + Ecb(i) * (z(i+1)*z(i+1)*z(i+1) − z(i)*z(i)*z(i))/3 end Sb = D − B*B/A % BENDING STIFFNESS INDEX A = (BW+BWc)/0.1 BSI = Ksi * Sb/(A*A*A) % Note: The instrument factor is % Ksi = 3 * depression * width/(length^3) of the test sample
Argument Classification The entries in the Argument classification window follow immediately: : All dependent variables, except BSI are Internal. : The properties of the cardboard Hcb,Ecb,Hc are Feed variables, to be determined by a not yet defined source component. : Since the coating substance is not changed, its strength Ec is classified as Parameter. : The basis weight of cardboard BW and coating BWc affect BSI in two ways, namely indirectly through the thicknesses Hcb and Hc, and directly through the standardized normalization of the index. They are both variables determining the specifications of the cardboard, and measured with relatively high accuracy. Classify them as Control variables. : Ksi has the standardized value of 12.6 [1000/m]. Classify it as Constant. Edit the Argument classification window: Argument
Class
Component output z0 z A B D Sb BSI
Internal Internal Internal Internal Internal Internal Response
Component input Hc Hcb Ec Ecb BW BWc Ksi
Feed Feed Parameter Feed Control Control Constant
I/O Interfaces Specifying the sources of the input and the possible targets of Response arguments is also immediate:
7 Quality Prediction in a Cardboard Making Process
251
: BSI is measured manually and the results are in the sample file. Select Sensor, even
though this means envisaging a ‘sensor’ that delivers its results only sparsely in time. Missing values are indicated by NaN in the file. : Hc, Hcb, Ecb are not measured and are to be determined by another component. : BW, BWc are measured, and it would be possible to select input interpolation models as their sources, and thus to include the latter as part of the Lab component. However, they are also input to other places in the process, and this makes it advantageous to use a common model for the data input. Select therefore User model. Edit the I/O interface window: Argument
Source
Connections to sensors BSI Sensor Hc NoSensor Hcb NoSensor Ecb NoSensor
Feed input: Source model Hc User model Hcb User model Ecb User model
Control input: Source model BW User model BWc User model
Argument Attributes The Argument attributes window is the second most important entry point for prior information (after the function statements). Considering the entries carefully will save troubles later: The following are some hints to support the specifications: : Since the variables are many, it is important to associate them with informative Short descriptions, including units. : Dimensions are required for arguments that are arrays: namely z, Hcb, Ecb. They follow immediately from the statements. : Since Hcb and Ecb are arrays, their scales and nominal values must be specified implicitly using labels. (A numeric value would be interpreted by MoCaVa as valid for all elements in the array.) : Even for other reasons will it be advantageous to use implicit specification: In this way, variables sharing scales and/or nominal values will need to be given numerical values only once. For instance, BW and BWc share scales, but not nominal values. However, they appear in other components, and are given implicit nominal values for that reason. Ecb will share both scales and nominal values with other tensile strength variables upstream in the process. : The arguments associated with the output, namely BSI and rms_BSI will not appear elsewhere, and may therefore be given numeric attributes. : Both Ec and rms_BSI are positive parameters, which is indicated under Min. Also the Control and Feed input are positive. However, the boundaries need not be specified, since they would have an effect only if the input would not be connected to other components and instead fitted as parameters.
252
Practical Grey−box Process Identification
Edit the Argument attributes window: Argument
Short description
Dim Scale
Nominal
Min
Parameters Ec rms_BSI
TensileStrengthCoating [N/m2] StdError_BSI
1 1
ScaleEc 1*1
NomEc 1*1
0 0
Control input BW BWc
BasisWeight [kg/m2] Coating [kg/m2]
1 1
ScaleBW NomBW ScaleBW NomBWc
Feed input Hc Hcb Ecb
ThicknessCoating [m] ThicknessLayers [m] TensileStrengthLayers [N/m2]
1 3 3
ScaleHc ScaleH ScaleE
Process output BSI
BendingStiffnessIndex [mN]
1
Constants Ksi
InstrumentFactor [1000/m]
1
Internal arrays z
PlyCoordinates [m]
4
Max
NomHc NomH NomE
12.6
Nominal Values and Scaling The values of implicit attributes are specified in the next window. Getting the values requires some prior analysis. Nominal values can be determined directly or indirectly from the data averages in Table 7.2: : Data averages yield (after units conversion) NomBW = 0.27, NomBWc = 0.02. : The thickness of coating is not measured directly. However, it is related to the basic weight by its density. Assuming a density of 1500 kg/m3 yields NomHc = 0.000013. : NomH requires a prior distribution of the measured average of the total thickness into (say) 1/4, 1/2, 1/4 of 0.000445, yielding NomH = (0.000100, 0.000225, 0.000100). : The remaining values NomE and NomEc are more difficult to find. However, a crude input−output balance of the Lab model will yield the answer: With the same tensile strength E for all layers of the cardboard the bending formula will reduce to S b = E (H c + |H cb|) 12, BSI = K si S b [(w + c) 0.1] 3, where |H cb|is the thickness of the board and H c that of the coating. This yields the balance 3 c cb −3 . Inserting the data averE = BSI × K −1 si [(w + c) 0.1] 12 (H + |H |) ages yields NomEcb = 14.1 × 12.6−1 [(0.27 + 0.02) 0.1] 3 12 (0.000445 + 0.000013) −3 = 3.41·10 12 Remark 7.8. The value may seem surprisingly large. However, it is a consequence of using the same metric unit for all lengths. The value illustrates the obvious: that it would require many milli−Newtons indeed to stretch a one−meter cube of cardboard out to double its length (assuming it does not break). The extreme value also emphasizes the importance of scaling. Using the default scale of 1 would obviously not work for the tensile strength variables.
7 Quality Prediction in a Cardboard Making Process
253
Since the variables are positive, the scales are determined easily from the nominal values by rounding to the nearest power of ten. Enter the following values into the Implicit attributes window: Attribute
Values
ScaleHc NomHc ScaleH NomH ScaleE NomE ScaleBW NomBw NomBWc ScaleEc NomEc
0.0001 0.000013 0.0001 0.00011 1e12 3.41e12 0.1 0.27 0.02 1e12 3.41e12
0.0001 0.000225 1e12 3.41e12
0.0001 0.00011 1e12 3.41e12
Assigning Data The next window requires the position in the data file and a units conversion factor. Both have been specified in Table 7.1. There is no units conversion; the instrument factor Ksi serves the same purpose. Edit the Data assignment window: Argument
Data
Conversion
BSI
KSTY8
1
The ‘model’ so far has no input variables defined. However, there is still a point in simulating it with nominal input: It is a way to check whether the nominal tensile strength has been computed correctly. It may also reveal a ‘bug’ in the assignment statements of the function specifications. Select Simulate. The model output BendingStiffness [mN] in the Plot window agrees with the data average. Select Reject, however, to indicate that the model class so far is not enough for a meaningful calibration. 7.5.2 The Paper Machine Select Edit (to change the model library), select Insert (to add a new component), enter PaperMachine (to give it a name), and select Change (to define the component). The Component function window opens again to receive M−statements. Entering Function Statements A first question in the modelling of the cardboard properties is how to define the bulk and strength variables B and E, since the bulk varies considerably as the fiber mass is transformed from pulp of various concentrations into cardboard in various sections of the paper machine. Mainly, the fibers hold different amounts of water, which affects the bulk. It may be even harder to envisage a property of “tensile strength” in the pulp phase, since pulp will obviously not withstand pulling. However, it is still possible to conceive a remaining bulk property, when water has been removed down to the moisture contents of the finished cardboard. Also a strength property of the dry fi-
254
Practical Grey−box Process Identification
ber is conceivable. Thus, the values of B pc, B mx, B pu, E pc, E mx, E pu should be interpreted as equivalent dry−fiber values. Remark 7.9. If the latter would appear hard to accept, it is still possible to argue that the physical interpretation will not have any affect of the final model, since none of the properties are measured directly. Remark 7.10. The ‘filling’ (various chemicals) added to the pulp may of course also change the bulk and strength of the cardboard. Since no data is available on filling, its effect must be ignored so far, possibly to be modelled later as an unknown factor or even disturbance. Assumption: The fiber properties of relevance to bending stiffness do not undergo significant changes between mixing and pressing. This means that possible effects of varying ‘formation’ on the wires can be neglected. After that, some relations follow immediately: The thicknesses are computed from the basis weights w i and densities à i (inverse of bulk), H i = w i à i = w i B i. The basis weights of the layers are computed by distributing the known total basis weight w according to the mass proportions in the layers: x i = Q i (Q 1 + Q 2 + Q 3), w i = x i w, H i = x i w à i, where Q i is the fiber flow in layer #i. These relations hold at all places in the paper machine. However, at least two phenomena can be expected to change the fiber properties, namely pressing in the two press sections and stretching between ‘wire’ and ‘pope’ (see Figure 7.1). There is no obvious reason for suspecting that the speed of the machine would influence the bulk; however the speed difference may well do, since this stretches the cardboard. The effect on bulk is by no means clear, but here too, mass balances provide partial knowledge: We have v w w w = v p w p, H w = B w w w, H p = B p w p, where the subscripts indicate ‘wire’ and ‘pope’ respectively. Now, if the cardboard would have the elasticity of an isotropic rubber band, its bulk would not be affected, and hence H p = B w w p. If it would be more like a bundle of rubber bands, its thickness would not be affected, which yields H p = B w w p v p v w, i.e., its bulk would increase. Unless the cardboard is perfectly isotropic, its bulk should change something in between, for instance by a factor of 1 + θ v (v p − v w) v w, (0 ≤ θ v ≤ 1) . Introduce therefore θ v as an unknown parameter, with zero nominal value, as a means for testing whether the hypothesis of isotrophy is in agreement with data. Remark 7.11. The way θ v appears in the formula makes it possible to regard it as an empirical coefficient measuring the effect of speed differences (drag), in case one would have difficulties accepting the reasoning on ‘anisotrophy’. Little is known about the effect of pressing, except possibly that the bulk should reduce. With the ‘hot nip’ press there is an open question whether a reduction would be due to some sort of ‘plasticity’ of the cardboard, or to the fact that the press is hot. In the latter case may be expected that it would effect mainly the top layer, since heat penetrates slowly. In the opposite case force would be the main cause of reduction, and the pressing would affect all layers. Assuming that the lasting compression of each layer is proportional to its thickness, the compression would be H i =θ i H i p h, where θi is an unknown ‘plasticity’ parameter. (The possibility that plasticity might be related to elasticity has been ignored, since such a relation would be even more speculative.) The effect of the heat is certainly speculative too, but the same relation will be used, mainly for lack of a better alternative. What is known is that heat has some affect on the cardboard, or it wouldn’t be there. Altogether, the assumed model for the thickness reduction due to hot nip pressing will be g i = exp(− θ bhn p h p h). i The ‘shoe’ press is believed to have smaller affect, and there is no heating. Assume
7 Quality Prediction in a Cardboard Making Process
255
therefore reduction by a common factor of the same type. Remark 7.12. Whenever lack of prior knowledge will not allow the use of well founded relations containing parameters with physical meanings, it is important that the empirical relations used instead have an effective form and that its (fictitious) parameters are scaled properly. Usually, it is possible to make fictitious parameters free of scale by introducing relations of the form scaled parameter = scale−free parameter × rated value, where the first factor expresses the variation and the second the constant scale. This has the advantage that it becomes easy to appraise whether a fitted parameter is ‘large’ or ‘small’. The latter event suggests that it may be worth trying a simpler model without the parameter. In the present case, the exponential form expresses the important prior knowledge that variables are positive. By means of the constant p h the parameters are made free of scale. A small value of θ will indicate that either there is no significant effect of pressing, or else the pressure does not vary enough to reveal any effect of pressing. The possibilities suggest that one try first zero values of the fictitious parameters. The following empirical formula for computing the stiffness will also be used: E i = E pu (à i à i) ν. It contains the new parameter ν, whose value (between 1 and 3) ini dicates how much tensile strength is effected by an increase of density, for instance due to pressing.
Prior Credibility The relations are uncertain. In particular, it is not likely that the formulas for computing density and the change of tensile strength will include all affects of the processing through the paper machine. In order to indicate the uncertainty (which is prior knowledge), introduce two ‘stubs’ s b and s e, in the form of constants with unit values. They will serve as “terminals” for other components refining the paper machine model. In summary, the thickness and stiffness models will be:
p h p h) à i = (s b B pu ) −1 [1 + θ v (v p − v w) v w] −1 exp(θ bs p s p s) exp(θ bhn i i (7.11) = x w à (7.12) H cb i i i e ν pu (7.13) E cd i = s E i (à i à i)
Enter the following statements into the Component function window: % EFFECT OF SHOE AND HOT NIP PRESSING ON BULK for i = 1:3 Bcb(i) = stubB * Bpu(i) * exp(−CBhn(i) * PressHotNip/nomPhn) Bcb(i) = Bcb(i) * exp(−CBs * PressShow/nomPs) end % EFFECT OF VELOCITY DIFFERENCE drag = (SpeedPop + 1)/(SpeedVir + 1) − 1 if drag > 0.1 drag = 0.1 end if drag < 0 drag = 0 end Gain(1) = 1/(1 − Cdrag * drag) Gain(2) = Gain(1)
256
Practical Grey−box Process Identification
Gain(3) = Gain(1) % DENSITY for i = 1:3 Dens(i) = Gain(i)/Bcb(i) end % THICKNESS Hc = BWc * Bc Q = Qpu(1) + Qpu(2) + Qpu(3) H = 0 for i = 1:3 Hcb(i) = BW * Qpu(i)/Q/Dens(i) H = H + Hcb(i) end % EFFECT OF DENSITY ON TENSILE STRENGTH for i = 1:3 Ecb(i) = stubE * Epu(i) * exp(nu*log(Dens(i)/nomDens(i))) end
Remark 7.13. The limitations to the drag variable are introduced to eliminate the effects of possibly remaining outliers in the speed data. Argument Classification Again, the argument classification is immediate: : All not previously defined dependent variables are Internal, except H. : The properties of the pulp Qpu,Epu,Bpu are set upstream, and therefore Feed input. : SpeedPop, SpeedVir, PressHotNip, PressShoe, BW, BWc govern the running of the paper machine and are therefore Control input. : CBhn, CBs, Cdrag, nu are dimensionless parameters in heuristic formulas. They are all unknown and to be classified as Parameter. : Bc is the bulk of coating. Its value should be constant, and possible to measure separately. However, since that has not been done, it must be classified as Parameter. : nomPHN, nomPs, nomDens are rated values. Classify them as Constant. : Classify stubB and stubE as Constant. Edit the Argument classification window: Argument
Class
Component output Bcd Internal drag Internal Gain Internal Dens Internal Q Internal H Response
Component input stubB Bpu CBhn PressHotNip nomPhn
Constant Feed Parameter Control Constant
7 Quality Prediction in a Cardboard Making Process CBs PressShoe nomPs SpeedPop SpeedVir Cdrag Bc Qpu stubE Epu nu nomDens
257
Parameter Control Constant Control Control Parameter Parameter Feed Constant Feed Parameter Constant
I/O Interfaces Specifying the sources of the input and the possible targets of response arguments is also immediate: : Thickness values H have been recorded in the sample file. Select Sensor. : None of Qpu, Bpu, Epu are measured. Select NoSensor. : Qpu, Bpu, Epu are input from other components. Select User model. : BW and BWc have already been assigned to a separate User model in the definition of the Lab component. : SpeedPop, SpeedVir, PressHotNip, PressShoe are also variables with a direct relation to input data. However, like BW and BWc they will also be input to other units in the cardboard process. Select User model. Edit the I/O interface window: Argument
Source
Connections to sensors H Sensor Bpu NoSensor Qpu NoSensor Epu NoSensor
Feed input: Source model Bpu User model Qpu User model Epu User model
Control input: Source model PressHotNip User model PressShoe User model SpeedPop User model SpeedVir User model
Argument Attributes Again, the attributes require more consideration: : CBhn, CEhn, Bpu, Qpu, Epu, nomDens, Bcb, Gain, Dens are three−dimensional arrays. : CBhn, CBs, Cdrag are units−free parameters, and have no effect for zero values. Hence, they have unit scales and zero nominal values. : stubB and stubE enter as factors and are therefore positive with unit nominal values. : nu has an admissible value between 1 and 3.
258
Practical Grey−box Process Identification
: Cdrag has zero nominal value. : Bc is positive with nominal value 0.00067 : The values rms_H measuring output errors will initially have to include modelling
errors, in addition to measurement errors. Therefore, it is usually better to start with higher values than the expected measurement accuracy. The conclusion is also supported by the observation that a search for rms−parameters appears to converge more easily from too high than from too low start values (see the example in section 2.6). Hence, set the nominal value and scale of rms_H somewhat lower than the thickness average 0.000445, even though the actual measurement accuracy would be expected to be much smaller. : Nominal values for SpeedPop, SpeedVir, PressHotNip, PressShow are obtained from the data averages in Table 7.2 (after rescaling). : Since nominal values and scales of Bpu, Qpu, Epu are shared with the Lab component, specify them implicitly, using symbolic attributes. : Set the ‘rated values’ nomPhn, nomPs to the nominal values of the corresponding pressures. (It would have been better to use implicit values instead, thus avoiding the need to enter the same figures twice.) Edit the Argument attributes window: Argument
Short description
Dim Scale
Nominal
Min
Parameters CBhn CBs Cdrag Bc nu rms_H
HopNipPress−BulkCoefficients ShoePress−BulkCoefficient DragCoefficient BulkCoating [m3/kg] DensiyExponent StDError_H
3 1 1 1 1 1
1 1 1 ScaleB 1 0.0001*1
0 0 0 0.00067 1 0.0001*1
0 0 1 0
Control input PressHotNip PressShoe SpeedPop SpeedVir
HotNipPressure [N/m2] ShoePressure [N/m2] PopeSpeed [m/h] WireSpeed [m/h]
1 1 1 1
10000 100000 10000 10000
54400 488000 19800 19080
0 0 0 0
Feed input Bpu Qpu Epu
BulkPulp [m3/kg] PulpFlows [kg/h] StrengthPulpFibers [N/m2]
3 3 3
ScaleB ScaleQ ScaleE
NomB NomQ NomE
0 0 0
Process output H
Thickness [m]
1
0.0001
Constants stubB nomPhn nomPs stubE nomDens
StubFactorBulk NominalHotNipPressure [N/m2] NominalShoePressure [N/m2] StubFactorStrength NominalBoardDensity [kg/m3]
1 1 1 1 3
1 54400 488000 1 NomDens
Max
3
7 Quality Prediction in a Cardboard Making Process Internal arrays Bcb Gain Dens
Bcb Gain Dens
259
3 3 3
Nominal Values and Scaling The implicit nominal values are again determined by different means: : The bulk may be computed from the basis weight and thickness, B = hw, assuming that all layers have the same bulk. Hence, from Table 7.2, and after rescaling: nomB(i) = 0.000445/0.27 = 0.00165. The densities are the inverses: nomDens(i) = 606. : The nominal values of the input flows must be determined more indirectly, again by means of a mass balance: The total input flow of fibers is Q = w v p b, where b = 7 m is the width of the paper machine. That is distributed according to the proportions of the mass contents of the layers. Logically, the proportions x are part of the specifications of the quality being produced. However, their values are not in the data file, and must therefore be substituted by estimates. The recorded proportions of pulp being fed into the process yields an obvious source from which to calculate the mass distribution in the layers: The averages follow immediately from Table 7.2, with one exception: Since FFC181 is a volume flow, it must first be converted using the known consistency 42 kg/m3 of the pulp. This yields x = {3350+4218,6239+5680+6270,5520}/31268 = {0.24,0.58,0.18}, and hence nomQ = {0.24,0.58,0.18} x 0.273 x 19800 x 7 = {9081,21946,6811}. Notice that the total output flow is about 20% higher than the input flow. The difference consists mainly of ‘filling’. Again the scales are the nominal values rounded to the nearest power of 10. Enter into the Implicit attributes window: Attribute
Value
ScaleB NomB ScaleQ NomQ NomDens
0.001 0.00165 10000 9081 606
0.001 0.00165 10000 21946 606
0.001 0.00165 10000 6811 606
Assigning Data The variable in the data file associated with H and the conversion factor are listed in Table 7.2 as ATJO2 and 1000000. Edit the Data assignment window: Argument
Data
Conversion
H
ATJO2
1e6
This concludes the definition of the PaperMachine component. Checking the component Again, it is worthwhile to check the function statements and the nominal values.
260
Practical Grey−box Process Identification
Click Simulate and then OK in the next two windows to confirm that the simulation is to be based on the list of nominal values, and that the two measured output are to be plotted and compared with the data. The Plot window does not indicate that something is obviously wrong with the nominal values. 7.5.3 The pulp feed Select Edit and Insert, enter PulpFeed, and select Change. Entering Function Statements The three flows of pulp Q pu into the paper machine are controlled, in order to maintain flows that are consistent with the prescribed basis weights of the three layers of the cardboard. The basis weight controller uses mass balance calculations to inject the right pulp flows into the paper machine: (This assumes that the pumps actuate the controller signals without error and that consistency measurements are accurate.) Q pu = w b v p x i , where x is the given distribution of pulp to the three layers. Again, i the proportions x are part of the specifications of the quality being produced, but the mx mx mx values have not been recorded. Use therefore x i = Q mx i (Q 1 + Q 2 + Q 3 ). pu pu The bulks B and tensile strengths E are properties of the pulp pumped out of the machine chests and into the paper machine. The corresponding properties into the machine chests are B mx and E mx (Figure 7.10). The machine chests may or may not introduce some significant dynamics into the model, depending of their volumes. With mass contents of about 42 x 10 kg and pulp flows of about (9000,22000,7000) kg/h, this will mean time constants around (0.05,0.02,0.06). The longest is about a third of the sampling time 0.2 h. It is therefore likely a priori that the dynamics will have little effect. However, the estimate is still a crude one, and the actual mass contents may differ. Prepare therefore for a later adding of a dynamic machine−chest model. This can be done by writing B pu = B mx + B mc, E pu = E mx + E mc , where B mc, E mc are ‘stubs’ with the purpose of accounting for possibly significant ‘transient’ effects of the machine chests. Enter the M−statements: % STATIC MASS BALANCE a = BW * width * SpeedPop/(Qmx(1) + Qmx(2) + Qmx(3)) for i = 1:3 Qpu(i) = a * Qmx(i) Bpu(i) = Bmx(i) + Bmc(i) Epu(i) = Emx(i) + Emc(i) end
Argument Classification Classification is immediate: : a is Internal and of no interest outside the component. : Qmx, Bmx, and Emx are Feed input. : Bmc, Emc, and width are Constant. Edit the Argument classification window:
7 Quality Prediction in a Cardboard Making Process
Argument
Class
Component output a
Internal
Component input width Qmx Bmx Bmc Emx Emc
Constant Feed Feed Constant Feed Constant
261
Remark 7.14. The BW and SpeedPop input is not in the list of variables to classify, since this has been done in earlier components. I/O Interfaces The pulp feed is a unit without direct relation to available data. : None of Qmx, Bmx, Emx are measured. Select NoSensor. : Qmx, Bmx, Emx are input from other components. Select User model. Edit the I/O interface window: Argument
Source
Connections to sensors Qmx NoSensor Bmx NoSensor Emx NoSensor
Feed input: Source model Qmx User model Bmx User model Emx User model
Argument Attributes Scales and nominal values of the input are the same as for the output and therefore defined implicitly, except that the nominal values of the ‘stubs’ Bmc and Emc are zero. Edit the Argument attributes window: Argument
Short description
Dim Scale
Nominal
Min
Feed input Qmx Bmx Emx
FeedPulpMixture [m3/kg] BulkPulpMixture [m3/kg] FiberStrengthPulpMixture [N/m2]
3 3 3
NomQ NomB NomE
0 0 0
Constants width Bmc Emc
MachineWidth [m] TransBulkMachineChest [m3/kg] TransStrMachineChest [N/m2]
1 3 3
ScaleQ ScaleB ScaleE
Max
7 0 0
This concludes the definition of the PulpFeed component. The values of the implicit scales and nominal values have already been defined, which illustrates the advantage of using implicits.
262
Practical Grey−box Process Identification
7.5.4 Control Input The components defined so far use control input from basis weight, speed, and pressure models, yet to be defined. It is obvious from the data records, and not surprising, that basis weight BW (AKYT2) plays a major role for the variations in thickness H (ATJO2), and that coating BWc (APAL) affects the bending stiffness BSI (KSTY8). Hence, the next step is to associate the input variables BWc and BW with the data. Select Edit and Insert, enter BWInput, and select Change. Connecting input variables to data is done in a standard way, the only freedom is that of selecting interpolation routine. The purpose of the component statements is to create two continuous−time variables to receive the results of library routines interpolating between the discrete−time data. Entering Function Statements M−statements: BW = BWin BWc = BWcin
Argument Classification The variables have the same classification. Edit the Argument classification window: Argument
Class
Component input BWin BWcin
Control Control
I/O Interfaces Try first the simplest interpolation, the Hold. Edit the I/O interface window: Argument
Source
Control input: Source model BWin Hold BWcin Hold
Argument Attributes: No nominal values are required, since the variables have been connected to data. Edit the Argument attributes window: Argument
Short description
Dim Scale
Control input BWin BWcin
BasisWeight [kg/m2] Coating [kg/m2]
1 1
ScaleBW ScaleBW
Nominal
Min
Max
7 Quality Prediction in a Cardboard Making Process
263
Assigning Data: Again, the names of the associated variables in the data file and the conversion factor are listed in Table 7.2. Edit the Data assignment window: Argument
Data
Conversion
BWin BWcin
KSTY8 APAL
1000 1000
This concludes the definition of the BWInput component. Any effects of other control input SpeedPop, SpeedVir, PressHotNip, PressShoe, although likely, are less obvious to the eye. In particular, it may be difficult to distinguish between the effects of PressHotNip and BWc, since the corresponding input data SLT and APAL change their levels almost simultaneously and hence are strongly correlated in normal production. This suggests that the building of a ‘root’ model class stop here, since the latter should include only such parts that are obviously needed to describe the variations in the data. The correct way to proceed would therefore be to go to step 4, “Calibration”, find the best model within the ‘root’ model class, and then continue by testing whether the class is sufficient. In principle, any hypothetical relation may be tested, but this requires first expanding the model class, which means looping back to Step 3, “Modelling”. Thus, an expansion will have to be done sooner or later to test at least the most likely shortcomings of the model so far, and it saves time to do some expansion a this stage. The obvious thing to do is to append models of the two remaining control input, the speed and pressure settings, which may or may not have significant effects on the models response. Notice that this will not change the model’s response, since the nominal values of the gain parameters CBhn, CBs, CEhn, Cdrag are zero. Hence, any test result will not be effected by the expansion of the root model. Select Edit and Insert, enter SpeedInput, and select Change.... Select Edit and Insert, enter PressureInput, and select Change.... Click Simulate. MoCaVa shows the responses in Figure 7.11 . The result confirms that the nominal values of nomQ, nomB and nomE computed from steady−state balances are reasonable. The variations in the model output are mainly the effects of varying basis weight and coating. Apparently the basis weight variations explain most of the variations in thickness, but not all. The variations in coating fail to explain those in bending stiffness, although apparently correlated. However, fitting parameters may improve the model. Again the question arises whether it is time to stop modelling and turn to fitting and testing. And again the argument for considering reasonable expansions to the root model is that it will be needed for testing the hypothesis that the root model class is sufficient. With the current model class as the ‘root’ (or ‘null hypothesis’ in the terminology of statistical testing) one or more ‘alternative hypotheses’ must be defined. Remark 7.15. The issue of when stop modelling and go to fitting and testing is one of practical convenience. It does not matter to the test results whether the expanding components are created before or after fitting and testing the current model class, as long as components are appended with parameter values that make them ‘null and void’. On one hand it pays to do it before, as long as one suspects a priori that the cur-
264
Practical Grey−box Process Identification
Figure 7.11. Results of simulating first model
rent model class will not be adequate. On the other hand it would mean unnecessary work to create expansions to a component before even the unexpanded component has been tested significant. Again, the decision depends on prior knowledge, this time on an appraisal of the uncertainty of the hypotheses a component is build on. Thus, the components created so far define what it needed for the model class to make sense and
7 Quality Prediction in a Cardboard Making Process
265
to include effects that are obvious a priori to be significant. The following components are expansions of the ‘root’ model class and are needed to allow the testing of such hypotheses that are most obviously suspected a priori. The alternatives that suggest themselves follow from the construction of the cardboard machine: The idea of having three layers of board is to increase the bending stiffness by having a bulky middle layer and strong surface layers. It is also the idea that changing the mix of pulp ingredients is a way to control the bending stiffness. Since the nominal values in the present model are constant, they cannot describe the latter effect, and a model allowing Bmx and Emx to change with the variation in pulp feed is needed. 7.5.5 The Pulp Mixing The mixing of pulp for the three layers takes place in three mixing tanks, which have two, three, and one pulp constituents as input, in addition to sufficient water to keep the volumes constant. As with the machine chests, it is unclear a priori whether the dynamics of the mixing tanks will have significant influence on the pulp properties. The volumes of the mixing tanks are five times those of the machine chests, which means that the time constants are about (0.2,0.1,0.3). Since this is in the vicinity of the sampling interval 0.2h, it would be no surprise if the dynamics would turn out not to be negligible. However, in order to find that out, model the mixing in a way similar to that of the machine chests, i.e., as static mixing plus a ‘stub’ for absorbing possibly significant transients from the mixing tanks. Since bulk and tensile strength are both additive properties, the following property−conserving relations hold: Q mx i =
j∈J i
Qj
=
E mx i =
B
mx i
(7.14)
j∈J i
Bj Qj Q
mt i
(7.15)
j∈J i
mt E j Q j Q mx i + Ei
(7.16)
mx i
+B
where i = 1,2,3, J 1 = {1,2}, J 2 = {3,4,5}, J 3 = {6}, and B mt, E mt are the ‘stubs’. Select Edit, select Insert, type PulpMixing, select Change, and enter the M−statements: % PULP Qmx(1) Qmx(2) Qmx(3)
FLOW IN LAYERS = Q(1) + Q(2) = Q(3) + Q(4) + Q(5) = Q(6)
% BULK Bmx(1) Bmx(2) Bmt(2) Bmx(3)
IN LAYERS = (B(1) * Q(1) + B(2) * Q(2))/Qmx(1) + Bmt(1) = (B(3) * Q(3) + B(4) * Q(4) + B(5) * Q(5))/Qmx(2) + = B(6) + Bmt(3)
% STIFFNESS COEFFICIENTS IN LAYERS Emx(1) = (E(1) * Q(1) + E(2) * Q(2))/Qmx(1) + Emt(1) Emx(2) = (E(3) * Q(3) + E(4) * Q(4) + E(5) * Q(5))/Qmx(2) + Emt(2) Emx(3) = E(6) + Emt(3)
266
Practical Grey−box Process Identification
Argument Classification Again, the classification is obvious: : Q is Feed input. : B and E are Feed input. : Bmt and Emt are Constant. Edit the Argument classification window: Argument
Class
Component input Q B Bmt E Emt
Feed Feed Constant Feed Constant
I/O Interfaces The only direct relation to data is that Q is measured. However, make a separate component for the relation, in order to facilitate a test of alternative data interface models. Edit the I/O interface window: Argument
Source
Connections to sensors Q NoSensor B NoSensor E NoSensor
Feed input: Source model Q User model B User model E User model
Argument Attributes There are six input lines and three output lines, which determines the dimension numbers. Scales and nominal values of the input are the same as for the output and therefore defined implicitly, except that the nominal values of the ‘stubs’ Bmt and Emt are zero. All implicit values except NomQin have been defined previously. The latter has six elements with different values. Also B and E are six−dimensional arrays. However, since it is possible to use equal nominal values for all elements, it is feasible to use the same three−dimensional array of nominal values for the six−dimensional arrays. When an attribute has insufficient dimensionality MoCaVa uses its last value as default values for the missing dimensions. Edit the Argument attributes window: Argument
Short description
Dim Scale
Nominal
Min
Feed input Q B E
FlowPulpConstituents [kg/h] BulkPulpConstituents [m3/kg] StrengthPulpConsituents [N/m2]
6 6 6
NomQin NomB NomE
0 0 0
ScaleQ ScaleB ScaleE
Max
7 Quality Prediction in a Cardboard Making Process Constants Bmt Emt
TransBulkMixingTanks [m3/kg] 3 TransStrengthMixingTanks [N/m2] 3
267
0 0
Nominal Values The nominal values of Q are obtained from Table 7.2. Enter the following into the Implicit attributes window: Attribute
Values
NomQin
3350
4218
6230
5680
6270
5520
This concludes the definition of the PulpMixing component. Data will be assigned through the next component. 7.5.6 Pulp Input The relation of input flow of constituents Q to the corresponding data is modelled in analogy with the other data input. The only deviation is that one of the flows must be converted from volume to mass flow. Select Edit, select Insert, type PulpInput, select Change, and enter the M−statements Q(1) Q(2) Q(3) Q(4) Q(5) Q(6)
= = = = = =
Qpine C_birch * Fbirch Qctmp QB75 Qreject QB70
Argument Classification Edit the Argument classification window: Argument
Class
Component input Qpine C_birch Fbirch Qctmp QB75 Qreject QB70
Feed Constant Feed Feed Feed Feed Feed
I/O Interfaces Specification of the data interface: : The Feed data look noisy and in need of filtering. However, it is still not evident that the effect of the input noise is significant in comparison with other contamination in the data. Try therefore the simplest interpolation first, the Hold. : Since Hold does not have a parameter to fit to the data, there is no point in assigning a sensor. (The contribution to the overall loss of the prediction error of such a sen-
268
Practical Grey−box Process Identification
sor output would depend only on the data, and would not contribute to the calibration process.) Edit the I/O interface window: Argument
Source
Connections to sensors Qpine NoSensor Fbirch NoSensor Qctmp NoSensor QB75 NoSensor Qreject NoSensor QB70 NoSensor
Feed input: Source model Qpine Hold Fbirch Hold Qctmp Hold QB75 Hold Qreject Hold QB70 Hold
Argument Attributes Edit the Argument attributes window: Argument
Short description
Dim Scale
Feedl input Qpine Fbirch Qctmp QB75 Qreject QB70
PulpFeedPine [kg/h] PulpFeedBirch [m3/h] PulpFeedCtmp [kg/h] PulpFeedB75 [kg/h] PulpFeedReject [kg/h] PulpFeedB70 [kg/h]
1 1 1 1 1 1
Constants C_birch
ConsistencyBirch [kg/m3]
1
Nominal
Min
Max
1000 100 1000 1000 1000 1000 42
Assigning Data Again, the names of the associated variables in the data file and the conversion factors are listed in Table 7.2. Edit the Data assignment window: Argument
Data
Conversion
Qpine Fbirch Qctmp QB75 Qreject QB70
F197 FFC181 F193 F194 F192 F196
0.001 0.2777 0.001 0.001 0.001 0.001
This concludes the definition of the PulpInput component.
7 Quality Prediction in a Cardboard Making Process
269
7.5.7 The Pulp Constituents The purpose of adding the PulpMixing and PulpInput components is to describe how the six unknown variables Bmx and Emx (bulk and tensile stiffness of the pulp mixture into the PulpFeed component) may depend on the ten input variables, namely the six input flows, and the remaining four input, refining and the “kappa number” (The latter is a standard measure of pulp properties that is generally considered to have some bearing on paper quality). However, it is also possible that there is no significant effect of refining and the kappa number. In this case the varying Feed input may still cause significant variation in Bmx and Emx, even for constant B and E, provided the entries differ. Since this is the whole idea of mixing pulp with different properties, it would be reasonable to hypothesize that having different pulp properties would be a more important improvement of the model than making the properties vary with varying refinement. This would suggest to proceed with calibrating the model class expanded so far. However again, there are two reasons for adding one more component to compute B and E: : The component is needed to hold ‘stubs’ for a later modelling of the effects of refining and kappa. The introduction of the stubs thus enter the prior knowledge that independence of refining and kappa is an uncertain hypothesis and must be tested. : Even if the six bulk and strength parameters in B and E differ, the differences may not be large enough to reveal themselves in the data. In order to allow for the possibility that common values will do for all constituents, a new parametrization will be needed. Select Edit, select Insert, type PulpConstituents, select Change, and enter the M−statements: % BULK B(1) = B(2) = B(3) = B(4) = B(5) = B(6) =
OF CONSTITUENTS AveBulk * CB(1) * SBR(1) AveBulk * CB(2) AveBulk * CB(3) AveBulk * CB(4) AveBulk * CB(5) * SBR(2) AveBulk * CB(6) * SBR(3) * SBK
% STRENGTH OF CONSTITUENTS E(1) = AveStrength * CE(1) * SER(1) E(2) = AveStrength * CE(2) E(3) = AveStrength * CE(3) E(4) = AveStrength * CE(4) E(5) = AveStrength * CE(5) * SER(2) E(6) = AveStrength * CE(6) * SER(3) * SEK
The Importance of Parametrization The particular parametrization may be interpreted as follows: : AveBulk and AveStrength are parameters carrying units, which are common to all pulp lines. : CB, CE are arrays measuring the influences of various pulp ingredients in the different layers in the cardboard. They are all units−freeand have nominal values one.
270
Practical Grey−box Process Identification
: SBR, SER, SBK, SEK are ‘stubs’, and thus of no effect until they are possibly re-
placed by output from other components describing the effects of refining and the kappa number. : Any of the factors may be used as ‘stubs’, i.e., locking an arbitrary selection to their nominal values makes it possible to test the hypothesis that any one will not be needed. : The two common parameters AveBulk and AveStrength increase the number of unknowns to 14, of which at most 12 are identifiable. However, this is not necessarily a problem, since not all of them will be freed at the same time. The point of the particular parametrization is that this makes it possible to test the hypothesis that equal factors will do. Argument Classification The stubs SBR, SER, SRK, SEK are Constant. The other arguments are Parameter. Edit the Argument classification window: Argument
Class
Component input AveBulk CB SBR SBK AveStrength CE SER SEK
Parameter Parameter Constant Constant Parameter Parameter Constant Constant
Argument Attributes The attributes follow immediately from the parametrization: : The only units−carrying parameters are AveBulk and AveStrength, which may preferably be given the same implicit attributes as the other bulk and tensile strength arguments. : Other parameters have unit scales and nominal values. : All parameter are positive for physical reasons. Edit the Argument attributes window: Argument
Short description
Dim Scale
Nominal
Min
Parameters AveBulk CB AveStrength CE
AverageBulk [m3/kg] BulkFactors AverageStrength [N/m2] StrengthFactors
1 6 1 6
NomB 1 NomE 1
0 0 0 0
Constants SBR SBK SER SEK
StubBulkRefining StubBulkKappa StubStrengthRefining StubStrengthKappa
3 1 3 1
ScaleB 1 ScaleE 1
1 1 1 1
This concludes the definition of the PulpConstituents component.
Max
7 Quality Prediction in a Cardboard Making Process
271
Figure 7.12. Block diagram of the root model class
A block diagram of the model class is shown in Figure 7.12, which is generated automatically from the component specifications. The model class has now been expanded enough to make a meaningful alternative for testing the hypothesis that the tentative (unexpanded) model is good enough. But first, and in order to detect any obvious faults in the added components, do a simulation of the expanded model. Click Simulate. The two windows in Figure 7.13 appear, one stating that the number of signals (= the total number of scalar variables connecting the components) is 49 and larger than the default maximum 32. The other provides a possibility to rectify this. Increase the value of MAX_NR_SIGNALS, for instance four times. MoCaVa stores the new maximum sizes (in the ASCII file MoCaVa3\mcvprojects\CardBoard\casedir\status\custom_spec ) and prompts you to exit and restart the session to allow new allocation of memory. Click OK, Exit, and Start, select Calibrate, select and open CardBoard, click Resume and Simulate again. Since the parameters in the added components have their ‘null’ values, they do not contribute, and the result is the same as in Figure 7.11.
7.6 Step 4: Calibration The minimum calibration task is to get the residuals unbiased and their variances estimated. This is achieved by fitting the four parameters rms_BSI, rms_H, AveBulk, AveStrength.
272
Practical Grey−box Process Identification
Figure 7.13. Customization window
7 Quality Prediction in a Cardboard Making Process
273
Select Accept (the model class) and click OK. Check the boxes after the appropriate parameters, and click OK three times to initiate the fitting. Then click Accept (the tentative model with estimated values). This brings the session to the starting point of the calibration loop, where the user must define one or more alternative structures, either by freeing selections of the remaining bounded parameters, or by expanding the model class. As long as there are unknown parameters in the tentative class, they are obvious tools by which to improve the current tentative model by fitting. It is prudent to free only few more parameter, preferably one, in order to reduce the risk of ‘over−fitting’. However, it is generally not obvious which one to fit first. If the user has some means to decide that, or at least to limit the number of candidates, the calibration session will be shorter. Tentative Model Structure #0 For the first round of tests the most obvious misfit is the response of the tentative model to the hot−nip pressure (Figure 7.14). The bulk parameter array CBhn affects the responses to hot−nip pressure directly. However, it is neither clear that all of the three entries do so significantly, nor which one has the largest effect. MoCaVa3 offers two alternative windows for selecting the parameters to free for alternative structures. The default window is the easier to use. However, its options are limited to those of selecting scalars and whole or truncated arrays. That is not enough for the present purpose, since it would presume that among the three layers in the cardboard the top layer is the most affected, and the bottom layer the least. Instead, it will be necessary to test each combination of individual entries in the two arrays as scalars. Click Advanced and check the box Primitive Alternative Structures Window. Click NewClass and then OK. An alternative structure may be specified in two ways: i) By checking the boxes to the right of the parameter (arrays) that are to be free, and then create the alternative by checking one of the boxes above the fields displaying the alternatives. This allows any conceivable combinations to be set up. ii) By using the ‘expand’ macro (<). This will implement the SFI rule, by setting up as many alternatives as there are entries in the array that have not yet been used, each one freeing one more parameter entry. There is also a ‘reduce’ macro (>) macro that will implement the SBE (Stepwise Backwards Exclusion) rule. Proceed by clicking < for the parameter array CBhn, thus creating three alternative structures, each one having five free parameters. Figure 7.15 shows the setup. The check marks indicate the free parameters in the current tentative model structure (and fitted to define the current tentative model). Click NewDim and then OK sufficiently many times. The result of the test is shown in Figure 7.16. The tentative model is falsified, and at least two alternatives are better with zero risk. Click Select_#2 (to acknowledge that the alternative with free CBhn(2) is better, and also the best). Then click OK three times to fit the new tentative model. Tentative Model Structure #1 The obvious way to proceed would be freeing each one of the remaining parameter entries in CBhn and possibly also the other parameters in PaperMachine, until either no alternative is better, or else all have been freed and fitted. It is quite straightforward to do this by repeated use of the “<” macro.
274
Practical Grey−box Process Identification
Figure 7.14. Response of model #0
However, considering that the idea of having several layers in the cardboard is to utilize the different properties of the pulp, it is clearly important to be able to falsify at an early stage the hypothesis of not nominal, but equal bulk and strength properties. Figure 7.17 shows the setup of alternatives this will require: In the last two alternatives the scalar parameters AveStrenghth and AveBulk are locked to their nominal
7 Quality Prediction in a Cardboard Making Process
275
Figure 7.15. Three alternative structures
values and the array parameters CE and CB are freed instead. Free parameters according to the setup in Figure 7.17, and click NewDim. MoCaVa issues two warnings: i) Two of the alternatives are not nested, which means that the faster of the two tests (ALMP) cannot be used, and ii) No alternative allows a ‘fair’ test. However, it is still possible to compute the risk of accepting each alternative and base the decision on that. Click Overrule (the recommendation from MoCaVa) and proceed with the LR−test. Loss reductions and parameter values: Parameter Ec CBhn(1) CBhn(3) CBs Cdrag Bc nu CB CE
Value 1012
3.47 0.284 0.301 −0.0625 2.46 0.000687 0.879 (0.71,0.74,1.64,1.09,1.09,0.91) (1,1,1,1,1,1)
Loss reduction 1.0 6090 854 541 994 2.2 1.1 7329 −546
Dgf 1 1 1 1 1 1 1 5 5
When, as in this case, the alternatives have different number of free parameter entries, the loss reductions (= 0.5*chi2) are not the deciding numbers, but instead the risk values, which also take the ‘degrees of freedom’ into account. Both the loss reductions associated with CBhn(1) and CB are high enough to make the risk lower than the numerical resolution of the routine computing them, and for all practical reasons one could select any of them. Select CBhn(1) for the sake of parsimony (it will also be the one with the mathematically smallest risk). Remark 7.16. The setup deviates from the main rule of expanding the tentative structure with an equal number of parameters, and preferably one. However, it is a consequence of wanting to test the hypothesis that a common value for all entries in an array will do. If the main rule (SFI) would be followed, i.e., to test whether any selec-
276
Practical Grey−box Process Identification
Figure 7.16. Test results for three alternatives
tion of parameter entries would deviate from a common value, this may result in the conclusion that some entries deviate from a common value, valid for the rest. Such a result would seem overly sophisticated and difficult to interpret physically. Select_#2, and Accept the fitted model. Tentative Model Structure #2 The next step is a repetition of the previous.
7 Quality Prediction in a Cardboard Making Process
277
Figure 7.17. Alternative structures to model #1
Free parameters as before (using the “<” macro, except for the last two alternative), thus making eight alternatives, and click NewDim. Loss reductions and parameter values: Parameter
Value
Loss reduction
Ec CBhn(3) CBs Cdrag Bc nu CB CE
3.68 1012 −0.071 −0.00065 0.961 0.000719 0.805 (0.94,0.99,1.51,0.98,0.95,1.03) (1,1,1,1,1,1)
21.0 88.7 9.8 217 21.5 63.9 2118 −5681
Dgf 1 1 1 1 1 1 5 5
The hypothesis of a common bulk property has been falsified. Select_#7, and Accept the fitted model. Tentative model structure #3 Again a repetition, since the freedom of the model class has not been exhausted. Use the “<” macro, except for the last alternative, replacing AveStrength by CE. This makes seven alternatives. Loss reductions and parameter values: Parameter
Value
Loss reduction
Ec CBhn(3) CBs Cdrag Bc nu CE
3.67 1012 −0.213 0.0051 0.650 0.000718 0.723 (0.45,1.02,1.08,1.22,0.81,0.90)
16.2 237 7.1 108 17.0 90.5 217
Dgf 1 1 1 1 1 1 5
278
Practical Grey−box Process Identification
Select_#4, and Accept the fitted model. A Succession of Tentative Model Structures Also the sequel follows mainly the SFI rule: Free the remaining parameters, one more at a time, and select the alternative that has the smallest acceptable risk (or promises the largest significant loss reduction with zero risk). In order to avoid more tedious repetitions the results are condensed into Table 7.3. Table 7.3. Fitted and alternative free parameters and their test scores #
Lab PaperMachine
PulpConstituents
Q, oQ
#0
[01] [01] [01] [<1] [01] [<1] [01] [<1] [01] [<1] [01] [<1] [01] [<1] [11] [11] [11] [11]
[1(000000)1(000000)] [1(000000)1(000000)] [1(000000)1(000000)] [0(######)0(######)] [1(000000)0(000000)] [0(######)0(######)] [0(111111)1(000000)] [0(111111)0(######)] [0(111111)1(000000)] [0(111111)0(######)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)]
−44346.6 7090 −53223.6 6090 −59313.9 2247 −61561.2 108 −61669.6 199 −61869.0 26.0 −61895.5 26.2 −62226.5 19.3 −62267.2
#1 #2 #3 #4 #5 #6 #7 #8
[(000)00001] [(<<<)00001] [(010)00001] [(<1<)<<<<1] [(110)00001] [(11<)<<<<1] [(110)00001] [(11<)<<<<1] [(110)01001] [(11<)<1<<1] [(110)01001] [(11<)<1<<1] [(110)11001] [(11<)11<<1] [(110)11001] [(11<)11<<1] [(110)11101] [(11<)111<1]
it 1 4 4 4 4 a) 1 a) 1 a) 1 a) b)
a) There are alternatives with larger loss reductions, but inadmissible parameter values. b) There are alternatives with significant loss reductions, but none is admissible.
The table includes also the previous tests. It is interpreted as follows: The first column contains the number of the tentative model structure, i.e., the latest not (yet) falsified. The next columns indicate which parameters have been fitted to data and which retain their nominal values (marked by 1 and 0 respectively) in order to define the tentative model. Parameter indices are grouped within square brackets according to the components they belong to. Individual entries in array parameters are further grouped within parentheses. Lines without a model number specify the alternative structures to compare with the tentative. A number of parameters marked with < have been freed, one at a time, to specify a number of alternatives. When all components in an array are freed simultaneously this is marked with as many hash signs #. The second last column has different uses, depending on whether it belongs to a tentative or an alternative structure: in the first case it shows the loss value Q, and in the second the loss reduction oQ that falsified the tentative model. The loss value is normally negative, since it measures the loss relative to that of a ‘null model’; any model that does better is a gain. When the reduction becomes insignificant (less than 32/2, for one more freed parameter, equivalent to the conventional 3σ limit), the tentative model is no longer falsified, and another combination of free parameters or active components is tried. The last column shows the number of iterations used to find an alternative model to be compared
7 Quality Prediction in a Cardboard Making Process
279
with the tentative. Notice that this does not have to be the best model in the alternative structure. An asterisk marks such cases where iteration has proceeded until the best has been found. Occasionally, the last column is also used for marking comments. QuitSearch.
7.7 Expanding the Tentative Model Class The current model class (Figure 7.12) has been rejected. An obvious possible shortcoming is that the effects of pulp refining have not been considered. Try amending this first, in particular since it will involve more evidence in the form of recorded data. 7.7.1 The Pulp Refining It is unknown a priori how the grinding of the fibers affects their bulk and strength, or what is the relation to the refining−dependent mixed quality measure called “kappa number”. Heuristic relations will therefore have to do. Introduce scale−free parameters in the same way as when modelling the effects of pressing. Click NewClass, select Edit, select Insert, type PulpRefining, select Change, and enter the M−statements % BULK FACTORS SBR(1) = 1 + CBR(1) * (SREpine/nomSREpine − 1) SBR(2) = 1 + CBR(2) * (SREreject/nomSREreject − 1) SBR(3) = 1 + CBR(3) * (SREB70/nomSREB70 − 1) SBK = 1 + CBK * (KapB70/nomKapB70 − 1) % STIFFNESS FACTORS SER(1) = 1 + CER(1) * (SREpine/nomSREpine − 1) SER(2) = 1 + CER(2) * (SREreject/nomSREreject − 1) SER(3) = 1 + CER(3) * (SREB70/nomSREB70 − 1) SEK = 1 + CEK * (KapB70/nomKapB70 − 1)
The argument classifications are obvious, except that of KapB70. The latter is associated with the data QKAP01, which are laboratory measurements taken with irregular intervals. Logically, it is an output from the refining process, which would require a model for describing its relation to SREB70 (Specific Refining Energy of the B70 pulp). However, in order to avoid for the moment the problem of defining such a model, regard KapB70 as a parameter in the PulpRefining component. It will then be possible to improve the component by associating the parameter with the data, either as the output of another component (by assigning a sensor), or as an input (by assigning an input filter), whichever alternative turns out to describe the responses best. That probably needs to be done sooner or later, but the point of doing it later is that the PulpRefining component will not have to be changed. In the graph of the model class the classification will visualize that KapB70 is a different kind of argument than the specific refining energies SRE. Edit the Argument classification window: Argument
Class
Component input CBR
Parameter
280
Practical Grey−box Process Identification
SREpine nomSREpine SREreject nomSREreject SREB70 nomSREB70 CBK KapB70 nomKapB70 CER CEK
Control Constant Control Constant Control Constant Parameter Parameter Constant Parameter Parameter
Place the interfaces to data in a separate component. Edit the I/O interface window: Argument
Source
Control input: Source model SREpine User model SREreject User model SREB70 User model
Again, those argument attributes that are not obvious from the context may be based on data averages. Notice that the effects of refining (CBR,CER) and kappa (CBK,CEK) are not necessarily positive. Edit the Argument attributes window: Argument
Short description
Dim Scale
Nominal
Parameters CBR CBK KapB70 CER CEK
BulkRefiningFactors BulkKappaFactor KappaB70 StrengthRefiningFactors StrengthKappaFactor
3 1 1 3 1
1 1 10 1 1
0 0 64 0 0
Control input SREpine SREreject SREB70
SpecRefEnergyPine_[Wh/kg] SpecRefEnergyReject_[Wh/kg] SpecRefEnergyB70_[Wh/kg]
100 100 100
204 105 179
Constants nomSREpine nomSpecRefEnergyPine_[Wh/kg] nomSRErejectnomSpecRefEnergyReject_[Wh/kg] nomSREB70 nomSpecRefEnergyB70_[Wh/kg] nomKapB70 nomKappaB70
Min
Max
204 105 179 64
This concludes the definition of the PulpRefining component. Make another to connect the specific refining energies to data in the same manner as in PulpInput. Select Edit, select Insert, type RefiningInput, select Change, and enter the M− statements % SPECIFIC REFINING ENERGIES SREpine = SRE1
7 Quality Prediction in a Cardboard Making Process
281
SREreject = SRE5 SREB70 = SRE6
Edit the Argument classification window: Argument
Class
Component input SRE1 SRE5 SRE6
Control Control Control
Edit the I/O interface window: Argument
Source
Control input: Source model SRE1 Hold SRE5 Hold SRE6 Hold
Edit the Argument attributes window: Argument
Short description
Dim Scale
Control input SRE1 SRE5 SRE6
SpecRefEnergyPine_[Wh/kg] SpecRefEnergyReject_[Wh/kg] SpecRefEnergyB70_[Wh/kg]
1 1 1
Nominal
Min
Max
100 100 100
Edit the Data assignment window: Argument
Data
Conversion
SRE1 SRE5 SRE6
EQ900 EQ903 EQ904
1 1 1
The KapB70 input argument to the pulp refining model is still a constant parameter and also unidentifiable. So far, it serves the purpose of a ‘stub’, indicating that the hypothesis of a constant kappa is uncertain. In order to test the hypothesis, it must be connected to a source making it vary with time. The simplest alternative is to connect it to data as input. Select Edit, select Insert, type Kappa, select Change, and enter the M−statement % KAPPA NUMBER FOR B70 KapB70 = Kappa6
The classification of Kappa6 affects only the menu of interface routines. Classify it as Feed, since this allows also input filters to be used. Edit the Argument classification window:
282
Practical Grey−box Process Identification
Argument
Class
Component input Kappa6
Feed
Try assigning a Hold interpolation first, since it is simplest and the effect on the bending stiffness might be small. Edit the I/O interface window: Argument
Source
Connections to sensor Kappa6 NoSensor
Control input: Source model Kappa6 Hold
Remark 7.17. The alternative is dubious from a physical point of view. Since MoCaVa3 interpolates a sparsely sampled variable linearly between the sampling points, the actual input becomes a saw−tooth−like discrete−time signal between sparse sampling points. Assigning a Hold interpolation on top of that makes a sequence of staircase signals going up and down. If the original data are also noisy, the two conversion operations may change the characteristics of the converted input significantly. The problem will remain to some extent, if one tries to filter the input. The signal that is filtered will still be the saw−tooth signal. A physically more satisfying alternative would be to classify Kappa6 as an output, and assign a sensor. This would avoid the linear interpolation, and treat the variable as sparsely sampled, with measurement error (see Section 2.3.5 in Part I). Edit the Argument attributes window: Argument
Short description
Dim Scale
Feed input Kappa6
KappaB70
1
Nominal
Min
Max
10
Edit the Data assignment window: Argument
Data
Conversion
Kappa6
QKAP01
1
The new model class is shown in Figure 7.18. The box representing the rudimentary Kappa model appears inside the PulpRefining model to indicate that it is a part of the refining process. Which in turn is part of the process preparing the pulp constituents. The refining input, however, are true input to control the refiners, even though the Hold interpolator is applicable only if the actual set points agree with the data (another questionable hypothesis). The current tentative model structure is #8. The expanded model class has four new parameters. Two of them are arrays, and unlike the case with the parameters representing the bulk and strength profiles, it is not unlikely that some of their entries are insignificant enough to make zero values acceptable. Hence, it will be necessary to test each combination of individual entries in the arrays as scalars. Proceed by clicking OK, and then < for each one of the new parameters.
7 Quality Prediction in a Cardboard Making Process
Figure 7.18. Block diagram of tentative model class
Table 7.4 shows the result.
283
284
Practical Grey−box Process Identification Table 7.4. Fitted and alternative free parameters and their test scores
#
Lab PaperMachine
PulpConstituents
PulpRefining
#8
[11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11]
[0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)]
[(<<<)<(<<<)<] [(000)0(001)0] [(<<<)<(<<1)<] [(000)0(001)1] [(<<<)<(<<1)1] [(010)0(001)1] [(<1<)<(<<1)1] [(010)1(001)1] [(<1<)1(<<1)1] [(010)1(001)1] [(<1<)1(<<1)1] [(110)1(001)1] [(11<)1(<<1)1] [(110)1(101)1] [(11<)1(1<1)1] [(111)1(101)1] [(111)1(1 <1)1]
#9 #10 #11 #12 #13 #14 #15 #16
[(110)11101] [(110)11101] [(110)11101] [(11<)111<1] [(110)11101] [(11<)111<1] [(110)11101] [(11<)111<1] [(110)11101] [(11<)111<1] [(110)01101] [(11<)011<1] [(110)01101] [(11<)<11<1] [(110)01101] [(11<)<11<1] [(110)01101] [(11<)<11<1]
Q, oQ
it
−62267.2 136 −62410.0 48.8 −62461.3 38.5 −62500.0 30.9 −62537.6 −62537.6 292 −62836.1 36.4 −62873.6 14.8 −62888.5 0.9
1 1 1 1 *a) 1 1 1 *
a) The value of Bs has turned negative in all falsifying alternatives. The earlier decision to free it has thus proved to be wrong, and it has been locked again to its zero nominal value. The increase in loss is not noticeable.
In the latest falsifying (and rejected) model the value of the parameter array CBhn is (0.451,0.063,−0.016) measuring the effect on ply density of hot−nippressure. Although that would be expected a priori to be positive, a negative value is not inconceivable, in particular since the positive entries dominate. However, it seems better to postpone the decision until other alternatives have been tried. One may suspect that the fact that the relatively small negative value has still turned out statistically significant is a consequence of the very large amount of measured data for the model to satisfy, in particular since the model structure is deterministic. This means that the structure allows only independent Gaussian measurement errors to decribe all deviations between model and data. Taking that into account, it would seem that surprisingly many refining parameters have turned out significant, which is suspicious enough to motivate further testing. There are now only less obvious expansions to explore. Some prior analysis indicated that the turnover times in the mixing tanks may be significant, and possibly also those of the machine chests, although the latter would be less likely. 7.7.2 The Mixing−tank Dynamics Mass balances yield the following differential equations for the three mixing tanks (assuming that the bulk and tensile−stiffness properties mix with the pulp): Q mx i = B ini =
Qj
(7.17)
B j Q j Q mx i
(7.18)
j∈J i j∈Ji
7 Quality Prediction in a Cardboard Making Process mx E ini = j∈J i E j Q j Q i xmt in in mt dB i dt = Q i (B i − B xmt i ) (C p V i ) mt dE xmt dt = Q ini (E ini − E xmt i i ) (C p V i ) mt xmt in B i = Bi − B i xmt − E ini E mt i = Ei
285
(7.19) (7.20) (7.21) (7.22) (7.23)
where i = 1,2,3, J 1 = {1,2}, J 2 = {3,4,5}, J 3 = {6}, B xmt and E xmt are state vectors, B mt and E mt are the transients, V mt are the volumes of the mixing tanks, and C p is the pulp density. The transients are proportional to the derivatives. The equations hold even for varying volumes. Click NewClass, select Edit, select Insert on the line of PulpMixing, type MixingTanks, and enter the following statements into the first Component function window: % MIXING TANK DYNAMICS Qin = Q(1) + Q(2) Bin = (Q(1) * B(1) + Q(2) * B(2))/Qin Ein = (Q(1) * E(1) + Q(2) * E(2))/Qin DBxmt(1) = Qin * (Bin − Bxmt(1))/(Cpulp * DExmt(1) = Qin * (Ein − Exmt(1))/(Cpulp * Bmt(1) = Bxmt(1) − Bin Emt(1) = Exmt(1) − Ein Qin = Q(3) + Q(4) + Q(5) Bin = (Q(3) * B(3) + Q(4) * B(4) + Q(5) * Ein = (Q(3) * E(3) + Q(4) * E(4) + Q(5) * DBxmt(2) = Qin * (Bin − Bxmt(2))/(Cpulp * DExmt(2) = Qin * (Ein − Exmt(2))/(Cpulp * Bmt(2) = Bxmt(2) − Bin Emt(2) = Exmt(2) − Ein DBxmt(3) = Q(6) * (B(6) − Bxmt(3))/(Cpulp DExmt(3) = Q(6) * (E(6) − Exmt(3))/(Cpulp Bmt(3) = Bxmt(3) − B(6) Emt(3) = Exmt(3) − E(6)
Vta(1)) Vta(1))
B(5))/Qin E(5))/Qin Vta(2)) Vta(2)) * Vta(3)) * Vta(3))
Enter the initial conditions into the second window: for i = 1:3 Bxmt(i) = Bxmt0(i) Exmt(i) = Exmt0(i) end
Edit the Argument classification window Argument
Class
Component Output Qin Internal Bin Internal Ein Internal
Component input Cpulp Vta
Constant Constant
286
Practical Grey−box Process Identification
Initialization input Bxmt0 Exmt0
Parameter Parameter
The state variables Bxmt and Exmt are not measured. Edit the I/O interface window: Argument
Source
Connections to sensors Bxmt NoSensor Exmt NoSensor
Most attributes are implicit and have been defined before. The exceptions are Cpulp and Vta, which are known constants. Edit the Argument attributes window: Argument
Short description
Dim Scale
Nominal
Min
Parameters Bxmt0 Exmt0
InitialBulkMixingTank_[m3/kg] InitialStrengthMixingTank_[N/m2]
3 3
ScaleB ScaleE
NomB NomE
0 0
States Bxmt Exmt
BulkMixingTanks_[m3/kg] StrengthMixingTanks_[N/m2]
3 3
ScaleB ScaleE
Constants Cpulp Vta
PulpConsistency_[kg/m3] MixingTankVolumes_[m3]
1 3
Max
42 Vta
Enter into the Implicit attributes window: Attribute
Values
Vta
50
100
50
This concludes the definition of the MixingTanks components. First test whether the dynamics improve the model with nominal values of the initial states, and continue by freeing the initial states. Leave the Alternative structures window empty to indicate no more free parameters. Click NewDim. Increasing the Speed of Processing Introducing six states makes the processing considerably much slower, in particular since the data sample is quite large and the free parameters many. It is time to use some of the options for speeding up the processing. The most time−reducing option is limiting the number of accesses to the model equations (see Section 2.5.1). This unavoidably causes approximation errors, which have to be controlled or compensated for. Now, approximations affect only the search for parameter values, and not the evaluation of the loss value associated with given parameter values. Hence, approximations are theoretically allowed when an alternative model is computed (since any parameter values that falsify the tentative model will do), but they are not allowed in a search for the next tentative model (since it has to be the best in the tentative structure, and there-
7 Quality Prediction in a Cardboard Making Process
287
fore exact). A search for alternatives may take long, if there are many alternatives, even if the number of steps may be few. This motivates the speed optimization option in this case. A search for a tentative model may take long, because it has to converge. This motivates speed optimization also in this case, but the error in the optimum have to be compensated. This is done by running an extra search with the speed optimization disconnected. The latter should take few iterations, since the approximate optimum can be expected to be close to the exact optimum (unless the approximation level is set much too high). Click Advanced, check all options under Optimization for Speed, and set Max ratio of model accesses to 0.1. When the next (approximative) tentative model has been found click Confirm to initiate the second, correct search. Finally click Accept to end the current round in the testing procedure. Table 7.5 shows the result of testing the need for a dynamic model. Table 7.5. Fitted and alternative free parameters and their test scores #
Lab PaperMachine Tank PulpConstituents
#16 [11] [11] #17 [11] [11] [11]
[(110)01101] [(110)01101] [(110)01101] [(110)01101] [(11<)<11<1]
[00] [00] [##] [00]
[0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)]
PulpRefining
Q, oQ
it
[(111)1(101)1] [(111)1(101)1] [(111)1(101)1] [(111)1(101)1] [(111)1(1 <1)1]
−62888.5 39.0 * −62927.5 0.9 * 0.1 *
The result shows that mixing tank dynamics are significant (tentative model structure #17). However, estimating the start values Bxmt0 and Exmt0 has little effect. That agrees with the expected; start transients that last only a few sampling intervals should not have a noticeable impact on a fitting loss evaluated over some 10000 sampling points. 7.7.3 The Machine Chests Proceed by expanding the model with the dynamics of the machine chest. The model is similar to that of the third mixing tank with a single feed line. Click NewClass, select Edit, select Insert on the line of PulpFeed, type MachineChests, and enter the following statements into the first Component function window: % DYNAMIC MASS BALANCES for i = 1:3 rate = Filler * Qmx(i)/(Cpulp * Vmc(i)) DBxmc(i) = (Bmx(i) − Bxmc(i)) * rate Bmc(i) = Bxmc(i) − Bmx(i) DExmc(i) = (Emx(i) − Exmc(i)) * rate Emc(i) = Exmc(i) − Emx(i) end
Enter the initial conditions into the second window: for i = 1:3 Bxmc(i) = Bxmc0(i)
288
Practical Grey−box Process Identification Exmc(i) = Exmc0(i)
end
The classification of the variables are obvious, except that of Filler. It is a factor that takes account of the increase in mass flow caused by additional substances to the pulp. However, the form of relations indicate that the fillers affect only the time constants, which are expected anyhow to be barely significant, if at all. It should therefore be sufficient to classify Filler as Constant. Edit the Argument classification window Argument
Class
Component Output rate Internal
Component input Filler Cpulp Vmc
Constant Constant Constant
Initialization input Bxmt0 Exmt0
Parameter Parameter
Edit the I/O interface window: Argument
Source
Connections to sensors Bxmc NoSensor Exmc NoSensor
The value of Filler can be computed as the ratio of flows of output cardboard and input pulp: (9081+21946+6811)/(3350+4218+6230+5680+6270+5520) = 1.2 Edit the Argument attributes window: Argum
Short description
Parameters Bxmc0 Exmc0
InitialBulkMachineChest [m3/kg] 3 InitialStrengthMachineChest [N/m2] 3
ScaleB ScaleE
States Bxmc Exmc
BulkStateMachineChest [m3/kg] 3 StrengthStateMachineChest [N/m2] 3
ScaleB ScaleE
Constants Filler Cpulp Vmc
FillerFactor PulpConsistency [kg/m3] MachineChestVolumes [m3]
Enter into the Implicit attributes window:
Dim Scale
1 1 3
Nominal
Min
NomB NomE
0 0
1.2 42 Vmc
Max
7 Quality Prediction in a Cardboard Making Process
Attribute
Values
Vmc
10
10
289
10
Test again whether the dynamics improve the model with nominal initial state values. Proceed as before. Table 7.6. Fitted and alternative free parameters and their test scores #
Lab PaperMachine Tank
#17 [11] [(110)01101] [11] [(110)01101]
PulpConstituents
PulpRefining
Q, oQ
[00] [0(111111)0(111111)] [(111)1(101)1] [00][00] [0(111111)0(111111)] [(111)1(101)1]
it
−62927.5 1.8 *
The machine chest dynamics are not significant. Click QuitSearch and NewClass, and make MachineChests Dormant. 7.7.4 Filtering the “Kappa” Input Unlike the other input variables the kappa value is measured in a laboratory and at irregular intervals. It may therefore be worth while to see if filtering will improve the model. Select Edit, select Insert, type FilteredKappa, select Change, and enter the M− statement KapB70 = Kappa6
Edit the Argument classification window: Argument
Class
Component input Kappa6
Feed
Edit the I/O interface window: Argument
Source
Connections to sensor Kappa6 NoSensor
Feed input: Source model Kappa6 LPFilter
Edit the Argument attributes window: Argument
Short description
Dim Scale
Nominal
Min
Max
Parameters bw_Kappa6
BandWidth_Kappa6
1
0.5
0
25
0.5
290
Practical Grey−box Process Identification
Feed input Kappa6
KappaB70
1
10
Edit the Data assignment window: Argument
Data
Conversion
Kappa6
QKAP01
1
Make FilteredKappa Active and Kappa Dormant. Try first the nominal bandwidth of the input filter. Table 7.7. Fitted and alternative free parameters and their test scores #
Lab PaperMachine Tank PulpConstituents
#17 [11] [11] #18 [11] [11]
[(110)01101] [(110)01101] [(110)01101] [(110)01101]
[00] [00] [00] [00]
[0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)]
PulpRefining
Kap Q, oQ
it
[(111)1(101)1] −62927.5 [(111)1(101)1] [0] 9.0* [(111)1(101)1] [0] −62936.5 [(111)1(101)1] [<] 0.2*
Hence it helps to filter the Kappa6 measurements using the nominal bandwidth, but not to search for a better prefilter bandwidth. There are no more simple ways to improve the model class based on prior knowledge, and no other will be tried here. The long−range predicting ability of model #18 is shown in Figure 7.19. It explains 94.5% of the variation in the measurements of the stiffness index and 98.0% of those in the thickness of the cardboard. It might be worth noticing from the residuals sequences that something out of the ordinary happened with the bending stiffness at about 950 hours (which a careful model designer should look into), and also that both residuals contain low frequencies, which indicates that better short−time predicting ability may be obtained by feedback. A crude way to achieve this would be to append a ’black box’ using the measured output as ’input’, for instance in a linear relation to the model’s output. However, another concern at this point (than how to expand) is the risk of ‘over−fitting’. Again, the number of ‘significant’ parameters is surprisingly large, based on a look at the noisy data. One might therefore suspect that some of the parameters have actually helped in fitting the output to spurious effects in the data.
7.8 Checking for Over−fitting: The SBE Rule Checking whether more parameters have been freed than necessary, as a result of the expansion procedure, can be done in a straightforward way by locking each one to its nominal value and see whether this means little increase in loss. It is basically a more tedious operation than expansion, since each search has to continue until convergence, and no approximation is allowed. However, the reward is two kinds if information: : Parameters having large increase in loss need not be tested further. : Among the parameter with small increase of loss the one with the smallest increase need not be tested further. Since, normally, the first condition holds for the majority of parameters, the initial step of the parameter reduction procedure is the only one that takes extensive time. In difficult cases it may be run overnight.
7 Quality Prediction in a Cardboard Making Process
Figure 7.19. Response of the best deterministic model #18
291
292
Practical Grey−box Process Identification Table 7.8. Fitted and alternative free parameters and their test scores
#
Lab PaperMachine Tank PulpConstituents
#18 [11] [(110)01101] [>1] [(>>0)0>>01]
[00] [0(111111)0(111111)] [00] [0(111111)0(111111)]
PulpRefining
Kap Q, oQ
it
[(111)1(101)1] [0] −62936.5 [(>>>)>(>0>)>] [0] −4.9 *
It appears that the suspicion of over−fitting has not been confirmed. However, structure #18 has still been falsified, although by alternatives that have no physical meaning. Click Simulate to open the Origin window, and then Export to save the parameter values of the best model so far in the file parameters_18. As before, the sequel hinges on the ability of the user to come up with further alternatives that do have physical meanings. At this point, however, the user has run out of useful and sufficiently simple prior knowledge of the effects of pressure on pulp properties. This raises the question of whether or not the ‘unphysical’ negative values of some pressure parameters could still be accepted for the purpose of the modelling. Taking into account that this is a preliminary study with the purpose of finding out whether a grey−box model would do better than a black−box in predicting bending stiffness, the result suggests the following: Find out how much the predicting may improve by allowing the unphysical values, and use the result to appraise how much more effort to devote to the modelling of pressure effects. Table 7.9. Fitted and alternative free parameters and their test scores #
Lab PaperMachine Tank PulpConstituents
#18 [11] [>1] [11] #19 [11] [11] #20 [11] [11] #21 [11] [11] #22 [11] [11]
[(110)01101] [(>>0)0>>01] [(11<)<11<1] [(110)11101] [(11<)111<1] [(111)11101] [(111)111 <1] [(111)11111] [(111)11111] [(111)11111] [(111)11111]
[00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00]
[0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)]
PulpRefining
Kap Q, oQ
[(111)1(101)1] [(>>>)>(>0>)>] [(111)1(101)1] [(111)1(101)1] [(111)1(101)1] [(111)1(101)1] [(111)1(101)1] [(111)1(101)1] [(111)1(1 <1)1] [(111)1(111)1] [(111)1(111)1]
[0] [0] [0] [0] [0] [0] [0] [0] [<] [0] [< ]
−62936.5 −4.9 66.6 −63005.1 14.4 −63043.6 67.9 −63111.6 5.6 −63117.2 0.9
it
* 1 1 * * *
Click QuitSearch, NewClass, Simulate and OK. The standard deviations of prediction errors are KSTY2: 5.5%, ATJO2: 2.0%. This agrees to the first decimal with the performance of the ‘physical’ model #18, and supports the conclusion to stop the calibration here. In order to restore model #18 click NewDim and Select_#1. Then Uncheck the three parameters CBhn(3), CBs, nu having inadmissible values, and click OK to fit model #18 again. (Alternatively, Import parameters_18.)
7 Quality Prediction in a Cardboard Making Process
293
7.9 Ending a Calibration Session The main rule for stopping the calibration process is to proceed as long as there are conceivable alternatives that falsify the current tentative model structure. An obvious weakness of the rule is that the result depends on the designers ability to come up with reasonable alternatives. The latter need not necessarily be well founded on prior knowledge to be instrumental in falsifying a tentative model structure. A ‘black−box’ extension may be able to achieve this. An ‘white−box’ extension with inadmissible parameter values may do the same, but would then lose the advantage of being based on prior knowledge. 7.9.1 ‘Black−box’ vs ‘White−box’ Extensions The straightforward way to proceed would be to extend the model class by adding dynamic linear relations (transfer functions) between all input variables and residual sequences, and, starting by low−order dynamics, define as many alternatives as there are parameters in the black boxes. In the actual case that would mean 32 alternatives in a first round, which is not an inconceivable number. However, some information can be obtained more directly from the fact that, if something is to be gained by relating residuals to input, there must be a correlation. The Verify routine basically computes a number of cross−correlations between variables that would be uncorrelated for a correct model. It provides three sets of information: : Probability measures for the event that each actual correlation is small enough to be due to chance alone. : A number of estimates of the minimum reduction of residuals that would be achieved by utilizing the correlation to improve on the model by a black box extension. : The extreme residual in each sequence, with probability measures that they are not outliers. Theoretically, this would seem rather conclusive for a stopping rule. In summary: If ‘white−box’ extensions falsify, then proceed. Otherwise, if no correlation, then stop. Otherwise, proceed with ‘black−box’ extensions, until no correlation. In practice, it is another matter. The problem with the cardboard modelling is that it seems to be well nigh impossible to obtain cross−correlations that are small enough not to be theoretically significant. The result is partly due to very large set of data (given enough data it is possible to statistically falsify any given model, simply because the threshold for significant correlation decreases indefinitely with increasing number of data), but also that the actual physical process is indeed far too complex for a model that has to be manageable to serve its purpose. So the question of when to stop in practice remains. Click Verify. The largest input−output correlation is that between coating and thickness residuals: chi2 = 2640, probability = 0, structure error = 1.3%. The ‘structure error’ is approximately the part of residual variation that could possibly be compensated for by adding black−box relations. Since that part is small, the result offers a way out of the dilemma: It says that although the probability is overwhelming that there is indeed a remaining correlation, exploiting the correlation would only contribute marginally to improving the model.
294
Practical Grey−box Process Identification
Remark 7.18. Logically, the practical solution precludes the validation procedure, since it assumes implicitly that the prediction error is the most important measure for the purpose of the modelling, and hence that a 1.3% improvement would be a small return from an investment in modelling effort. The auto−correlations of residuals yield: Bending stiffness: chi2 = 1870, probability = 0, structure error 21%. Thickness: chi2 = 53800, probability = 0, structure error = 48%. This is far from negligible, and would be expected already from Figure 7.19; residuals do not look ‘white’. It also indicates how to improve the model. 7.9.2 Determinism vs Randomness Compensating for auto−correlation in residuals can be done by ‘black−box’ amendments in a similar way as when compensating for cross−correlation; fit linear relations between residuals and past output and include them in a model based controller. A crucial difference is that this would introduce relations between present and past output, in addition to past input. This makes the model a one−step predictor. Hence, it can no longer be used for predicting longer ranges, which is one of the original purposes of the modelling. However, it would still be useful for designing feedback control of bending stiffness and thickness. An alternative to the ‘black−box’ output feedback is to model random disturbances. This would be intuitively more appealing, since it would make use of some prior information about at which points the model is most likely to be inadequate. Some parameters assumed to be constant, or insignificant, might not be so. The conclusion is to stop developing the ‘deterministic’ model and proceed with a ‘stochastic’. Remark 7.19. The term “stochastic“ means here that the model has unrecorded input as well as recorded, and the ‘stochastics’ (= random variables) are present only in the model components describing the hypothetical sources of the unrecorded input. A better term would therefore be “environment” model, had this term not already been used in the modelling of wider parts of nature than the cardboard machine. The output of many industrial production processes are undoubtedly affected by phenomena that are not recorded. In essence, one can chose between trying to model them by ‘stochastic’ descriptions, and thus exploit the recorded data to get a basis for designing compensating feedback, or else try and design the feedback without modelling. The latter alternative would not necessarily be more difficult or less well founded on data, since it would be possible to fit unknown parameters in the feedback compensation to the same data. The choice depends on whether or not one has some basis for hypothesizing where in the model the unknown input appear. A basic problem with the introduction of unknown input is the balance between the ‘deterministic’ and ‘stochastic’ parts of the model. There are two kinds of structural error: : Too much of the variation that is actually depending on known input may be modelled as effects of unknown input. This leads to models that are useful for prediction, but less useful for control, since the effects of control input will be under− rated. : Too much of the variation that is actually random is modelled as due to variation in known input. This leads to over−parametrization, and poor robustness, since the effects of control input will be over−rated. There is also a risk that the parameters
7 Quality Prediction in a Cardboard Making Process
295
are partially fitted to spurious effects in the data, even if their number is not too high.
7.10 Modelling Disturbances The constants stubB and stubE now come handy as places to introduce stochastic disturbances. The idea is that this will replace the effects of all the conceivable but unmodelled physical phenomena that play a rôle in the complex operations of the paper machine. Since those unmodelled effects are present in the data that the model has to be fitted to, the disturbances will conceivably help, by freeing the parameters from the burden of describing also part of the unmodelled disturbances, in addition to the physical properties they were intended to describe. The prior knowledge about the disturbances is naturally slim. There is some, however: : The factors are positive. This suggests models of the form exp(v), where v is the output of one of the library models. : Some information about the character of the disturbances can be obtained from the residuals in Figure 7.19. They appear to be stationary (not drifting), non−periodic, and contain some low frequencies. This suggests the Lowpass library model. Click NewClass, select Edit, select Insert on the line of PaperMachine, type Disturbance, and enter the following statements in the function window: % FACTOR DISTURBANCES IN BULK AND TENSILE STRENGTH stubB = exp(vB) stubE = exp(vE)
Edit the Argument classification window Argument
Class
Component input vB vE
Disturbance Disturbance
Edit the I/O interface window: Argument
Source
Connections to sensors vB NoSensor vE NoSensor
Unknown input: Environment model vB Lowpass vE Lowpass
Try first small values for rms_vB and rms_vE. Edit the Argument attributes window: Argument
Short description
Dim Scale
Nominal
Min
Parameters rms_vB
Rms_vB
1
0.1*0.1
0
0.1*1
Max
296
Practical Grey−box Process Identification 1 1 1
0.5 0.1 0.5
Disturbance input vB BulkDisturbance vE StiffnessDisturbance
1 1
0.1 0.1
States x_vB x_vE
1 1
1 1
bw_vB rms_vE bw_vE
BandWidth_vB Rms_vE BandWidth_vE
State_vB State_vE
0.5 0.1*0.1 0.5
0 0 0
2.5 2.5
This ends the definition of the Disturbance model. It can now be used as any other component in the model class.
7.11 Calibrating Models with Stochastic Input Click NewClass and select Active for the Disturbance component. Figure 7.20 shows the new model class. New parameters are rms_vB, bw_vB, rms_vE, bw_vE. Proceed as before with the expanded model structure as alternative, starting with no more free parameters (an empty Alternative structures window). The result is summarized in Table 7.10. Table 7.10. Fitted and alternative free parameters and their test scores #
Lab PaperMachine
#18 [11] [11] #23 [11] [11] #24 [11] [11] #25 [11] [11] #26 [11] [11] #27 [11] [11] #28 [11] [11] [>1] #29 [11] [>1] #30 [11] [>1] #31 [11] [>1] #32 [01]
[(110)01101] [(110)01101][0000] [(110)01101][0000] [(110)01101][ <<<<] [(110)01101][0100] [(110)01101][ <1<<] [(110)01101][0101] [(110)01101][<1<1] [(110)01101][1101] [(110)01101][11 <1] [(110)01101][1111] [(11<)<11<1][1111] [(110)01111][1111] [(11<)<1111][1111] [(>>0)0>>>1][1111] [(110)01111][1111] [(110)01111][1111] [(110)01111][1111] [(110)01111][1111] [(110)01111][1111] [(110)01111][1111] [(110)01111][1111]
Kap Q, oQ
Tank PulpConstituents
PulpRefining
[00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00] [00]
[(111)1(101)1] [0] [(111)1(101)1] [0] [(111)1(101)1] [0] [(111)1(101)1] [0] [(111)1(101)1] [0] [(111)1(101)1] [0] [(111)1(101)1] [0] [(111)1(101)1] [0] [(111)1(101)1] [0] [(111)1(101)1] [0] [(111)1(101)1] [0] [(111)1(1 <1)1] [<] [(111)1(101)1] [0] [(111)1(1 <1)1] [<] [(>>>)>(>0>)>] [0] [(111)1(100)1] [0] [(111)>(100)>] [0] [(111)1(100)0] [0] [(111)>(100)0] [0] [(111)0(100)0] [0] [(111)0(100)0] [0] [(111)0(100)0] [0]
[0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)]
−62936.5 4437 4 −67398.6a) 2167 1 −69864.8 563 1 −70531.6 462 1 −71155.1 61.4 1 −72054.2 149 4 −72890.7b) *c) −0.6 *d) −72890.2 −1.1 * −72889.1 −0.1 * −72889.0 −1.5 * −72887.5 e)
7 Quality Prediction in a Cardboard Making Process
297
Figure 7.20. Block diagram of tentative model class a) The introduction of the Disturbance components appears to make the loss function more difficult to optimize. (It is not possible, however, to conclude that it will generally be so.) No fitting failed, but some took more than the default maximum of 16 iterations. b) The estimated value of nu is now admissible c) There is a very significant loss reduction of 114, although again for an inadmissible value of CBhn(3). d) The SBE rule limits the parameters that have to be tested again, this time to three.
298
Practical Grey−box Process Identification
e) A final confirmation. Neither of the SFI or SBE rules changes the selection of significant parameters: All selected parameters, and no other admissible, contribute significantly to the loss value.
Figure 7.21 shows the one−step (0.2h) predicted values of stiffness index and thickness, together with the measured values and the prediction errors. The standard deviations of prediction errors are KSTY2: 4.4%, ATJO2: 1.0%. The prediction errors of thickness look dramatically much better, when compared with those of the deterministic model in Figure 7.19. The standard deviation is halved, compared with that of the best deterministic model #18. Most of the improvements come from the feedback from measurements, which one model allows but not the other. The number of significant parameters not associated with stochastic input or with output measurement errors has reduced from 24 to 22. This gives some support to the suspicion of an over−parametrized deterministic model; the freedom in some estimated parameters has served to fit the model output to spurious data. Click Simulate to open the Origin window, and then Export to save the parameter values of the best stochastic model so far in the file parameters_32. 7.11.1 Determinism vs Randomness Revisited One question remains to be answered: What parameter values to use for the deterministic model used for long−range prediction? There are two candidates: a) either those fitted with the deterministic or b) with the stochastic model. The second one may be motivated if the deterministic model would be over−parametrized or its parameters would have been partly fitted to spurious effects in the data. The evidence has given some support to this. On the other hand, the disturbance model may have absorbed some variations in the data that where actually response to known input, although the hypothesized model structure has been unable to describe the relation. Hence, the question is again one of the right balance between ‘deterministic’ and ‘stochastic’ parts in a model. Joke: Non−decisive conclusions are seldom popular. I heard of a company that employed only one−armed engineers. Because they couldn’t say “But on the other hand ...”. An obvious way to settle the matter would be to try the alternatives with the independent data sample and compare the long−range prediction errors. However, some evidence can also be obtained from the results of two more tests that do not involve independent data: : Cross−correlate to see to what extent estimated stochastic input depends on known input. If it does so significantly, this is evidence that the deterministic part of the model is under−parametrized. It should have been possible to describe part of ‘disturbances’ by a deterministic model. : The second reason to suspect the deterministic model, that its parameters estimates are more contaminated with errors, suggests that one compare the performances of the models. However, since the first alternative will always perform better (when the same data is used for the comparison), one has also to take that into account, and observe whether the difference is marginal or not. Thus, check if zeroed stochastic input makes a long−range predictor that is almost as good. Click Verify.
7 Quality Prediction in a Cardboard Making Process
299
Figure 7.21. Response of the best stochastic model #33
The maximum dependence of disturbances on known input occurs for the dependence of he bulk disturbance (vB) on the specific refining energy of the B70 pulp (QB70). It is no larger than 8.2%.
300
Practical Grey−box Process Identification
The conclusion is that at most 0.044 x 0.082 = 0.0036 of the total variation in BSI can possibly be gained by improving the deterministic part. It does not seem worth the effort. Click NewClass, make Disturbance Dormant, and click Simulate. The standard deviations of residuals are 7.9% and 3.1%, to be compared with 5.5% and 2.0% for the first alternative. The difference is not marginal, so the result is not conclusive. It may still be interpreted as indicating either that the better agreement of the first parameter set has been achieved by fitting to spurious values, or else that the poorer outcome of the second set indicates that disturbances have partly modelled even such physical phenomena that the disturbance−free structure is in fact able to describe. Evidently, a decision requires independent data. Long−range Predictor Evaluated with Independent Data Click Suspend and Exit. Restart by Calibrate, CardBoard, and Reset. Open out0010.mcv. Then make Disturbance Dormant, and click Simulate. Click Import and select parameters_18. The standard deviations are 7.5% and 2.9% for the first alternative. Figure 7.22 illustrates the performance of the deterministic model for long−range prediction. Click Reject and Simulate. Click Import and select parameters_32. The standard deviations are: 9.6% and 4.2% for the second alternative. Figure 7.23 illustrates the performance of the stochastic model for long−range prediction. The results indicate that the parameters fitted with the deterministic model #18 are better. The deficiency of the stochastic model for long−range prediction is evident from Figure 7.23; the response to pressure changes is clearly underestimated. This reveals a common weakness of stochastic models; they are generally good at predicting one step, when the output changes slowly compared to the sampling interval, and this makes it difficult for deterministic models to compete in Likelihood− based inference. They tend to favour the effect of the unknown input. An attempt to intuitive explanation is the following: The model output is a response to both recorded and unrecorded (‘deterministic’ and ‘stochastic’) input. Hence, two things that are needed to predict the output are unknown: i) the unrecorded input, and ii) the response to recorded and unrecorded input. The stochastic modelling scheme approaches the problem in four steps: i) building a ‘disturbance model’, i.e., one that describes the general characteristics of the unrecorded input, ii) building a ‘process’ model, describing the response to both kind of input, iii) using both models to estimate the unrecorded input from recorded input and output, and iv) simulating the model with the recorded and estimated input. The result is a ‘predictor model’, which is fitted to recorded output. The ‘disturbance’ model, in essence, determines what variation in the output data that may be judged as ‘spurious’, i.e., unlikely to be the response of known input. Both the ‘process’ and the ‘disturbance’ models contain unknown parameters, which the fitting program is more or less free to use to choose the degree of influence of any one. However, a disturbance model is obviously more versatile in fitting to all kinds of irregularities in the data, spurious or not. Since the model’s responses to known input are unable to fit to spurious irregularities, while that of random input is good at describing those and quite able to do so reasonably well also for such variation that actually depends on known input, this tends to set a necessary level of influence for the ‘disturbance’ model. If, as in the cardboard case, data are much contaminated, and large changes in recorded input are infrequent, then the errors due to wrong responses to the
7 Quality Prediction in a Cardboard Making Process
301
Figure 7.22. Response of deterministic model for long−range prediction
known input changes tend to be outweighed in the predictor loss by the frequent errors due to spurious effects. Remark 7.20. The disturbance model works similar to a ‘signal−to−nose ratio’, which determines how much of the measured data one should believe is ‘signal’ and how much is ‘noise’. One can estimate a level, assuming something about the signal.
302
Practical Grey−box Process Identification
Figure 7.23. Response of stochastic model for long−range prediction
Short−range Predictor Evaluated with Independent Data Make Disturbance Active and click Simulate. Click Import and select parameters_32.
7 Quality Prediction in a Cardboard Making Process
303
Figure 7.24. Response of stochastic model for one−step prediction
The standard deviations are 4.9% and 1.3% for the first alternative. Figure 7.24 illustrates the performance of the stochastic model for one−step prediction. The results are not much inferior to the values 4.4% and 1.3% obtained for the data sample the model was fitted to. Since that sample was originally selected as being
304
Practical Grey−box Process Identification
somewhat less contaminated, the result supports the conclusion that the controller based on the stochastic model #32 is reasonably robust. Hence, the conclusion is to prefer the deterministic model for long−range prediction and planning of grade changes, and the stochastic model for the short−range prediction and on−line control. This would also have been the naive choice. Notice, however, that in the steel−rinsing case study the naive choice turned out not to be the best! Remark 7.21. It was reported in (Bohlin, 1994a) that augmenting a stochastic disturbance model with the deterministic actually made the parameter estimates vary less, when fitted to consecutive segments of the data sample. This supports the explanation that disturbances function as ’slack’ variables to relieve the parameters. 7.11.2 A Local Minimum Introducing stochastic elements in the model has several advantages at first sight, the most striking is that it is an easy way to obtain large reductions in the loss function. In fact, if the Disturbance component had been included in the model from the beginning, it is very likely that its four parameters would have been among the first to be fitted. In addition, the strategy used so far −to explore all ‘deterministic’ explanations for the measured responses to stimuli − seems to have first produced an over−parametrized model, which had later to be purged from spurious values at considerable cost of computing time. This suggests a reiteration of the structure identification session, where the characteristic parameters of disturbances are fitted first, together with the two rms values of the output errors. The strategy may well favour the disturbance model even more, but might also produce a simpler model, that might still work for the short−range predictor. The result is summarized in Table 7.11. Remark 7.22. Model #35 already has a much better loss than the value −62936.5 reached by the best deterministic model #18. A comparison of the responses of the two models, Figure 7.19 and Figure 7.25 is also very favorable to the latter. Even the prediction errors 4.7% and 1.3% are better than the 5.5% and 2.0% for model #18. In spite of this, Model #35 is obviously quite useless for control purposes, since it does not use any of the nine control input to the pulping process. This is just a reminder of the fact that the prediction error is no more than a lower limit for the control error. It is not possible to do better, but well to do worse, if the model describing the response to control input is faulty. Remark 7.23. Not until Model #40 do the properties of the input to the pulping process play any rôle in the model’s response. Being closer to the sensor output, the paper machine characteristics dominate. The actual variations in pulp properties are apparently predicted sufficiently well as random variation in the incoming pulp to the paper machine. The most noticeable result of a comparison between the final result of this session and the best model #32 in the previous session is that the minimum loss is larger, in spite of the fact that two more parameters have been fitted. The conclusion is that starting with modelling disturbances has led to a local minimum. Hence, it is not irrelevant in which way the model class is expanded. Again, prior knowledge plays a decisive rôle. The ability of disturbance models to inoculate against spurious fitting and over− parametrization motivates the recommendation to model disturbances. And the fact that there is more prior information in a deterministic model than in a stochastic motivates that one add disturbance models late.
7 Quality Prediction in a Cardboard Making Process
Figure 7.25. Excellent prediction from the useless model #35
305
306
Practical Grey−box Process Identification Table 7.11. Fitted and alternative free parameters and their test scores
#
Lab PaperMachine
#33 [01] [01] #34 [01] [01] #35 [01] [<1] #36 [01] [<1] #37 [01] [<1] #38 [01] [<1] #39 [01] [<1] #40 [01] [<1] #41 [01] [<1] #42 [01] [<1] #43 [11] [11] [11] #44 [11] [11] #45 [11] [11] #46 [11] [11] #47 [11]
[(000)00001] [(000)00001][0000] [(000)00001][0000] [(000)00001][1111] [(000)00001][1111] [(<<<)<<<<1][1111] [(100)00001][1111] [(1<<)<<<<1][1111] [(100)00011][1111] [(1<<)<<<11][1111] [(110)00011][1111] [(11<)<<<11][1111] [(110)01011][1111] [(11<)<1<11][1111] [(110)01011][1111] [(11<)<1<11][1111] [(110)01011][1111] [(11<)<1<11][1111] [(110)01111][1111] [(11<)<1111][1111] [(110)01111][1111] [(11<)<1111][1111] [(11<)<1111][1111] [(110)01111][1111] [(11<)<1111][1111] [(110)01111][1111] [(110)01111][1111] [(110)01111][1111] [(110)01111][1111] [(110)01111][1111]
Tank PulpConstituents
[<(000000) <(000000)] [0(000000)0(000000)] [<(000000) <(000000)] [0(000000)0(000000)] [<(000000) <(000000)] [0(000000)0(000000)] [<(000000) <(000000)] [0(000000)0(000000)] [<(000000) <(000000)] [1(000000)0(000000)] [1(000000) <(000000)] [1(000000)1(000000)] [1(000000)1(000000)] [1(000000)1(000000)] [1(000000)1(000000)] [1(000000)1(000000)] [1(000000)1(000000)] [0(######)0(######)] [0(111111)1(000000)] [0(111111)0(######)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)] [0(111111)0(111111)]
PulpRefining
Kap
Q, oQ
−43287.9 2359 −45647.0 12187 −68426.5 2320 −70905.4 136 −71128.7 252 −71538.0 87.5 −71626.5 48.9 −71855.5 47.3 −71906.9 19.5 −71927.7 57.1 −71985.4
[(<<<)<(<<<)<] [(010)0(000)0] [(<1<)<(<<<)<] [(010)0(001)0]
397 −72383.6 272 −72772.7 34.1 −72807.2 25.7 −72833.7
4 1 1 1 1 1 1 1 1 4 1 4 4 1 1
7.12 Conclusions from the Calibration Session Use model #18 for long−range prediction. It predicts the Bending Stiffness Index and thickness variables BSI and H with error standard deviations 7.5% and 2.9% of the total variations, when evaluated for an independent data set. Use model #32 for short−range prediction and feedback control. It predicts the Bending Stiffness Index and thickness variables BSI and H one sampling interval with error standard deviations 4.9% and 1.3% of the total variations, when evaluated for an independent data set. The structure of the PaperMachine model has been falsified by the data. An improvement of the model should therefore focus on improving the submodels describing the paper machine, in particular the effects of pressing. That has been done in the main study (Petterson et al, 1997; Pettersson, 1998). The values of estimated parameters are shown in Tables 7.12 and 7.13.
7 Quality Prediction in a Cardboard Making Process
307
Table 7.12. Parameter estimates for deterministic model #18
Box
Parameter
Physical meaning
Value
St.D.
Ec rms_BSI PaperMachine: CBhn
TensileStrengthCoating_[N/m2] StDError_BSI
2.28••1012 0.349
9% 3%
HotNipPress−BulkCoefficients
2% 6%
CBs Cdrag Bc nu rms_H PulpFeed: BWInput: SpeedInput: PressureInput: PulpMixing: MixingTanks: Bxmt0 Exmt0 PulpInput: PulpConstituents: AveBulk CB
ShoePress−BulkCoefficient DragCoefficient BulkCoating_[m3/kg] DensityExponent StdError_H
0.3737 0.0353 0 0 0.742 0.00204 1 8.54••10−6
InitialBulkMixingTanks_[m3/kg] InitialStrengthMixingTanks_[N/m2]
0.00165 3.41••1012
AverageBulk_[m3/kg] BulkFactors
0.00165 1.09 1.04 1.55 0.95 0.86 1.19 3.41••1012 1.21 2.37 0.66 0.90 0.61 0.47
Lab:
AveStrength CE
PulpRefining: CBR CBK CER CEK RefiningInput: FilteredKappa: bw_Kappa6
AverageStrength_[N/m2] StrengthFactors
BulkRefiningFactors
StrengthKappaFactor
−0.43 −0.074 0.061 0.057 −0.64 0 0.44 0.16
BandwidthKappa6
0.5
BulkKappaFactor StrengthRefiningFactors
6% 6% 1%
2% 1% 1% 1% 1% 1% 4% 3% 5% 4% 6% 2% 6% 8% 21% 12% 11% 5% 8%
308
Practical Grey−box Process Identification Table 7.13. Parameter estimates for stochastic model #32
Box
Parameter
Physical meaning
Value
St.D.
Ec rms_BSI PaperMachine: CBhn
TensileStrengthCoating_[N/m2] StDError_BSI
3.41••1012 0.208
11%
HotNipPress−BulkCoefficients
5% 11%
CBs Cdrag Bc nu rms_H Disturbance: rms_vB bw_vB rms_vE bw_vE PulpFeed: BWInput: SpeedInput: PressureInput: PulpMixing: MixingTanks: Bxmt0 Exmt0 AveBulk CB
ShoePress−BulkCoefficient DragCoefficient BulkCoating_[m3/kg] DensityExponent StdError_H
0.205 0.0311 0 0 0.680 0.00210 2.65657 1.81••10−6
RmsBulkVariation BandwidthBulkVariation RmsStrengthVariation BandwidthStrengthVariation
0.0308 0.0402 0.0625 0.0142
3% 7% 3% 1%
InitialBulkMixingTanks_[m3/kg] InitialStrengthMixingTanks_[N/m2] AverageBulk_[m3/kg] BulkFactors
StrengthKappaFactor
0.00165 3.41••1012 0.00165 1.13 0.998 1.56 0.993 0.822 1.06 3.41••1012 3.9811 2.26 0.923 0.00814 0.00521 0.507 −0.571 −0.315 −0.215 0 −0.723 0 0 0
BandwidthKappa6
0.5
Lab:
AveStrength CE
AverageStrength_[N/m2] StrengthFactors
CBR
BulkRefiningFactors
CBK CER
BulkKappaFactor StrengthRefiningFactors
CEK RefiningInput: FilteredKappa: bw_Kappa6
6% 3% 1% 3%
3% 2% 2% 2% 3% 2% 11% 9% 4% 31% 31% 4% 11% 8% 9% 19%
7 Quality Prediction in a Cardboard Making Process
309
Many of the standard deviations of the estimation errors may seem surprisingly small. But keep in mind that they are calculated under the assumption that the model structure is correct, i.e., they are the values obtained if the data were actually generated by a model of the same structure, with the estimated parameter values, and with Gaussian random input in addition to the data input. Again the low values are mainly a consequence of the very large number of data.
Appendices
A
Mathematics and Algorithms
In this appendix are treated some more advanced topics, supporting the ‘theory’ chapters in Part I, for anyone who wants to know more about the fundamentals of MoCaVa, or would even consider developing the program further. The section also contains outlines of the main algorithms implemented in MoCaVa.
A.1 The Model Classes This section derives the generic model class used in IdKit and summarizes the structural restrictions that are made. Let the object of “process” and “environment” be described by the following Itôh differential equations dx v = G v(x v, t) dt + E v(x v, t) dω v v = Z v(x v, t) dx z = G z(x z, v, u, t) dt + E z(x z, v, u, t) dω z z = Z z(x z, u, v, t)
(A.1) (A.2) (A.3) (A.4)
where ω v and ω z are Wiener processes. Divide time into quanta of length h and let t τ = τh. The idea is to have the quantum so short that it is reasonable to approximate all variables with solutions of linear dynamic equations with coefficients that are constant during a quantum. If the system is stiff, some of the variables may still change rapidly during a time quantum. The sampling “sensor” model has the form y(t k) = Z y[z(t k), v(t k), u(t k), t k, w y(k)]
(A.5)
Notice that some of the input variables u and v may also be measured, with errors. Notice also that the model allows different intervals between the control and sampling times, and also variable intervals. Restriction #1: Noise−free Actuator Model. The “actuator” model has the form dx u = G u[x u, u d(τ + 1)] dt u = Z u[x u, u d(τ)], t ∈ [t τ, t τ + h)
(A.6) (A.7)
314
A Mathematics and Algorithms
This means that it is driven by a control sequence u d(τ) coming from a digital controller with intervals h. If h is short, the sequence may be constant for several τ−values. Unless the control sequence is issued by the computer, and recorded, it must usually be reconstructed from measurements of the actuator output. If explicit time t is present, this means that there must be a signal from a physical clock affecting the process, or some other clock−dependent effect, like daylight. Remark A.1. If the actuator model would be affected by stochastic variables, then the latter must be modelled as part of xz. In that case u d(τ) may also appear explicitly in G z and Z z. The case is not implemented. Restriction #2: Synchronization. The stepwise control sequence (hold circuit) and the sampling device must be active only at the quantization points. This eliminates the necessity to interpolate to get {y(t k)} from {y(t τ)}. Restriction #3: Band−limited and Autonomous Disturbance. dx v = G v[x v, w(τ)] dt v = Z v[x v], t ∈ [t τ, t τ + h)
(A.8) (A.9)
It is generated in the same way as the input signal, except that w(τ) is ‘discrete white noise’ instead of a control sequence. Notice that this allows an unknown control sequence to be treated as disturbance, in case it cannot be reconstructed conveniently from measurements of u. Notice also that w(τ) does not appear in Z v, which makes v continuous. The bandwidth of v is about 1/2h. Since h should be as large as possible to favour computing efficiency, only slow disturbances can be modelled in this way. Faster disturbances, for instance effecting rate of change must be modelled instead by the term E z(x z, v, u, t) dω z added to dx z . (It would seem more logical to add the term E v(x v, t) dω v to dx v, but that would prevent the modelling of hf−noise affecting state derivatives directly). Anyway, high−frequency disturbances require a further restriction: Restriction #4: State−independent and Stepwise Constant Noise Variance. E v[x v, t] = E v[τ] E z[x z, v, u, t] = E z[u d(τ), τ]
(A.10) (A.11)
The state independence simplifies the treatment of the Itôh equation much, and also its modelling (Graebe, 1990b). It is feasible to have dependence on u, since that is a deterministic variable. Remark A.2. MoCaVa3 does not allow the modelling of fast disturbances, although IdKit would be able to handle such disturbances. The reason is the unexpected effects that may result from feeding high−frequency continuous−time noise into nonlinear models. In summary, dxu = G u[x u, u d(τ)] dt, u = Z u[x u, u d(τ)], t ∈ [t τ, t τ + h) dx v = G v[x v, w(τ)] dt, v = Z v[x v], t ∈ [t τ, t τ + h) dx z = G z(x z, v, u, t) dt + E z(τ) dω z, z = Z z(x z, u, v, t)
(A.12) (A.13) (A.14)
A Mathematics and Algorithms
315
w(τ) Environment
w y(k)
v(t)
u d(τ)
Actuator
u(t)
Process
z(t)
Sampler
y(tk)
Figure A.1. Structure of the restricted model assumed in IdKit
A condensed form of the model is dx = G[x, w(τ), u d(τ)] dt + E ω(τ) dω, t ∈ [t τ, t τ + h) η = Z[x, u d(τ)], where η = (z, v, u)
(A.15) (A.16)
The object model in IdKit is further restricted by what an Extended Kalman Filter can handle, most generally a nonlinear stochastic state−vector model in discrete time, but quasilinearizable in its stochastic elements. This includes, in principle, equivalent discrete−time models obtained as solutions of ordinary differential equations with stochastic input, describing a sampled continuous−time object. There are two ways to go when transforming a nonlinear continuous−time model into a discrete−time linearized model: either integrating first and linearizing second, or linearizing first and integrating second. The first alternative would require the least restrictions on the model, since one would only have to apply a stiff ODE solver to get an arbitrarily exact discrete−time model to linearize. Even though one would need to restrict the class of the discrete−time models to be able to linearize it, one would not necessarily have to do so already in the continuous−time model; the transient effects of some short−lived nonlinearities may have vanished at the end of the time quantum. However, IdKit uses the second alternative. The reason is that it clears a way to integrating an important subset of stiff differential equations very efficiently. And speed of computing becomes paramount when the models are not small. Restriction #5: Quasi−linearizability. G[x, w(τ), u d(τ)] = G[x(t τ), 0, u d(τ)] + G x[x(t τ), 0, u d(τ)] [x − x(t τ)] + G w[x(t τ), 0, u d(τ)] w(τ) Z[x, u d(τ)] = Z[x(t τ), u d(τ)] + Z x[x(t τ), u d(τ)] [x − x(t τ)]
(A.17) (A.18)
This means that any changes over a time quantum of state and noise gradients are neglected. Notice that both state x and input u may still change much during that interval. In addition, the input may change stepwise between intervals. To indicate this write the total restricted model dx = {G(τ) + G x(τ) [x − x(τ)] + G w(τ) w(τ)} dt + E(τ) dω η = Z(τ) + Z x(τ) [x − x(τ)], t ∈ [t τ, t τ+1)
(A.19) (A.20)
It is still nonlinear, since gradients depend on x(τ). However, the point of the quasilinearization is this: Differential equations may be ‘stiff’ in two ways: Either some of
316
A Mathematics and Algorithms
the gradients change fast with the states, or some of the states change fast. The quasi− linearizability restriction assumes that in industrial production system the gradients do not change much during a time quantum, even if the states do. When this is true, the model may be integrated analytically over h as a linear time−invariant differential equation. Remark A.3. It is not difficult to conceive realistic cases where the assumption does not hold. For instance, dynamic effects in gas or liquid flows (not to mention turbulence) will most likely be difficult to cope with. Laminar flows, on the other hand, are likely to be quasi−linear. Although it is a much simpler case, IdKit still has . difficulties with some binary reactions: x = p(u 1 − x)(u 2 − x). If the two fuel ingredients in u are both depleted at the same time, and before the time quantum has . been reached, then the linearized model will be x = p (u 1 − x r)(u 2 − x r) − p (u 1 + u 2)(x − x r), and will not even approximate the nonlinear model. On the other hand, linearization may work if the fuel is replenished continuously and at a sufficient rate.
A.2 The Loss Derivatives This section derives the formulas used by IdKit for evaluating the Likelihood loss, its first derivatives, and the Hessian. From (2.26): N
q(k|M, d) + α θ 2 − 2 k=1 1 q(k|M, d) = [log det R e(k|M, d) 2 + e(k|M, d) T R e(k|M, d) −1 e(k|M, d)]
N i (log Λ ii + ½)
Q(M, d) =
(A.21)
i
(A.22)
Drop the arguments for convenience and do a Choleski factorization, R e = ΓΓ T, where Γ is lower left triangular. Let Á = Γ −1e. Then q = 1 (log det R e + Á TÁ) 2 = log det Γ + Á TÁ 2 = trace log Γ + Á TÁ 2
(A.23)
Let δ denote a small variation. Then, according to the rules of differentiation δq = δ(trace log Γ + Á TÁ 2)= trace(− Γ δΓ −1) + Á TδÁ δ 2q = δ[trace(− Γ δΓ −1) + Á TδÁ] = trace(Γ δΓ −1 Γ δΓ −1 − Γ δ 2Γ −1 + δÁ TδÁ+ Á Tδ 2Á)
(A.25)
δÁ = δ[Γ −1 e] = δΓ −1 e + Γ −1 δe δ 2Á = δ 2Γ −1 e+ 2δΓ −1 δe+ Γ −1 δ 2e
(A.26) (A.27)
(A.24)
But
Inserting this into Equation A.25 yields
A Mathematics and Algorithms
δ 2q= trace(Γ δΓ −1 Γ δΓ −1 − Γ δ 2Γ −1 + δÁ TδÁ+ Á T δ 2Γ −1 e + 2 Á T δΓ −1 δe + Á T Γ −1 δ 2e) = trace(Γ δΓ −1 Γ δΓ −1 + δÁ TδÁ) + trace[Γ (− I + Á Á T) δ 2Γ −1 + 2 e T Γ −T δΓ −1 δe]
317
(A.28)
Now, provided residuals e(k) are uncorrelated and have covariance matrix R e(k) (which holds for a correct model), then the last term has zero mean. This follows, since E[Á(k) Á(k) T] = I, and all factors in e(k) T Γ(k) −T δΓ(k) −1 δe(k) = − e(k) T Γ(k) −T δΓ(k) −1 δy(k|k − 1), except e(k), depend only on data up to k − 1 and are therefore uncorrelated with e(k). Introduce two column vectors, γ with all unit elements, and γ containing the logarithms of the diagonal elements in Γ. They are all real, since R e is symmetric and positive definite. Then it follows, for instance from writing out the expressions in trace(Γ δΓ −1) and trace(Γ δΓ −1 Γ δΓ −1), that the variations can be written in a form that requires less computing:
q = γ Tγ + Á TÁ 2 δq = γ Tδγ + Á TδÁ δ 2q = δγ Tδγ + δÁ TδÁ + a term with zero mean.
(A.29) (A.30) (A.31)
The value and gradients of Q follow immediately N
Q=
[γ T γ(k) + Á(k) T Á(k) 2] + α θ 2 k=1 N
∇ TQ =
2
[γ T ∇ Tγ(k) + Á(k) T ∇ TÁ(k)] + α θ
(A.32)
(A.33)
k=1 N
[(∇ Tγ(k)) T ∇ Tγ(k) + (∇ TÁ(k)) T ∇ TÁ(k)] + α I
∇∇ TQ ≈
(A.34)
k=1
where ∇ is the column vector of gradient operators. The approximation holds for large N and good models.
A.3 The ODE Solver This section derives the particular stiff ODE solver used in IdKit to create the equivalent linearized discrete−time model called by the EKF predictor. The task is to compute the reference trajectory and sensitivity matrices. A.3.1 The Reference Trajectory Let x r(t) be the undisturbed trajectory satisfying dxr = G[x r, 0, u d(τ)] dt
(A.35)
318
A Mathematics and Algorithms
With quasilinearization, dx r = G(τ) + G (τ) [xr − x(τ)] x dt
(A.36)
This has the discrete−time solution t τ+1
xr(τ + 1) = Φ(τ, h) xr(τ) +
Φ(τ, t τ+1 − s) [G(τ) − G x(τ) x(τ)] ds tr
= Φ(τ, h) xr(τ) + G x(τ) −1 [Φ(τ, h) − I][G(τ) − G x(τ) x(τ)] = xr(τ) + Γ(τ, h) G(τ)
(A.37)
Φ(τ, s) = exp[G x(τ) s] Γ(τ, h) = G x(τ) −1 [Φ(τ, h) − I]
(A.38) (A.39)
where
A.3.2 The State Deviation The deviation from the reference trajectory satisfies dx = [G x(τ) x + G w(τ) w(τ)] dt + E ω(τ) dω
(A.40)
The equivalent discrete−time model will be x(τ + 1) = Φ(τ, h) x(τ) + Γ(τ, h) G w(τ) w(τ) + w ω(τ)
(A.41)
where {w ω(τ)} is an uncorrelated Gaussian sequence with covariance matrix h
R ω(τ, h)
Φ(τ, s) E ω(τ) E ω(τ) T Φ(τ, s) Tds (A.42)
E w ω(τ) w ω(τ) T = 0
The value of the integral is the solution of the following Lyapunov equation: G x R ω + R ω G Tx = Φ E ω E Tω Φ T − E ω E Tω
(A.43)
The discrete noise−to−next−state sensitivity function E(τ) follows from factorizing the covariance of the added effects of external disturbances and state noise: E(τ) E(τ) T = Γ(τ, h) G w(τ) G w(τ) T Γ(τ, h) T + R ω(τ, h)
(A.44)
Although it would be numerically feasible to allow state noise, the option is not implemented in IdKit, in order to avoid the solving of the Lyapunov equation. A.3.3 The Equivalent Discrete−time Sensitivity Matrices There are three matrices to evaluate for each quantum τ, viz. Φ(τ, h), Γ(τ, h), and R ω(τ, h). The following derives the method used in IdKit:
A Mathematics and Algorithms
319
Divide the time quantum into smaller intervals δ = h n, where n = 2 m, and suppress τ. Then Φ(2νδ) = Φ(νδ) Φ(νδ), (ν = 1,
, 2 m−1)
(A.45)
which requires only quite few matrix multiplications to obtain the accuracy of a large number of small discrete steps over the time quantum. This way of evaluating exponential matrices is standard routine (Golub and van Loan, 1996). Now, the second matrix Γ can be evaluated almost as efficiently: Γ(2νδ) = G −1 [Φ(2νδ) − I] x = G −1 [Φ(δ) − I] [I + Φ(δ) + x = Γ(δ) Σ(2νδ)
+ Φ(δ) 2ν−1]
(A.46)
where Σ(2νδ) = [I + Φ(δ) + + Φ(δ) 2ν−1] = [I + Φ(δ) + + Φ(δ) ν−1] + Φ(δ) ν [I + Φ(δ) + = [I + Φ(νδ)] Σ(νδ)
+ Φ(δ) ν−1] (A.47)
All are fast algorithms requiring H(m) number of operations, provided the start values Φ(δ) and Γ(δ) are known. It remains to calculate those matrices and to find a sufficiently small value of δ. Golub and van Loan suggest Padé approximation. However, this involves matrix inversion, which suggests a Taylor expansion instead, also since the accuracy requirements are quite moderate in grey−box identification: Φ(δ) = exp(G x δ) = I + G x δ + 1 (G x δ) 2 + 2! MoCaVa3 uses a linear approximation. However, it has not been investigated which start approximation that is most efficient. The equivalent discrete−time sensitivity matrices are
A(τ) = exp[G x(τ) h]= Φ(τ, 2 mδ) H(τ) = G x(τ) −1 [A(τ) − I]= Γ(τ, 2 mδ) E(τ) = H(τ) G w(τ) C(τ) = Z x(τ) F(τ) = Z w(τ)
(A.48) (A.49) (A.50) (A.51) (A.52)
They are needed for Algorithm A.3 as well as for computing the nominal trajectory from Equation A.41: x r(τ + 1) = x r(τ) + H(τ) G[x r(τ), 0, u d(τ)] η r(τ) = Z[x r(τ), u d(τ)]
(A.53) (A.54)
A.3.3.1 Error Analysis The following serves to find the smallest number of iterations m. This requires first an error criterion on which to base an assessment of a sufficient value of m.
320
A Mathematics and Algorithms
In order to obtain comparable sizes of the variables involved define the scale− G x S x, where S x is the diagonal matrix of scale facfree sensitivity matrix G x = S −1 x tors of x, and define the error matrix as
exp(G x h) − Φ(G x h n) n (G x h) = exp(G x h) − (I + G x h n) n
(A.55)
Transform to Jordan form U (G x h) U −1 = (J h), where J has the eigenvalues λ i in the diagonal. Now, since is a monotone scalar function of all diagonal elements, it is reasonable to limit its maximum value. But
max (h|λ i|)
(h G x )
(h Σ i|λ i| 2)
(A.56)
T
where G x = trace(G x G x ) is the Euclidian norm. a given Hence, determine m as the smallest integer that satisfies (h G x ) small value. This yields δ = h n 1 G x , which agrees with the expectation that δ is determined by the shortest time constant in the stiff system.
Algorithm A.1. Computing the minimum number of iterations Computing the number of iterations m: Time constant: 1 S −1 G x(τ) S x → T(τ) x Until | exp[h T(τ)] − exp[n log(1 − n −1 h T(τ) −1)]| 0.001 repeat Increment m; 2 m → n
The time constants T(τ) may vary with time, which causes no problems. Notice however, that the same array of time constants must be used throughout a fitting session, in order to ensure that the number of operations will be the same. It must therefore be computed once, stored, and reused. A.3.3.2 Extension to High−frequency Disturbance The following extension of the scope of IdKit is feasible, although not implemented with MoCaVa3: The fast recursion for computing Φ and Γ can be extended to include Rω. 2νδ
νδ
R ω(2νδ) =
Φ(s) E ω E Φ(s) ds + T ω
0
νδ
2νδ
= R ω(νδ) + Φ(νδ)
Φ(s) E ω E Tω Φ(s) T ds
T
Φ(s − νδ) E ω E Tω Φ(s − νδ) T ds Φ(νδ) T νδ
= R ω(νδ) + Φ(νδ) R ω(νδ) Φ(νδ) T
(A.57)
Since Equation A.57 offers an alternative to the Lyapunov Equation A.43, it will be possible to handle direct state noise, allowing also high−frequency disturbances.
A Mathematics and Algorithms
321
A.4 The Predictor The following algorithm computes the variables needed to evaluate the loss and its derivatives. All recursive algorithms need a state; that of the EKF is called “information state”. Denote it by χ. Algorithm A.2. Computing residuals and their variances Predictor [y d(τ), u d(τ), χ(k)] → χ(k + 1), Á(k), χ(k) Notations: τ = Discrete time τ s(k) = Discrete sampling time for data record #k x( τ) = State y( τ) = Sensor output z( τ) = Response v( τ) = Disturbance u( τ) = Stimulus u d(τ) = Input data (interpolated) y d(k) = Output data ξ r(τ) = Reference value of ξ(τ), ξ ∈ {x, y, z, v, u} ξ δ(τ) = Deviation of ξ(τ) from ξ r(τ) ξ(τ|τ) = Estimated value of ξ(τ) ξ(τ|τ − 1) = Predicted value of ξ(τ) R x(τ|τ − 1) = E{xδ(τ) xd(τ) T} = Covariance of state prediction A(τ) = State−to−next−statesensitivity matrix C(τ) = State−to−output sensitivity matrix C y(τ) = State−to−sensor sensitivity matrix E(τ) = State−to−next−statesensitivity matrix F(τ) = Noise−to−sensor sensitivity matrix χ(τ) = {xr(τ), x δ(τ|τ − 1), R x(τ|τ − 1)} = Information state η(τ) = {y(τ), z(τ), v(τ), u(τ)} = Model output Á(τ), γ(τ) = Normalized residuals Loop over the sampling interval: For τ = τ s(k), , τ s(k + 1) − 1, do Update nominal trajectory and sensitivity matrices: DiscreteModel [x r(τ), u d(τ), τ] → x r(τ + 1), η r(τ), A(τ), C(τ), E(τ), F(τ) Update a priori estimates of model output: η r(τ) + C(τ) xδ(τ|τ − 1) → η(τ|τ − 1) If sampling time, then do estimation update of the information state: If τ = τ s(k), do Residuals: y d(k) − Z y[η(τ|τ − 1), 0] → e(k) Kalman gain: C y(τ) R x(τ|τ − 1) C y(τ) T + F(τ) F(τ) T → R y(τ) Choleski [R y(τ)] → Γ(k) R x(τ|τ − 1) C y(τ) T Γ(k) −1 → K(k) Normalized residuals: Γ(k) −1 e(k) → Á(k); −ln diag Γ(k) −1 → γ(k)
322
A Mathematics and Algorithms
x δ(τ|τ − 1) + K(k) Á(k) → xδ(τ|τ) R x(τ|τ − 1) − K(k) K(k) T → R x(τ|τ) Else x δ(τ|τ − 1), R x(τ|τ − 1) → x δ(τ|τ), R x(τ|τ) Prediction update of the information state: A(τ) xδ(τ|τ) → x δ(τ + 1|τ) A(τ) R x(τ|τ) A(τ) T + E(τ) E(τ) T → R x(τ + 1|τ)
Remark A.4. Since the algorithm in principle estimates the noise sequences entering the model from given values of its measured output, a better name for the algorithm would be “Model inverter”. However, the predictor is the dominating part. Remark A.5. The subprogram implementing Predictor is based on a somewhat longer specification, since it has to include a startup procedure and some administration of missing data. Predictor is also used for several purposes, which adds some interface handling. Remark A.6. The algorithm differs from that used by Kristensen, Madsen, and Jørgensen (2004), which shows better performance in cases with large deviation from the reference trajectory. A.4.1 The Equivalent Discrete−time Model This routine implements the ‘stiff’ ODE solver derived in Section A.3. Algorithm A.3. Computing reference trajectory and sensitivity matrices. DiscreteModel (x r, u d, τ) → x r, η r, H, A, C, E, F Update nominal discrete−time trajectory and sensitivity matrices: Notations: g = State derivatives G x = State−to−derivative sensitivity matrix G w = Noise−to−derivative sensitivity matrix H = State−derivative−to−next−state sensitivity matrix Linearized−model response, state derivatives, and gradients: Quasilinearization (x r, u d, τ) → η r, g, G x, G w, C, F Discrete−time sensitivity matrices: State−to−next−state: exp (h G x) → A (A − I) → H State−derivative−to−next−state: G −1 x Noise−to−next−state: H G w → E Next reference state: x r + H g → x r
A.5 Mixed Algebraic and Differential Equations Occasionally, models are described by ADE−systems, either because they contain implicit algebraic equations, in addition to the assignment statements computing output and state derivatives, or else are written in such a way that they create ‘algebraic loops’. Such constructions may even lead to systems for which no solution exists. However, a restricted case that can be handled by IdKit is that when the algebraic equations are explicit. Expand Equations A.15 and A.16 into the following form:
A Mathematics and Algorithms
dx = G[x, s, w(τ), u d(τ)] dt + E ω(τ) dω s = H[x, s, w(τ), u d(τ)] η = Z[x, s, u d(τ)]
323
(A.58) (A.59) (A.60)
The dependence on w(τ) and u d(τ) means that s and dx/dt may be discontinuous at the time−quantization points. It would be most straightforward to solve Equation A.59 for s by some iterative loop like s(τ) ← s(τ) − [H s(τ) − I] −1 [H(τ) − s(τ)]
(A.61)
and enter the result in Equation A.58 as start values for an integration in the interval (t, t + h). However, the following alternative is a way to avoid the numerical problems associated with having iterations with a variable number of steps. If one would regard Equation A.59 as the equilibrium of a differential equation with very short time constants (which is probably what it is, if describing a physical equilibrium), this would be modelled by replacing Equation A.59 with ds = − δ −1 [H (τ) − I] −1 [H(x, s, w(τ), u d(τ)) − s] s dt
(A.62)
with a small value of δ > 0. Assume Equation A.62 is stable (otherwise the iterative solution does not work either), and linearize around x(τ), s(τ), w(τ), as in Equations A.19 and A.20 dx = G(τ) + G (τ) [x − x(τ)] + G [s − s(τ)] + G (τ) w(τ) x s w dt ds = − δ −1 [H (τ) − I] −1 {H(τ) + H (τ) [x − x(τ)] s x dt + [H s(τ) − I] [s − s(τ)] + H w(τ) w(τ) − s(τ)} η = Z(τ) + Z x(τ) [x − x(τ)] + Z s(τ) [s − s(τ)]
(A.63)
(A.64) (A.65)
The system can be solved in the same way as Equations A.15 and A.16, as long as δ is positive: G x(τ) Φ(τ) = exp h − [H (τ) − I] −1 H (τ) δ s x
G s(τ) −I δ
G x(τ) Γ(τ) = [Φ(τ) − I] − [H (τ) − I] −1 H (τ) δ s x
G s(τ) −I δ
(A.66) −1
(A.67)
x(τ + 1) s(τ + 1) = G(τ) + G w(τ) w(τ) Γ(τ) − [H (τ) − I] −1 [H(τ) + H (τ) w(τ) − s(τ)] δ s w
(A.68)
324
A Mathematics and Algorithms
x(τ + 1) s(τ + 1)
η(τ + 1) = Z x(τ) Z z(τ) + 1) ⎫ ⎧ ⎫=⎧x(τ) ⎧x(τ s(τ)⎪+⎪ s(τ + 1) ⎪η(τ + 1)⎪ ⎪η(τ) ⎭⎩ ⎭ ⎩ ⎩
(A.69)
⎫ ⎪ ⎭
x(τ + 1) s(τ + 1) η(τ + 1)
(A.70)
However, this requires that δ be not too small, otherwise the routine evaluating Φ and Γ will be slow. The following alternative avoids this problem, and also reduces the number of state variables. Let δ → 0. Then Equations A.66 to A.68 reduce to G sx(τ) G x(τ) − G s(τ) [H s(τ) − I] −1 H x(τ) s Φ (τ) exp[h G sx(τ)] s Γ (τ) [Φ s(τ) − I][G sx(τ)] −1 g x(τ) G x(τ) + G w(τ) w(τ) g s(τ) − [H s(τ) − I] −1 [H(τ) + H w(τ) w(τ) − s(τ)] x(τ + 1) = Γ s(τ) [g x(τ)+ G s(τ) g s(τ)] s(τ + 1) =− [H s(τ) − I] −1 H x(τ) x(τ + 1) + g s(τ)
(A.71) (A.72) (A.73) (A.74) (A.75) (A.76) (A.77)
Proof Drop the τ arguments and introduce G + Gw w gx g s = − (H s − I) −1 (H + H w w − s)
(A.78)
Gx G xx G xs G sx G ss = − (H s − I) −1 H x
Gs −I
(A.79)
G xx G xs G sx G ss
(A.80)
Φ xx Φ xs 1 0 Φ sx Φ ss = exp h 0 1 δ Γ xx Γ xs Φ xx − I Φ xs Γ sx Γ ss = Φ ss − I Φ sx
G xx G xs G sx G ss
−1
(A.81)
Then x(τ + 1) Γ xx Γ xs s(τ + 1) = Γ xs Γ ss
gx gs
(A.82)
Do a linear transformation into Jordan form I 0 0 I δ
G xx G xs G sx G ss
X 11 X 12 X 11 X 12 X 21 X 22 = X 21 X 22
J1 0 0 J2 δ
(A.83)
Order the eigenvalues in such a way that J 2 δ contains the large values of the system. Then the diagonal elements of J 2 are all negative, or the system would not be stable. Writing out the four block relations yields
A Mathematics and Algorithms
G xx X 11 + G xs X 21 = X 11 J 1 G sx δ −1 X 11 + G ss δ −1 X 21 = X 21 J 1 G xx X 12 + G xs X 22 = X 12 J 2 δ −1 G sx δ −1 X 12 + G ss δ −1 X 22 = X 22 J 2 δ −1
325
(A.84) (A.85) (A.86) (A.87)
From Equations A.84 to A.87 X 12 = δ (G xx X 12 + G xs X 22) J −1 2 G sx X 12 + G ss X 22 = X 22 J 2 −1 [G xx − G xs G −1 ss G sx] X 11 = X 11 J 1 − G xs G ss X 21 J 1 δ −1 X 21 = − G ss (G sx X 11 − X 21 J 1 δ)
(A.88) (A.89) (A.90) (A.91)
Using the relation f (X −1 G X) = X −1 f (G) X that holds for all analytical function f with matrix arguments G: Γ xx Γ xs Γ sx Γ ss =
I 0 0 Iδ
I 0 exp h 0 I δ
X 11 X 12 = X 21 X 22
−1
G xx G xs G sx G ss
−1
−I
G xx G xs G sx G ss
−1
−I
J1 0 0 J2 δ
0 J exp h 01 J 2 δ
I 0 0 Iδ X 11 X 12 X 21 X 22
−1
(A.92) Let δ → 0. Then from Equations A.88 to A.91 2 X 12 = δ G xs X 22 J −1 2 + H(δ ) −1 X 22 = − G ss X 22 J 2 + H(δ) [G xx − G xs G −1 ss G sx] X 11 = X 11 J 1 + H(δ) X 21 = − G −1 ss G sx X 11 + H(δ)
(A.93) (A.94) (A.95) (A.96)
Introduce the partitioned inverse of X X 11 X 12 X 21 X 22
I 0 X 11 X 12 X 21 X 22 = 0 I
(A.97)
From Equations A.93 to A.97 follows −1 = X −1 X 11 = (X 11 − X 12 X −1 22 X 21) 11 + H(δ) 22 −1 −1 X = (X 22 − X 21 X 11 X 12) = X −1 22 + H(δ) −1 X 12 = − X −1 X [X + H(δ)] 12 11 22 −1 X 21 = − X −1 22 X 21 [X 11 + H(δ)]
(A.98) (A.99) (A.100) (A.101)
Insert Equations A.93 to A.96. −1 −1 2 X 12 = − X −1 11 G xs X 22 J 2 X 22 δ+ H(δ ) −1 −1 2 = − X 11 G xs G ss δ + H(δ ) −1 X 21 = X −1 22 G ss G sx + H(δ)
(A.102) (A.103)
326
A Mathematics and Algorithms
Define the analytic function Γ(x) A.93, and A.102: X 11 X 12 Γ xx Γ sx Γ xs Γ ss = X 21 X 22
[exp(hx) − 1] x. Then from Equations A.92, 0 Γ(J 1) −1 0 Γ(J 2 δ )
X 11 X 12 δ −1 X 21 X 22 δ −1
X 11 0 = X 21 X 22
0 Γ(J 1) −1 0 − J2
−1 −1 − X −1 X −1 11 X 12 X 22 δ 11 + H(δ) −1 X 22 0
X 11 0 = X 21 X 22
0 Γ(J 1) −1 0 − J2
−1 − X −1 X −1 11 G xs G ss 11 + H(δ) (A.104) −1 X 22 0
Writing out the block relations and using Equations A.94 and A.95 yields Γ xx = X 11 Γ(J 1) X 11 + H(δ) = Γ(G ss − G xs G −1 ss G sx)+ H(δ) −1 Γ xs = − X 11 Γ (J 1) X −1 11 G xs G ss + H(δ) −1 = − Γ xx G xs G ss + H(δ) −1 Γ sx = X 21 Γ(J 1) X −1 11 + H(δ) = − G ss G xs Γ xx + H(δ) −1 −1 −1 Γ ss = − X 21 Γ (J 1) X 11 G xs G ss − X 12 J −1 2 X 22 + H(δ) −1 −1 −1 = G ss G xs Γ xx G xs G ss − G ss + H(δ)
(A.105) (A.106) (A.107) (A.108)
Insert Equations A.105 to A.108 into Equation A.82. Then x = Γ xx g x + Γ xs g s = Γ xx (g x − G xs G −1 ss g s) −1 −1 −1 s = − G −1 G Γ g − G G Γ G sx xx x xs xx xs G ss g s − G ss g s ss ss −1 −1 = − G ss G sx x− G ss g s
(A.109) (A.110)
Insert Equation A.78. Then x = Γ[G x − G s (H s − I) −1 H x] (g x + G s g s) s = − (H s − I) −1 H x x + g s
(A.111) (A.112)
This completes the proof.
A.6 Performance Optimization The numerical operations that take most of the time in evaluating the loss function is that of computing the sensitivity matrices H, A, C, E, F, which all depend on τ. The option of “Performance optimization” in IdKit is designed to exploit three common properties of the physical systems: i) they are often ‘sparse’, ii) some of their non−zero elements are constant, and iii) if they are not, they are possibly continuous functions of the reference trajectory. This motivates the inclusion in IdKit of three functions designed to exploit these properties. : SparseQuasilinearization: Analyses the block−matrix structure of a multi−component model in order to avoid computing zero blocks. : SetMemoization: Keeps track of elements that have already been computed.
A Mathematics and Algorithms
327
: SensitivityUpdateControl: Keeps track of the need to access the model, when up-
dating the sensitivity matrices. With performance optimization the DiscreteModel algorithm will be
Algorithm A.4. DiscreteModel algorithm with performance optimization. DiscreteModel (x r, u d, τ) → x r, η r, H, A, C, E, F Update nominal discrete−time trajectory and sensitivity matrices: Notations: χ g = Sensitivity−update control indicators χ m = Memoization indicators g = State derivatives G x = State−to−derivative sensitivity matrix G w = Noise−to−derivative sensitivity matrix H = State−derivative−to−next−state sensitivity matrix Linearized−model response, state derivatives, and gradients: If sparsity option is on, then SparseQuasilinearization (x r, u d, τ, χ g, χ m) → χ g, η r, g, G x, G w, C, F else Quasilinearization (x r, u d, τ, χ g) → χ g, η r, g, G x, G w, C, F If first iteration, then check whether perturbed coordinates change sensitivity matrices: SetMemoization (G x, G w, C, F, τ, χ g) → χ m Discrete−time sensitivity matrices: State−to−next−stateand state−derivative−to−next−state: (A − I) → H If χ g.G x = 1, then exp (h G x) → A and G −1 x Noise−to−next−state: If χ g.G x = 1 or χ g.G w = 1, then H G w → E Control sensitivity matrix updates: SensitivityUpdateControl (x r, u d, τ, χ g, H, A, E, C, F) → χ g, H, A, E, C, F Next reference state: x r + H g → x r
Remark A.7. The notation χ g.X means a member X of the variable set χ g (in the MATLABX or C style). A.6.1 The SensitivityUpdateControl Function Assume that the discrete−time state model is at most quadratic in the state vector. This makes it possible to estimate the gradient’s linear dependences on the state. The following method is used: In a first pass compute all derivatives in G[x(k)] using numerical differentiation, and let the results be g ij(k). Let H = ∇ xG be the ‘curvature’ (assumed constant, but possibly non−zero), i.e., the gradient with respect to the state. Set up the following system to describe the variation of the G[x(k)] with k: h ijl(k) = h ijl(k − 1) + λ w 1ijl(k) g ij(k) = g ij(k − 1) +
(A.113)
h ijl(k − 1) [x l(k) − x l(k − 1)] + σ w 2ij(k) l
(A.114)
where w 1 and w 2 are independent normal variables with zero means and unit variances. The second equation uses the assumption that the gradients depend linearly
328
A Mathematics and Algorithms
on the states, and the first that the assumption may not always hold. The curvature h ijl(k) can now be estimated by a Kalman filter, using g ij(k) and ‘measured variables’ and x(k) as ‘input’: ^
^
h ij(k) = h ij(k − 1) + K(k) g~ ij(k)
(A.115)
~
(A.116) (A.117) (A.118) (A.119)
^
g ij(k) = g ij(k) − h ij(k − 1) x(k) R g(k − 1) = x(k) R g(k − 1) x(k) T + σ 2 K(k) = R h(k − 1) x(k) R g(k − 1) −1 R h(k) = R h(k − 1) − K(k) R g(k − 1) K(k) T + λ 2 I
Notice that R g is a scalar, and that K does not depend on i and j. The residuals g~ ij(k) are measures of how well the quadratic assumptions hold in each interval k. Let G c be the function computing state derivatives and output of the continuous− time model, and partition into G cx G cxx G cxv dx~ dt = G cz + G czx G czv ~ z
G cxw v + G cxw w(k)
~ x ~
(A.120)
The equivalent discrete−time model is Φ = exp[h G cxx], Γ = (Φ − I) (G cxx) −1 G xx = Φ, G xv = Γ G cxv, , G xw = Γ G cxw G zx = G czx, G zv = G czv , G zw = G czw
(A.121) (A.122) (A.123)
In the consecutive passes the following heuristic rules apply: ~ If G xx(k) > Á then update G cxx(k) → Φ(k), Γ(k) → G xv(k), G xx(k), G xw(k) ~ else if G xxx(k) x(k) > Á then G xx(k) = G xx(k − ν) + G xxx(k) x(k) ~ If G xv(k) > Á then update G cxv(k) → G xv(k) ~ else If G xvx(k) x(k) > Á then G xv(k) = G xv(k − ν) + G xvx(k) x(k) ~ If G xw(k) > Á then update G cxw(k) → G xw(k) ~ else If G xwx(k) x(k) > Á then G xw(k) = G xw(k − ν) + G xwx(k) x(k) The threshold Á is nominally set to an absolute value (e.g., 0.001). To increase safety without losing much efficiency, let the level be controlled by the frequency of violation by the residuals. This means that the level is set by the upper percentiles ~ of G(k) . The first pass files the average curvatures G zvx, G zxx, G zxv,G xvx, G xxx, G xxv and the residuals. The second pass files the instances where the gradients violate the thresholds. The following passes use the instances to control the updating. This ascertains that exactly the same loss function is evaluated in each pass. The following algorithm has been implemented: For a discussion see Section 2.5.1. Algorithm A.5. The SensitivityUpdateControl Function SensitivityUpdateControl (x, u, τ, χ g, H, A, E, C, F) → χ g, H, A, E, C, F Control sensitivity matrix updates: User ‘advanced’ settings:
A Mathematics and Algorithms
;.gain = Indicator for active sensitivity−update control ;.τ = Start time for sensitivity−update control ;.p = Maximum fraction of updates ;.e Ψ = Maximum scaled error levels, Ψ ∈ {H, A, E, C, F} Sensitivity−update control variables: χ g.Ψ(τ) = Sensitivity−update control indicators χ g.T(τ) = Time constants χ g.Ψ = Indicators for updated discrete sensitivity matrices Local gain control variables: χ.h Ψ = Curvature coefficients χ.e Ψ(τ) = Current extrapolation error of Ψ χ.e 0 Ψ(τ) = Current hold error χ.e Ψp = Error percentiles χ.Ψ = Latest update χ.x = Latest updated state χ.u = Latest updated input Ψ(τ) = Scaled increments of sensitivity matrices x(τ), u(τ) = Scaled argument differences e Ψ = Error threshold of Ψ Initialize: Ψ → χ.Ψ; 0 → χ.h Ψ Scaled argument differences: u S −1 (x − χ.x) → x ; S −1 u (u − χ.u) → x Predict sensitivity functions, if not updated from the user’s model: For Ψ ∈ {H, A, E, C, F}, if χ g.Ψ = 0, do If χ g.Ψ(τ) = 0, then hold: χ.Ψ → Ψ If χ g.Ψ(τ) = 1, then extrapolate: χ.Ψ + [ x u] χ.h Ψ → Ψ If probing pass, then update curvature statistics: Scaled sensitivity matrix increments: If χ g.H(τ) = 1, then S −1 (H − χ.H) S g → H(τ) x If χ g.A(τ) = 1, then S −1 (A − χ.A) S x → A(τ) x If χ g.E(τ) = 1, then S −1 (E − χ.E) → E(τ) x If χ g.C(τ) = 1, then S −1 C(τ) η (C − χ.C) S x → (τ) If χ g.F(τ) = 1, then S −1 (F − χ.F) → F η Save scaled arguments and sensitivity matrix increments: Store x(τ), u(τ), χ. H(τ), χ. A(τ), χ. E(τ), χ. C(τ), χ. F(τ) Update curvature estimates: For Ψ ∈ {H, A, E, C, F}, do UpdateCurvatureEstimates [τ, Ψ(τ), x(τ), u(τ), χ.h Ψ]→ χ.h Ψ, χ.e Ψ If probing pass and end of sample, then compute update indicators: If τ = τ f, then Loop over sample range: For τ = 1, , τ f, do Hold and extrapolation errors: For Ψ ∈ {H, A, E, C, F}, do χ.e Ψo (τ) = | Ψ(τ)| χ.e Ψ(τ) = | Ψ(τ) − [ x(τ) u(τ)] χ.h Ψ|
329
330
A Mathematics and Algorithms
Rescale curvature estimates: S x χ.h H S −1 → χ.h H;S x χ.h A S −1 → χ.h A; S x χ.h E → χ.h E; g x C −1 C F S η χ.h S x → χ.h ; S η χ.h → χ.h F Compute percentiles: For Ψ ∈ {H, A, E, C, F}, do Percentile (;.p, χ.e Ψ, χ.e Ψ0 ) → χ.e Ψp Relax thresholds according to prescribed percentage of updates: For Ψ ∈ {H, A, E, C, F}, do max(;.e Ψ, χ.e Ψp ) → e Ψ Set update indicators: For τ = 1,. . ., B.τ, do For Ψ ∈ {H, A, E, C, F}, do 2 → χ g.Ψ(τ) For τ = B.τ + 1,. . ., τ f, do For Ψ ∈ {H, A, E, C, F}, do If min[χ.e Ψ0 , χ.e Ψ] > e Ψ, then update: 2 → χ g.Ψ(τ); else if χ.e Ψ0 (τ) < χ.e Ψ(τ) + 0.1 ;.e Ψ , then hold: 0 → χ g.Ψ(τ); else extrapolate: 1 → χ g.Ψ(τ) Update previous sensitivity matrices: For Ψ ∈ {H, A, E, C, F}, do χ.Ψ → Ψ
Remark A.8. The SensitivityUpdateControl routine in IdKit also includes a function for checking and utilizing the sparseness, if any, of the curvature array (see the source code in MoCaVa3\C\SensitivityUpdateControl.c). A.6.2 Memoization Algorithm A.6. Setting indicators for changing sensitivity matrices. SetMemoization (G x, G w, C, F, τ, χ g) → χ m Check whether perturbed coordinates change sensitivity matrices: Notations: χ m.X 0 = Unperturbed sensitivity matrix X ∈ {G x, G w, C, F} χ m.X ∈ {0|1} = Memoization index ∈{Dependent|Independent} If unperturbed θ, then save updated sensitivity matrices: For X ∈ {G x, G w, C, F} if χ g.X = 1, then Χ → χ m.X 0 If perturbed θ + δθ, then check if this changes sensitivity matrices: For X ∈ {G x, G w, C, F}, do If |Χ − χ m.Χ 0| is above threshold, then 0 → χ m.Χ, else 1 → χ m.Χ
A.7 The Search Routine Algorithm A.7. Modified Newton−Raphson search Search( n, N, Q, Q θ, Q θθ, χ f) → χ f, θ, Q θ, q Modified Newton−Raphson search: Notations: Q θ = A priori loss q = Score
A Mathematics and Algorithms
331
N = Length of sample χ f = Search state χ f.Q = Previous loss χ f.θ = Previous coordinates ? = Design parameters Initialize: If n = 0, then ∞ → χ f.Q; 0 → χ f.θ Add a priori loss: ½ θ 2 ?.prior_weight → Q θ; Q + Q θ → Q; Q θ + θ ?.prior_weight → Q θ; Q θθ + I (?.prior_weight + N ?.regul2) → Q θθ Overshoot: If Q > χ f.Q + ?. Q, then Step back: ½(θ + χ f.θ) → θ; 99 → q No overshoot: Else do Update search state: θ → χ f.θ; Q → χ f.Q Regularize: Q θθ + diag(Q θθ) ?.regul1 → Q θθ NR solution: Choleski (Q θθ) → Γ;− Γ −T Γ −1 Q θ → θ Step reduction: θ + θ ?.step_reduction → θ Search score: log(− 10 Q θ θ) → q Stopping rule: If q < 0, then ‘true’ → converged, else ‘false’ → converged If converged = ‘true’ or n + 1 = ?.max_iterations, then terminate search
A.8 Library Routines This section contains the functions the user can choose from to model interfaces to discrete−time data, i.e., ADC and DAC conversion and disturbance generation. A.8.1 Output Conversion The Sensor model samples variables with an interval equal to the time quantum, and adds a Gaussian random error. In case the data is sampled with a lower frequency, or some data points are missing, the sensor output will be ignored. The model is function [y] = Sensor_y(z,wy,rms) y = z + rms * wy;
where z is the sampled variable, wy is a normalized Gaussian variable, rms is a parameter to be fitted, and y is the sensor output. A.8.2 Input Interpolators Interpolators are applicable when the input data is accurate, and it is important what happens between the sampling points. Name: Hold u(t) = u d(τ), t ∈ [t τ, t τ + h)
(A.124)
332
A Mathematics and Algorithms
The Hold model makes a stepwise constant function from the data values, continuous to the right. The model is % Processing: function [u,xd] = Hold_u(xd,ud) u = xd; xd = ud; % Initialisation: function [xd] = Hold_i(ud) xd = ud;
where ud is the discrete−time data value at the end of the interpolation interval, xd a state variable holding the current value, and u the zero−hold output. The predicting function in Hold does not use the data value at the end of the interval, but other iterpolators do. Name: Linear dx u(t) dt = [u d(τ + 1) − x d(τ)] h u(t) = xu(t), t ∈ [t τ, t τ + h)
(A.125) (A.126)
The Linear model makes a linear interpolation between the data values. The model is % Processing: function [u,D,xd] = Linear_u(x,xd,ud) global TIME_QUANTUM Dx = (ud − xd)/TIME_QUANTUM; u = x; xd = ud; % Initialisation: function [x,xd] = Linear_i(ud) xd = ud; x = ud;
where ud is the right−end discrete−time data, xd a discrete−time state variable, x a continuous−time state variable, Dx its time derivative, and u the interpolated value. Name: FirstOrder .
x(t) = a[u d(τ + 1) − x(t)] u(t) = x(t), t ∈ [t τ, t τ + h)
(A.127) (A.128)
The FirstOrder model outputs the response to a first−order linear model with unit low−frequency gain and stepwise input. The model is % Processing: function [u,Dx] = FirstOrder_u(x,ud,a) Dx = a * (ud − x); u = x;
A Mathematics and Algorithms
333
% Initialisation: function [x] = FirstOrder_i(ud) x = ud;
where ud is the right−end discrete−time data, x is a continuous−time state, Dx is the state derivative, u the response, and a is a parameter. Name: SecOrder .
x 1(t) = x 2(t) . x 2(t) = − a 2 x 1(t) − a 1 x 2(t) + a 2 u d(τ + 1) u(t) = x1(t), t ∈ [t τ, t τ + h)
(A.129) (A.130) (A.131)
The SecOrder model outputs the response to a second−order linear model with unit low−frequency gain and stepwise input. The model is % Processing: function [u,Dx1,Dx2] = SecOrder_u(x1,x2,ud,a1,a2) Dx1 = x2; Dx2 = − a2*x1 − a1*x2 + a1*ud; u = x1; % Initialisation: function [x1,x2] = SecOrder_i(ud) x1 = ud; x2 = 0;
where ud is the right−end discrete−time data, x are continuous−time states, Dx are the state derivatives, u the response, and a are parameters. Name: Delay u 0(τ) = (1 − a) u d(τ − k + 1) + a u d(τ − k) u 1(τ) = (1 − a) u d(τ − k) + a u d(τ − k − 1) dx(t) dt = [u 0(τ) − u 1(τ)] h u(t) = x(t), t ∈ [t τ, t τ + h)
(A.132) (A.133) (A.134) (A.135)
where k = floor(t 0), a = t 0 − floor(t 0). Parameter t 0 is the time lag. The Delay model is an approximation of a delay function with a (possibly unknown) delay time, which need not be an integer number of the time quantum. The function carries out two linear interpolations, first between the discrete data, and then between two delayed points on the resulting sequence of linear segments. The response is exact for ramp input, and a good approximation for processes with long response times. The model is: % Processing: function [u,D,xd] = Delay_u(xd,x,ud,delay) global MAX_DELAY global TIME_QUANTUM tau = 1:MAX_DELAY+2; % Shift data sequence: x1 = xd(MAX_DELAY+1);
334
A Mathematics and Algorithms
xd(2:MAX_DELAY+1) = xd(1:MAX_DELAY); xd(1) = ud; % Linear interpolation in two delayed points: ui = interp1(tau,cat(2,xd,x1),tau(1:2)+delay); % Set up for linear interpolation between those points: Dx = (ui(1) − ui(2))/TIME_QUANTUM; u = x; % Initialisation: function [xd,x] = Delay_i(ud) global MAX_DELAY x = ud; for i = 1:MAX_DELAY+1 xd(i) = ud; end
where ud is right−end discrete−time data, xd is a discrete−time state vector holding delayed data, x is a continuous−time state variable, Dx is its time−derivative, delay is the (possibly unknown) time delay, and u is the delayed and interpolated value. Other variables are internal. A.8.3 Input Filters Filters are applicable to contaminated input data, were it is less important what happens between sampling points: Name: LPFilter u(t) = xd(τ), t ∈ [t τ, t τ + h) a = 1 − exp(− B h) x d(τ + 1) = x d(τ) + a [u d(τ + 1) − x d(τ)]
(A.136) (A.137) (A.138)
The LPFilter applies a first−order linear digital filter to the data, and then makes a continuous−to−the−right stepwise constant function from the filtered values. The model is % Processing: function [u,xd] = LPFilter_u(xd,ud,bw) global TIME_QUANTUM a = 1 − exp(−bw*TIME_QUANTUM); u = xd; xd = xd + a*(ud − xd); % Initialisation: function [xd] = LPFilter_i(ud) xd = ud;
where ud is the right−end discrete−time data, xd a discrete−time state variable, bw is the (possibly unknown) bandwidth of the filter, and u the filtered value. Name: NLFilter u(t) = x d(τ), t ∈ [t τ, t τ + h) e(τ) = u d(τ + 1) − x d(τ)
(A.139) (A.140)
A Mathematics and Algorithms
x d(τ + 1) = x d(τ) + e(τ) 3[e(τ) 2 + e 2]
335
(A.141)
The NLFilter model combines a linear filter with a nonlinear gain e(τ) 2[e(τ) 2 + e 2] that varies with the error amplitude. Thus, the filter responds fast to changes that are well above the (known or unknown) threshold parameter e. The model is % Processing: function [u,xd] = NLFilter_u(xd,ud,h) u = xd; e = ud − xd; xd = xd + e*e*e/(e*e + h*h); % Initialisation: function [xd] = NLFilter_i(ud) xd = ud;
where ud is the right−end discrete−time data, xd a discrete−time state variable, h is the (possibly unknown) error level, and u the filtered value. A.8.4 Disturbance Models Disturbance models are applicable when little is known about the input and no data is available. Name: Brownian dx(t)dt = w(τ) ¯h v(t) = λ x(t)
(A.142) (A.143)
where w(τ) are independent Gaussian variables with zero means and unit variances, v is the continuous−time disturbance, and the parameter λ is the constant drift rate. The normalization with 1 ¯h makes the covariance function of v independent of h. The Brownian model accumulates Gaussian random numbers, and makes a linear interpolation between the discrete values to create a continuous−time approximation of ‘Brownian motion’. The model is % Processing: function [v,Dx] = Brownian_v(x,wv,rate) global TIME_QUANTUM Dx = wv/sqrt(TIME_QUANTUM); v = rate * x; % Initialisation: function [x] = Brownian_i x = 0;
where wv is a Gaussian variable with zero mean and unit variance, x is a continuous− time state vector, Dx is its time−derivative, rate is a (possibly unknown) parameter specifying the drift rate, and v is the continuous−time disturbance.
336
A Mathematics and Algorithms
Name: Lowpass dx(t) dt = − 4 Bx(t) + 8 B h w(τ) v(t) = σ x(t)
(A.144) (A.145)
where B and σ are parameters characterizing the disturbance v. For Bh ½ the disturbance has bandwidth = B and standard deviation = σ. When B approaches the Nyquist frequency 1 2h the variance will change to σ 2 1 − exp(− 4Bh) = σ 2 (1 − 4 B 2h 2 + 3 2Bh 1 + exp(− 4Bh)
)
(A.146)
and the bandwidth to 1 1 − exp(− 4Bh) = B (1 − 4 B 2h 2 + 3 2h 1 + exp(− 4Bh)
)
(A.147)
The Lowpass model makes a step−wise constant function from Gaussian random numbers and then applies a first−order linear filter. The model is % Processing: function [v,Dx] = Lowpass_v(x,wv,rms,bw) global TIME_QUANTUM Dx = − 4*bw*x + sqrt(8*bw/TIME_QUANTUM)*wv; v = rms * x; % Initialisation: function [x] = Lowpass_i x = 0;
where wv is a Gaussian variable with zero mean and unit variance, x is a continuous− time state variable, Dx is its time−derivative, bw and rms are (possibly unknown) parameters characterising the disturbance, and v is the continuous−time disturbance. Name: Bandpass: dx 1(t) dt = x 2(t)
(A.148)
dx 2(t) dt = − 2 β ω x 2(t) − ω x 1(t) + 2 ω β ω h w(τ) v(t) = σ x1(t) 2
(A.149) (A.150)
where ω = 2 π f 0, and β, f0 , and σ are parameters characterizing the disturbance. For f0 h ½ the disturbance has standard deviation σ, mean frequency f0 , and damping factor β. The Bandpass model makes a step−wise constant function from Gaussian random numbers and then applies a second−order linear filter with a pair of complex poles. The model is % Processing: function [v,Dx] = Bandpass_v(x,wv,rms,beta,freq)
A Mathematics and Algorithms
337
global TIME_QUANTUM omega = 2*pi*freq; Dx(1) = x(2); Dx(2) = − 2*beta*omega*x(2) − omega*omega*x(1)... + 2*omega*sqrt(beta*omega/TIME_QUANTUM)* wv; v = rms * x(1); % Initialization: function [x] = Bandpass_i x(1) = 0; x(2) = 0;
where wv is a Gaussian variable with zero mean and unit variance, x is a continuous− time state vector, Dx is its time−derivative, beta, freq, and rms are (possibly unknown) parameters characterising the disturbance, and v is the continuous−time disturbance. For freq*TIME_QUANTUM << 1/2, the disturbance has standard devia− tion rms, mean frequency freq, and damping factor beta. Approaching the Nyquist frequency changes this.
A.9 The Advanced Specification Window This window is used for setting up for some options for increasing the speed of execution, controlling the display, and debugging. A.9.1 Optimization for Speed Three schemes have been implemented to increase the speed of execution of the compiled C−program MinLoss.exe evaluating likelihood and its derivatives. All are designed to reduce the number of times the user’s model needs to be accessed. This may be important in cases of large models. Basically, the operations that need access to the user’s model are the evaluations of sensitivity or ‘transition’ matrices (partial derivatives) during a pass over the sample. Each scheme is based on some restricting property of large models (se Section 2.5). Click the appropriate boxes to activate. : Sparse changes in dynamics: This option pays off, when significant changes in transition matrices occur only sparsely in time, for instance at large changes in the operating point. The GainControl.c routine decides when the transition matrices can and cannot be estimated from previous values with a given error level, thus avoiding unnecessary accesses to the user’s model. It uses a number of design parameters, whose default values may be changed in this window: : Start range of gain control: At start up transition matrices are always updated for a preset number of time quanta. : Max ratio of model accesses: This is a limit set to the number of times the model will be accessed, entered as a given part of the maximum number of accesses. If the transition matrices would change more frequently, this may result in violation of one or more approximation error thresholds. The resulting error levels are displayed to the user. : Error threshold of H|A|E|C|F−matrix: These are normalized error levels of the five transition matrices. H is the derivative−to−next−statetransition, A is the state−to− derivative transition, E is the noise−to−derivative transition, C is the state−to− output transition, and F is the noise−to−output transition.
338
A Mathematics and Algorithms
: Sparse transition matrices: This option pays off, when the model is composed of
several components, and the total transition matrices therefore contain many zero values. The SparseQuasilinearization.c routine analyses the structure of the model to avoid such model accesses that would otherwise result in zero derivatives. No approximation is used in this option. However, the routine carries an overhead, which means that it is not recommended in cases with dense, few−component models. : Sparse parameter dependence: This option pays off, when most transition matrix entries depend only on few members of the total array of free parameters. The SetMemoization.c routine analyses the structure of the model to avoid such model accesses that would otherwise result in zero sensitivity to parameter changes. No approximation is used in this option. However, the routine carries an overhead, which means that it is not recommended in cases with dense, few− component models. The option requires that of Sparse transition matrices, which will be activated automatically. : Primitive alternative structures window: This option becomes useful when the number of parameters is so large that the default Alternative structures window will take uncomfortably long to generate all the buttons required, and a faster window will be desirable. It also allows individual entries in vector parameters to be freed independently, which is not possible with the default window. Using the options for reducing the number of model accesses may require some skill, and understanding how the options operate will no doubt help in that respect. During operations the computer will issue some auxiliary printout that will guide you to efficient computing, if interpreted correctly. See Section 2.5.3 for more information. A.9.2 User’s Checkpoints MoCaVa suggests default specifications whenever possible, namely for tentative and alternative origin of the free parameter space (= start values for a search), dimensions of the tentative free space (= parameters to search for), and search parameters for fitting and testing. The values are displayed to the user to check and possibly overrule. Each display can be suppressed in order to speed up the procedure in cases when the session runs smoothly and no checking is necessary. Click in the appropriate boxes. The first box causes the display of a graph illustrating the connections of the active components and the variables involved. The graph cannot be changed via the same window. The Model class specification window must be used for the purpose. A.9.3 Internal Integration Interval The “time quantum” is the interval the ODE solver will integrate the continuous− time model in order to create the equivalent discrete−time model used for prediction. The latter operation is the core of all CPU−demanding tasks of simulation, fitting, and testing. The time quantum should therefore be as large as possible in order to minimize the number of accesses to the user’s model. Since IdKit accepts ‘stiff’ ODE, the quantum does not have to be shorter than the shortest time constant in the model, i.e., the rate at which the states may change. Instead, it is limited by the rate at which time constants may change. It is also limited by the (shortest) sampling interval, which is the default value. If the time quantum is smaller than the sampling interval, it must be an integer part of it (1/2,1/3,...).
A Mathematics and Algorithms
339
A.9.4 Debugging This section determines the modes of execution of the five basic tasks in IdKit: Simp.c: One−step predictor simulation. MinLoss.c: Fitting parameters. ALMP.c: Computing test statistics. Simm.c: Long−range predictor simulation Simpm.c: M−step predictor simulation. The options are : Default: The program will be run at full speed. C−compiling will be done automatically, but only when needed. : Recompile: The program will be compiled and then run at full speed. : Trace C: A traceable version of the C−program will be set up, compiled, and executed. The trace option opens a communication window allowing the user to trace the execution of any variable. The options are Step (execute and log the current statement, or enter the current subroutine), Skip (execute the current subroutine, but do not log), Run (execute and log to a break point), Show (display the value of an indicated structured variable or member of), Monitor (run and log a variable each time it changes), Off (leave the debugging mode). By default, the subroutines to be traced are limited to those that are parts of the user’s model. However, the setup routine allows the user to expand this to include also subroutines that are parts of IdKit. : Retrace C: A previously set up traceable version of the C−program will be executed. This bypasses the setup part, which otherwise takes several minutes. : Trace M: A MATLABX M−file version simp.m will be executed, evoking the MATLABX debugger option the first time the user model is reached. This option may be easier to handle for a user who is familiar with debugging M−files. However, it cannot detect conceivable errors in the automatic translation of user’s model M−statements into C. The idea is to debug the user’s model by running simp.m, each component at a time, before the model is processed further. The C−tracing options are for more hard−to−detect errors. An M−version of the model is generally much slower. The reason is the multi−level hierarchical structure of the IdKit programs, which follows from the fact that the forms of the user’s models are not given a priori, and may be non−linear. This means that the ‘vectorisation’ feature, which normally makes MATLABX run fast, cannot be exploited in the programming of IdKit, and must be replaced with more time−consuming explicit loop control and function calls. Hence, the M−version is not an option for the more demanding tasks of fitting and testing.
: : : : :
Glossary
Physical concepts : Process proper: The physical system to be described by a mathematical model. When subject to ’stimulus’ input it produces ‘response’ output (in addition to the product it is designed to produce). It is usually called “the system” in identification literature. : Actuator: A physical system producing continuous−time stimulus, when subject to discrete−time ’control’ input. : Environment: A physical system producing ‘disturbance’ input to the process proper, independent of control input. : Sensor: A physical system producing sampled data by measuring a process variable (response, stimulus, or disturbance). : Experiment: A process producing discrete−time response data from the process proper. It creates a file of control data to actuators and response data from sensors. : Identification object: It consists of process proper, actuators, sensors, and environment. Theoretical, mathematical, and programming concepts : ODE solver: A program that integrates given systems of ordinary differential equations. : ADE solver: A program that integrates given systems of ordinary differential and algebraic equations. : Model: A set of differential and/or algebraic equations. Together with an ODE or ADE solver it allows simulation of the output of the identification object when given input stimulus. Examples: 1) A set of nonlinear state equations. 2) A linear continuous−time transfer function. : The true model: A perfect model of the identification object, generally fictitious. : Parametric model: A model containing “parameters”, i.e., input with values that are constant during a simulation. Example: A subroutine defining the model algorithm, where some constants are not ‘hard coded’, i.e., they have been declared but not assigned values. They must therefore be set outside the subroutine. This book treats only parametric models, all encoded as subroutines. No distinction is therefore made between “model” and “parametric model”, or between the set of model equations and the subroutine that codifies them. : Arguments: All scalar or array variables and constants defined in a model.
342
Glossary
: Argument attributes: Properties associated with an argument, and which are specified a priori. Examples: Dimensions, scales, nominal values, ranges.
: Argument class: The class decides how an argument should be interpreted by the identification software. Examples: “stimulus”, “response”, “parameter”, “state”.
: Deterministic model: A model, whose input are all known when it is put to use. : Stochastic model: A model that describes its unknown, “disturbance” input as stochastic processes. Example: Brownian motion.
: Noise: Uncorrelated, often fictitious random variables that are input to stochastic
:
: : : :
: : : :
: : : :
models. Examples: 1) ‘Measurement noise’ is used to explain deviations between data and the model’s response. 2) ‘State noise’ is used as ’slack’ variables to allow for imperfections in the model. Predictor: An algorithm that forecasts a given model’s response a given time ahead, given output data up to present time and input data up to predicted time. For a deterministic model the predictor is equivalent to the model, since output data provides no new information. Example: A discrete−time Kalman filter is a one− step−ahead predictor. Time quantum: A fixed time interval, short enough to allow prediction of the model’s response with sufficient accuracy. It is normally equal to the shortest sampling interval. Observer: An algorithm that estimates the current values of unmeasured state variables from values of known stimulus and measured response up to present time. Residuals: One−step prediction errors. Model component: A submodel that is part of the object model. A model is a system of one or more connected components. Examples: 1) The model of a single unit in a production line, 2) A physical law, like Boyle’s law for an ideal gas, 3) An actuator model. Signal: An output of a ‘source’ component that is input to one or more components. Terminal: An argument that receives a signal, when the ‘source’ component has been connected. It is assigned the signal value, after the source has been executed. Stub: A parameter introduced for the sole purpose of serving as terminal for signals from another component, that a priori may or may not be needed. Example: A parameter added to or multiplied with a variable in the component equations. Parameter map: A function specifying nominal values and admissible ranges of parameters. It maps fictitious “free coordinates” θ into physical parameter values, and the “origin” θ = 0 into the nominal value. Example: A volume V has range (0,*). Its “map” is the function V = V o exp(θ), where V 0 is the nominal value and θ is the free coordinate with domain (−*,*). Free parameter space: A set of indicators and parameter maps specifying conceivable alternatives to the nominal values. The dimension of the space is defined by the indicator array, and its geometry by the corresponding set of parameter maps. Model structure: A parametric model with unspecified parameter values, but specified free parameter space. Example: All second−order linear continuous−time transfer functions, stable or not. Model complexity: The number of free parameters in a model structure. Example: In a rational linear transfer function the model complexity is twice the order. Model class: A parametric model with unspecified parameter values and unspecified free parameter space. Example: All linear continuous−time transfer functions.
Glossary
343
: Model library: The set of all model classes that can be formed within the limitations of the designer’s prior knowledge and the identification software.
: White box: A model class, where all input are available for a simulation, and all unknown parameters are indicated as free, i.e., a “model structure”.
: Black box: A model class picked from a preprogrammed library of model classes, but not all input are available for a simulation, and the model order is unspecified.
: Grey box: A model class, where not all input are available for a simulation, and free parameters are not indicated.
: Equivalent discrete−time model: An algorithm describing the relations between : :
:
: : : : : : : : :
: : : :
variables at the beginning and end of a time interval of fixed length, the “time quantum”. The relations are determined by integrating the continuous−time model. Modelling: Specifying the model class. Examples: 1) Deriving and writing the state equations. 2) Limiting the set of models to linear transfer functions. Likelihood: A measure of agreement between model response and data. It is based on the theory of conditional probabilities, and requires the computing of residuals and their covariances. Example: The negative logarithm of the likelihood of a deterministic model is proportional to the sum of squared residuals weighted by the inverses of their variances. Fitting: Finding the most likely parameter values within a given free parameter space, and based on data from a given experiment. Example: Fitting the response of a second−order transfer function to data. It minimizes the sum of squared residuals. Over−parametrization: Creating a model class with unnecessarily many parameters. Over−fitting: Fitting a model structure with more free parameters than data will support. Parameter estimates will be uncertain. ML (Maximum Likelihood): A fitting criterion based on likelihood. MAP (Maximum Aposteriori Probability): The “Maximum Likelihood” criterion, slightly modified by a given a priori distribution of the unknown parameter values. Falsification: Testing whether given data contradicts a given model. Falsifying the fitted model within a structure falsifies also the structure. Falsifying the most complex structure within a model class falsifies the class. Tentative model structure: A model structure not (yet) falsified by data. Refining a model structure: Expanding the free parameter space, or adding a component to the model class. Alternative model structure: A refined tentative model structure, possibly more complex. Nested model structures: A tentative and an alternative structure are nested, if all models in the first set are also in the second, larger set. Making an alternative structure by expanding the free parameter space of a tentative structure makes the case nested. Conditional falsification: Falsification based on one or more alternative structures. Unconditional falsification: Falsification without an alternative structure. Power of a test: The probability of falsifying an incorrect tentative structure, for a given risk of falsifying a correct tentative structure. LR test (Likelihood−Ratio): A conditional falsification method based on computing the probability that a given alternative model is better than any one in the tentative structure. It applies to nested cases, and is then a “Maximum−Power” test.
344
Glossary
: ALMP test (Asymptotic Locally Most Powerful): A sub−optimal conditional falsi-
: :
: : : :
fication method for nested cases. It requires less computing, but does not have maximum power, except for cases where there is little difference between the tentative and the alternative models (when power is needed the most). Correlation tests: Unconditional falsification methods, based on the idea that residuals should be uncorrelated. Otherwise, the correlation could be used to improve the predicting ability of the model. Calibration: Finding, within the model class, the most likely model, with the lowest complexity, that is not falsified by data from a given experiment. This includes testing whether the model class is adequate. Example: Finding the minimum order of transfer functions fitted to data, and testing whether the hypothesis of linearity holds. Identification: The definition varies in the literature. It is used here as another term for “calibration”. Validation: Deciding whether the fitted model within a given model structure is good enough for a given purpose. Model design: Modelling + Calibration + Validation. Interactive design: A design method with several steps, where a human designer and a computer program interchange information between steps (user input and computer output), typically via user’s ‘windows’ to the program. Example: Grey− box identification.
Software : MATLABX: A commercial software package for mathematical computing. : SimulinkX: A commercial software package for modelling and simulation. : DymolaZ: A commercial software package for modelling and simulation. : ModelicaX: A language for specifying models. : ModKit (Modelling tool Kit): A set of Matlab M−files supporting the specification of model classes. Examples: Entering equations and argument attributes. : IdKit (Identification tool Kit): A set of C−programs executing subtasks during calibration and validation sessions. Examples: Fitting and testing. : MoCaVa (Model Calibrator & Validator): An interactive software package supporting calibration and validation. MoCaVa contains user interfaces and user’s guides to ModKit and IdKit.
References
The references TRITA−REGand IR−S3−REGare issued by Dept. Automatic Control, Royal Institute of Technology, Stockholm, Sweden. Internet address: www.s3.kth.se/control/local/ftp/reports/) Allison B, Isaksson AJ, Karlström A (1995) Grey−Box Identification of a TMP Refiner. IR−S3−REG−9513. International Mechanical Pulping Conference, Ottowa, Canada Akaike H (1974) A new look at the statistical model identification. IEEE Trans. Automatic Control AC−19:716−723 Åström KJ (1970) Introduction to stochastic control theory. Academic Press, New York Åström KJ, Eklund K (1972) A simplified nonlinear model of a drum−boiler turbine unit. Int. J. Control. 16:145−169 Åström KJ, Eklund K (1975) A simple nonlinear drum boiler model. Int. J. Control 22:739−740 Atherton DP (1992) Nonlinear affects and their modelling. In: Atherton D, Borne P (eds.) Concise Encyclopedia of Modelling & Simulation. Pergamon Press, Oxford, pp 297−300 Billings SA (1980) Identification of nonlinear systems − A survey. IEE Proc. D. 127:272−285 Bohlin T (1978) Maximum−power validation of models without higher−order fitting. Automatica 14:137−146 Bohlin T (1986) Computer−aided grey−box identification. In: Byrnes CI, Lindquist A (eds.) Modelling, Identification and Robust Control. Elsevier, Amsterdam, pp 549−562 Bohlin T (1987a) Identification: Practical aspects. In: Sing MG (ed.) Systems & Control Encyclopedia, vol 4. Pergamon Press, Oxford, pp 2301−2307 Bohlin T (1987b) Evaluation of likelihood functions for Grey−box identification. IFAC 10th World Congress on Automatic Control, Munich, FRG. Bohlin T (1991a) Interactive System Identification: Prospects and Pitfalls. Springer Verlag, Berlin. Bohlin T (1991b) Grey−Box Identification: A Case Study. TRITA−REG−91−1. Bohlin T (1993) A Designer’s Guide for Grey−Box Identification of Nonlinear Dynamic Systems with Random Disturbances. IFAC 1993 World Congress, Sydney, Australia Bohlin T (1994a) A case study of grey−box identification. Automatica 30:307−318 Bohlin T (1994b) Derivation of a “designer’s guide” for interactive Grey Box identification of nonlinear stochastic objects. Int. J. Control 59:1505−1524 Bohlin T (1996) Modelling the Bending Stiffness of Card Board at the Frövifors Plant. IR−S3−REG−9608. Bohlin T, Graebe SF (1994a) Issues in nonlinear stochastic grey−box identification. 10th IFAC Symposium on System Identification, Copenhagen Bohlin T, Graebe SF (1994b) Issues in nonlinear stochastic grey−box identification. Int. J. Adaptive Control and Signal Processing 9:465−490 Bohlin T, Isaksson AJ (2003) Gray−box model calibrator and validator. 13th IFAC Symposium on System Identification Bortolini G (2002) On modelling and estimation of curl and twist in multi−ply paperboard. Licentiat thesis, Optimization and Systems Theory, Royal Inst. of Technology, Sweden Camo A/S (1987). Modelling & Identification. Trondheim, Norway.
346
References
Edsberg L, Wikström G (1995) Toolbox for Parameter Estimation and Simulation in Dynamic Systems with Applications to Chemical Kinetics. Nordic MATLAB Conference ’95, COMSOL, Stockholm Eklund K (1970) Numerical model building. Int. J. Control 11:973−985 Ekstam L, Smed T (1987) Parameter Estimation in Dynamic Systems with Application to Power Engineering. Report UPTEC 8747 R, Uppsala University, Uppsala, Sweden Ekvall J, Funkquist J, Lagerberg A (1994) A Program Package for Modelling and Simulation of Distributed Parameter Processes. IR−S3−REG−9405. SIMS’94, 36th Simulation Conference on Applied Simulation in Industry Elmqvist H (1978) Structured Model Language for Large Continuous Systems. Dissertation TFRT−1015, Dept. Automatic Control, Lund Institute of Technology, Lund, Sweden Fan P (1990) Modelling and Control of a Fermentation Process. Dissertation, TRITA− REG−9007 Fan P, Bohlin T (1989) Modelling, Estimation and Control of Baker’s Yeast Production. IEEE International Conference on Control and Application, Jerusalem, Israel Funkquist J (1993) On Modeling and Control of a Continuous Pulp Digester. Licentiate Thesis, TRITA−REG−9301 Funkquist J (1994a) Control of the Washing Zone in a Continuous Digester. IR−S3−REG −9402. 3rd IEEE on Control Applications, 1994 Funkquist J (1994b) A Dynamic Model of the Continuous Digester − a Simulation Study. IR−S3−REG−9404. SIMS’94, 36th Simulation Conference on Applied Simulation in Industry Funkquist J (1994c) On Modeling and Identification of a Continuous Pulp Digester. 10th IFAC symposium on System Identification, Copenhagen Funkquist J (1994d) A Dynamic Model of a Continuous Digester Suitable for System Identification. IR−S3−REG−9403.SIMS’94, 36th Simulation Conference on Applied Simulation in Industry Funkquist J (1995) Modelling and Identification of a Distributed Parameter Process: The Continuous Digester. Dissertation, TRITA−REG−9504 Gavelin G (1995) Papp och kartong (in Swedish). Skogsindistrins utbildning i Markaryd Golub GH, van Loan CF (1996) Matrix Computation, John Hopkins University Press Graebe SF (1990a) Theory and Implementation of Gray Box Identification. Dissertation, TRITA−REG 9006. Graebe SF (1990b) IDKIT: A software for Grey−Box Identification. Mathematical Reference. TRITA−REG−9003 Graebe SF (1990c) IDKIT: A software for Grey−Box Identification. User’s Guide, Version RT. TRITA−REG−9004 Graebe SF (1990d) IDKIT: A software for Grey−Box Identification. Implementation and Design Reference. TRITA−REG−9005 Graebe SF, Elsley G, Goodwin GC (1992) Nonlinear identification and control of mold level oscillations in continuous bloom casting. Report EE9204, Centre for Industrial Control Science, University of Newcastle, Australia. Graebe SF, Bohlin T (1992) Identification of Nonlinear Stochastic Grey−Box Models. 4th IFAC Symposium on Adaptive Systems in Control and Signal Processing, Grenoble Gupta K, Groshans D, Houtchens SP (1993) MATRIXX. In: Linkens DA (ed) CAD for Control Systems. Dekker, New York Gutman PO, Nilsson B (1996) Modelling and prediction of bending stiffness for paper board manufacturing. IFAC World Congress, San Fransisco Gutman PO, Nilsson B (1998) Modelling and prediction of bending stiffness for paper board manufacturing. J. Process Control 4:229−237 Havelange O (1995) Grey Box Modelling of a Cement Milling Circuit. Report IR−RT− EX−9510, Automatic Control, Royal Institute of Technology, Stockholm, Sweden. He N (1991) Approximate Evaluation of Likelihood Functions for Nonlinear Stochastic State− Vector Models. Licentiate thesis. TRITA−REG−89/9
References
347
Hjalmarsson H (2005) From experiment design to closed−loop control. Automatica 41:393−438 Isaksson AJ, Lindkvist R (2003) Identification of mechanical parameters in drive train systems. 13th IFAC Symposium on System Identification Kristensen NR, Madsen H (2003) Continuous time stochastic modelling − CTSM 2.2 − User’s Guide. Technical University of Denmark, Lyngby Kristensen NR, Madsen H, Jørgensen SB (2004) Parameter estimation in stochastic grey−box models. Automatica 40:225−237 Kuo BC (1992) Digital control systems. Oxford University Press, New York Leontaritis IJ, Billings SA (1987) Model selection and validation methods for non−linear systems. Int. J. Control 45:311−341 Liao B (1989a) A Comparison of Three Optimization Methods for Maximum Likelihood Parameter Estimation. TRITA−REG−89−6 Liao B (1989b) Evaluation of Likelihood Functions and Their Derivatives for Nonlinear phase−Variable Models. TRITA−REG−89−7 Liao B (1989c) Convergence and Consistency in Maximum Likelihood Identification for Nonlinear Phase−Variable Models. TRITA−REG−89−8 Liao B (1990) Convergence and consistency in Maximum Likelihood identification of nonlinear systems. IASTED Conference on Modelling, Identification and Control, Innsbruck Lindskog P, Ljung L (1995). Tools for semi−physical modelling. Int. J. Adaptive Control and Signal Processing 9:509−523 Ljung L (1987) System Identification: Theory for the user. Prentice−Hall, New York Margolis DL (1992) Simulation modelling formalism: Bond graphs. In: Atherton D, Borne B (eds.) Concise Encyclopedia of Modelling & Simulation, Pergamon Press, Oxford, pp 415−420 Markusson O (2002) Model and System Inversion with Applications to Nonlinear System Identification and Control. Dissertation, Automatic Control, Royal Institute of Technology, Stockholm Markusson O, Bohlin T (1997) Identification of a nonlinear EEG−generating model. IR−S3−REG−9712. IEEE Workshop on Nonlinear Signal and Image Processing, Michigan Mattson AE, Andersson M, Åström KJ (1993) Object oriented modelling and simulation. In: Linkens DA (ed.) CAD for Control Systems, Marcel Dekker, New York, pp 31−69 Pettersson J, Gutman PO, Bohlin T, Nilsson B (1997) A grey−box bending stiffness model for paper board manufacturing. Proc. IEEE International Conference on Control Applications. pp 395−400 Petterssson J (1998) On Model Based Estimation of Quality Variables for Paper Manufacturing. Licentiate thesis, TRITA−REG−9804 Sohlberg B (1990) Datorstödd modellering och optimalreglering av sköljningsprocess (In Swedish). Licentiate thesis, TRITA−REG−9008 Sohlberg B (1991) Computer Aided Modelling of a Rinsing Process. IMACS Symposium MTCS, Casablanca Sohlberg B (1992a) Computer Aided Modelling of a Rinsing Process. IMACS Conference on Modelling and Control of Technological Processes, Lille Sohlberg B (1992b) Supervision of a Steel Strip Rinsing Process. In: Proceedings of the 1st IEEE Conference on decision and control, pp 2557−2561. Sohlberg B (1993a) Optimal control of a steel strip rinsing process. In: Proceedings of the 2nd Conference on control applications pp 147−253 Sohlberg, B. (1993b). Supervision and Control of a Steel Strip Rinsing Process. Dissertation, TRITA−REG−9302. Sohlberg B (1998a) Monitoring and failure diagnostics of a steel strip process. IEEE Trans. Control Systems Technology 6:294−303 Sohlberg B (1998b) Supervision and control for industrial processes: Using grey−box models, predictive control, and fault detection methods. Springer, London Sohlberg B, Särnfelt M (2002) Grey box modelling for river control. Journal of Hydroinformatics 4:265−280
348
References
Sohlberg B (2003) Grey box modelling for model predictive control of a heating process. Journal of Process Control 13:225−238 Sørlie JA (1994a) On the Interfacing of Software for Identification of Grey Box Models. Licentiate thesis. TRITA−REG−9401 Sørlie JA (1995a) omsim2maple − A Translation Utility for OmSim Simulation Code. IR−S3−REG−9503. EuroSim’95 − TU, Vienna Sørlie JA (1995c) EKF and TplC Maple Packages for Symbolic Derivation and Code Generation of Higher−order Extended Kalman Filters. IR−S3−REG−9506 Sørlie JA (1996a) On Grey−Box Model Definition and Symbolic Derivation of Extended Kalman Filters. Dissertation, TRITA−REG−9601 Sørlie JA (1996b) Demarcation Application Dependence and the Role of Code Generation in Symbolic−Numeric Tools. IR−S3−REG−9601, ACM’s ISSAC ’96, ETH, Zurich Sørlie JA (1996c) On the Modular Symbolic−Numeric Implementation of Extended Kalman Filters. IR−S3−REG−9602, CASC’96, Dearborn, Michigan Sørlie JA (1996d) Model Definition and Management Using Multiple Realization − Case Studies in Omola. IR−S3−REG−9603, SIMS’96, Trondheim Spännar J, Sohlberg B (1997) Modelling and identification of an air−cooling process. Proc. IEEE International Conference of Control Applications, pp 383−385 Spännar J, Wide P, Sohlberg B (2002) A method for measuring strip temperature in the steel industry. IEEE Trans. Instruments & Measurements, 51:1240−1246 Söderström T (1981) On a method for model structure selection in system identification. Automatica, 17:387−388 Söderström T, Stoica P (1989) System identification. Prentice−Hall, New York Tiller M (2001) Introduction to Physical Modeling with Modelica. Kluwer Academic Publisher, Boston Tulleken JAF (1993) Grey box modelling and identification using physical knowledge and bayesian techniques. Automatica 29:285−308 Wilks SS (1962) Mathematical Statistics. Wiley, London
Index
Actuator model 25, 26, 30, 44, 313 Algebraic loop 147, 322 Argument 34 attributes 34, 37, 42, 98 class 92 Assignment statements 34 Bond graphs 19, 33 Calibration 14, 68, 83, 203, 271 procedure 21, 22 Causality 24, 32, 68, 189, 244 Conversion function output 97 input 98 Control sequence 25, 314 Credibility 43 Data MoCaVa 78 sample 86 raw 78 Diagnostic tools 69 Disturbance 25, 36, 98, 117, 138 Downloading 77 DymolaZ 19, 160 EKF (Extended Kalman Filter) 12, 56, 317 Environment 25, 313 Environment model 26, 30, 96, 225 Equation solver ADE 322 ODE 317 Falsification 20, 51 Feedback 93, 147 Fitting 5, 6, 10, 51, 107 Free coordinates 15, 28 Free parameter space 15, 103, 113 Help alternative structures 114 argument attributes 100 argument classification 93 component function 91
component library 89 I/O interface 96 model class appraisal 105 model class specification 101 origin 104 primitive alternative structures 116 search appraisal 111 search specification 109 test appraisal 120 time range 88 Hypothesis 42 IdKit 18 Identification 344 black−box 6 grey−box 10 white−box 5 Importing model 159, 170 parameters 104, 140 Input constant 35, 94 control 33, 35, 36, 45, 94 disturbance 35, 36, 47, 94 feed 33, 35, 36, 45, 94 parameter 35, 36 time 35, 36 Input filter 48, 97, 334 Input noise 46 Installation 77 Interpolator 49 I/O Interface 49, 95 Library component 15, 27, 89 conversion 331 model 15, 17 user 153 Likelihood function 51, 53, 63 Loss function 5, 6, 9, 13, 53, 55, 63 normalized 52
350
Index
weighted 53 Loss derivatives 55, 316 MAP (Maximum A Posteriori) 54, 343 MATLABX M−statements 91 ML (Maximum Likelihood) 52, 343 Model 5, 6, 10, 15, 28 disturbance 50, 124, 335 DymolaZ 160, 170, 174 equivalent discrete−time 12, 56, 322 parametric 51, 341 null 53 state 18 Model cell 33, 35 Model class 5, 10, 15, 28, 43, 101, 105, 313 ARMAX 8, 71 expanding 29 NARMAX 8 NARX 9, 71 root 29 Model complexity 28 Model component 15, 27, 88 statements 90 Model set 15, 23 alternative 14 tentative 14 Model structure 15, 16, 28, 121 alternative 15, 22, 122 nested 51, 54, 119 tentative 14, 107, 113, 116 ModelicaX 19, 160, 170 Modelling 4, 68, 194, 248 disturbance 222, 295 tools 19 shell 31 Noise process 26 measurement 26 continuous−time white 6, 16, 26, 314 discrete−time white 314 state 223 Observer 12, 342 Omola 19 Optimization tool parameter 20 performance 57, 327, 337 Order number 6, 16 Outliers 139, 240 Over−parametrization 7, 73, 166 Over−fitting 118, 229, 290 Parameter 28, 34, 93, 140 map 28, 53, 342 origin 28, 103 Predictor 6, 8, 10, 13, 56, 321 Prior knowledge 41 Process proper 26, 30, 185, 341
Project 83 Refining 121, 206 Residuals 52, 56, 116 Response 25, 36, 93 Resuming 143 Risk 120, 121 SBE rule 290 Search routine 62, 330 parameters 108 Sensitivity matrix 56, 58 equivalent discrete−time 319 Sensor model 26, 30, 313 SFI rule 10, 122 Signal 29, 35, 93 Simulink 19 State 93 Stimulus 25 Stub 134 Suspending 141 Terminal 134 Test 6, 10 ALMP 54, 121 correlation 117, 133 fair 54, 119 LR 52, 119, 121 power of 20, 121 Time quantum 24, 26, 87, 143 User function 89, 154 User’s checkpoints 338 Validation 17, 68 Window advanced specification 123, 337 alternative structures 113 argument attributes 98 argument classification 92 component function 90 component library 89 component naming 88 customization 272 DymolaZ connection 161, 175 data assignment 100 data outline 80 implicit attributes 100 I/O interface 95 main 86 MoCaVa 78, 83 model 85, 107 model class appraisal 105 model class specification 101 origin 103 plot 105 plot outline 192 plot specification 104 pilot 85
Index predat control 78 rescale 155 restart 143 search appraisal 111 search specification 108 select project 83
session save 141 targets specification 135 tentative structure 107 test appraisal 119 time range 86 user’s library 153
351
Other titles published in this Series (continued): Analysis and Control Techniques for Distribution Shaping in Stochastic Processes Michael G. Forbes, J. Fraser Forbes, Martin Guay and Thomas J. Harris Publication due August 2006 Process Control Performance Assessment Andrzej Ordys, Damien Uduehi and Michael A. Johnson (Eds.) Publication due August 2006
Modelling and Analysis of Hybrid Supervisory Systems Emilia Villani, Paulo E. Miyagi and Robert Valette Publication due November 2006 Model-based Process Supervision Belkacem Ould Bouamama and Arun K. Samantaray Publication due February 2007
Adaptive Voltage Control in Power Systems Giuseppe Fusco and Mario Russo Publication due September 2006
Continuous-time Model Identification from Sampled Data Hugues Garnier and Liuping Wang (Eds.) Publication due May 2007
Advanced Fuzzy Logic Technologies in Industrial Applications Ying Bai, Hanqi Zhuang and Dali Wang (Eds.) Publication due September 2006
Process Control Jie Bao, and Peter L. Lee Publication due June 2007
Distributed Embedded Control Systems Matjaˇz Colnariˇc, Domen Verber and Wolfgang A. Halang Publication due October 2006
Optimal Control of Wind Energy Systems Iulian Munteanu, Antoneta Iuliana Bratcu, Nicolas-Antonio Cutululis and Emil Ceanga Publication due November 2007